Classifying Breast Cancer with Quantum Machine Learning using IBM Qiskit

Andrew A Borkowski
6 min readJan 16, 2023

Part 2. Variational Quantum Circuit Method

Source: Quantum.gov

Breast cancer is a common and often aggressive form of cancer that affects millions of people worldwide. Early diagnosis is critical to successful treatment, and machine-learning techniques have been widely used to help classify breast tumors as benign or malignant.

This blog post is a second part of a two-part series that explores how quantum machine learning can diagnose breast cancer using the Wisconsin breast cancer dataset. In the first article, we introduced the quantum kernel method to complete this task, and now we will show how to implement it using a variational quantum circuit. Again we will use the IBM Qiskit library to code this program and run it on a quantum simulator and a quantum computer.

The first article also described the basics of quantum computing, quantum gates, superposition and entanglement, and the differences between bits and qubits. Therefore, I recommend you read it before reading this article.

A variational quantum circuit, also called a quantum neural network, consists of a fixed quantum circuit called the “ansatz,” parameterized by a set of variables. These variables are adjusted in order to minimize a cost function, which encodes the problem to be solved, like the classification in our example. We optimize these parameters by rotating the qubits of the circuit during model training.

In the next few steps, we will walk through the python program for this project. Although the initial steps are the same as in the previous article, we will review them again for completeness.

  1. We import basic python libraries:

2. We load the Wisconsin Breast Cancer Dataset:

3. We convert the sklearn dataset to the Pandas dataframe:

4. We assign independent variables and dependent (target) variable:

It is worth noting that the Wisconsin breast cancer dataset has many features (over 30), making it difficult to process and analyze on a quantum computer with a limited number of qubits. Therefore, to make the task more manageable, we will first use principal component analysis (PCA) to reduce the dimensionality of the dataset to just four variables for the simulator and two for the quantum computer.

5. We standardize features by removing the mean and scaling to unit variance, which is needed for PCA:

6. We reduce feature dimensionality with PCA:

Dimensionality reduction for the QASM simulator
Dimensionality reduction for a quantum computer

7. We normalize the data with MinMaxScaler, which is needed for the Variational Quantum Classifier algorithm:

8. We split the dataset into training and testing:

Now we will code and train a variational quantum classifier (VQC). The VQC is the simplest variational quantum circuit classifier in the Qiskit Machine Learning library. Two central elements of the VQC class are the feature map and ansatz. Since our data is classical, it consists of bits, not qubits. We need a way to encode the data as qubits. This process is crucial if we want to obtain an effective quantum model. We usually refer to this as mapping, which is the feature map’s role. There are different feature maps available in the Qiskit library. You can also program a custom one. We will use ZZFeatureMap as we did in a previous article.

In the next few steps, we will illustrate the code for the QASM simulator. The code for a quantum computer is similar, only with two instead of four feature dimensions.

9. We encode our data as qubits by creating the quantum circuit with four qubits (one qubit for each feature):

Feature map for QASM simulator

Once the data is loaded, we must immediately apply a parameterized quantum circuit. As mentioned before, the parameterized quantum circuit is also called an ansatz. The circuit is a direct analog to the layers in classical neural networks. The circuit is based on unitary operations and depends on external parameters which will be adjustable. Given a prepared state |ψ>, the model circuit U(w) maps |ψ> to another vector |ψ>= U(w)|ψ>. U(w) consists of a series of unitary gates. The unitary gates that we will use are Ry gates. Ry gates rotate the qubits about the Y axis. We will also use CNOT gates. CNOT gates entangle two qubits and create quantum correlations between them. We will use the RealAmplitutes variational quantum circuit from the Qiskit library for our project.

10. We code our parameterized circuit:

Our Variational Quantum Circuit

11. We now combine our feature map with our circuit:

Our complete circuit, including the feature map

12. We set up initial random values for the gates (“weights”):

13. We one hot encode our labels, as required by the VQC algorithm:

14. We create a VQC object. We will use the Simultaneous Perturbation Stochastic Approximation (SPSA) as optimizer. The main feature of SPSA is the stochastic gradient approximation, which requires only two measurements of the objective function, regardless of the dimension of the optimization problem. According to the Qiskit documentation, SPSA can be used in the presence of noise, and it is therefore indicated in situations involving measurement uncertainty on a quantum computation when finding a minimum. This is an important fact since we are still in the noisy intermediate-scale quantum (NIS) era. As described in the previous article, to use the actual IBM quantum computer as a backend, we must set up the IBM Quantum Experience account.

Create a VQC object for the QASM simulator
Create a VQC object for a quantum computer

15. We fit our classifier:

16. We print metrics for a testing dataset:

The result for the QASM simulator as a backend
The result for a quantum computer as a backend

Conclusion

The results for the breast cancer classification with the variational quantum circuit method are not as good as with the quantum kernel method. In addition, this method takes more time to train the model (275 seconds using the simulator and 176247 seconds using the IBM quantum computer). Therefore, at this time, the quantum kernel method would be preferable for the Wisconsin Breast Cancer Dataset classification.

I hope you enjoyed reading this article.

All the best,
Andrew

--

--