How to Handle Classical Data in Quantum Models?

recent years, quantum computing has attracted growing interest from researchers, businesses, and the public. “Quantum” has become a buzzword that many use to attract attention. As this field has gained popularity, quantum machine learning (QML) has emerged as an area where quantum computing and machine learning meet.

As someone with an interest in machine learning and a deep love for math and quantum computing, I found the concept of quantum machine learning very appealing. But as a researcher in the field, I was also somewhat skeptical about the near-term applications of QML.

Today, machine learning powers tools such as recommendation systems and medical diagnostics by finding patterns in data and making predictions. Quantum computing, in contrast, processes information differently by leveraging effects such as superposition and entanglement.

The field of quantum machine learning explores this possibility and seeks to answer this question.

Can quantum computers help us learn from data more effectively?

However, as with anything related to quantum computing, it’s important to set clear expectations. Quantum computers are now faulty and incapable of running large-scale programs. That being said, they are capable of providing a proof of concept on the utility of QML in various applications.

Moreover, QML isn’t meant to replace classical machine learning. Instead, it looks for parts of the learning process where quantum systems might offer an advantage, such as data representation, exploring complex feature spaces, or optimization.

With that in mind, how can a data scientist or a machine learning engineer dip their toe in the pool that is QML? Any machine learning algorithm (quantum or classical) requires data. The first step is always data preparation and cleaning. So, how can we prepare the data for use in a QML algorithm?

This article is all about QML workflows and data encoding.

Quantum Machine Learning Workflows

Before we jump into data, let’s take a quick pause and briefly define what quantum machine learning is. At a high level, quantum machine learning refers to algorithms that use quantum systems to perform machine learning tasks, including:
1. Classification
2. Regression
3. Clustering
4. Optimization

Most approaches today fall into what we call hybrid quantum-classical models, in which classical computers handle data input and optimization, while quantum circuits are part of the model.
A helpful way to think about this is: Classical machine learning focuses on designing features, while quantum machine learning often focuses on encoding features into quantum states.

Since data can take many forms, QML workflows may look different depending on the type of input and algorithm.

Image by author

If we have classical data and a classical algorithm, that is our typical machine learning workflow. The other three options are where things get somewhat interesting.

1. Quantum Data with a Quantum Model (Fully Quantum)

The most straightforward approach is to have some quantum data and use it with a quantum model. In theory, what would this workflow look like?
1- Quantum Data Input: The input is already a quantum state: ∣ψ⟩
2- Quantum Processing: A circuit transforms the state: U(θ)∣ψ⟩
3- Measurement

The data we are working with might come from:
1. A quantum experiment (e.g., a physical system being measured).
2. A quantum sensor.
3. Another quantum algorithm or simulation.

Because the data is already quantum, there is no need for an encoding step. At a conceptual level, this is the “purest” form of quantum machine learning, so we might expect the strongest form of quantum advantage here!

But, this workflow is still limited in practice due to some challenges, including:
1. Access to Quantum Data: Most real-world datasets (images, text, tabular data) are classical. Truly, quantum data is much harder to obtain.
2. State Preparation and Control: Even with quantum data, preparing and maintaining the state ∣ψ⟩ with high fidelity is challenging due to noise and decoherence.
3. Measurement Constraints: While we delay measurement until the end, we still face limitations, such as we only extract partial information from the quantum state, and we need careful design of observables.

In this type of workflow, the goal is to learn directly from quantum systems.

2. Quantum Data with Classical Algorithms

So far, we have focused on workflows in which quantum data is used in a quantum system. But we should also consider the scenario where we have quantum data, and we want to use it with a classical ML algorithm.

At first glance, this seems like a natural extension. If quantum systems can generate rich, high-dimensional data, why not use classical machine learning models to analyze it?

In practice, this workflow is feasible, but with an important limitation.

A quantum system is described by a state such as:
$|\psi\rangle = \sum_{i=0}^{2^n – 1} \alpha_i |i\rangle$
which contains exponentially many amplitudes. However, classical algorithms cannot directly access this state. Instead, we must measure the system to extract classical information, for example, through expectation values:
$x_i = \langle \psi | O_i | \psi \rangle$

These measured quantities can then be used as features in a classical model.

The challenge is that measurement fundamentally limits the amount of information we can extract. Each measurement provides only partial information about the state, and recovering the full state would require an impractical number of repeated experiments.
That being said, classical machine learning can play a valuable role in analyzing noisy measurement data, identifying patterns, or improving signal processing.

Hence, most quantum machine learning approaches aim to keep data in the quantum system for as long as possible—bringing us back to the central challenge of this article:

How do we encode classical data into quantum states in the first place?

So, let’s talk about the final workflow.

3. Classical Data with a Quantum Model (Hybrid QML)

This is the most common workflow used today. Basically, it is a model where we encode classical data into quantum states and then apply QML to obtain results. Hybrid QML algorithms like this have 5 steps:
1- Classical Data Input
Data starts in a familiar form:
$x=(x1,x2,…,xn)$
2- Encoding Step
The data is mapped into a quantum state:
$x \rightarrow |\psi(x)\rangle$
3- Quantum Processing
A parameterized circuit processes the data:
$U(\theta)|\psi(x)\rangle$
4- Measurement
Results are extracted as expectation values:
$y = \langle \psi | O | \psi \rangle$
5- Classical Optimization Loop
Parameters θ are updated using classical optimizers.

This workflow brings a new challenge that isn’t found in classical machine learning:

How do we efficiently encode classical data into a quantum system?
That’s what we will answer next!

Classical Data Encoding

If we step back and compare these workflows, one thing becomes clear: the main structural difference is the encoding step.
Because most real-world applications use classical datasets, this step is usually necessary. So, how do we represent classical data in a quantum system?

In classical computing, data is stored as numbers in memory.
In quantum computing, data must be represented as a quantum state:

$|\psi\rangle = \alpha_0 |0\rangle + \alpha_1 |1\rangle$
For multiple qubits:
$|\psi\rangle = \sum_{i=0}^{2^n – 1} \alpha_i |i\rangle$

Where: $\alpha_i$ are complex amplitudes $( \sum |\alpha_i|^2 = 1 )$ . So, in simple terms, encoding means: Taking classical data and mapping it into the amplitudes, phases, or rotations of a quantum state.

Now, let’s take a deeper look at the different types of data encoding.

1. Basis Encoding (Binary Mapping)

This is the simplest approach to encoding classical data. Basically, we represent classical binary data directly as qubit states.
$x = (1,0,1) \rightarrow |101\rangle$

Qiskit Example

from qiskit import QuantumCircuit
qc = QuantumCircuit(3)
qc.x(0)  # 1
qc.x(2)  # 1
qc.draw('mpl')

Here, each bit maps directly to a qubit, and no superposition is used. This approach only works if the dataset we are using is simple. and it is usually used in demonstrations and teaching rather than actual implementation of QML.

In this type of data encoding, you would need one qubit per feature, which doesn’t scale well to larger, more realistic problems.

2. Angle Encoding

To have a richer encoding, instead of turning values into 0 or 1, we use rotations to encode our classical data. Quantum data can be rotated in three directions, X, Y, and Z.

In angle encoding, we take a classical feature x and map it onto a quantum state using a rotation:
$∣ψ(x)⟩=Rα(x) ∣0⟩$ , where α∈{x, y, z}.

So in principle, you can use Rx(x), Ry(x), or Rz(x).
But not all of them encode data in the same way. In most cases, Rx or Ry is used for data encoding.

$Ry(x)∣0⟩=cos(2x)∣0⟩+sin(2x)∣1⟩$
$Rx(x)∣0⟩=cos(2x)∣0⟩−i sin(2x)∣1⟩$

Qiskit Example

from qiskit import QuantumCircuit
import numpy as np
x = [0.5, 1.2]
qc = QuantumCircuit(2)
qc.ry(x[0], 0)
qc.ry(x[1], 1)
qc.draw('mpl')

Angle encoding can, in principle, be implemented using rotations about any axis (e.g., Rx, Ry, Rz). However, rotations about the Y- and X-axes directly affect measurement probabilities, while Z-rotations encode information in phase and require additional operations to become observable.

When we use rotation to encode data, continuous data are handled naturally, resulting in a compact representation that is easy to implement. On its own, this method is mostly linear unless we add entanglement.

3. Amplitude Encoding

This is where things start to feel “quantum.” In amplitude encoding, the data is encoded into the amplitudes of a quantum state.
$x = (x_0, x_1, x_2, x_3)$

With n qubits, we can encode $2^n$ values, which means we get exponential compression.

Qiskit Example

from qiskit import QuantumCircuit
from qiskit.quantum_info import Statevector
import numpy as np
x = np.array([1, 1, 0, 0])
x = x / np.linalg.norm(x)
qc = QuantumCircuit(2)
qc.initialize(x, [0,1])
qc.draw('mpl')

The challenge with this approach is that state preparation is expensive (circuit-wise), which can make circuits deep and noisy. So, even though amplitude encoding seems powerful in theory, it’s not always practical with current hardware.

4. Feature Maps (Higher-Order Encoding)

So far, we’ve mostly just loaded classical data into quantum states. Feature maps go a step further by introducing nonlinearity, capturing feature interactions, and leveraging entanglement.
The structure of this encoding would look like:

$U_\phi(x) = \exp\left(i \sum_{j,k} \phi_{jk}(x) Z_j Z_k \right)$

That means features don’t just act independently; they interact with each other.

Qiskit Example

from qiskit import QuantumCircuit
x1, x2 = 0.5, 1.0
qc = QuantumCircuit(2)
qc.ry(x1, 0)
qc.ry(x2, 1)
qc.cx(0, 1)
qc.rz(x1 * x2, 1)
qc.draw('mpl')

This type of encoding is the quantum equivalent of polynomial features or kernel transformations. This lets the model find complex relationships in the data.

You can think of feature maps as transforming data into a new space, much as kernels do in classical machine learning. Instead of mapping data into a higher-dimensional classical space, QML maps it into a quantum Hilbert space.

Final Thoughts

Even though quantum computers are not fully there, hardware-wise, there is a lot we can do with them today. One of the most promising applications of quantum computers is quantum machine learning. If there’s one idea worth holding onto from this article, it’s this:
In quantum machine learning, how you encode the data often matters as much as the model you are using.

This might seem surprising at first, but it’s actually similar to classical machine learning. The difference is that in QML, encoding isn’t just preprocessing; it’s part of the model itself.

And, just like the wider field of quantum computing, this area is still developing. We don’t yet know the “best” encoding strategies. The hardware constraints shape what’s practical today, and new approaches are still being explored.

So if you’re looking to get into quantum computing, quantum machine learning is one of the most impactful places to start. Not by jumping straight into complex algorithms, but by starting with a much simpler question: How can my data interact with a quantum system?

Answering that question allows us to fully utilize the power of the quantum computers we have today.

References & Further Reading

Schuld, M., & Petruccione, F. (2018). Supervised learning with quantum computers (Vol. 17, p. 3). Berlin: Springer.
Havlíček, V., Córcoles, A. D., Temme, K., Harrow, A. W., Kandala, A., Chow, J. M., & Gambetta, J. M. (2019). Supervised learning with quantum-enhanced feature spaces. Nature, 567(7747), 209-212.
Qiskit Documentation: https://qiskit.org/documentation/
Schuld, M., & Killoran, N. (2019). Quantum machine learning in feature Hilbert spaces. Physical Review Letters, 122(4), 040504.