Quantum Machine Learning
Giuseppe Di Molfetta & Hachem Kadri
CANA & QARMA,
- Lab. d’Informatique Fondamentale de Marseille
Aix-Marseille Universit´ e, France
Quantum Machine Learning Giuseppe Di Molfetta & Hachem Kadri - - PowerPoint PPT Presentation
Quantum Machine Learning Giuseppe Di Molfetta & Hachem Kadri CANA & QARMA , Lab. dInformatique Fondamentale de Marseille Aix-Marseille Universit e, France Outline I Machine Learning I Quantum Computing I Quantum Machine Learning
Quantum Machine Learning
Giuseppe Di Molfetta & Hachem Kadri
CANA & QARMA,
Aix-Marseille Universit´ e, France
Outline
I Machine Learning I Quantum Computing I Quantum Machine Learning
Machine Learning?
Machine learning is the study of computer algorithms that improve automatically through experience
Machine learning is programming computers to optimize a performance criterion using example data or past experience
Annotation/d´ ecodage d’images
(de Wolfram, ML Mathematica toolbox) (de Haxby et al, 2001)
AlphaGo (Silver et al. 2016)
Process, Generalization
sélection données brutes connaissance apprentissage modèles prétraitement validation interprétation données préparées
Goal
From a training set consisting of randomly sampled pairs of (input, target), learn a function or a predictor which predicts well the target of a new data.
Supervised learning / Generalization
≠ æ Given l training examples (x1, y1), . . . , (xl, yl) œ (X ◊ Y) and u test data xl+1, . . . , xl+u œ X ≠ æ Learn f : X æ Y to generalize from training to testing
Positionnement
a la fin des ann´ ees 70, les bases math´ ematiques de l’apprentissage automatique/statistique, ` a l’intersection de l’informatique, la statistique math´ ematique, l’optimisation.
Perceptron (Rosenblatt, 1958)
Inspiration: biological neural network
Motivations:
I Learning system composed by
associating simple processing units
I Efficiency, scalability, and
adaptability
Perceptron: a linear classifier, X = Rd, Y = {−1, +1}
biais : activation = 1
σ(Pd
i=1 wixi + w0)
x1 x2 x =
σ w0 w1 w2
Perceptron (Rosenblatt, 1958)
Inspiration: biological neural network
Motivations:
I Learning system composed by
associating simple processing units
I Efficiency, scalability, and
adaptability
Perceptron: a linear classifier, X = Rd, Y = {−1, +1}
I Classifier weights: w œ Rd I Classifier prediction: f (x) = signÈw, xÍ I Question: how to learn w from training data
Perceptron (Rosenblatt, 1958)
Inspiration: biological neural network
Motivations:
I Learning system composed by
associating simple processing units
I Efficiency, scalability, and
adaptability
Algorithm: S = {(Xn, Yn)}N
n=1
w Ω 0 while it exists (Xn, Yn): YnÈw, XnÍ Æ 0 do w Ω w + YnXn end while
Perceptron in action
Perceptron in action
Perceptron in action
Perceptron in action
Perceptron: some results
Theorem (Number of iterations, Novikoff, 1962)
If it exists γ > 0, w∗, Îw∗Î = 1, ÎXnÎ Æ R, ’n = 1, . . . , N, and YnÈw∗, XnÍ Ø γ then the number of mistakes made by the Perceptron algorithm is at most R2/γ2
Theorem (XOR, Minsky, Papert, 1969)
The perceptron algorithm cannot solve the XOR problem
Neural Networks
−1 +1 +1i j k wji si, ai σ(x) =
1 1+exp(−x)
σ(x) = tanh(x) wkj sk, ak sj = P
i wjiai
aj = σ(sj)
biais : activation = 1
i k j
= y x1 x2 x = y1 y2
SVM and Kernel Methods
support vectors margin = 2 / ||w|| w.x + b = 0
Computer Science Department Aix-Marseille University
Quantum Walks O
Grover Alg., an introduction
Hello.c Hello World! Input Computation Output What is the essence of computation? 2 + 2 4
Church-Turing Thesis: Computation is anything that can be done by a Turing machine. This definition coincides with our intuitive ideas of computation: addition, multiplication, binary logic, etc… What is a Turing machine?
…0100101101010010110…
Infinite tape Read/Write head
Finite State Automaton (control module)
…0000001011111111100…
Computation
…1110010110100111101…
Output
…0100101101010010110…
Input
Desktop computers Billiard balls DNA Cellular automata
These can all be shown to be equivalent to each other and to a Turing machine! The Big Question: What next?
Conventional computers, no matter how exotic, all obey the laws of classical physics.
Conventional computers, no matter how exotic, all obey the laws of classical physics. On the other hand, a quantum computer obeys the laws of quantum physics.
|α|2 + |β|2 = 1
+
The basic component of a classical computer is the bit, a single binary variable of value 0 or 1.
1 1
The state of a classical computer is described by some long bit string of 0s and 1s. 0001010110110101000100110101110110...
At any given time, the value
Bit is 1-D point in only one of two states, 0 and 1.
Pbit is a 2-D line between the two states 0 and 1
pbit = p ∗ [1] + (1 − p) ∗ [0]
Qubit is a 3-D sphere with 0 and 1 at the poles, and an infinite number of superpositions as points on the sphere
|ψi = cos(θ/2)|0i + eiφ sin(θ/2)|1i
How does the use of qubits affect computation?
Classical Computation
Data unit: bit
x = 0 x = 1
1 1
Valid states:
x = ‘0’ or ‘1’ |ψ = c1|0 + c2|1 Quantum Computation
Data unit: qubit Valid states:
|ψ = |0 |ψ = |1
|ψ = (|0 + |1)/√2
=|1 =|0 = ‘1’ = ‘0’
1 1
How does the use of qubits affect computation?
Classical Computation
Operations: logical Valid operations:
AND =
i
1 0 0 -1 1 1 1
0 1 1 1
NOT =
0 1 1
in
in in
1 1 1 1
1-bit 2-bit Quantum Computation
Operations: unitary Valid operations:
σX = σy = σz = Hd = CNOT = √2 1 1-qubit 2-qubit
1 1
How does the use of qubits affect computation?
Classical Computation
Operations: logical Valid operations:
AND =
i
1 0 0 -1 1 1 1
0 1 1 1
NOT =
0 1 1
in
in in
1 1 1 1
1-bit 2-bit Quantum Computation
Operations: unitary Valid operations:
σX = σy = σz = Hd = CNOT = √2 1 1-qubit 2-qubit
How does the use of qubits affect computation?
Classical Computation
Measurement: deterministic
x = ‘0’ State Result of measurement ‘0’ x = ‘1’ ‘1’ Quantum Computation
Measurement: stochastic
|ψ = |0 |ψ = |0- |1 State Result of measurement |ψ = |1 √2 ‘0’ ‘1’ ‘0’ 50% ‘1’ 50%
1
u11 u12 u21 u22
Single qubit
c1 c2
c1 c2
Two qubits H2 = 1 1
,
|0,|1 H2
⊗2 = H2⊗H2 =
,
|00,|01,|10,|11
1
,
1
,
1
c1 c2 c3 c4 c1 c2 c3 c4 u11 u12 u13 u14 u21 u22 u23 u24 u31 u32 u33 u34 u41 u42 u43 u44
Hilbert space U|ψ= U|Ψ= Operator
|ψ = c1|0 + c2|1 =
|Ψ
c1|00 + c2|01 + c3|10 + c4|11
= = Arbitrary state
1 1 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0
CNOT =
1 1
|0 |0 |1 |0 |1 |1 ‘1’ ‘1’ Example Circuit σx
One-qubit
CNOT
Two-qubit
Measurement 0 0 1 0 0 0 0 1 1 0 0 0 0 1 0 0
σx Ä I =
|0i ⌦ |0i
|1i ⌦ |0i |1i ⌦ |1i
1/√2 1/√2
1
σx CNOT |0 Example Circuit
1
|0 ‘0’ ‘0’
‘1’ ‘1’
50% 50%
Separable state: can be written as tensor product
|Ψ = |φ ⊗ |χ
Entangled state: cannot be written as tensor product
|Ψ ≠ |φ ⊗ |χ
|0i + |1i p 2 |0i + |1i p 2
1/√2 1/√2 1/√2 1/√2
Quantum Superordinacy
All classical quantum computations can be performed by a quantum computer.
No cloning theorem
It is impossible to exactly copy an unknown quantum state |ψ |0 |ψ |ψ
Reversibility
Since quantum mechanics is reversible (dynamics are unitary), quantum computation is reversible. |00000000 |ψφβπμψ |00000000
Imagine we are looking for the solution to a problem with N possible solutions. We have a black box (or ``oracle”) that can check whether a given answer is correct. 78
Question: I’m thinking of a number between 1 and 100. What is it?
Oracle
No 3
Oracle
Yes
The best a classical computer can do on average is N/2 queries. 1
Oracle
No
...
2
Oracle
No 3
Oracle
Yes
Classical computer
Oracle
1+2+3+... No+No+Yes+No+...
Quantum computer
Using Grover’s algorithm, a quantum computer can find the answer in √N queries!
Superposition over all N possible inputs.
The best a classical computer can do on average is N/2 queries. 1
Oracle
No
...
2
Oracle
No 3
Oracle
Yes
Classical computer
Oracle
1+2+3+... No+No+Yes+No+...
Quantum computer
Using Grover’s algorithm, a quantum computer can find the answer in √N queries!
Superposition over all N possible inputs.
Pros: Can be used on any unstructured search problem, even NP-complete problems. Cons: Only a quadratic speed-up over classical search.
The circuit is not complicated, but it doesn’t provide an immediately intuitive picture of how the algorithm works. Are there any more intuitive models for quantum search?
O
σz
O
σz
… … … …
|0 |0 |0
O(√N) iterations
Hd Hd Hd
…
Hd Hd Hd
…
Hd Hd Hd
…
Hd Hd Hd
…
Hd Hd Hd
Idea: extend classical random walk formalism to quantum mechanics
A
Classical random walk:
C S
|
t
ψ 〉
1
|
t
ψ + 〉
Quantum random walk:
1
t t
+ 〉 =
U S C = ⋅
Moves walkers based on coin Flips coin
ij
To obtain a search algorithm, we use our “black box” to apply a different type of coin operator, C1, at the marked node C0 C1
1
C0= 1 2 C1=
0 -1 0 0 0 0 -1 0 0 0 0 -1
Pros: As general as Grover’s search algorithm. Cons: Same complexity as Grover’s search algorithm. Slightly more complicated in implementation Slightly more memory used Interesting Feature: Search algorithm flows naturally
based algorithms?
Find the factors of: 57 3 x 19 Find the factors of:
162384760165017623876107626917226121712398721039746218 761871207362384612987398263489712186110237969186319827 6319276121
whimper All known algorithms for factoring an n-bit number on a classical computer take time proportional to O(n!). But Shor’s algorithm for factoring on a quantum computer takes time proportional to O(n2 log n).
Makes use of quantum Fourier Transform, which is exponentially faster than classical FFT.
# bits 1024 2048 4096 factoring in 2006 105 years 5x1015 years 3x1029 years factoring in 2024 38 years 1012 years 7x1025 years factoring in 2042 3 days 3x108 years 2x1022 years with a classical computer # bits 1024 2048 4096 # qubits 5124 10244 20484 # gates 3x109 2X1011 X1012 factoring time 4.5 min 36 min 4.8 hours with potential quantum computer (e.g., clock speed 100 MHz)
The details of Shor’s factoring algorithm are more complicated than Grover’s search algorithm, but the results are clear:
Errors accumulate, lowering success rate of algorithm
Grover’s algorithm success rate n = # of qubits O O Ideal
Noisy
n = # of qubits O O Ideal
Noisy
n = # of qubits O O Ideal
Noisy
“it from Qubit”
computer science and information theory
Quantum Machine Learning
I Quantum machine learning is an emerging interdisciplinary research area
at the intersection of quantum physics and machine learning
I Era of big data nowadays . . . I the time is ripe to initiating a long-term dialogue between the quantum
computing and the machine learning communities with a view to foster cross-fertilization of ideas.
Refs
Machine learning in a quantum world (Aimeur et al., 2006) Quantum machine learning (Biamonte et al., Nature 2017)
Quantum Machine Learning
https://en.wikipedia.org/wiki/Quantum_machine_learning
Quantum Perceptron Models (Wiebe et al., 2016)
Algorithm: S = {(Xn, Yn)}N
n=1
w Ω 0 while it exists (Xn, Yn): YnÈw, XnÍ Æ 0 do w Ω w + YnXn end while
I Seek out training vectors that the current perceptron model misclassifies I “Grover’s search” I Computational complexity: Quadratic reduction O(N) ≠
æ O( Ô N)
Quantum Perceptron Models (Wiebe et al., 2016)
Version space
VS := {w|YnÈw, XnÍ > 0}
https: //en.wikipedia.org/wiki/Version_space_learning https: //tminka.github.io/papers/ep/minka-thesis.pdf
Quantum Perceptron Models (Wiebe et al., 2016)
Version space perceptron
I Generate K sample hyperplanes w1, . . . , wK from N(0, ) I Large enough K ∆ at least one wi would lie in the version space and
perfectly separate the data
I “Grover’s algorithm” I the number of samples K: O( 1 γ ) ≠
æ O( 1
√γ ) I Statistical efficiency: O( 1 γ2 ) ≠
æ O( 1
√γ )
I Enhance machine learning: reduced computational complexity and
improved generalization performance
I Quantum SVM, fast kernel computation (Lloyd et al., 2013)
I Introduce new ideas and concepts to machine learning emerging from the
field of quantum mechanics
I Disentangling (NIPS workshop: Learning Disentangled Representations
https://sites.google.com/view/disentanglenips2017)
I Improving benchmarking and control of experimental quantum systems I Quantum tomography, Estimation of density matrices (Koltchinskii, 2015)
I Fully quantum I less investigated
Quantum Machine Learning
There is lots of work to be done . . . QARMA team: Machine Learning CANA team: Natural Computing