Reinforcement Learning with Neural Networks for Quantum Multiple - - PowerPoint PPT Presentation

reinforcement learning with neural networks for quantum
SMART_READER_LITE
LIVE PREVIEW

Reinforcement Learning with Neural Networks for Quantum Multiple - - PowerPoint PPT Presentation

Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing Sarah Brandsen 1 , Kevin D. Stubbs 2 , Henry D. Pfister 2 , 3 1 Department of Physics, Duke University 2 Department of Mathematics, Duke University 3 Department of


slide-1
SLIDE 1

Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing

Sarah Brandsen1, Kevin D. Stubbs2, Henry D. Pfister2,3

1 Department of Physics, Duke University 2 Department of Mathematics, Duke University 3 Department of Electrical Engineering, Duke University

IEEE International Symposium on Information Theory June 21-26, 2020

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 1 / 27

slide-2
SLIDE 2

Outline

1 Overview of Multiple State Discrimination 2 Reinforcement Learning with Neural Networks (RLNN) 3 Comparing RLNN performance to known results

Binary pure state discrimination RLNN performance as function of subsystem number Comparison to “Pretty Good Measurement”

4 Performance of RLNN in more general cases

Trine ensemble Comparison to Semidefinite Programming Upper Bounds

5 Open questions (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 2 / 27

slide-3
SLIDE 3

Quantum State Discrimination

Given: ρ ∈ {ρj}|m

j=1 with priors

q = (q1, ..., qm) Objective: find quantum measurement ˆ Π = {Πj}|m

j=1 that maximizes

Psuccess =

m

  • j=1

Tr[qjρjΠj]

qj = Pr(ρ = ρj) ρ1 ρ2 ρ3 ρ4

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 3 / 27

slide-4
SLIDE 4

Locally Adaptive Strategies

Locally adaptive protocols consist of measuring one subsystem at a time, then choosing the next subsystem and measurement based on previous results

ρ(1)

1

ρ(1)

2

ρ(1)

3

ρ(1)

4

ρ(2)

1

ρ(2)

2

ρ(2)

3

ρ(2)

4

ρ(3)

1

ρ(3)

2

ρ(3)

3

ρ(3)

4 (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 4 / 27

slide-5
SLIDE 5

Motivation for Locally Adaptive Strategies

Analytic solution for optimal collective measurement generally not known when m ≥ 3 Approximately optimal solutions found via semidefinite programming [EMV03] may be experimentally impractical for large systems

ρ(1)

1

ρ(1)

2

ρ(1)

3

ρ(1)

4

ρ(2)

1

ρ(2)

2

ρ(2)

3

ρ(2)

4

ρ(3)

1

ρ(3)

2

ρ(3)

3

ρ(3)

4

ρj = n

k=1 ρ(k) j (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 5 / 27

slide-6
SLIDE 6

Reinforcement Learning

Main idea- agent learns to maximize the expected future reward through repeated interactions with the environment. at ∈ A st ∈ S rt

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 6 / 27

slide-7
SLIDE 7

Advantage function

Agent’s policy- draw random action a given state s according to πθ(a|s) = Pr[A = a|S = s] Advantage function- compares the expected reward of choosing action a given state s to the average expected reward for being in state s given policy π Aπ(st, at) =

N

  • ℓ=t

γℓ−t Eπθ[r(sℓ, aℓ)

  • st, at] − Eπθ[r(sℓ, aℓ)
  • st]
  • (Duke University)

Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 7 / 27

slide-8
SLIDE 8

Neural Networks for Function Approximation

Setup- we use a fully connected neural network where the input layer feeds into two parallel sets of sub-networks.

Input Layer Two Hidden Layers π∗(a1|s) π∗(a2|s) π∗(a|A||s) V (s)

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 8 / 27

slide-9
SLIDE 9

Set of allowed quantum measurements

Binary Projective Measurement Set – taken to be {ˆ Π(ℓ)}Q

ℓ=1 where

ˆ Π(ℓ)   ℓ

Q

2

ℓ Q

  • 1 −

Q

2

ℓ Q

  • 1 −

Q

2 1 − ℓ

Q

2   ,   1 − ℓ

Q

2 − ℓ

Q

  • 1 −

Q

2 − ℓ

Q

  • 1 −

Q

2 ℓ

Q

2  

  • and ℓ ∈ {0, 1, ..., Q − 1}. Q = 20 in
  • ur experiments.

Π(ℓ + 1) Π(ℓ) Π(ℓ − 1)

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 9 / 27

slide-10
SLIDE 10

Applying RLNN to Multiple State Discrimination

Initialize- Randomly generate ρ ∈ {ρj}m

j=1 according to

q = (q1, ..., qm) Initialize s = (s1, ..., sn) to all-zeros vector

  • s = [0, 0, 0]

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 10 / 27

slide-11
SLIDE 11

Applying RLNN to Multiple State Discrimination (cont)

Step- Agent chooses an action of the form (j, ˆ Π) Implement action and sample outcome according to Tr[Πoutρ] Update prior via Bayes’ Theorem Set sj → 1

j = 2

  • s → [0, 1, 0]

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 11 / 27

slide-12
SLIDE 12

Reward scheme

If subsystem j has already been measured in a previous round, return penalty of -0.5

r = −0.5

When sj = 1 for all j return reward of 1 if ρguess = ρ and 0 else. 1

1Results are generated using the default PPO algorithm from Ray version 0.7.6 (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 12 / 27

slide-13
SLIDE 13

Binary Pure State Discrimination

Setup- in the special case where m = 2, the state set is {ρ+, ρ−} with prior q = Pr(ρ = ρ+). Optimal solution- the Helstrom measurement is optimal, where Πh = {Π+, Π−} and Π± are projectors onto the positive/negative eigenspace of M qρ+ − (1 − q)ρ− In the special case where ρ± are both tensor products of pure subsystems, an adaptive greedy protocol is fully optimal.

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 13 / 27

slide-14
SLIDE 14

RLNN Performance in the Binary Case

Setup- for each trial, we randomly select pure tensor product quantum states with m = 2, n = 3. Results for the optimal RLNN policy are plotted after 1000 training iterations.

2 4 6 8 0.5 0.6 0.7 0.8 0.9 1 trial Psucc Helstrom RLNN

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 14 / 27

slide-15
SLIDE 15

RLNN Performance as Function of Training Iterations

200 400 600 800 1,000 2 · 10−2 4 · 10−2 6 · 10−2 8 · 10−2 0.1 training iteration Psucc, Helstrom − Psucc, RLNN

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 15 / 27

slide-16
SLIDE 16

Special known case

Given a base set {ρ0, ρ1}, consider: S(1) {ρ0, ρ1} S(2) {ρ0 ⊗ ρ0, ρ0 ⊗ ρ1, ρ1 ⊗ ρ0, ρ1 ⊗ ρ1} S(3) {ρ0 ⊗ ρ0 ⊗ ρ0, ρ0 ⊗ ρ0 ⊗ ρ1, ρ0 ⊗ ρ1 ⊗ ρ0, ρ0 ⊗ ρ1 ⊗ ρ1, ρ1 ⊗ ρ0 ⊗ ρ0, ρ1 ⊗ ρ0 ⊗ ρ1, ρ1 ⊗ ρ1 ⊗ ρ0, ρ1 ⊗ ρ1 ⊗ ρ1} .... and for each state set, assume each candidate state is equally probable.

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 16 / 27

slide-17
SLIDE 17

Special known case

Results- the RLNN performance starts to show a signficant gap from the

  • ptimal success probability when n ≥ 5

1 2 3 4 5 6 0.4 0.6 0.8 n Psucc Optimal NN

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 17 / 27

slide-18
SLIDE 18

The “Pretty Good Measurement” (PGM)

The “Pretty Good Measurement” defines the POVM ΠPGM,k (

  • j

qjρj)− 1

2 qkρk(

  • j

qjρj)− 1

2

∀k ∈ {1, ..., m} Motivation- PGM is known to be optimal for several cases: Symmetric pure states with uniform prior where ρj = |ψj ψj| and |ψj = Uj−1 |ψ1 with Um = I Linearly-independent pure states where the diagonal elements of the square-root of the Gram matrix are all equal

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 18 / 27

slide-19
SLIDE 19

RLNN vs Pretty Good Measurement

Setup- we generate 10 trials of candidate states with n = 3, m = 5 and plot the difference in RLNN and PGM success probability.

−5 · 10−2 5 · 10−2 0.1 1 2 3 4 Psucc,NN − Psucc,PGM

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 19 / 27

slide-20
SLIDE 20

Trine Ensemble Candidate States

The trine ensemble consists of three equally spaced real qubit states, namely

  • R( 4π

3 )⊗j |0 0|

  • R( 4π

3 )†⊗j2 j=0

ρ(0) ρ(0)

1

ρ(0)

2

ρ(1) ρ(1)

1

ρ(1)

2

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 20 / 27

slide-21
SLIDE 21

Conjectured optimal local method

Step 1: “Anti-trine” measurement implemented on subsystem 1

ρ(0) ρ(0)

1

ρ(0)

2

Π2 Π0 Π1

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 21 / 27

slide-22
SLIDE 22

Conjectured optimal local method

Step 2: Helstrom measurement for the remaining two candidate states is implemented on subsystem 2

  • ut = Π2

ρ(1) ρ(1)

1

Π0 Π1

Success probability of this method is Psucc ≈ 0.933, whereas success probability of a locally greedy method is Psucc, lg = 0.8

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 22 / 27

slide-23
SLIDE 23

RLNN Results for Trine Ensemble

The training curve for RLNN performance (with modified action space) indicates convergence to the conjectured optimal local success.

50 100 150 200 250 0.5 0.6 0.7 0.8 0.9 1 training iteration Psucc RLNN Conjectured optimal local Optimal collective

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 23 / 27

slide-24
SLIDE 24

General Results for Binary State Discrimination

Setup- we set m = 2 and n = 3, and randomly generate 7 trials with (depolarized) candidate state sets.

1 2 3 4 5 6 7 0.65 0.7 0.75 0.8 0.85 trial number Psucc SDP RLNN

Optimal collective success probability via semidefinite programming (SDP).

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 24 / 27

slide-25
SLIDE 25

General Results for 3-ary State Discrimination

Setup- we set m = 3 and n = 3, and randomly generate 10 trials with (depolarized) candidate state sets.

2 4 6 8 10 0.45 0.5 0.55 0.6 0.65 0.7 trial number Psucc SDP RLNN

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 25 / 27

slide-26
SLIDE 26

Open Questions

What is the “worst case” case gap between the optimal locally-adaptive protocol and optimal collective measurements as a function of m, n? How do these methods perform on entangled states such as graph states? Can we use a multi scale approach for problems with large n?

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 26 / 27

slide-27
SLIDE 27

Thank you!

Github Repository: https://github.com/SarahBrandsen/RLNN-QSD Related Work: arXiv:1912.05087

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 27 / 27

slide-28
SLIDE 28

References I

  • A. Ac´

ın, E. Bagan, M. Baig, Ll. Masanes, R. Mu˜ noz Tapia. Multiple-copy two-state discrimination with individual measurements.

  • Phys. Rev. A, 71:032338, 2005.

Antonio Assalini, Nicola Dalla Pozza, Gianfranco Pierobon. Revisiting the Dolinar receiver through multiple-copy state discrimination theory.

  • Phys. Rev. A, 84:022342, Aug 2011.
  • M. Ban.

Optimum measurements for discrimination among symmetric quantum states and parameter estimation. International Journal of Theoretical Physics, 36(6):1269–1288, 1997. Stephen M. Barnett. Minimum-error discrimination between multiply symmetric states.

  • Phys. Rev. A, 64:030303, Aug 2001.

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 28 / 27

slide-29
SLIDE 29

References II

Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, Wojciech Zaremba. OpenAI gym, 2016. Richard Bellman. The theory of dynamic programming.

  • Bull. Amer. Math. Soc., 60(6):503–515, 11 1954.

V.P. Belavkin. Optimum distinction of non-orthogonal quantum signals. Radio Engineering and Electronic Physics, 20:39–47, June 1975. Sarah Brandsen, Mengke Lian, Kevin D. Stubbs, Narayanan Rengaswamy, Henry D. Pfister. Adaptive procedures for discrimination between arbitrary tensor-product quantum states, 2019.

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 29 / 27

slide-30
SLIDE 30

References III

Sarah Brandsen, Kevin D. Stubbs, Henry D. Pfister. Reinforcement learning with neural networks for quantum multiple hypothesis testing. placeholder for arXiv version to be posted, 2020.

  • Y. C. Eldar, G. D. Forney.

On quantum detection and the square-root measurement. IEEE Transactions on Information Theory, 47(3):858–872, March 2001. Y.C. Eldar. A semidefinite programming approach to optimal unambiguous discrimination of quantum states. IEEE Transactions on Information Theory, 49(2):446–456, Feb 2003. Yonina C Eldar. A Semidefinite Programming Approach to Optimal Unambiguous Discrimination of Quantum States. IEEE Transactions on Information Theory, 49:446–456, 2003.

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 30 / 27

slide-31
SLIDE 31

References IV

Y.C. Eldar, A. Megretski, G.C. Verghese. Designing optimal quantum detectors via semidefinite programming. IEEE Transactions on Information Theory, 49(4):1007–1012, Apr 2003.

  • Y. C. Eldar, A. Megretski, G. C. Verghese.

Optimal detection of symmetric mixed quantum states. IEEE Transactions on Information Theory, 50(6), June 2004.

  • A. Ferdinand, M. DiMario, F. Becerra.

Multi-state discrimination below the quantum noise limit at the single-photon level. npj Quantum Information, 3, 12 2017. Thomas F¨

  • sel, Petru Tighineanu, Talitha Weiss, Florian Marquardt.

Reinforcement learning with neural networks for quantum feedback.

  • Phys. Rev. X, 8:031084, Sep 2018.

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 31 / 27

slide-32
SLIDE 32

References V

Geoffrey J. Gordon. Stable fitted reinforcement learning. Proceedings of the 8th International Conference on Neural Information Processing Systems, NIPS’95, strona 1052–1058, Cambridge, MA, USA, 1995. MIT Press. Carl W Helstrom. Quantum detection and estimation theory. Journal of Statistical Physics, 1(2):231–252, 1969. Paul Hausladen, William K. Wootters. A ‘pretty good’ measurement for distinguishing quantum states. Journal of Modern Optics, 41(12):2385–2390, 1994. Hari Krovi, Saikat Guha, Zachary Dutton, Marcus P. da Silva. Optimal measurements for symmetric quantum states with applications to optical communication. Physical Review A, 92(6), Dec 2015.

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 32 / 27

slide-33
SLIDE 33

References VI

Alexander Holm Kiilerich, Klaus Mølmer. Multistate and multihypothesis discrimination with open quantum systems. Physical Review A, 97(5), May 2018. Robert Koenig, Renato Renner, Christian Schaffner. The operational meaning of min- and max-entropy. IEEE Transactions on Information Theory, 55(9):4337–4347, Sep 2009. Eric Liang, Richard Liaw, Philipp Moritz, Robert Nishihara, Roy Fox, Ken Goldberg, Joseph E. Gonzalez, Michael I. Jordan, Ion Stoica. Rllib: Abstractions for distributed reinforcement learning, 2017. Richard Liaw, Eric Liang, Robert Nishihara, Philipp Moritz, Joseph E Gonzalez, Ion Stoica. Tune: A research platform for distributed model selection and training. arXiv preprint arXiv:1807.05118, 2018.

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 33 / 27

slide-34
SLIDE 34

References VII

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller. Playing Atari with deep reinforcement learning, 2013. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei Rusu, Joel Veness, Marc Bellemare, Alex Graves, Martin Riedmiller, Andreas Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis. Human-level control through deep reinforcement learning. Nature, 518:529–33, 02 2015. Carlos Mochon. Family of generalized “pretty good” measurements and the minimal-error pure-state discrimination problems for which they are optimal.

  • Phys. Rev. A, 73:032328, Mar 2006.

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 34 / 27

slide-35
SLIDE 35

References VIII

Pfister Henry D. Rengaswamy, Narayanan. Quantum advantage in classical communications via belief-propagation with quantum messages. 2020. Masahide Sasaki, Kentaro Kato, Masayuki Izutsu, Osamu Hirota. Quantum channels showing superadditivity in classical capacity.

  • Phys. Rev. A, 58:146–158, Jul 1998.

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov. Proximal policy optimization algorithms, 2017. Gerald Tesauro. Practical issues in temporal difference learning.

  • Mach. Learn., 8(3–4):257–277, Maj 1992.

Graeme Weir, Catherine Hughes, Stephen M. Barnett, Sarah Croke. Optimal measurement strategies for the trine states with arbitrary prior probabilities, 2018.

(Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 35 / 27