reinforcement learning with neural networks for quantum
play

Reinforcement Learning with Neural Networks for Quantum Multiple - PowerPoint PPT Presentation

Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing Sarah Brandsen 1 , Kevin D. Stubbs 2 , Henry D. Pfister 2 , 3 1 Department of Physics, Duke University 2 Department of Mathematics, Duke University 3 Department of


  1. Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing Sarah Brandsen 1 , Kevin D. Stubbs 2 , Henry D. Pfister 2 , 3 1 Department of Physics, Duke University 2 Department of Mathematics, Duke University 3 Department of Electrical Engineering, Duke University IEEE International Symposium on Information Theory June 21-26, 2020 (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 1 / 27

  2. Outline 1 Overview of Multiple State Discrimination 2 Reinforcement Learning with Neural Networks (RLNN) 3 Comparing RLNN performance to known results Binary pure state discrimination RLNN performance as function of subsystem number Comparison to “Pretty Good Measurement” 4 Performance of RLNN in more general cases Trine ensemble Comparison to Semidefinite Programming Upper Bounds 5 Open questions (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 2 / 27

  3. Quantum State Discrimination Given: ρ ∈ { ρ j }| m j =1 with priors � q = ( q 1 , ..., q m ) Objective: find quantum measurement ˆ Π = { Π j }| m j =1 that maximizes m � P success = Tr[ q j ρ j Π j ] j =1 ρ 1 q j = Pr( ρ = ρ j ) ρ 4 ρ 3 ρ 2 (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 3 / 27

  4. Locally Adaptive Strategies Locally adaptive protocols consist of measuring one subsystem at a time, then choosing the next subsystem and measurement based on previous results ρ (3) 4 ρ (3) 1 ρ (2) ρ (1) 3 ρ (2) 1 ρ (1) 1 2 ρ (3) ρ (1) ρ (1) ρ (2) 3 4 3 2 ρ (2) ρ (3) 4 2 (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 4 / 27

  5. Motivation for Locally Adaptive Strategies Analytic solution for optimal collective measurement generally not known when m ≥ 3 Approximately optimal solutions found via semidefinite programming [EMV03] may be experimentally impractical for large systems ρ (3) 4 ρ (3) 1 ρ (2) ρ (1) 3 ρ (2) 1 ρ (1) 1 2 ρ (3) ρ (1) ρ (1) ρ (2) 3 4 3 2 ρ (2) ρ (3) 4 2 k =1 ρ ( k ) ρ j = � n j (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 5 / 27

  6. Reinforcement Learning Main idea- agent learns to maximize the expected future reward through repeated interactions with the environment. a t ∈ A s t ∈ S r t (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 6 / 27

  7. Advantage function Agent’s policy - draw random action a given state s according to π θ ( a | s ) = Pr[A = a | S = s ] Advantage function - compares the expected reward of choosing action a given state s to the average expected reward for being in state s given policy π N γ ℓ − t � � � � � A π ( s t , a t ) = E π θ [ r ( s ℓ , a ℓ ) � s t , a t ] − E π θ [ r ( s ℓ , a ℓ ) � s t ] ℓ = t (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 7 / 27

  8. Neural Networks for Function Approximation Setup- we use a fully connected neural network where the input layer feeds into two parallel sets of sub-networks. π ∗ ( a 1 | s ) π ∗ ( a 2 | s ) π ∗ ( a | A | | s ) V ( s ) Input Layer Two Hidden Layers (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 8 / 27

  9. Set of allowed quantum measurements Binary Projective Measurement Set – taken to be { ˆ Π( ℓ ) } Q ℓ =1 where Π( ℓ ) � ℓ � ℓ �  �  � 2 � 2 Π( ℓ − 1) Π( ℓ + 1) ℓ 1 − ˆ Q Q Q  , Π( ℓ ) � � ℓ � ℓ  � � 2 � 2 ℓ 1 − 1 − Q Q Q � ℓ � ℓ  �  � 2 � 2 � − ℓ 1 − 1 − Q Q Q � ℓ � ℓ  �  � 2 � 2 − ℓ 1 − Q Q Q and ℓ ∈ { 0 , 1 , ..., Q − 1 } . Q = 20 in our experiments. (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 9 / 27

  10. Applying RLNN to Multiple State Discrimination Initialize- Randomly generate ρ ∈ { ρ j } m j =1 according to � q = ( q 1 , ..., q m ) Initialize � s = ( s 1 , ..., s n ) to all-zeros vector � s = [0 , 0 , 0] (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 10 / 27

  11. Applying RLNN to Multiple State Discrimination (cont) Step- Agent chooses an action of the form ( j , ˆ Π) Implement action and sample outcome according to Tr[Π out ρ ] Update prior via Bayes’ Theorem Set s j → 1 j = 2 s → [0 , 1 , 0] � (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 11 / 27

  12. Reward scheme If subsystem j has already been measured in a previous round, return penalty of -0.5 r = − 0 . 5 When s j = 1 for all j return reward of 1 if ρ guess = ρ and 0 else. 1 1 Results are generated using the default PPO algorithm from Ray version 0.7.6 (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 12 / 27

  13. Binary Pure State Discrimination Setup- in the special case where m = 2, the state set is { ρ + , ρ − } with prior q = Pr( ρ = ρ + ). Optimal solution- the Helstrom measurement is optimal, where Π h = { Π + , Π − } and Π ± are projectors onto the positive/negative eigenspace of M � q ρ + − (1 − q ) ρ − In the special case where ρ ± are both tensor products of pure subsystems, an adaptive greedy protocol is fully optimal. (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 13 / 27

  14. RLNN Performance in the Binary Case Setup- for each trial, we randomly select pure tensor product quantum states with m = 2, n = 3. Results for the optimal RLNN policy are plotted after 1000 training iterations. 1 0 . 9 0 . 8 P succ 0 . 7 0 . 6 Helstrom RLNN 0 . 5 0 2 4 6 8 trial (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 14 / 27

  15. RLNN Performance as Function of Training Iterations 0 . 1 P succ, Helstrom − P succ, RLNN 8 · 10 − 2 6 · 10 − 2 4 · 10 − 2 2 · 10 − 2 0 0 200 400 600 800 1 , 000 training iteration (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 15 / 27

  16. Special known case Given a base set { ρ 0 , ρ 1 } , consider: S (1) � { ρ 0 , ρ 1 } S (2) � { ρ 0 ⊗ ρ 0 , ρ 0 ⊗ ρ 1 , ρ 1 ⊗ ρ 0 , ρ 1 ⊗ ρ 1 } S (3) � { ρ 0 ⊗ ρ 0 ⊗ ρ 0 , ρ 0 ⊗ ρ 0 ⊗ ρ 1 , ρ 0 ⊗ ρ 1 ⊗ ρ 0 , ρ 0 ⊗ ρ 1 ⊗ ρ 1 , ρ 1 ⊗ ρ 0 ⊗ ρ 0 , ρ 1 ⊗ ρ 0 ⊗ ρ 1 , ρ 1 ⊗ ρ 1 ⊗ ρ 0 , ρ 1 ⊗ ρ 1 ⊗ ρ 1 } .... and for each state set, assume each candidate state is equally probable. (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 16 / 27

  17. Special known case Results- the RLNN performance starts to show a signficant gap from the optimal success probability when n ≥ 5 0 . 8 P succ 0 . 6 0 . 4 Optimal NN 1 2 3 4 5 6 n (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 17 / 27

  18. The “Pretty Good Measurement” (PGM) The “Pretty Good Measurement” defines the POVM q j ρ j ) − 1 q j ρ j ) − 1 � � Π PGM , k � ( 2 q k ρ k ( ∀ k ∈ { 1 , ..., m } 2 j j Motivation- PGM is known to be optimal for several cases: Symmetric pure states with uniform prior where ρ j = | ψ j � � ψ j | and | ψ j � = U j − 1 | ψ 1 � with U m = I Linearly-independent pure states where the diagonal elements of the square-root of the Gram matrix are all equal (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 18 / 27

  19. RLNN vs Pretty Good Measurement Setup- we generate 10 trials of candidate states with n = 3, m = 5 and plot the difference in RLNN and PGM success probability. 4 3 2 1 0 − 5 · 10 − 2 0 5 · 10 − 2 0 . 1 P succ , NN − P succ , PGM (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 19 / 27

  20. Trine Ensemble Candidate States The trine ensemble consists of three equally spaced real qubit states, 3 ) † � ⊗ j � 2 � 3 ) ⊗ j | 0 � � 0 | � R ( 4 π R ( 4 π namely j =0 ρ ( 0 ) ρ ( 1 ) 0 0 ρ ( 0 ) ρ ( 0 ) ρ ( 1 ) ρ ( 1 ) 2 1 2 1 (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 20 / 27

  21. Conjectured optimal local method Step 1: “Anti-trine” measurement implemented on subsystem 1 ρ ( 0 ) 0 Π 1 Π 2 ρ ( 0 ) ρ ( 0 ) 2 1 Π 0 (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 21 / 27

  22. Conjectured optimal local method Step 2: Helstrom measurement for the remaining two candidate states is implemented on subsystem 2 ρ ( 1 ) 0 Π 0 ρ ( 1 ) 1 Π 1 out = Π 2 Success probability of this method is P succ ≈ 0 . 933, whereas success probability of a locally greedy method is P succ, lg = 0 . 8 (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 22 / 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend