strengths and weaknesses of quantum examples
play

Strengths and weaknesses of quantum examples Srinivasan Arunachalam - PowerPoint PPT Presentation

Strengths and weaknesses of quantum examples Srinivasan Arunachalam (MIT) joint with Ronald de Wolf (CWI, Amsterdam) and others 1/ 18 Machine learning Classical machine learning 2/ 18 Machine learning Classical machine learning Grand goal:


  1. Complexity of learning How to measure the efficiency of the classical or quantum learner? 8/ 18

  2. Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner 8/ 18

  3. Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner 8/ 18

  4. Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples 8/ 18

  5. Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples A CLW’18: Sample complexity of learning Fourier-sparse Boolean functions under uniform D 8/ 18

  6. Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples A CLW’18: Sample complexity of learning Fourier-sparse Boolean functions under uniform D Bshouty-Jackson’95: Quantum polynomial time learnability of DNFs under uniform D 8/ 18

  7. Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples A CLW’18: Sample complexity of learning Fourier-sparse Boolean functions under uniform D Bshouty-Jackson’95: Quantum polynomial time learnability of DNFs under uniform D A CKW’18: Quantum examples can help the coupon collector 8/ 18

  8. Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples A CLW’18: Sample complexity of learning Fourier-sparse Boolean functions under uniform D Bshouty-Jackson’95: Quantum polynomial time learnability of DNFs under uniform D A CKW’18: Quantum examples can help the coupon collector Weaknesses of quantum examples A W’17: Quantum examples are not more powerful than classical examples for PAC learning 8/ 18

  9. Fourier sampling: a useful trick under uniform D 9/ 18

  10. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . 9/ 18

  11. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n 9/ 18

  12. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = S � 9/ 18

  13. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] S � 9/ 18

  14. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � 9/ 18

  15. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � c ( S ) 2 } S forms a probability distribution So { � 9/ 18

  16. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � c ( S ) 2 } S forms a probability distribution So { � Given quantum example under uniform D : � 1 √ | x , c ( x ) � 2 n x 9/ 18

  17. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � c ( S ) 2 } S forms a probability distribution So { � Given quantum example under uniform D : � � 1 Hadamard √ � | x , c ( x ) � − → c ( S ) | S � 2 n x S 9/ 18

  18. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � c ( S ) 2 } S forms a probability distribution So { � Given quantum example under uniform D : � � 1 Hadamard √ � | x , c ( x ) � − → c ( S ) | S � 2 n x S c ( S ) 2 } S Measuring allows to sample from the Fourier distribution { � 9/ 18

  19. Applications of Fourier sampling 10/ 18

  20. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n 10/ 18

  21. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed 10/ 18

  22. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) 10/ 18

  23. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x 10/ 18

  24. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D 10/ 18

  25. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � 10/ 18

  26. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . 10/ 18

  27. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . Note that C 1 is 1-Fourier sparse and C 2 is 2 ℓ -Fourier sparse 10/ 18

  28. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . Note that C 1 is 1-Fourier sparse and C 2 is 2 ℓ -Fourier sparse Consider the concept class C = { c : { 0 , 1 } n → {− 1 , 1 } : c is k -Fourier sparse } 10/ 18

  29. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . Note that C 1 is 1-Fourier sparse and C 2 is 2 ℓ -Fourier sparse Consider the concept class C = { c : { 0 , 1 } n → {− 1 , 1 } : c is k -Fourier sparse } Observe that C 1 ⊆ C . C contains linear functions 10/ 18

  30. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . Note that C 1 is 1-Fourier sparse and C 2 is 2 ℓ -Fourier sparse Consider the concept class C = { c : { 0 , 1 } n → {− 1 , 1 } : c is k -Fourier sparse } Observe that C 1 ⊆ C . C contains linear functions Observe that C 2 ⊆ C . C contains (log k )-juntas 10/ 18

  31. Learning C = { c is k -Fourier sparse } 11/ 18

  32. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D 11/ 18

  33. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C 11/ 18

  34. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) 11/ 18

  35. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C 11/ 18

  36. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound 11/ 18

  37. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � 11/ 18

  38. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � Collect S s until the learner learns the Fourier span of c , V = span { S : � c ( S ) � = 0 } 11/ 18

  39. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � Collect S s until the learner learns the Fourier span of c , V = span { S : � c ( S ) � = 0 } Suppose dim( V ) = r , then � O ( rk ) quantum examples suffice to find V 11/ 18

  40. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � Collect S s until the learner learns the Fourier span of c , V = span { S : � c ( S ) � = 0 } Suppose dim( V ) = r , then � O ( rk ) quantum examples suffice to find V Use the result of [HR’15] to learn c ′ completely using � O ( rk ) classical examples 11/ 18

  41. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � Collect S s until the learner learns the Fourier span of c , V = span { S : � c ( S ) � = 0 } Suppose dim( V ) = r , then � O ( rk ) quantum examples suffice to find V Use the result of [HR’15] to learn c ′ completely using � O ( rk ) classical examples √ Since r ≤ � k ) for every c ∈ C [Sanyal’15], we get � O ( k 1 . 5 ) upper bound O ( 11/ 18

  42. Learning Disjunctive normal Forms (DNF) 12/ 18

  43. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. 12/ 18

  44. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) 12/ 18

  45. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s 12/ 18

  46. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D 12/ 18

  47. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] 12/ 18

  48. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! 12/ 18

  49. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound 12/ 18

  50. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s 12/ 18

  51. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s c ( T ) 2 } T , poly( s ) many times to see such a U Fourier sampling! Sample T ∼ { � 12/ 18

  52. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s c ( T ) 2 } T , poly( s ) many times to see such a U Fourier sampling! Sample T ∼ { � Construct a “weak learner” who outputs χ U s.t. Pr[ χ U ( x ) = c ( x )] = 1 2 + 1 s 12/ 18

  53. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s c ( T ) 2 } T , poly( s ) many times to see such a U Fourier sampling! Sample T ∼ { � Construct a “weak learner” who outputs χ U s.t. Pr[ χ U ( x ) = c ( x )] = 1 2 + 1 s Not good enough! Want an hypothesis that agrees with c on most inputs x ’s 12/ 18

  54. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s c ( T ) 2 } T , poly( s ) many times to see such a U Fourier sampling! Sample T ∼ { � Construct a “weak learner” who outputs χ U s.t. Pr[ χ U ( x ) = c ( x )] = 1 2 + 1 s Not good enough! Want an hypothesis that agrees with c on most inputs x ’s Boosting: Run weak learner many times in some manner to obtain a strong learner who outputs h satisfying Pr[ h ( x ) = c ( x )] ≥ 2 / 3 12/ 18

  55. Pretty good measurement for state identification 13/ 18

  56. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution 13/ 18

  57. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n 13/ 18

  58. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c 13/ 18

  59. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, 13/ 18

  60. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) 13/ 18

  61. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) If P opt is the success probability of the optimal measurement, 13/ 18

  62. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) If P opt is the success probability of the optimal measurement, P pgm is the success probability of the PGM, 13/ 18

  63. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) If P opt is the success probability of the optimal measurement, P pgm is the success probability of the PGM, then P opt ≥ P pgm 13/ 18

  64. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) If P opt is the success probability of the optimal measurement, P pgm is the success probability of the PGM, then P opt ≥ P pgm ≥ P 2 opt (Barnum-Knill’02) 13/ 18

  65. Quantum examples help the coupon collector 14/ 18

  66. Quantum examples help the coupon collector Standard coupon collector Problem: Suppose there are N coupons. 14/ 18

  67. Quantum examples help the coupon collector Standard coupon collector Problem: Suppose there are N coupons. How many coupons to draw (with replacement) before having seen each coupon at least once? 14/ 18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend