Strengths and weaknesses of quantum examples Srinivasan Arunachalam - PowerPoint PPT Presentation

Complexity of learning How to measure the efficiency of the classical or quantum learner? 8/ 18

Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner 8/ 18

Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner 8/ 18

Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples 8/ 18

Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples A CLW’18: Sample complexity of learning Fourier-sparse Boolean functions under uniform D 8/ 18

Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples A CLW’18: Sample complexity of learning Fourier-sparse Boolean functions under uniform D Bshouty-Jackson’95: Quantum polynomial time learnability of DNFs under uniform D 8/ 18

Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples A CLW’18: Sample complexity of learning Fourier-sparse Boolean functions under uniform D Bshouty-Jackson’95: Quantum polynomial time learnability of DNFs under uniform D A CKW’18: Quantum examples can help the coupon collector 8/ 18

Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples A CLW’18: Sample complexity of learning Fourier-sparse Boolean functions under uniform D Bshouty-Jackson’95: Quantum polynomial time learnability of DNFs under uniform D A CKW’18: Quantum examples can help the coupon collector Weaknesses of quantum examples A W’17: Quantum examples are not more powerful than classical examples for PAC learning 8/ 18

Fourier sampling: a useful trick under uniform D 9/ 18

Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . 9/ 18

Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n 9/ 18

Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = S � 9/ 18

Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] S � 9/ 18

Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � 9/ 18

Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � c ( S ) 2 } S forms a probability distribution So { � 9/ 18

Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � c ( S ) 2 } S forms a probability distribution So { � Given quantum example under uniform D : � 1 √ | x , c ( x ) � 2 n x 9/ 18

Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � c ( S ) 2 } S forms a probability distribution So { � Given quantum example under uniform D : � � 1 Hadamard √ � | x , c ( x ) � − → c ( S ) | S � 2 n x S 9/ 18

Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � c ( S ) 2 } S forms a probability distribution So { � Given quantum example under uniform D : � � 1 Hadamard √ � | x , c ( x ) � − → c ( S ) | S � 2 n x S c ( S ) 2 } S Measuring allows to sample from the Fourier distribution { � 9/ 18

Applications of Fourier sampling 10/ 18

Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n 10/ 18

Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed 10/ 18

Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) 10/ 18

Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x 10/ 18

Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D 10/ 18

Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � 10/ 18

Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . 10/ 18

Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . Note that C 1 is 1-Fourier sparse and C 2 is 2 ℓ -Fourier sparse 10/ 18

Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . Note that C 1 is 1-Fourier sparse and C 2 is 2 ℓ -Fourier sparse Consider the concept class C = { c : { 0 , 1 } n → {− 1 , 1 } : c is k -Fourier sparse } 10/ 18

Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . Note that C 1 is 1-Fourier sparse and C 2 is 2 ℓ -Fourier sparse Consider the concept class C = { c : { 0 , 1 } n → {− 1 , 1 } : c is k -Fourier sparse } Observe that C 1 ⊆ C . C contains linear functions 10/ 18

Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . Note that C 1 is 1-Fourier sparse and C 2 is 2 ℓ -Fourier sparse Consider the concept class C = { c : { 0 , 1 } n → {− 1 , 1 } : c is k -Fourier sparse } Observe that C 1 ⊆ C . C contains linear functions Observe that C 2 ⊆ C . C contains (log k )-juntas 10/ 18

Learning C = { c is k -Fourier sparse } 11/ 18

Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D 11/ 18

Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C 11/ 18

Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) 11/ 18

Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C 11/ 18

Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound 11/ 18

Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � 11/ 18

Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � Collect S s until the learner learns the Fourier span of c , V = span { S : � c ( S ) � = 0 } 11/ 18

Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � Collect S s until the learner learns the Fourier span of c , V = span { S : � c ( S ) � = 0 } Suppose dim( V ) = r , then � O ( rk ) quantum examples suffice to find V 11/ 18

Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � Collect S s until the learner learns the Fourier span of c , V = span { S : � c ( S ) � = 0 } Suppose dim( V ) = r , then � O ( rk ) quantum examples suffice to find V Use the result of [HR’15] to learn c ′ completely using � O ( rk ) classical examples 11/ 18

Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � Collect S s until the learner learns the Fourier span of c , V = span { S : � c ( S ) � = 0 } Suppose dim( V ) = r , then � O ( rk ) quantum examples suffice to find V Use the result of [HR’15] to learn c ′ completely using � O ( rk ) classical examples √ Since r ≤ � k ) for every c ∈ C [Sanyal’15], we get � O ( k 1 . 5 ) upper bound O ( 11/ 18

Learning Disjunctive normal Forms (DNF) 12/ 18

Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. 12/ 18

Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) 12/ 18

Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s 12/ 18

Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D 12/ 18

Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] 12/ 18

Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! 12/ 18

Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound 12/ 18

Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s 12/ 18

Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s c ( T ) 2 } T , poly( s ) many times to see such a U Fourier sampling! Sample T ∼ { � 12/ 18

Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s c ( T ) 2 } T , poly( s ) many times to see such a U Fourier sampling! Sample T ∼ { � Construct a “weak learner” who outputs χ U s.t. Pr[ χ U ( x ) = c ( x )] = 1 2 + 1 s 12/ 18

Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s c ( T ) 2 } T , poly( s ) many times to see such a U Fourier sampling! Sample T ∼ { � Construct a “weak learner” who outputs χ U s.t. Pr[ χ U ( x ) = c ( x )] = 1 2 + 1 s Not good enough! Want an hypothesis that agrees with c on most inputs x ’s 12/ 18

Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s c ( T ) 2 } T , poly( s ) many times to see such a U Fourier sampling! Sample T ∼ { � Construct a “weak learner” who outputs χ U s.t. Pr[ χ U ( x ) = c ( x )] = 1 2 + 1 s Not good enough! Want an hypothesis that agrees with c on most inputs x ’s Boosting: Run weak learner many times in some manner to obtain a strong learner who outputs h satisfying Pr[ h ( x ) = c ( x )] ≥ 2 / 3 12/ 18

Pretty good measurement for state identification 13/ 18

Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution 13/ 18

Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n 13/ 18

Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c 13/ 18

Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, 13/ 18

Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) 13/ 18

Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) If P opt is the success probability of the optimal measurement, 13/ 18

Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) If P opt is the success probability of the optimal measurement, P pgm is the success probability of the PGM, 13/ 18

Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) If P opt is the success probability of the optimal measurement, P pgm is the success probability of the PGM, then P opt ≥ P pgm 13/ 18

Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) If P opt is the success probability of the optimal measurement, P pgm is the success probability of the PGM, then P opt ≥ P pgm ≥ P 2 opt (Barnum-Knill’02) 13/ 18

Quantum examples help the coupon collector 14/ 18

Quantum examples help the coupon collector Standard coupon collector Problem: Suppose there are N coupons. 14/ 18

Quantum examples help the coupon collector Standard coupon collector Problem: Suppose there are N coupons. How many coupons to draw (with replacement) before having seen each coupon at least once? 14/ 18

Strengths and weaknesses of quantum examples Srinivasan Arunachalam - PowerPoint PPT Presentation

Strengths and weaknesses of quantum examples Srinivasan Arunachalam (MIT) joint with Ronald de Wolf (CWI, Amsterdam) and others 1/ 18 Machine learning Classical machine learning 2/ 18 Machine learning Classical machine learning Grand goal:

How UNAIDS works Strengths and weaknesses in the governance of UNAIDS Strengths Inclusiveness

Combining Agile and Strengths How to Capitalize on the Strengths Movement

Quantum Information Processing and Quantum Error Correction and Quantum Error Correction with

Quantum Weirdness Part 6 Quantum Weirdness in Materials Quantum Cryptography Quantum

Quantum Cryptography 1. Fake Quantum Theory. 2. Simple Quantum Protocols. 3. More Fake Quantum

Weaknesses of Probabilistic Context-Free Grammars Michael Collins, Columbia University Weaknesses

Five weaknesses of ASPIC+ Leila Amgoud amgoud@irit.fr Amgoud (IRIT) Weaknesses of APSIC+ 1 /

How Quantum Cryptography Quantum . . . and Quantum Computing How Quantum . . . How to Deal with

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

Computation Quantum Computing: . . . Potential Use of . . . in Quantum Space-Time Quantum

Quantum Hall effect effect Quantum Hall integer integer Hall bar geometry classical quantum

Quantum Cryptography Lecture 28 Quantum Cryptography Quantum Cryptography Quantum information:

1. How Autism affects me Social Communication Social Interaction Strengths Strengths Struggles

Performance Monitoring and Measurement: Strengths and Weaknesses Burt S. Barnow George

Strengths and Weaknesses of Corpus Linguistics in Legal Analysis: A Case Study of the Law and

To determine the level of fitness of students. To identify strengths and weaknesses for

Some Definition and Example of Markov Chain Bowen Dai The Ohio State University April 5 th 2016

Bounding Deviation from Expectation Theorem [Markov Inequality] For any non-negative random

Forecast setup: Forecasting is about the future! The practical setup: we are at time t (e.g., at

CPU Scheduling Continued. CS 416: Operating Systems Design, Spring 2011 Department of

Probability Theory Defd in terms of a probability space or sample space S (or ), a set whose

Bernstein Bound is Tight Repairing Luykx-Preneel Optimal Forgeries Mridul Nandi Indian

Usable High-Assurance Operating Systems Doug McIlroy Sean Smith Sergey Bratus Alex Ferguson

Lumped Element High Voltage MOS Model presented by Sebastian Schmidt at MOS-AK / Bblingen