Stochasticity in Algorithmic Statistics for Polynomial Time Alexey - PowerPoint PPT Presentation

Stochasticity in Algorithmic Statistics for Polynomial Time Alexey Milovanov, Nikolay Vereshchagin National Research University Higher School of Economics CCC 2017, Riga 1 / 15

Algorithmic Statistics A black box that samples from − → an unknown probability distribution 2 / 15

Algorithmic Statistics A black box that samples from − → x = 1000010101 . . . 1 an unknown probability � �� distribution n 2 / 15

Algorithmic Statistics A black box that samples from − → x = 1000010101 . . . 1 an unknown probability � �� distribution n A general question: Given the black box’s output x and a distribution µ , is it plausible that the black box samples from µ ? 2 / 15

Algorithmic Statistics A black box that samples from − → x = 1000010101 . . . 1 an unknown probability � �� distribution n A general question: Given the black box’s output x and a distribution µ , is it plausible that the black box samples from µ ? Example: Let x = 101100101110100101010000101100101110100101010000 and let µ be the uniform distribution over strings of length n = | x | . Is it plausible that the black box samples from µ ? 2 / 15

Algorithmic Statistics A black box that samples from − → x = 1000010101 . . . 1 an unknown probability � �� distribution n A general question: Given the black box’s output x and a distribution µ , is it plausible that the black box samples from µ ? Example: Let x = 101100101110100101010000101100101110100101010000 and let µ be the uniform distribution over strings of length n = | x | . Is it plausible that the black box samples from µ ? An answer: No, since x is a square ( x = uu ) and the probability of being a square is negligible (2 − n / 2 ). 2 / 15

Algorithmic Statistics with no time bounds Definition (Kolmogorov) A probability distribution µ is an acceptable explanation for x if the randomness deficiency of x wrt µ , − log 2 µ ( x ) − C ( x | µ ) is negligible. 3 / 15

Algorithmic Statistics with no time bounds Definition (Kolmogorov) A probability distribution µ is an acceptable explanation for x if the randomness deficiency of x wrt µ , − log 2 µ ( x ) − C ( x | µ ) is negligible. Majority Principle: for all µ , if x is sampled from µ , then the probability of having − log 2 µ ( x ) − C ( x | µ ) > β is less than 2 − β . 3 / 15

Algorithmic Statistics with no time bounds Definition (Kolmogorov) A probability distribution µ is an acceptable explanation for x if the randomness deficiency of x wrt µ , − log 2 µ ( x ) − C ( x | µ ) is negligible. Majority Principle: for all µ , if x is sampled from µ , then the probability of having − log 2 µ ( x ) − C ( x | µ ) > β is less than 2 − β . Proposition − log µ ( x ) − C ( x | µ ) is large if and only if there is a simple T ∋ x (that is, T is enumerated by a short program) with negligible µ ( T ) . 3 / 15

Algorithmic Statistics with no time bounds Back to our example: x = 101100101110100101010000101100101110100101010000 4 / 15

Algorithmic Statistics with no time bounds Back to our example: x = 101100101110100101010000101100101110100101010000 If µ is the uniform distribution over n -bit strings, then − log µ ( x ) − C ( x | µ ) ≈ n − n / 2 = n / 2; 4 / 15

Algorithmic Statistics with no time bounds Back to our example: x = 101100101110100101010000101100101110100101010000 If µ is the uniform distribution over n -bit strings, then − log µ ( x ) − C ( x | µ ) ≈ n − n / 2 = n / 2; If µ is the uniform distribution over all n -bit squares, then − log µ ( x ) − C ( x | µ ) ≈ n / 2 − n / 2 = 0. 4 / 15

Algorithmic Statistics with no time bounds Back to our example: x = 101100101110100101010000101100101110100101010000 If µ is the uniform distribution over n -bit strings, then − log µ ( x ) − C ( x | µ ) ≈ n − n / 2 = n / 2; If µ is the uniform distribution over all n -bit squares, then − log µ ( x ) − C ( x | µ ) ≈ n / 2 − n / 2 = 0. Another example: Let x be an arbitrary n -bit string, let µ be concentrated on x , i.e., µ ( x ) = 1. Then µ is acceptable for x , since − log µ ( x ) − C ( x | µ ) ≈ 0 − 0 = 0. 4 / 15

Algorithmic Statistics with no time bounds Back to our example: x = 101100101110100101010000101100101110100101010000 If µ is the uniform distribution over n -bit strings, then − log µ ( x ) − C ( x | µ ) ≈ n − n / 2 = n / 2; If µ is the uniform distribution over all n -bit squares, then − log µ ( x ) − C ( x | µ ) ≈ n / 2 − n / 2 = 0. Another example: Let x be an arbitrary n -bit string, let µ be concentrated on x , i.e., µ ( x ) = 1. Then µ is acceptable for x , since − log µ ( x ) − C ( x | µ ) ≈ 0 − 0 = 0. The goal: given x , find a simple ( C ( µ ) ≈ 0) acceptable explanation µ for x . 4 / 15

Algorithmic Statistics with no time bounds Back to our example: x = 101100101110100101010000101100101110100101010000 If µ is the uniform distribution over n -bit strings, then − log µ ( x ) − C ( x | µ ) ≈ n − n / 2 = n / 2; If µ is the uniform distribution over all n -bit squares, then − log µ ( x ) − C ( x | µ ) ≈ n / 2 − n / 2 = 0. Another example: Let x be an arbitrary n -bit string, let µ be concentrated on x , i.e., µ ( x ) = 1. Then µ is acceptable for x , since − log µ ( x ) − C ( x | µ ) ≈ 0 − 0 = 0. The goal: given x , find a simple ( C ( µ ) ≈ 0) acceptable explanation µ for x . Theorem (A. Shen 1983) This goal is not always achievable (there are non-stochastic strings). 4 / 15

Algorithmic Statistics with time bounds: acceptable explanations Now we care about computation time! 5 / 15

Algorithmic Statistics with time bounds: acceptable explanations Now we care about computation time! Question: How do we define acceptable explanations? Why not say that time-bounded version of Kolmogorov’s randomness deficiency − log µ ( x ) − C t ( x | µ ) is small? 5 / 15

Algorithmic Statistics with time bounds: acceptable explanations Now we care about computation time! Question: How do we define acceptable explanations? Why not say that time-bounded version of Kolmogorov’s randomness deficiency − log µ ( x ) − C t ( x | µ ) is small? Answer: For polynomial time bounded computations, we cannot prove that randomness deficiency is small if and only if there is no simple refutation set. We will define acceptability using refutation sets. 5 / 15

Algorithmic Statistics with time bounds: acceptable explanations Now we care about computation time! Question: How do we define acceptable explanations? Why not say that time-bounded version of Kolmogorov’s randomness deficiency − log µ ( x ) − C t ( x | µ ) is small? Answer: For polynomial time bounded computations, we cannot prove that randomness deficiency is small if and only if there is no simple refutation set. We will define acceptability using refutation sets. Back to our example: x = 101100101110100101010000101100101110100101010000, µ is the uniform distribution over strings of length n = | x | . We refute µ , since x falls into a simple set T ∋ x having negligible µ ( T ). Notice that T can be recognized by a short program in a short (polynomial) time. 5 / 15

Algorithmic Statistics with time bounds: acceptable explanations Definition (informal) µ is an acceptable explanation for x if there is no T ∋ x with negligible µ ( T ) which is recognizable by a short program in a short time. 6 / 15

Algorithmic Statistics with time bounds: acceptable explanations Definition (informal) µ is an acceptable explanation for x if there is no T ∋ x with negligible µ ( T ) which is recognizable by a short program in a short time. Definition (formal) µ is a ( t , α, ε ) -acceptable explanation for x if for all T ∋ x with CD t ( T ) < α , we have µ ( T ) � ε . 6 / 15

Algorithmic Statistics with time bounds: acceptable explanations Definition (informal) µ is an acceptable explanation for x if there is no T ∋ x with negligible µ ( T ) which is recognizable by a short program in a short time. Definition (formal) µ is a ( t , α, ε ) -acceptable explanation for x if for all T ∋ x with CD t ( T ) < α , we have µ ( T ) � ε . Majority principle: if ε ≪ 2 − α , then the µ -probability of the event µ is not ( t , α, ε ) -acceptable explanation for x is negligible (the probability of this event is smaller than ε 2 α ). 6 / 15

Simple explanations Example x is an arbitrary string and µ is concentrated on x . Then µ is ( ∗ , ∗ , 1)-acceptable for x . 7 / 15

Simple explanations Example x is an arbitrary string and µ is concentrated on x . Then µ is ( ∗ , ∗ , 1)-acceptable for x . Goal: Given x find a simple acceptable explanation for x . 7 / 15

Simple explanations Example x is an arbitrary string and µ is concentrated on x . Then µ is ( ∗ , ∗ , 1)-acceptable for x . Goal: Given x find a simple acceptable explanation for x . Definition (informal) A distribution µ is simple if there is a fast sampler with a short program for µ . 7 / 15

Stochasticity in Algorithmic Statistics for Polynomial Time Alexey - PowerPoint PPT Presentation

Stochasticity in Algorithmic Statistics for Polynomial Time Alexey Milovanov, Nikolay Vereshchagin National Research University Higher School of Economics CCC 2017, Riga 1 / 15 Algorithmic Statistics A black box that samples from

Why Algorithmic and Rigorous Polynomial Approximations? Rigorous Polynomial Approximation =

Kolmogorov-Loveland stochasticity and Kolmogorov complexity Laurent Bienvenu Laboratoire

Introduction to Data Science: Statistical X i = {0, 1} x 1 , x 2 , x 3 , , x 100 x 1 , x 2 , x

Introduction Warping polynomial Span of warping polynomial Span and dealternating number Ayaka

Algorithmic Complexity Algorithmic Complexity "Algorithmic Complexity", also called

On Kauffman polynomial of alternating knot and HOMFLY polynomial of its Whitehead double

PATTERN RECOGNITION AND MACHINE LEARNING Polynomial Curve Fitting Sum-of-Squares Error Function 0

Property of the interior polynomial from the HOMFLY polynomial

Algorithmic Meta-Theorems for Restrictions of Treewidth Michael Lampis Computer Science Dept.

Algorithmic Aspects of Example: How to . . . Algorithmic Aspects of . . . Analysis, Prediction,

Treewidth reduction and algorithmic applications Treewidth reduction and algorithmic applications

New Modification of Restricted Boltzmann Machine that Considers the Stochasticity of Real Neural

Timing from Stochasticity Scott Yang Nick Rhind (UMass Med) John Bechhoefer (SFU) CMMT TGIF

The Dawning of the Age of Stochasticity For over two millennia, Aristotles logic has ruled

Reconciling Rationality and Stochasticity: Rich Behavioral Models in Two-Player Games Mickael

Tuning numerical parameters of algorithms: sampling and stochasticity handling Z. Yuan, T. St

mCarve: Carving attributed dump sets Sjouke Mauw University of Luxembourg sjouke.mauw@uni.lu

Session 8 As you arrive: 1. Start up your computer and plug it in. Sequences, Log into Angel

CS3102 Theory of Computation www.cs.virginia.edu/~njb2b/cstheory/s2020 Warm up: Software can be

AC MGFs Q&A 1: 0 bits in 00-free bit strings Q. What is the average number of 0 bits in a

Implementing Practical leakage-resilient symmetric cryptography Daniel J. Bernstein

L O U R I 1 0 1 S t e p h a n B e r g m a n n F O S D E M , F e b

Homework 1 (due 2/8/04) CS519 In class, I described how the packet header always contains two

Build a USB 2.0 device from scratch Friday 15, July 2016 Philmon `PhilGekni` Gardet

Stochasticity in Algorithmic Statistics for Polynomial Time Alexey - PowerPoint PPT Presentation

Stochasticity in Algorithmic Statistics for Polynomial Time Alexey Milovanov, Nikolay Vereshchagin National Research University Higher School of Economics CCC 2017, Riga 1 / 15 Algorithmic Statistics A black box that samples from

Why Algorithmic and Rigorous Polynomial Approximations? Rigorous Polynomial Approximation =

Kolmogorov-Loveland stochasticity and Kolmogorov complexity Laurent Bienvenu Laboratoire

Introduction to Data Science: Statistical X i = {0, 1} x 1 , x 2 , x 3 , , x 100 x 1 , x 2 , x

Introduction Warping polynomial Span of warping polynomial Span and dealternating number Ayaka

Algorithmic Complexity Algorithmic Complexity &quot;Algorithmic Complexity&quot;, also called

On Kauffman polynomial of alternating knot and HOMFLY polynomial of its Whitehead double

PATTERN RECOGNITION AND MACHINE LEARNING Polynomial Curve Fitting Sum-of-Squares Error Function 0

Property of the interior polynomial from the HOMFLY polynomial

Algorithmic Meta-Theorems for Restrictions of Treewidth Michael Lampis Computer Science Dept.

Algorithmic Aspects of Example: How to . . . Algorithmic Aspects of . . . Analysis, Prediction,

Treewidth reduction and algorithmic applications Treewidth reduction and algorithmic applications

New Modification of Restricted Boltzmann Machine that Considers the Stochasticity of Real Neural

Timing from Stochasticity Scott Yang Nick Rhind (UMass Med) John Bechhoefer (SFU) CMMT TGIF

The Dawning of the Age of Stochasticity For over two millennia, Aristotles logic has ruled

Reconciling Rationality and Stochasticity: Rich Behavioral Models in Two-Player Games Mickael

Tuning numerical parameters of algorithms: sampling and stochasticity handling Z. Yuan, T. St

mCarve: Carving attributed dump sets Sjouke Mauw University of Luxembourg sjouke.mauw@uni.lu

Session 8 As you arrive: 1. Start up your computer and plug it in. Sequences, Log into Angel

CS3102 Theory of Computation www.cs.virginia.edu/~njb2b/cstheory/s2020 Warm up: Software can be

AC MGFs Q&amp;A 1: 0 bits in 00-free bit strings Q. What is the average number of 0 bits in a

Implementing Practical leakage-resilient symmetric cryptography Daniel J. Bernstein

L O U R I 1 0 1 S t e p h a n B e r g m a n n F O S D E M , F e b

Homework 1 (due 2/8/04) CS519 In class, I described how the packet header always contains two

Build a USB 2.0 device from scratch Friday 15, July 2016 Philmon `PhilGekni` Gardet

Algorithmic Complexity Algorithmic Complexity "Algorithmic Complexity", also called

AC MGFs Q&A 1: 0 bits in 00-free bit strings Q. What is the average number of 0 bits in a