 
              Peter GrΓΌnwald November 2015 Prelude: Kelly Gambling Safe Probability β’ Suppose we observe sequence π 1 , π 2 , β¦ of 0s and 1s β’ At each point in time π , we can buy a ticket π π,1 that pays off $2 iff π π = 1, and a ticket π π,0 that pays off $2 Peter GrΓΌnwald iff π π = 0. Both tickets cost $1 Centrum Wiskunde & Informatica β Amsterdam β’ Crucially: we are allowed to divide our capital any way Mathematisch Instituut β Universiteit Leiden we like and re-invest our capital at each point in time β e.g. By putting 50% of your capital at time i on π π,1 and 50% on π π,0 you make sure that your capital remains the same How to design a gambling Prelude: Kelly Gambling strategy? At each time π , we can buy a ticket π β’ β’ π,1 that pays off A gambling strategy in this game is formally equivalent to a probability distribution $2 iff π π = 1, and a ticket π π,0 that pays off $2 iff π π = π on infinite 0. Both tickets cost $1 sequences. Which strategy should we adopt? β’ A gambling strategy in this game is a function and thus defines a probability distr. on 0,1 β via setting β’ If we follow such a strategy and start with $1, our capital after n rounds will be How to design a gambling How to design a gambling strategy? strategy? Strict Subjective Bayesian: determine subjective π β , β’ β’ A gambling strategy in this game is formally equivalent to a probability distribution and then play optimal π (we may have π on infinite π β π β ) Imprecise: determine set and play βoptimalβ sequences. Which strategy should we adopt? β’ π β’ Strict Subjective Bayesian: think very long about the β’ Information Theorist: pick any gambling strategy situation, come up with a subjective distribution π β , which you think might gain you a lot. E.g. if you think and then play the distribution π maximizing expected frequency might converge to π β 0.5 , you might play gain (we may have π β π β ) Laplace rule of succession... β’ Imprecise Probabilist: come up with a set of distributions , and then play the distribution π optimal relative to , with optimality defined relative to some additional criterion (which one?) Safe Probability β Workshop Teddy Seidenfeld 1
Peter GrΓΌnwald November 2015 How to design a gambling Starting Point strategy? Strict Subjective Bayesian: determine subjective π β , β’ β’ Adopting a Bayesian predictive distribution like the and then play optimal π (we may have π β π β ) Laplace Rule of Succession if you think data are not Imprecise: determine set and play βoptimalβ π Bernoulli is o.k. (and I think, rational!) for some β’ prediction tasks... β’ Information Theorist: pick any gambling strategy β Sequential gambling, Data Compression which you think might gain you a lot. E.g. if you think ...but not for others: frequency might converge to π β 0.5 , you might play β 0/1-loss prediction (no fractional bets!) when you are only Laplace rule of succession... asked to predict π π in the situation that π πβ1 = 1 β’ I want to design a theory which can cope with such βpartially useableβ distributions ...if your hypothesis about frequence is correct, you gain exponential amount of money even if at the same time you think data are not Bernoulli (or not even stationary) A Middle Ground between strict Menu Bayes and imprecise probability 1. The Setting β’ Set of distrs has unique 2. Definition 1, Example 1: Dilation representative , as in βobjective Bayesβ, fiducial inference, Maximum 3. Definition 2, Example 1 cont. Entropy, data compression... β’ 4. Definition 3-4, Example 2: Calibration One absolutely crucial difference: we restrict use of π to subset of all 5. Example 3: Fiducial Distributions possible prediction tasks: we know in advance that π should not be 6. Desert: Monty Hall Problem, Decision Safety π taken to seriously β’ Provides unifying and demistifying view π The Setting The Setting β’ A Bayesian would have a singleton and could then set β’ Let be a set of distributions on a space Ξ©, representing Note that π β is a distribution on Ξ©, inducing a joint β’ Decision- Maker (DM)βs uncertainty about a domain which in turn induces , while is DM has to make predictions/assertions about some π (or a β’ directly defined as a conditional function thereof), upon observing π. Both π and π are RVs (hence π in picture to be taken with grain of salt) (random variables) on Ξ©, taking values in and , resp. She does so using a pragmatic distribution β’ π π π , defined as a conditional distribution of π given π , i.e. a function mapping each to a distribution on π(π|π) β’ Whenever finite, we think of as a column vector Safe Probability β Workshop Teddy Seidenfeld 2
Peter GrΓΌnwald November 2015 The Setting First Definition: Weak Safety β’ A Bayesian would have a singleton and could then set We say that β’ π π π is safe for π | β©πβͺ if for all : β’ We have to do something else β sometimes eqv. to conditioning on a special element of , sometimes really different... πΈ is really a probability update rule!! β’ i.e. π(π|π) First Definition First Definition We say that We say that β’ π π π is safe for π | β©πβͺ if for all : β’ π π π is safe for π | β©πβͺ if for all : β’ β’ i.e. we can expect our expectation of π to be βcorrectβ i.e. (in a relative sense) β’ we will usually want somewhat stronger versions of βsafetyβ First Example: Dilation Dilation Seidenfeld & Wasserman, β93 β’ Given: marginal probability of π . π may depend on Before observing π we had precise probability π , but we have no idea how after we only know is in large superset Task: predict π given π . β’ β extra information ο less knowledge Suppose we observe π = 0 . Now conditional β’ no matter what you observe !β probability could be anything... Similarly if we observe π = 1 : β’ Safe Probability β Workshop Teddy Seidenfeld 3
Peter GrΓΌnwald November 2015 First Example of βSafetyβ Ignoring instead of Dilating β’ β’ Pointwise conditioning gives dilation REALITY : U may be dependent on V β’ Instead we may decide to ignore π , i.e. act as if π β’ PRAGMATICS : we nevertheless decide to predict U and π are independent, and predict with the with a distribution that assumes U and V are pragmatic distribution independent β’ Our predictions will be just as accurate as we Proposition: π π π is safe for π | β©πβͺ would expect them to be if our pragmatic β’ distribution πΈ were βcorrectβ ...as long as we only use β’ π only for certain, not all prediction tasks... β’ i.e. Definition 2, Preparation Definition 2 Recall: π π π is safe for π½β² |β©πβͺ if and for β’ We write if there exists a function π such that β’ all : π π β‘ π (β π determines π β) β’ π π π can be used to predict not just π , but also any πβ² determined by (π, π) , i.e. with : We say that π π π is safe for π½ | β©πβͺ if for all π β² with β’ , all : : Example 1(b) - dilation again Definition 2 Recall: β’ π π π is safe for π½β² |β©πβͺ if and for all : β’ Task: predict π given π . β’ Again we decide to ignore π and set e.g. for all : We say that π π π is safe for π½ | β©πβͺ if for all π β² with β’ Then π is safe for π | β©πβͺ but not for π | β©πβͺ β’ , all : : Safe Probability β Workshop Teddy Seidenfeld 4
Peter GrΓΌnwald November 2015 Example 1(c) Example 1(c): use the marginal β’ Task: predict π given π . β’ Task: predict π given π . β’ Again we decide to ignore π and set e.g. for all : β’ Again we decide to ignore π and set e.g. for all : Then, again , Then β’ π is safe for π | β©πβͺ but not for π | β©πβͺ β’ π is safe for π | β©πβͺ and also for π | β©πβͺ Definition 3, Preparation Definition 3 Recall: Recall: π π π is safe for πβ² |β©πβͺ if and for π π π is safe for πβ² |β©πβͺ if for all : β’ β’ all : We say that π π π is safe for β©π β² βͺ| πΎ if for all : β’ β’ Leave out β β part from now on, for brevity : : Definition 3 Definition 3, 3b Recall: Recall: β’ π π π is safe for πβ² |β©πβͺ if for all : β’ π π π is safe for πβ² |β©πβͺ if for all : We say that We say that π π π is safe for β©π β² βͺ| πΎ if for all : π π π is safe for β©π β² βͺ| πΎ if for all : β’ β’ We say that π π π is safe for π½ β² | πΎ if for all : β’ Our expectation of Uβ is (relatively) correct β’ : : i.e. π β is unique and π is almost surely βcorrectβ Safe Probability β Workshop Teddy Seidenfeld 5
Recommend
More recommend