Analogies and Theories in Belief Formation Itzhak Gilboa – Tel Aviv University and HEC, Paris ISIPTA 2015 Joint works of subsets of A. Billot, G. Gayer, I. Gilboa, O. Lieberman, A. Postlewaite, D. Samet, L. Samuelson, D. Schmeidler 1

Background • Classics: – Ramsey (1926), de Finetti (1931,7) – von-Neumann-Morgenstern (1944) – Savage (1954) – Anscombe-Aumann (1963) • Problems: – Descriptive – Normative 2

Background – cont. • Alternative theories – Schmeidler (1989) Choquet EU – G-Sch (1989) Maxmin EU – Klibanoff, Marinacci, Mukerji (2005) (Nau, Seo …) “Smooth Model” – Maccheroni, Marinacci, Rustichini (2006 ) “Variational Preferences” • Still the “black box” paradigm 3

Background – cont. • Case-Based Decision Theory – (w/ Schmeidler, Theory of Case Based Decisions, CUP 2001) • Probabilities from cases – (w/ Schmeidler and others, Case-Based Prediction, World Scientific 2012) • Analogies and Theories – (w/ Samuelson, Schmeidler and others, Analogies and Theories, OUP, 2015) 4

Statistics and Psychology • This project touches on both • And we found ourselves axiomatizing known formulae • Surprisingly, known in both domains – Which goes beyond this project – Sometimes, even the mistakes 5

Probabilities from Cases: Similarity-weighted frequencies n 1 m The data: x ,..., x , y i i i i 1 m m where and x ,..., x y 0 , 1 i i i We are asked about the probability that y 1 p for a new data point 1 m x ,..., x p p 6

Similarity-weighted frequencies – Formula (Kernel) m m Choose a similarity function s : n Given observations 1 m x ,..., x , y i i i i and a new data point 1 m x ,..., x p p estimate by s ( x , x ) y i p i s i n y P ( y 1 ) p s ( x , x ) p i p i n 7

Similarity-weighted frequencies – Interpretation s ( x , x ) y • Special cases of i p i s i n y p s ( x , x ) i p i n – If is constant: an estimate of the expectation s (in fact, “repeated experiment” is always a matter of subjective judgment of equal similarity) 1 – If s x i x , : an estimate of the conditional p x x i p expectation • Useful when precise updating leaves us with a sparse database • Akin to interpolation • But not to extrapolation! 8

Axiomatization – Setup m 1 observations (case types) M 1 m 1 m x ,..., x , y x ,..., x , y A database is a multi-set of observations Z I : M We will refer to a database as a sequence or a multi-set interchangeably. 9

Axiomatization I: Observables • A state space 1 ,..., s • Fix a new data point 1 m m x ,..., x p p • Databases Z I : M • A probability assignment function p : I ( ), I 0 10

The combination axiom database I + J case types database I database J M 9 1 5 4 18 2 12 6 . . . . + . . . . 3 . . . . 11 m 3 8 States of the world . p(I + J) . . p(J) Ω = {1,2,3 ,…, s} p(I) 1 11 2 Δ ( Ω )

The combination axiom • Formally p ( I J ) p ( I ) ( 1 ) p ( J ) for some 0 1 12

Theorem I • The combination axiom holds, and not all p ( I ) I are collinear if and only if c • For each ( M there are c , not all p ) collinear, and such that s 0 c c I ( c ) s p c c M p ( I ) I ( c ) s c c M – In “Probabilities as Similarity - Weighted Frequencies” w/ Billot, Samet, Schmeidler 13

The perspective 14

Frequency Probability of cases of states F 1 s 1 p 1 + F 2 s 2 p 2 + F 3 s 3 p 3 Probability = Frequency in perspective For case 3 s 2 p 2 s 3 p 3 3 3 . I . . Δ ( Ω ) For case 2 F = (F 1 , F 2 , F 3 ) . p 2 . . . 2 p 3 2 For case 1 . . . s 1 p 1 p 1 p(F) = p(I) 15 1 1

Theorem II m m s : Some axioms hold iff there exists a function such that ranks values by their proximity to I s ( x , x ) y i p i s i n y p s ( x , x ) i p i n n 1 m 1 m x x ,..., x where and I x ,..., x , y i i i i i i i 1 The function is unique up to multiplication by s 0 • In “Empirical Similarity” w/Lieberman and Schmeidler 16

Theorem III Some additional axioms hold iff there exists a norm m n : such that n ( x z ) s x , z e • Satisfies “multiplicative transitivity”: s x , z s x , y s y , z • In “Exponential Similarity” w/ Billot and Schmeidler 17

The Similarity – whence? – In “Empirical Similarity” w/Lieberman and Schmeidler we propose an empirical approach: – Estimate the similarity function from the data – A parametrized approach: Consider a certain functional form – Choose a criterion to measure goodness of fit – Find the best parameters 18

A functional form • Consider a weighted Euclidean distance m 2 d ( x , x ) w ( x x ) w i t j ij tj j 1 and d ( x , x ) s ( x , x ) e w i t w i t 19

Selection criteria • Find weights that would minimize 2 ˆ y y i i i ˆ p • Or: round off y 0 , 1 y to get a prediction i i – and then minimize 20

How objective is it? • Modeling choices that can affect the “probability”: – Choice of X’s and of sample – Choice of functional form – Choice of goodness of fit criterion • As usual, objectivity may be an unattainable ideal • But it doesn’t mean we shouldn’t try. 21

Statistical inference – In “Empirical Similarity” w/Lieberman and Schmeidler we also develop statistical inference tools for our estimation procedure – Assume that the data were generated by a DGP of the type s ( X , X ) Y i t i i t P ( Y 1 ) t t s ( X , X ) i t i t – Estimate the similarity function from the data – Perform statistical inference 22

Statistical inference – cont. • Estimate the weights by maximum w j likelihood • Test hypotheses of the form H : w 0 0 j • Predict out-of-sample by the maximum likelihood estimators (via the similarity- weighted average formula) 23

Failures of the combination axiom • Integration of induction and deduction – Learning the parameter of a coin – Linear regression Limited to case-to-case induction, generalizing empirical frequencies 24

Failures of the combination axiom – cont. • Second order induction – Learning the similarity function In particular, doesn’t allow the similarity function to get more concentrated for large databases Combination restricted to periods of “no learning”. 25

Combining Theories and Analogies 26

Learning in the Model 27

Modes of Reasoning 28

Dynamics of Reasoning • Under mild assumptions that mean that – The reasoner doesn’t know the nature of the process – The reasoner is “open - minded” • The reasoner converges away from Bayesian reasoning 29

Example 30

Example – cont. 31

Recommend

More recommend