Belief Formation Itzhak Gilboa Tel Aviv University and HEC, Paris - - PowerPoint PPT Presentation

belief formation
SMART_READER_LITE
LIVE PREVIEW

Belief Formation Itzhak Gilboa Tel Aviv University and HEC, Paris - - PowerPoint PPT Presentation

Analogies and Theories in Belief Formation Itzhak Gilboa Tel Aviv University and HEC, Paris ISIPTA 2015 Joint works of subsets of A. Billot, G. Gayer, I. Gilboa, O. Lieberman, A. Postlewaite, D. Samet, L. Samuelson, D. Schmeidler 1


slide-1
SLIDE 1

1

Analogies and Theories in Belief Formation

Itzhak Gilboa – Tel Aviv University and HEC, Paris ISIPTA 2015 Joint works of subsets of

  • A. Billot, G. Gayer, I. Gilboa, O. Lieberman, A.

Postlewaite, D. Samet, L. Samuelson, D. Schmeidler

slide-2
SLIDE 2

2

Background

  • Classics:

– Ramsey (1926), de Finetti (1931,7) – von-Neumann-Morgenstern (1944) – Savage (1954) – Anscombe-Aumann (1963)

  • Problems:

– Descriptive – Normative

slide-3
SLIDE 3

3

Background – cont.

  • Alternative theories

– Schmeidler (1989) Choquet EU – G-Sch (1989) Maxmin EU – Klibanoff, Marinacci, Mukerji (2005) (Nau, Seo…) “Smooth Model” – Maccheroni, Marinacci, Rustichini (2006) “Variational Preferences”

  • Still the “black box” paradigm
slide-4
SLIDE 4

4

Background – cont.

  • Case-Based Decision Theory

– (w/ Schmeidler, Theory of Case Based Decisions, CUP 2001)

  • Probabilities from cases

– (w/ Schmeidler and others, Case-Based Prediction, World Scientific 2012)

  • Analogies and Theories

– (w/ Samuelson, Schmeidler and others, Analogies and Theories, OUP, 2015)

slide-5
SLIDE 5

5

Statistics and Psychology

  • This project touches on both
  • And we found ourselves axiomatizing

known formulae

  • Surprisingly, known in both domains

– Which goes beyond this project – Sometimes, even the mistakes

slide-6
SLIDE 6

6

Probabilities from Cases: Similarity-weighted frequencies

The data: where and We are asked about the probability that for a new data point

  n

i i m i i

y x x

, ,...,

1

 

m p p

x x ,...,

1

 

m m i i

x x   ,...,

1

 

1 , 

i

y 1 

p

y

slide-7
SLIDE 7

7

Similarity-weighted frequencies – Formula (Kernel)

Choose a similarity function Given observations and a new data point estimate by

  n

i i m i i

y x x

, ,...,

1

 

 

n i p i n i i p i s p

x x s y x x s y ) , ( ) , (

 

m p p

x x ,...,

1

) 1 ( 

p

y P

 

    

m m

s :

slide-8
SLIDE 8

8

Similarity-weighted frequencies – Interpretation

  • Special cases of

– If is constant: an estimate of the expectation (in fact, “repeated experiment” is always a matter of subjective judgment of equal similarity) – If : an estimate of the conditional expectation

  • Useful when precise updating leaves us with a sparse

database

  • Akin to interpolation
  • But not to extrapolation!

s

 

 

n i p i n i i p i s p

x x s y x x s y ) , ( ) , (

 

 

p i

x x p i x

x s

1 ,

slide-9
SLIDE 9

9

Axiomatization – Setup

  • bservations (case types)

A database is a multi-set of observations We will refer to a database as a sequence or a multi-set interchangeably.

1 

 

m

M

 Z M I :

     

y x x y x x

m m

, ,..., , ,...,

1 1

slide-10
SLIDE 10

10

Axiomatization I: Observables

  • A state space
  • Fix a new data point
  • Databases
  • A probability assignment function

 

s ,..., 1  

 Z M I :

), ( :    I I p 

 

m m p p

x x   ,...,

1

slide-11
SLIDE 11

11

database J 5 12 . . . 3 4 6 . . . 8 database I + J 3 2 1 Δ(Ω)

. .

9 18 . . . 11

+

The combination axiom

database I 1 2 . . . m case types M p(I) p(J) States of the world Ω = {1,2,3,…,s} p(I + J).

slide-12
SLIDE 12

12

The combination axiom

  • Formally

for some

) ( ) 1 ( ) ( ) ( J p I p J I p      

1   

slide-13
SLIDE 13

13

Theorem I

  • The combination axiom holds, and not all

are collinear if and only if

  • For each

there are , not all collinear, and such that

– In “Probabilities as Similarity-Weighted Frequencies” w/ Billot, Samet, Schmeidler

 

I

I p ) (

M c

) (  

c

p 

c

s

 

 

M c c M c c c

s c I p s c I I p ) ( ) ( ) (

slide-14
SLIDE 14

14

The perspective

slide-15
SLIDE 15

15

For case 2 3 2 1 Δ(Ω)

. . .

p1 p2 p3

Probability

  • f states

Probability = Frequency in perspective Frequency

  • f cases

s1p1 s3p3 s2p2

. . .

3 2 1

.

I

.

F = (F1, F2, F3) For case 1 For case 3

.

F1s1p1 + F2s2p2 + F3s3p3

.

p(F) = p(I)

slide-16
SLIDE 16

16

Theorem II

Some axioms hold iff there exists a function such that ranks values by their proximity to where and The function is unique up to multiplication by

  • In “Empirical Similarity” w/Lieberman and Schmeidler

 

    

m m

s :

I

 

 

n i p i n i i p i s p

x x s y x x s y ) , ( ) , (

   

n i i m i i

y x x I

1 1

, ,...,

 

m i i i

x x x ,...,

1

s

 

slide-17
SLIDE 17

17

Theorem III

Some additional axioms hold iff there exists a norm such that

  • Satisfies “multiplicative transitivity”:
  • In “Exponential Similarity” w/ Billot and Schmeidler

  m n:

 

) (

,

z x n

e z x s

 

     

z y s y x s z x s , , , 

slide-18
SLIDE 18

18

The Similarity – whence?

– In “Empirical Similarity” w/Lieberman and Schmeidler we propose an empirical approach: – Estimate the similarity function from the data – A parametrized approach: Consider a certain functional form – Choose a criterion to measure goodness of fit – Find the best parameters

slide-19
SLIDE 19

19

A functional form

  • Consider a weighted Euclidean distance

and

2 1

) ( ) , (

tj ij m j j t i w

x x w x x d   

) , (

) , (

t i w

x x d t i w

e x x s

slide-20
SLIDE 20

20

Selection criteria

  • Find weights that would minimize
  • Or: round off

to get a prediction

– and then minimize

 

2

ˆ

i i i

y y

i

y ˆ

 

1 , 

p i

y

slide-21
SLIDE 21

21

How objective is it?

  • Modeling choices that can affect the

“probability”:

– Choice of X’s and of sample – Choice of functional form – Choice of goodness of fit criterion

  • As usual, objectivity may be an unattainable

ideal

  • But it doesn’t mean we shouldn’t try.
slide-22
SLIDE 22

22

Statistical inference

– In “Empirical Similarity” w/Lieberman and Schmeidler we also develop statistical inference tools for our estimation procedure – Assume that the data were generated by a DGP of the type – Estimate the similarity function from the data – Perform statistical inference

t t i t i t i i t i t

X X s Y X X s Y P    

 

 

) , ( ) , ( ) 1 (

slide-23
SLIDE 23

23

Statistical inference – cont.

  • Estimate the weights by maximum

likelihood

  • Test hypotheses of the form
  • Predict out-of-sample by the maximum

likelihood estimators (via the similarity- weighted average formula) : 

j

w H

j

w

slide-24
SLIDE 24

24

Failures of the combination axiom

  • Integration of induction and deduction

– Learning the parameter of a coin – Linear regression Limited to case-to-case induction, generalizing empirical frequencies

slide-25
SLIDE 25

25

Failures of the combination axiom – cont.

  • Second order induction

– Learning the similarity function In particular, doesn’t allow the similarity function to get more concentrated for large databases Combination restricted to periods of “no learning”.

slide-26
SLIDE 26

26

Combining Theories and Analogies

slide-27
SLIDE 27

27

Learning in the Model

slide-28
SLIDE 28

28

Modes of Reasoning

slide-29
SLIDE 29

29

Dynamics of Reasoning

  • Under mild assumptions that mean that

– The reasoner doesn’t know the nature of the process – The reasoner is “open-minded”

  • The reasoner converges away from

Bayesian reasoning

slide-30
SLIDE 30

30

Example

slide-31
SLIDE 31

31

Example – cont.