Statistical methods Toma Podobnik Oddelek za fiziko, FMF, UNI-LJ , - - PowerPoint PPT Presentation

statistical methods
SMART_READER_LITE
LIVE PREVIEW

Statistical methods Toma Podobnik Oddelek za fiziko, FMF, UNI-LJ , - - PowerPoint PPT Presentation

Statistical methods Toma Podobnik Oddelek za fiziko, FMF, UNI-LJ , , Odsek za eksperimentalno fiziko delcev, IJS I Prologue: I. Prologue: 1. Literature, 2. Basic Desiderata, 3. Example. 3. Example. 4/15/2010 2 1. Literature: no


slide-1
SLIDE 1

Statistical methods

Tomaž Podobnik Oddelek za fiziko, FMF, UNI-LJ , , Odsek za eksperimentalno fiziko delcev, IJS

slide-2
SLIDE 2

I Prologue:

  • I. Prologue:
  • 1. Literature,
  • 2. Basic Desiderata,
  • 3. Example.
  • 3. Example.

4/15/2010 2

slide-3
SLIDE 3
  • 1. Literature:
  • no single textbook covers all topics appropriately,
  • relevant references will be given for each of the topics

separately.

4/15/2010 3

slide-4
SLIDE 4
  • 2. Basic Desiderata:

A Every system must be (self-) consistent. B Every scientific (empirical) theory must be operational

(i.e., it must specify operations that ensure falsifiability ( , p y p y

  • f its predictions).

K Popper (1959) The Logic of Scientific Discovery § 24 pp 91-92

  • K. Popper (1959), The Logic of Scientific Discovery, § 24, pp. 91 92.
  • J. Neyman (1937), Phil. Trans. R. Soc., A 236, 333-380.
  • G. Pólya (1954), Mathematics and Plausible Reasoning, Vol. 2 –

Patterns of Plausible Inference Chap XIV § 4 p 64 Patterns of Plausible Inference, Chap. XIV, § 4, p. 64, and Chap. XV, § 4, p. 117.

4/15/2010 4

slide-5
SLIDE 5

Definition 1 (Inconsistency). A system is inconsistent iff, within the Definition 1 (Inconsistency). A system is inconsistent iff, within the rules of the system, a conclusion can be reasoned out in more than one ways, and not all the ways lead to the same result. Definition 2 (Consistency). A system that is not inconsistent (i.e., a system without a single demonstrated inconsistency) is said a system without a single demonstrated inconsistency) is said to be consistent.

4/15/2010 5

slide-6
SLIDE 6

3 Example 1 A decay of an atom (nucleus particle) at time t :

  • 3. Example 1. A decay of an atom (nucleus, particle) at time t1:

( )

⎭ ⎬ ⎫ ⎩ ⎨ ⎧− = τ τ τ ; exp 1 |

1 1

t t fI

)

1 =

( ) [ ]

{ }

+ −

∈ − = = ⎭ ⎩  τ τ τ τ τ τ ; / exp |

1

t t f I

( |

= τ t fI

( ) ( ) ( )

b a t I I

t t A dt t f A P

b

, ; | | = = ∫ τ τ

( )

1 | = τ A P

I

t t

( ) ( ) ( )

b a t I I

f

a

, ; | |

a

t

b

t t

4/15/2010 6

slide-7
SLIDE 7

Given t1, what is the value of the parameter

? τ

The Maximum Likelihood Method (MLM):

R A Fisher (1922) Phil Trans R Soc A 222 309 368

  • R. A. Fisher (1922), Phil. Trans. R. Soc., A 222, 309-368.

Likelihood function: . “The principle of Maximum Likelihood states that when confronted with

( ) ( )

τ τ | ;

1 1

t f t L

I

The principle of Maximum Likelihood states that, when confronted with a choice of we choose that one (if any) which maximizes L.” A St art J K Ord (1994) Kendall’s Ad anced Theor of Statistics

τ

  • A. Stuart, J. K. Ord (1994), Kendall’s Advanced Theory of Statistics,
  • Vol. 1 – Distribution Theory, § 8.22, p. 300.
  • I. Kuščer, A. Kodre (1994), Matematika v fiziki in tehniki, § 11.8, p. 309.
  • R. Jamnik (1975), Verjetnostni račun in statistika, § 24, p. 529-530.

In I. Vidav (ed.), Višja matematika II.

4/15/2010 7

slide-8
SLIDE 8

General method:

( )

1

ˆ ˆ ; ln : ˆ t L = = ∂ ∂ τ τ τ τ τ

)

τ ; 1

1

ˆ t = τ

( 1 =

t L

τ

What justifies MLM?

4/15/2010 8

slide-9
SLIDE 9

(The fallacy of) The limiting property of :

τ ˆ

Set of measurements {t1,…,tn}: i.i.d.

( ) ( ) ( )

∂ = =

=

t f t t f t t L

n n i i I n I n

τ τ τ 1 | | ,..., ; ,...,

1 1 1

( ) ( ) ( )

⎟ ⎞ ⎜ ⎛ = = ⇒ = = ∂ ∂ =

=

N f t n t t t L t t

i i n n n

τ τ τ τ τ τ τ τ τ τ τ | ˆ : lim 1 ˆ ˆ ; ,..., ln : ,..., ˆ ˆ

1 1 1

( )

⎟ ⎠ ⎜ ⎝

∞ →

n N f

n

τ τ τ , ~ | : lim ∞ ≈ = ∞ ≈ = 1 , 3

C

n N

4/15/2010 9

slide-10
SLIDE 10

The conditional probability fallacy: ( ) ( )

τ τ | |

1 1

t f t f

I I

=

( ) ( )

. max | : ., max | :

1 1

= = τ τ τ τ t f t f calculate in Interested : assumption implicit an ⇒

( ) ( )

| |

1 1

f f

I I

( )

d f | p p

( )

∞ =

τ τ d t fI

1 |

Example (L. Lyons): A: ”The first person I’ll meet is female.” B ”Th fi t I’ll t i t ” B: ”The first person I’ll meet is pregnant.”

( ) ( )

A B P B A P | | ≠

4/15/2010 10

slide-11
SLIDE 11

The fallacy of point estimates:

The parameter may take on every value in a continuum

τ

p y y +ï a measure of a single point in the continuum is 0. For verifiable predictions we must turn to interval estimations

τ ˆ

For verifiable predictions we must turn to interval estimations. “Ignorance is preferable to error and he is less remote from the truth who belives nothing than he who believes what is wrong.” g g (T. Jefferson (1781). Notes on Virginia.)

4/15/2010 11

slide-12
SLIDE 12

Contents:

I. Prologue, II. Mathematical Preliminaries,

  • III. Frequency Interpretation of Probability Distributions,
  • IV. Confidence Intervals,

V. Testing of Hypotheses,

  • VI. Inverse Probability Distributions,
  • VII. Interpretation of Inverse Probability Distributions,

VIII.Time Series and Dynamical Models.

4/15/2010 12

slide-13
SLIDE 13
  • II. Mathematical Preliminaries:
  • 1. Motivation,
  • 2. Probability spaces,
  • 3. Conditional probabilities,
  • 4. Random variables,
  • 5. Probability distributions,
  • 6. Transformations of probability distributions,
  • 7. Conditional distributions,
  • 8. Parametric families of (direct) probability distributions,
  • 9. The Central limit Theorem,

10.Invariant parametric families.

4/15/2010 13

slide-14
SLIDE 14
  • 1. Motivation :
  • (self-) consistency,
  • “Jede axiomatische (abstrakte) Theorie läßt bekanntlicht

unbegrentzt viele konkrete Interpretationen zu.” (A. N. Kolmogorov (1933). Grundbegriffe der Wahrscheinlichkeitsrechnung, Chap. I, p. 1.) g, p , p )

  • A. Stuart, J. K. Ord (1994), Kendall’s Advanced Theory of

Statistics, Vol. 1 – Distribution Theory. M M Rao (1993) Conditional Measures and Applications M.M.Rao (1993), Conditional Measures and Applications.

4/15/2010 14

slide-15
SLIDE 15
  • 2. Probability spaces:

Ω

( = Ω set universal empty)

  • non

abstract,

Ω

A

{ }

K K , , , , , C B A C B A = Σ Ω ⊆

A C

Definition 3 (s-algebra). S is called s-algebra (s-field) on W iff:

B

, ; \ , Σ ∈ Ω Σ ∈ Ω

A A ii) i) . ;

1

Σ ∈ Σ ∈

∞ = i i i

A A iii) Definition 4 (Measurable space). An ordered pair (W,S), consisting of a universal set W and a s-algebra S on W is called a measurable space.

4/15/2010 15

slide-16
SLIDE 16

Example 2 (Measurable space). (W,S), where S={W,«}. Example 3 (Measurable space). (n,n),where n is the Borel s-algebra (the smallest s-algebra containing all n-dimensional

  • pen rectangles).

Definition 5 (Probability measure). A real-valued function P on a s-algebra S is called Probability measure iff: g y

( ) ( )

, 1 , ; = Ω Σ ∈ ≥ P A A P ii) i)

( )

( )

. ;

1 1

∅ = =

≠ ∞ = ∞ =

i j i i i i i

A A A P A P I U iii) Definition 6 (Probability space). An ordered triple (W,S,P), consis- ting of a universal set W, of a s-algebra S on W, and of a probabi- lity (measure) P on S is called probability space.

4/15/2010 16

lity (measure) P on S is called probability space.

slide-17
SLIDE 17
  • 3. Conditional probability:

Definition 7 (Conditional probability). Consider a probability space ( p y) p y p (W,S,P) and sets A,BœS, P(B)>0. Then, the conditional probability P(A|B) is defined as

( ) ( )

| B A P B A P I

( ) ( ) ( )

. | B P B A P I ≡ D fi iti 8 (I d d t t ) C id b bilit Definition 8 (Independent sets). Consider a probability space (W,S,P) and sets A,BœS. The sets are (P-) independent iff

( )

. ) ( ) ( B P A P B A P = I

( )

) ( ) ( Definition 9 (Mutually exclusive and exhaustive sets). Consider a space (W S P) and a set {B B } B œS and P(B )>0 When space (W,S,P) and a set {B1,...,Bn}, BiœS and P(Bi)>0. When the sets Bi are called mutually exclusive and (W-)exhaustive. ,

1

Ω = ∅ =

= ≠ i n i i j i

B B B U I and

4/15/2010 17

i

y ( )

slide-18
SLIDE 18

Proposition 1 (Law of Total Probability, LTP). Let (W,S,P) be a probability space and {Bj} mutually exclusive and W-exhaustive sets Then P(A) AœS decomposes as

  • sets. Then, P(A), AœS, decomposes as

( )

( ) ( ).

|

1 j n j j

B A P B P A P

∑ =

=

( ) ( )

( ) ( ) ( ) ( )

Proof. ฀

( ) ( )

( ) ( ) ( ) ( ) ( ) ( ) ( ).

|

1 1 1 1 j n j j n j j j n j j n j

B A P B P B A P B A P B A P A P A P

∑ ∑

= = = =

= = = = Ω = I I U U I I ฀

j j

Theorem 1 (Bayes). Let (W,S,P) be a probability space, {Bj} mutually exclusive and W exhaustive sets and P(A)>0 AœS Then exclusive and W-exhaustive sets, and P(A)>0, AœS. Then,

( ) ( ) ( )

( ) ( )

. | | |

j n j i i i

B A P B P B A P B P A B P

=

  • Proof. follows from Def. 7,

whereas P(A) decomposes according to LTP. ฀

( ) ( )

|

1 j j j

B A P B P

∑ =

( ) ( ) ( ) ( ) ( )

A B P A P B A P B P B A P

i i i i

| | = = I

4/15/2010 18

( ) p g

slide-19
SLIDE 19
  • 4. Random variable:

Definition 10 (Scalar random variable). Given a probability space (W,S,P), let a function be S-measurable, Then X is called a (real valued scalar) random variable while

( ) { }

. ; : .  ∈ Σ ∈ ≤ Ω ∈ =

x x X A

x X

ω ω  → Ω : X Then, X is called a (real-valued, scalar) random variable, while x is called a realization of X. Definition 11 (Random vector) Given a probability space (W S P) Definition 11 (Random vector). Given a probability space (W,S,P), let a function be S-measurable,

( ) { }

. ; : . n A ∈ Σ ∈ ≤ Ω ∈ =

x x X

x X

ω ω

n

 → Ω : X Then, X is called a (real-valued) random vector, while x is called a realization of X.

( ) { }

≤x X

Remark 1. The notion of chance (‘physical randomness’) is avoided in Defs. 10 and 11.

4/15/2010 19

slide-20
SLIDE 20
  • 5. Probability distributions:

Definition 12 (Probability distribution) A function

[ ]

1 : Pr → 

Definition 12 (Probability distribution). A function called probability distribution, is defined as the image measure

  • f P by the random variable X, such that

[ ],

1 , : Pr → 

X

( ) ( )

[ ]

( )

P

1 1 1

Σ

− −

S X S S X P S   , Pr

1 −

≡ X P

X

  • ( )

( )

[ ]

( )

. , , Pr

1 1 1

Σ ∈ ≡ ∈ = S X S S X P S

X

 

Remark 2. Probability vs. Quantum Mechanics.

( )

X

Pr , , 

( ) ( ) ( ) ( ) ( )

1 , = ∗ = Ψ ≡

dx x x x x x x

ψ ψ ψ ψ ψ

X

ˆ ; , ˆ =

x x x X x X

( )

P , ,Σ Ω

1 ; ; , , = Ψ Ψ Ψ Φ Φ Ψ K

4/15/2010 20

slide-21
SLIDE 21

Definition 13 (Cumulative distribution function, cdf). Given Definition 13 (Cumulative distribution function, cdf). Given a probability space (W,S,P), a random variable X, and a set the (cumulative) distribution function is defined as

( ) { },

: x X A

x X

≤ Ω ∈ =

ω ω

( ) [ ]

1 : →  x F is defined as

( ) ( ).

x X X

A P x F

( ) [ ]

1 , : →  x FX Properties of :

( )

x FX

  • non-decreasing,
  • (

) ( )

, 1 , = ∞ + = ∞ −

X X

F F

( ) ( ) ( ) ( ]

  • ( )

( ) ( ) ( ]

. , ; Pr

2 1 1 2

.  ∈ = − = x x S x F x F S

X X X

4/15/2010 21

slide-22
SLIDE 22

Example 4 (Discrete distribution). FX(x) discontinuous at x=xi.

X

F x

Definition 14 (Probability mass function, pmf). Given a discrete ( y ) distribution with FX(x) discontinuous at x=xi, the probability mass function pX(x) is defined as

( ) ( )

Pr x X x p = ≡

4/15/2010 22

( ) ( ).

Pr x X x p

X X

= ≡

slide-23
SLIDE 23

Example 5 (Continuous distribution). FX(x) continuous function.

X

F x

Definition 15 (Probability density function, pmf). Cdf of a continuous distribution is expressible as a (Lebesgue) integral

  • f a (non-negative) probability density function fX(x),

Also

( ) ( )

. ' ' dx x f x F

x X X

∫ ∞

=

( ) ( ) ( ) ( )

; Pr )]' ( [ /  ∈ = ≡ =

S dx x f S x F dx x dF x f and

4/15/2010 23

Also,

( ) ( ) ( ) ( )

. ; Pr )] ( [ /  ∈ = ≡ =

S dx x f S x F dx x dF x f

S X X X X X

and

slide-24
SLIDE 24

Definition 16 (Mean of a distribution).

( )

⎧∫

( ) ( )

. ⎪ ⎩ ⎪ ⎨ ⎧ ≡ 〉 〈

∑ ∫

i i i X

x p x dx x f x x

Definition 17 (Variance of a distribution). Definition 17 (Variance of a distribution).

( ) ( ) ( )

. ) ( ) ( ) (

2 2 2

⎪ ⎩ ⎪ ⎨ ⎧ 〉 〈 − 〉 〈 − ≡ 〉 〉 〈 − 〈 ≡

∑ ∫

X

x p x x dx x f x x x x x Var

( )

) ( ⎪ ⎩ 〉 〈

∑i

i i

x p x x Definition 18 (Support of a continuous distribution).

( ) { }.

: > ∈ ≡ x f x VX 

4/15/2010 24

slide-25
SLIDE 25
  • Defs. 12 – 18 extend without change to random vectors:

( ) ( ) ( ) ( )

. , , , , , , Pr

X X X X X

x x x x x V Var f p F 〉 〈

( ) ( ) ( ) ( ),

, , , , ,

X X X X X

f p 〉 〈 Definition 19 (Independent random variables). Components

  • f a random vector are called independent

random variables iff

( )

2 1, X

X = X

( ) ( ) ( ).

2 1

2 1

x F x F F

X X

= x

X

If, in addition, the functions are the same, X1 and X2 are called independent identically distributed random variables (i i d )

( ) ( ) ( )

2 1

2 1

X X X

2 1

X X

F F and variables (i.i.d.). Definition 20 (Marginal distributions). Consider a random vector . The marginal cdf FX(x1) and pdf fX(x1) are defined as

( )

2 1, X

X = X

( ) ( ) ( ) ( )

≡ +∞ = ≡ . , ,

2 2 1 1 2 1 1

dx x x f x f x x F x F

X X X X

and

4/15/2010 25

( ) ( ) ( ) ( )

, ,

2 2 1 1 2 1 1

f f

X X X X

slide-26
SLIDE 26
  • 6. Transformations of probability distributions.

Proposition 2 Let X and Y be continuous random variables defined Proposition 2 . Let X and Y be continuous random variables, defined

  • n a probability space (W,S,P), and let s be a function on VX, such

that , X s Y

  • =

( ) ( )

P P     s

( ) ( ) ( )

P

Y X

Pr , , Pr , , Σ Ω     X Y Let, in addition, Then,

( )

P , ,Σ Ω . ; ) ( '

X

V x x s ∈ ∀ ≠

1

⎪ ⎧

( ) ( )

1 , ) ( ' ; )] ( [ 1 ) ( ' ; )] ( [

1 1

x s y s F x s y s F y F

X X Y − −

⎪ ⎩ ⎪ ⎨ ⎧ < − > =

( ) ( )

2 . )]' ( [ )] ( [

1 1

y s y s f y f

X Y − −

=

4/15/2010 26

slide-27
SLIDE 27

Proof.

( ) { } [ ]( ) { }

: : ≤ Ω ∈ = ≤ Ω ∈ =

y X s y Y A

y Y

ω ω ω ω

  • [

]( ) { } ( ) [ ] { } ( ) ( )

) ( ' ; : :

1

⎪ ⎫ ⎪ ⎧ ⎧ > ≤ ≤ Ω ∈ = ≤ Ω ∈ =

x s y s X y X s y X s ω ω ω ω ω

  • ( )

( ) ( ) ( )

( )

) ( ' ; ) ( ' ; ) ( ; :

1

1

⎪ ⎧ > ⎪ ⎭ ⎪ ⎬ ⎫ ⎪ ⎩ ⎪ ⎨ ⎧ ⎩ ⎨ ⎧ < ≥ > ≤ Ω ∈ =

x s A x s y s X x s y s X ω ω ω ฀

( ) ( )

. ) ( ' ; ) ( ;

1 1

⎪ ⎩ ⎪ ⎨ ⎧ < > =

− −

≥ ≤

x s A x s A

y s X y s X

  • Eq. (2) extends to random vectors:

( )

) ( )] ( [

1 1

∂ f f ( ) . ) ( )] ( [

1 1

y s y s y

y X Y − −

∂ = f f

4/15/2010 27

slide-28
SLIDE 28

Example 6 (Convolution).

( ) ( ) ( ) ( );

= = x f x f x x f X X X

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( );

, , ; , , , ' ; , , ,

1 1 1 1 1 ' 2 1 2 1 1 2 1 2 1 2 1

2 1

− = − = + ≡ + ≡ ≡ = = x s f x f x s x f s x f x x s X X S S X x f x f x x f X X

X X X X X X X

X X

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

. , ; , ,

1 1 1 1 1 ' ' 1 1 1 1 1 '

2 1 2 1

∫ ∫

− = =

 

dx x s f x f dx s x f s f x s f x f x s x f s x f

X X X X X X X X

Properties of convolutions:

( ) ( ) ( ) ( ) ( )

( )

2 2 2 1 2 1

; . 2 ; . 1 + = 〉 〈 + 〉 〈 = 〉 〈 x Var x Var s Var x x s

( ) ( )

( )

2 2 2 1 2 1 2 2 2 1 1 1

, ~ , ~ , , ~ . 3 σ σ μ μ σ μ σ μ + + ⇒ N S N X N X

4/15/2010 28

slide-29
SLIDE 29
  • 7. Conditional probability distributions.

Let:

( ) ( )

= z y f Z Y ; , , ,

X

X Let:

( ) ( ) ( ] ( ] ( ) ( ) ( )

∫ ∫

+

⎤ ⎡ > = > + × ≡ × ∞ − ≡

h z B

d f dz dy z y f B h h z z B y A y f ; ' ' ' , ' Pr , ; , , , ; , , ,

X X X

  Recall:

( ) ( ) ( ) ( ) ( )

∫ ∫ ∫

∞ − +

⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎣ ⎡ = =

y h z z z

dy dz z f dz z y f B B A B A ' ' ' ' ' , ' Pr Pr | Pr

X X X X X

I

( ) ( ) ( ) ( )

∫ ∫ ∫ ∫

+ +

→ ⎥ ⎥ ⎤ ⎢ ⎢ ⎡

y y h z h z z

dy z f z y f dy dz z f dz z y f a ' , ' ' ' ' ' ' , ' )

X X X

: lim

→ h

( ) ( ) ( ) ( ] ( ) ( ]

∞ − ∞ −

∞ × ∈ × ∞ − ∈ → ⎥ ⎦ ⎢ ⎣

z

Z Y z Z Y y Z Y B A b z f dz z f | , | , , | )

X X

 

( ] ( ) ( ) ( ) ( )

∫ ∞

= = = → ⇒ = ∞ − ∈ =

y

dy z y f z y F z Z y F B A z Z y Y ' | ' | | | Pr | ,

X X X X

4/15/2010 29

slide-30
SLIDE 30

( ) ( ) ( ) ( ) ( )

, ; , | > ≡ = ⇒

dy z y f z f z f z y f z y f

( )

∞ −

z f

  • R. Jamnik (1975). § 11, p. 475-477.

I Kuščer A Kodre (1994) § 11 3 p 281 283

  • I. Kuščer, A. Kodre (1994). § 11.3, p. 281-283.

Borel-Kolmogorov paradox:

  • A. N. Kolmogorov (1933). Chap. V, § 2, p. 44-45.
  • M. M. Rao (1993). § 3.2, p. 65-66.

( ) § , p Consistent definition (Kolmogorov): ( g )

  • M. M. Rao (1993). § 2.1, pp. 25-26 and pp. 29-30,

§ 2.4, pp. 51-54.

4/15/2010 30

slide-31
SLIDE 31

Example 7 (Transformation of a conditional probability distribution).

( ) ( );

, , ,

2 1 2 1

x x f X X =

X

X

( ) ( )

; )]' ( [ ; , ,

2 , 1 2 , 1 2 1 2 2 1 1

x s Y Y X s X s ≠ = = = X s Y

  • (

)

. )]' ( [ )] ( | ) ( [ |

1 1 1 2 1 2 1 1 1 2 1

y s y s y s f y y f

− − −

= ⇒

X Y

4/15/2010 31

slide-32
SLIDE 32
  • 8. Parametric families:

Definition 21 (Parametric family) The term parametric family stands Definition 21 (Parametric family). The term parametric family stands for a collection of probability distributions that differ only in the value q of a parameter Q.

{ }

Θ

∈ = V I

I

θ

θ :

Pr , Example 8 (Reparametrization). Let a probability distribution for a continuous random variable X belong to a family I

( ) ( )

f f continuous random variable X belong to a family I, and let Then,

( )

)]' ( [ )] ( [

1 1

y s y s f y f

− −

=

( ) ( ),

,

x f x f

I X θ

=

( ) ( ) ( ) ( )

. ' , ' ≠ ≡ ≡ θ θ λ s x s s x s y with and R k 3 D t l t l b t th t f ti

( )

( )

. )] ( [ )] ( [

, ',

1

y s y s f y f

s I I

=

λ λ

Remark 3. Due to a complete analogy between the transformations in Examples 7 and 8 we define

( ) ( ).

| x f x f

I I θ

θ ≡

4/15/2010 32

( ) ( )

|

,

f f

I I θ

slide-33
SLIDE 33

Example 10 (Families of discrete distributions). 1 Bi i l

  • 1. Binomial:

( )

. ; , , ] 1 , [ ; ) 1 ( , | n n n n n n n n p

n n n I

≤ ∈ ∈ ∈ − ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

 ,  θ θ θ θ

  • 2. Poisson:

⎠ ⎝

( )

; |   ∈ ∈ =

+ −

n e n p

n

μ μ μ

μ

3 Multinomial:

( )

. , ; ! |   ∈ ∈ = n e n n pI μ μ

  • 3. Multinomial:

( )=

n k n k k k I

n n n n n n p

k

1 1 , 1 1

; ! ! ! , , , | , ,

1

θ θ θ θ L L K K

∑ ∑

= =

= = ≥ ∈ ∈ ∈

k i k i i i i i

n n k n n

1 1

. , 1 ; 2 ; , , ] 1 , [ θ θ  , 

4/15/2010 33

slide-34
SLIDE 34

Example 11 (Families of continuous distributions).

  • 1. Uniform:

1 ⎧ Ë  

+

∈ ∈ τ σ μ , ; , x

( )

;

  • therwise

; 1 1 ; 2 1 , | ⎪ ⎩ ⎪ ⎨ ⎧ ≤ − ≤ − = σ μ σ σ μ x x fI

  • 2. Triangular:

⎩ 1 ; 1 2 1 ⎪ ⎧ ≤ − ≤ − ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − + σ μ σ μ σ x x

( )

; 1 ; 1 2 1 2 , | ⎪ ⎪ ⎪ ⎪ ⎨ ≤ − < ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − − ⎠ ⎝ = σ μ σ μ σ σ σ σ σ μ x x x fI

  • therwise

; ⎪ ⎪ ⎪ ⎩ ⎠ ⎝

4/15/2010 34

slide-35
SLIDE 35

Example 11 (Continued).

  • 3. Normal (Gauss):

) ( 1

2 ⎫

  • 3. Normal (Gauss):

4 Cauchy:

( ) ( )

σ μ σ μ σ π σ μ , ~ ; 2 ) ( exp 2 1 , |

2 2

N x x x fI ⎭ ⎬ ⎫ ⎩ ⎨ ⎧ − − =

1

  • 4. Cauchy:

( )

; 1 1 , |

1 2 −

⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎣ ⎡ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − + = σ μ σ π σ μ x x fI

  • 5. Exponential:

( )

; ; exp 1 |

+

∈ ⎭ ⎬ ⎫ ⎩ ⎨ ⎧− =  t t t fI τ τ τ

  • 6. Weibull:

( )

; , ; exp , |

1 + + −

∈ ∈ ⎪ ⎪ ⎬ ⎫ ⎪ ⎪ ⎨ ⎧ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ = θ θ θ θ τ

θ θ

t t t fI   ⎭ ⎩ τ τ

( )

. , ln , ln p |

1 −

≡ ≡ ≡ ⎪ ⎭ ⎬ ⎪ ⎩ ⎨ ⎠ ⎝ ⎠ ⎝ θ σ τ μ τ τ τ t x fI

4/15/2010 35

slide-36
SLIDE 36

I

f

( )

1 , N

( )

1 C(

)

1 , C x

Definition 22 (Location and scale parameters) When Definition 22 (Location and scale parameters). When

( ) ( )

, ' , | ' , |

∫ ∞

⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − Φ = ≡

x I I

x dx x f x F σ μ σ μ σ μ is called location and scale parameter. ⎠ ⎝

μ σ

4/15/2010 36

slide-37
SLIDE 37

Example 12 (Exponential family revisited).

( ) ( )

1 | 1 | ⎬ ⎫ ⎨ ⎧ ⎬ ⎫ ⎨ ⎧

+

t t F t t t f 

( ) ( ) ( )

{ }

( ).

exp 1 | ln , ln , exp 1 | ; exp |

'

μ μ τ μ τ τ τ τ τ

μ

− Φ = − − = ⇒ ≡ ≡ ⎭ ⎬ ⎫ ⎩ ⎨ ⎧− − = ⇒ ∈ ⎭ ⎬ ⎫ ⎩ ⎨ ⎧− =

− +

x e x F t x t F t t f

x I I I

 Example 11 (Continued) Example 11 (Continued).

  • 7. Multidimensional Normal:

⎫ ⎧

( )

t i d fi it iti l t i ; ) ( ) ( 2 1 exp | | det ) 2 ( 1 , |

1 2 /

f

n T n I

⎭ ⎬ ⎫ ⎩ ⎨ ⎧ − − − =

V x V x V V x  m m m π matrix. definite, positively symmetric, ; , n n

n

× ∈ V x  m

4/15/2010 37

slide-38
SLIDE 38
  • 9. The Central Limit Theorem (CLT):

Example 13 (Binomial vs Normal distribution) Example 13 (Binomial vs. Normal distribution).

; ; ) 1 ( ) , | ( θ θ θ ≤ ∈ ∈ − ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

n n n n n n n n p

n n n I

  ' 2 ) ' ( exp 2 1 ) , | ( ) , | ( : 1 ) 1 ( , ;

2 2 5 .

σ μ σ π θ θ θ θ ⎭ ⎬ ⎫ ⎩ ⎨ ⎧ − − = − ≤ ∈ ⎠ ⎝

∫ ∑

+ ∞ − =

dx x n i p n n F n n n n n n

n n i I

> q

 ) 1 ( , θ θ σ θ μ − = = n n FI F

) , | ( n n FI θ ) | ( F

x

) , | (

'

σ μ x FI

4/15/2010 38

x

slide-39
SLIDE 39

Theorem 2 (CLT, Lévy). Consider i.i.d. and with , ,

1

〉 〈 = 〉 〈

i n

x x X X K Then, . ) ( ) ( ∞ < =

i

x Var x Var 1 ) ( lim

≡ ⎟ ⎟ ⎞ ⎜ ⎜ ⎛ 〉 〈

n x

x x Var x N x

  • Proof. By invoking characteristic functions.

. , , ~ lim

1

∑ =

∞ →

≡ ⎟ ⎟ ⎠ ⎜ ⎜ ⎝ 〉 〈

i i n n n

x n x n x N x

  • Proof. By invoking characteristic functions.

(C t )E l 14 (CLT C h f il ) (Counter-)Example 14 (CLT vs. Cauchy family).

( )

, ) ( ) ( ) ( x FWHM x FWHM x Var

i n i

= ⇒ exist not does

( ).

2 / ) ( , ~ x FWHM x C xn 〉 〈

4/15/2010 39

slide-40
SLIDE 40
  • 10. Invariant families:

Example 15 (Invariance of a location scale family) Example 15 (Invariance of a location scale family).

( )

, , | σ μ σ μ x x FI ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − Φ = , ) , ( ) , ( )] , ( ), , [( ] ), , [( σ μ σ μ σ G b a a b a b a l b x a x b a l = × ∈ ⎭ ⎬ ⎫ + ≡ + ≡ ⎠ ⎝

+ 

( ) ( )

)] , ( ), , [( | ] ), , [( )] , ( ), , [( | ] ), , [(

'

σ μ σ μ b a l x b a l F b a l x b a l F

I I

= ⎭ Example 16 (Invariance under inversion) The Normal and the Example 16 (Invariance under inversion). The Normal and the Cauchy family are invariant under simultaneous inversions of x and m. We say that they have positive parity.

4/15/2010 40

slide-41
SLIDE 41

Example 17 (Invariance of the exponential family) Example 17 (Invariance of the exponential family).

( )

, exp 1 | t t FI ⎭ ⎬ ⎫ ⎩ ⎨ ⎧− − = τ τ , ] , [ ] , [ G a a a l t a t a l = ∈ ⎭ ⎬ ⎫ ≡ ≡ ⎭ ⎩

+

 τ τ τ ⎭ 3 ( ) f f Proposition 3 (Invariance under Lie group). A family of probability distributions of a scalar random variable that is invariant under a scalar Lie group is reducible to a location family.

4/15/2010 41

slide-42
SLIDE 42

Text box With shadow

4/15/2010 42