Statistical methods Toma Podobnik Oddelek za fiziko, FMF, UNI-LJ , - - PowerPoint PPT Presentation
Statistical methods Toma Podobnik Oddelek za fiziko, FMF, UNI-LJ , - - PowerPoint PPT Presentation
Statistical methods Toma Podobnik Oddelek za fiziko, FMF, UNI-LJ , , Odsek za eksperimentalno fiziko delcev, IJS Contents: I. Prologue, II. Mathematical Preliminaries, III. Frequency Interpretation of Probability Distributions, IV.
Contents:
I. Prologue, II. Mathematical Preliminaries,
- III. Frequency Interpretation of Probability Distributions,
- IV. Confidence Intervals,
V. Testing of Hypotheses,
- VI. Inverse Probability Distributions,
- VII. Interpretation of Inverse Probability Distributions,
VIII.Time Series and Dynamical Models.
4/15/2010 2
- II. Mathematical Preliminaries:
- 1. Motivation,
- 2. Probability spaces,
- 3. Conditional probabilities,
- 4. Random variables,
- 5. Probability distributions,
- 6. Transformations of probability distributions,
- 7. Conditional distributions,
- 8. Parametric families of (direct) probability distributions,
- 9. The Central limit Theorem,
10.Invariant parametric families.
4/15/2010 3
Theorem 2 (CLT, Lévy). Consider i.i.d. and with , ,
1
〉 〈 = 〉 〈
i n
x x X X K Then, . ) ( ) ( ∞ < =
i
x Var x Var 1 ) ( lim
∑
≡ ⎟ ⎟ ⎞ ⎜ ⎜ ⎛ 〉 〈
n x
x x Var x N x . , , ~ lim
1
∑ =
∞ →
≡ ⎟ ⎟ ⎠ ⎜ ⎜ ⎝ 〉 〈
i i n n n
x n x n x N x
∑
=
− − ≡
n i n i n
x x n s
1 2 2
) ( 1 1 Then, finite. are and exist all , , , that suppose and i.i.d. Consider }, , , {
4 3 2 1
x x x x X X
n
〉 〈 〉 〈 〉 〈 〉 〈 K n. Propositio . , , ) (
2
x Var sn = 〉 〈 〉 〈 〉 〈
4/15/2010 4
- III. Frequency Interpretation of Probability Distributions:
“In order to make the theory operational, we must introduce a concept
- f probability that links the mathematics to an external world of measu-
rable phenomena ” (A Stuart J K Ord (1994) § 8 8 p 290 ) rable phenomena. (A. Stuart, J. K. Ord (1994), § 8.8, p. 290.) “The most striking achievement of the physical sciences is prediction.” (G. Pólya (1954), Chap. XIV, § 4, p. 64.) “The pure mathematician can do what he pleases, but the applied The pure mathematician can do what he pleases, but the applied mathematician must be at least partially sane.” (M. Kline (1980). Mathematics: The Loss of Certainty, Chap. XIII, p. 285.)
4/15/2010 5
- III. Frequency Interpretation of Probability Distributions:
- III. Frequency Interpretation of Probability Distributions:
- 1. Example,
- 2. Binary random sequence,
- 3. Random sequence of real numbers,
- 4. Monte Carlo methods.
4/15/2010 6
- 1. Example 1. (Bertrand’s paradox).
- 1. Example 1. (Bertrand s paradox).
A straw is tossed at random so that the line determined by the straw intersects the unit circle What is the expected length
- f the chord
〉 〈l intersects the unit circle. What is the expected length
- f the chord
thus defined? J.L. Bertrand (1889), Calcul des Probilités, pp. 4-5. 〉 〈l J.B. Paris (1994), The Uncertain’s Reasoner Companion, Chap. 6,
- pp. 71-72.
E.T. Jaynes (2003), Probability Theory, § 12.4.4, pp. 386-394. y ( ), y y, § , pp
( )
2 2, y
x
( )
2 2, y
x l
( )
, 1 − h
( )
θ
( )
1 1, y
x
( )
,
4/15/2010 7
( )
y x
1 ; 1 ⎧ ≤ ≤ h π
( )
2 2, y
x
( )
, 1 − h
θ
( )
; 57 . 1 2
- therwise
; 1 ; 1 ) ≈ = 〉 〈 ⇒ ⎩ ⎨ ⎧ ≤ ≤ = l h h f a π
( )
, 1 h
( )
,
( )
; 27 . 1 4 2 ; 2 ) ≈ = 〉 〈 ⇒ ⎪ ⎪ ⎨ ⎧ ≤ ≤ = l f b π π θ π θ 1
- therwise
; ⎧ ⎪ ⎩ π
( )
. 33 . 1 3 4
- therwise
; 1 1 ; 2 1 )
2 2
≈ = 〉 〈 ⇒ ⎪ ⎩ ⎪ ⎨ ⎧ ≤ ≤ − = l x x f c ; ⎩
4/15/2010 8
- 2. Binary random sequences.
Consider an infinite binary sequence 1,0,1,1,0,1,0,0,0,1,0,1,1,0,1,... with equal relative frequencies of appearance of 1’s and 0’s, equa e a e eque c es o appea a ce o s a d 0 s,
- r more precisely
; 2 1
1
= =ν ν
- r more precisely,
. 1 2 1 lim
1
= ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ < −
∞ →
ε n n P
n
We say that n1=n0=1/2 is true almost everywhere with respect to the Bernoulli measure Bn(1/2) on the space of infinite binary ll d C t (B (1/2) th C t ⎠ ⎝ sequences, called Cantor space (Bn(1/2) on the Cantor space is isomorphic to the Lebesgue measure on the interval [0,1]).
4/15/2010 9
For a Bn(1/2) -typical binary sequence we would further expect that , 4 1
, 1 , , 1 1 , 1
= = = = ν ν ν ν M , 8 1
, , , , , 1 , 1 , 1 , , , 1 , 1 1 , , 1 1 , 1 , 1 , 1 , 1
= = = = = = = = ν ν ν ν ν ν ν ν holds Bn(1/2)-almost everywhere. That is from a Bn(1/2) typical binary sequence we would naively M That is, from a Bn(1/2) -typical binary sequence we would naively expect to satisfy all properties true Bn(1/2)-almost everywhere. Unfortunately, such a definition is vacuous.
4/15/2010 10
Definition 1 (Bn(1/2)- random binary sequence). An infinite binary sequence is called (Martin-Löf) Bn(1/2)- random iff it is not rejected by the Martin Löf test (i e if it satisfies a (special) countable by the Martin-Löf test (i.e., if it satisfies a (special) countable sequence of properties true Bn(1/2)-almost everywhere). P M ti Löf (1966) I f C t l 9 602 619
- P. Martin-Löf (1966), Inform. Control 9, 602-619.
The limiting frequencies n and n need not be the same e g n 2/3 The limiting frequencies n1 and n0 need not be the same, e.g., n1=2/3 and n0=1/3. Definition 2 (Bn(n )- random binary sequence) An infinite binary Definition 2 (Bn(n1)- random binary sequence). An infinite binary sequence is called (Martin-Löf) Bn(n1)- random iff it is not rejected by the Martin-Löf test (i.e., if it satisfies a countable sequence of properties true Bn(n ) almost everywhere) sequence of properties true Bn(n1)-almost everywhere). Remark 1. No finite binary sequence is random.
4/15/2010 11
e a
- te b a y seque ce s a do
- 3. Real random sequences.
n
Given a probability space ((n,n,PrX), a set Aœn d i fi it ( ) i
A
and an infinite sequence x1,x2,…, (xi œn) give rise to a binary sequence b1,b2,…, where ; 1 ⎧ ∈ A x .
- therwise
; ; 1 ⎩ ⎨ ⎧ ∈ = A b
i i
x Definition 3 (PrX-random sequence). Given a probability space ( (n,n,PrX), an infinite sequence x1,x2,…, (xi œn) is PrX-random iff for (
X)
q
1, 2,
( i )
X
everyAœn the corresponding binary sequence b1,b2,… is Bn[PrX(A)]- random. In this way, the probability distribution PrX on n coincides with the (frequency) distribution of the sequence x1,x2,..., which is characteristic
- f the frequency interpretation of probability
4/15/2010 12
- f the frequency interpretation of probability.
R k 2 E fi it i d C tl th Remark 2. Every finite sequence is non-random. Consequently, the ran- domness of QM cannot be verified, it can only be postulated. Remark 3. Every (possibly infinite) sequence that results from an algo- ith i d C tl f th b f rithm is non-random. Consequently, none of the numbers from random number generators, based on algorithms, is truly random. Rather, they are pseudo-random numbers. There are random number generators based on QM processes such as, for example, radioactive decays. The numbers from these , p , y generators may be (parts of) truly random sequences.
4/15/2010 13
- 4. Monte Carlo methods.
Basis: Generator of (pseudo-) random numbers, uniformly distributed
- n an interval, often [0,1].
( )
rndm 1 x x x x − × + = : n integratio MC
) (x f
max
f ( ) ( )
max
1 . 3 rndm' . 2 rndm . 1 N N x f y f y x x x x
i i i i a b i a i
+ = ⇒ ≤ × = − × + =
max
f ( ) ( ) ( )
acc acc acc
1 . 3 f x x N dx x f N N x f y
x i i
b
× × − − − − − − − − − − − − − − − − + ⇒ ≤
∫
( ) ( )
max gen acc
f x x N dx x f
a b xa
× − × =
∫
x
a
x
b
x
4/15/2010 14
( )
x f X : arbitrary for numbers Random (Pseudo-)
( )
x x V
b a X
d 1 ] , [ =
) (x fX
max
f ( ) ( )
x x f y f y x x x x
i i a b i a i
accept 3 rndm' . 2 rndm . 1
max
⇒ ≤ × = − × + =
max
f ( ) { } ( )
x f x x x f y
i i X i
~ accepted accept . 3 − − − − − − − − − − − − − − − − ⇒ ≤
{ } ( )
x f x
X i ~
accepted
x
a
x
b
x
4/15/2010 15
Low efficiencies may represent a serious problem:
( )
rndm . 1 ] , [ x x x x x x V
a b i a i b a X
− × + = =
) (x fX f
( ) ( )
max
accept . 3 rndm' . 2 x x f y f y
i i X i i i a b i a i
⇒ ≤ × =
max
f
max rec
] , [ f x x S
b a
× = − − − − − − − − − − − − − − − −
rec shad gen acc
S S N N =
x
a
x
b
x
4/15/2010 16
Solution if FX(x) simple (analytic) expression (e.g., for Exponential distr.):
1 ; 1 ) ( ) ( ⎨ ⎧ ≤ ≤ y f F rndm . 1 ;
- therwise
; ; ) ( ) ( = ⎩ ⎨ ⎧ = ⇒ ≡
i i Y X
y y y f x F y ) efficiency (100% ) (y . 2
i 1 −
=
X i
F x
Solutions for Normal distributions: a) sum of n uniform i.i.d. variables, ⎪
⎪ ⎪ ⎧ − = ⎪ ⎨ ⎧ ≤ ≤ = = ⇒ =
Φ Φ
; } 2 / exp{ ) ( 2 ; 2 1 ) ( ; ) ( ) ( ) , ( ) ( ) ( ) , (
2 , ,
r r r f f f r f r f y f y f y x f
R R Y Y Y X
π φ π φ φ φ
) , b) 2D Normal distribution……..:
⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎧ ≡ ⇒ ≥ − − = ⇒ = ⎪ ⎩ ⎨ = ⇒ ) ( ; , } 2 / exp{ 1 ) ( ; } 2 / exp{ ) ( ,
- therwise
; 2 ) (
2
r F z r r r F r r r f f
R R R
π φ ⎪ ⎪ ⎩ ⎩ ⎨ ⎧ ≤ ≤ = = ⇒
Φ Φ
.
- therwise
; 1 ; 1 ) ( ; ) ( ) ( ) , (
,
z z f f z f z f
Z Z Z
φ φ
4/15/2010 17
- IV. (Classical) Confidence Intervals:
1 Motivation
- IV. (Classical) Confidence Intervals:
- 1. Motivation,
- 2. Construction,
3 Intervals based on likelihood ratio ordering
- 3. Intervals based on likelihood-ratio ordering,
- 4. Intervals for constrained parameters,
5 Confidence intervals for discrete distributions
- 5. Confidence intervals for discrete distributions,
- 6. On the shortest confidence intervals.
4/15/2010 18
- 1. Motivation.
Example 1 (Prolog): given t1, can we say anything about ? The parameter may take on every value in a continuum
τ τ
The parameter may take on every value in a continuum +ï a measure of a single point in the continuum is 0. For verifiable predictions we must turn to interval estimations
τ
For verifiable predictions we must turn to interval estimations.
- J. Neyman (1937). Phil. Trans. R. Soc., A 236, 333-380.
4/15/2010 19
τ
) [ ]
γ α − ∈ 1 , 1)
[ ]
γ α − ∈ 1 , 1 τ τ
) [ ]
1 , 1 γ α − ∈ τ τ τ
) [ ]
γ α − ∈ 1 , 1 τ τ τ τ
) [ ] )
γ α − ∈ 1 , 1 τ τ τ τ
) [ ] )
γ α − ∈ 1 , 1
) ( )
∞ ∈ , 5 τ τ τ τ τ τ
) [ ] )
γ α − ∈ 1 , 1
) ( )
, 5 τ ∞ ∈ τ
- 2. Construction.
τ τ τ τ
) [ ] )
γ α − ∈ 1 , 1
) ( ) )
, 5 ∞ ∈ τ
1
τ
)
1
2 τ
1
τ1 τ
) ) ( )
α τ τ =
1 1
| : 3 2
a I a
t F t
1
τ1 τ1 τ
) ) ( ) ) ( )
γ α τ α τ τ + = =
1 1
| : 4 | : 3 2
a I a
t F t t F t
1
τ1 τ1 τ
) ) ( ) ) ( )
γ α τ α τ τ + = =
1 1
| : 4 | : 3 2
a I a
t F t t F t
1
τ1 τ1 τ1 τ
) ) ( ) ) ( )
γ α τ α τ τ + = =
1 1
| : 4 | : 3 2
a I a
t F t t F t
)
value true ; 6
1 1 τ
t
1
τ1 τ1 τ1 τ
b
τ
) ) ( ) ) ( )
γ α τ α τ τ + = =
1 1
| : 4 | : 3 2
a I a
t F t t F t
) ) ( ) ( )
b a b a
t t t t , , 7 value true ; 6
1 1 1 1
∈ ⇔ ∈ τ τ τ τ
1 1
) ( ) ( ) ( ) ( )
τ τ τ γ α τ − = ≤ < ⇒ + =
1 1 1
| | | Pr | : 4
a I b I b a I b I b
t F t F t t t t F t
) ( ) ( ) ( ) ( )
τ τ τ γ α τ − = ≤ < ⇒ + =
1 1 1
| | | Pr | : 4
a I b I b a I b I b
t F t F t t t t F t
1 1
) ( ) ( ) ( ) ( )
τ τ τ γ α τ − = ≤ < ⇒ + =
1 1 1
| | | Pr | : 4
a I b I b a I b I b
t F t F t t t t F t
a
τ a τ
) ( ) ( ) ( ) ( )
τ τ τ γ α τ − = ≤ < ⇒ + =
1 1 1
| | | Pr | : 4
a I b I b a I b I b
t F t F t t t t F t t t t t t t
a
t t
a
t t t
a
ta t
b
t t γ = t
a
t
b
t t t
a
t t
a
t
b
t
( ) ( )
γ = t
a
t
b
t t t
a
t t
a
t
b
t t
a
t
1
t
b
t
( ) ( )
γ =
a b
t
a
t
1
t
b
t t t
a
t t
a
t
b
t
a
t
b
t t
1
t
( ) ( )
γ =
1 1 a I b I
( )
⎫
Remark 4.
( ) ( ) ( ) ( ).
| | | |
1 1 1 1 b I a I a I b I
t F t F t F t F τ τ γ γ α τ α τ − = ⇒ ⎭ ⎬ ⎫ + = =
Remark 5.
( ) ( ) ( )
. in monotone strictly and as long as , τ τ τ τ α α
b a
t t =
4/15/2010 20
- 3. Confidence intervals based on likelihood-ratio ordering.
G J Feldman R D Cousins (1998) Phys Rev D 57 3873 3889 G.J. Feldman, R.D. Cousins (1998), Phys. Rev. D 57, 3873-3889. May be regarded as a definition of
( )
|θ f
( ) ( )).
(of θ α τ α
( ) ( ) ( ) ( ) ( ) ( ) ( ) { }
: Given | Pr and : , max. | : ˆ ; | ˆ | , γ θ θ θ θ θ θ θ = ≥ ∈ = = = = ≡ A R x R V x x x A x f x x x f x f x R
X X b I I I
( ) ( ) ( ) { }
slide. previous the
- n
- ne
the to identical is procedure the
- f
rest The . | Pr and , : , γ θ θ ≥ ∈ A R x R V x x x A
X X b a
p p
Remark 6.
( ) ( )
- to
- no
under t equivarian are intervals confidence These
( ) ( ) ( ) ( ).
: and izations reparametr
- ne
)] ( [ )], ( [ )] ( [ )], ( [
1 1 1 1
x s x s x s x s s x s y
b a b a
θ θ ν ν θ ν = ≡ ≡
4/15/2010 21
Example 2. Confidence intervals for
( ):
1 , ~ = σ μ N x
= 9 γ
p
( )
, μ
✻
µ
µ2
= = = const. 05 . 9 . ratio
- likelihood
the from intervals b) , a)α γ : CLT
- f
because Important principle.
- rdering
✲
µ1
x
x0 (0, 0)
⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ 〉 〈 ≈ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ 〉 〈 ∞ → n s x N n x Var x N x n
n n 2
, ) ( , ~ : : CLT
- f
because Important ⎠ ⎝ ⎠ ⎝
4/15/2010 22
- 4. Intervals for constrained parameters.
Example 3. Confidence intervals for
( ):
1 , 92 . 3 ~ = ≤ ≤ σ μ N x a) b)
✻
µ
✻
µ
✻
µ
µb µ2
µ
µb µ2 µ1
- ✲
µ1
x
x0
- x0
(0, 0)
✲
x
µ1
- µ2
µa= µ1 (0, 0) x0
- x0
4/15/2010 23
- 5. Confidence intervals for discrete distributions.
( ) equations
discrete For μ |
I n
F (
) ( ) ( )
all for solutions have not do equations , discrete For μ γ α μ α μ μ + = = . | , | |
b I a I I
n F n F n F whose s CI' shortest the Construct μ s). CI' ive (conservat coverage γ ≥
4/15/2010 24
- 6. On the shortest confidence intervals.
Example 4 (Exponential family):
that such chose Given t t
Example 4 (Exponential family):
( ) ( ) ( )
: For minimal. be will interval confidence the
- f
lenght the that such chose , Given 0 74 0 353 0 740976 2 , const.
1
t t t
b a
= ⇒ = = = τ τ α γ τ τ α
( ) ( ) ( )
. contains but , contain not does . : For 2 / ˆ , 0.74 , 0.353 , 0.740976 2 .
1 1 1 1
t t t t
b b a
= = ⇒ = = τ τ τ τ τ α γ : Note (
)
. contains but , contain not does 2 / ,
1 1
t t
b a
τ τ τ : Note
Example 4 (cont’d):
( )
{ }
x x
f
μ μ e p
| ln ln ⇒
− −
Example 4 (cont’d):
( )
{ }
( ) ( ).
: For . ,
b a x x I
t t e e x f x x μ μ α γ μ τ μ
μ μ
0.288 ln 0.263, ln , 0.527573 2 . exp | ln ln
1 1 '
+ − = ⇒ = = − = ⇒ ≡ ≡
( ) ( ).
b a b a
τ τ μ μ ln , ln , ≠ : Note
4/15/2010 25
Example 5 (Hypothesis testing):
at rejected . H H τ τ :
1
=
( )
. is coverage whose interval confidence shortest the
- utside
is if level nce) (significa confidence
b a
γ τ τ τ γ ,
1
) (id l ! j t t t you what
- n
depends ation parametriz
- f
choice The H ). (ideology! reject
- r
accept want, H
4/15/2010 26
- V. Hypothesis testing:
- V. Hypothesis testing:
- 1. Basic definitions,
- 2. Errors of the first and the second kind,
- 3. Neyman-Pearson Lemma,
- 4. Uniformly most powerful tests.
4/15/2010 27
- 1. Basic definitions.
H Frank S C Althoen (1994) Statistics Chaps 9 11 pp 326 480
- H. Frank, S.C. Althoen (1994), Statistics, Chaps. 9-11, pp. 326-480.
? parameter
- f
value the is what : Inference θ ? parameter
- f
value the is : hypothesis a
- f
Test θ θ
- r
0.
) ( : . :
1 1 1
H H > ≠ = = hypothesis e Alternativ hypothesis (null) Test θ θ θ θ θ θ θ θ correct is if value
- ther
some expected is and correct is if value the take to expected is that index numerical a H H w W : Test . correct. is if value
- ther
some expected is and correct is ) , (
, 2 1 1
K x x W W W H H = : statistic Test
4/15/2010 28
is if (??) likely relatively but correct is if (??) unlikely are that test a
- f
values the
- f
region the
1
H H W RC : region (critical) Rejection . ) | correct. ( Pr R R
C I C
≡ : )
- f
size level, nce (significa value Critical θ α l l fid t f f i b d if . 1 Cl H H R Cl − ≡ D i i level Confidence α rejected. is , if ; level confidence at
- f
favor in abandon if
1 1
H R w Cl H H R w
C C
∉ ∈ : Decision
4/15/2010 29
- 2. Errors of the first and second kind.
correct. is it when rejecting H : positive) (false I Error false). indeed is it when reject to fail (i.e., correct is it when rejecting
1
H H : negative) (false II Error . ) II Error ( , ) I Error ( P P β α ≡ = ) false a reject will test a that ty (probabili 1 H β − : test the
- f
Power ). false a H
4/15/2010 30
( ) ( ) ( )
| | | H Q P H R P H R P R i f b t A
- 3. Neyman-Pearson Lemma.
( ) ( ) ( ) ( )
. which for all for ,
1 1
| | | | H Q P Q H Q P H R P H R P R
C C C C C C
= ≥ = α α α : size
- f
best A
( )
,
1 1
) ; ( : , : x x L H H = = θ θ θ θ θ Pearson).
- (Neyman
1 Lemma
( ) ( ) { }
,
1 1 1 1 1 1 1 1
, ; , , : ) , , ( ) ; , , ( ) ; , , ( , ; , , x x W x x R x x L x x L x x W
n n C n n n
≤ = = η θ θ θ θ θ θ K K K K K
( ) { } ( )
a is i.e., , size
- f
region critical best a is
1 1 1 1 1
, ; , , , ; , , ) , , ( x x W R
n C n n C
⇒ θ θ α η test powerful most K
( ).
size
- f
| H R P
C
= α
4/15/2010 31
4 U if l t f l t t
α . size
- f
test powerful most Uniformly
- 4. Uniformly most powerful test.
( ) ( )
all for powerfull most , A x x W A A H H
n
θ θ θ θ θ θ θ
1 1 1 1
, ; , , ) ( ; : , : K ∈ ∉ ∈ =
( )
. in es alternativ for size
- f
test powerful most uniformly a is A x x W
n
α θ θ
1 1
, ; , ,K ⇒
, i.i.d., μ ) ( ) 1 , ( ~ } , , {
1
L N X X X
i n
: distr.) Normal for (UMPT 6 Example K UMPT. is , μ μ μ μ μ μ μ μ > ⇒ > = , ; ) ; , , ( ) ; , , ( : :
1 1 1 1 1
x x L x x L H H
n n
K K . , if UMPT no is There μ μ μ μ ≠ = : :
1
H H : 7 Example
4/15/2010 32
Text box With shadow
4/15/2010 33