likelihood methods of inference toss coin 6 times and get
play

Likelihood Methods of Inference Toss coin 6 times and get Heads - PDF document

Likelihood Methods of Inference Toss coin 6 times and get Heads twice. p is probability of getting H. Probability of getting exactly 2 heads is 15 p 2 (1 p ) 4 This function of p , is likelihood function. Definition : The likelihood function


  1. Likelihood Methods of Inference Toss coin 6 times and get Heads twice. p is probability of getting H. Probability of getting exactly 2 heads is 15 p 2 (1 − p ) 4 This function of p , is likelihood function. Definition : The likelihood function is map L : domain Θ, values given by L ( θ ) = f θ ( X ) Key Point: think about how the density de- pends on θ not about how it depends on X . Notice: X , observed value of the data, has been plugged into the formula for density. Notice: coin tossing example uses the discrete density for f . We use likelihood for most inference problems: 66

  2. 1. Point estimation: we must compute an es- timate ˆ θ = ˆ θ ( X ) which lies in Θ. The max- imum likelihood estimate (MLE) of θ is the value ˆ θ which maximizes L ( θ ) over θ ∈ Θ if such a ˆ θ exists. 2. Point estimation of a function of θ : we must compute an estimate ˆ φ = ˆ φ ( X ) of φ = g ( θ ). We use ˆ φ = g (ˆ θ ) where ˆ θ is the MLE of θ . 3. Interval (or set) estimation. We must com- pute a set C = C ( X ) in Θ which we think will contain θ 0 . We will use { θ ∈ Θ : L ( θ ) > c } for a suitable c . 4. Hypothesis testing: decide whether or not θ 0 ∈ Θ 0 where Θ 0 ⊂ Θ. We base our deci- sion on the likelihood ratio sup { L ( θ ); θ ∈ Θ 0 } sup { L ( θ ); θ ∈ Θ \ Θ 0 } 67

  3. Maximum Likelihood Estimation To find MLE maximize L . Typical function maximization problem: Set gradient of L equal to 0 Check root is maximum, not minimum or sad- dle point. Examine some likelihood plots in examples: Cauchy Data Iid sample X 1 , . . . , X n from Cauchy( θ ) density 1 f ( x ; θ ) = π (1 + ( x − θ ) 2 ) The likelihood function is n 1 � L ( θ ) = π (1 + ( X i − θ ) 2 ) i =1 [Examine likelihood plots.] 68

  4. Likelihood Function: Cauchy, n=5 Likelihood Function: Cauchy, n=5 1.0 1.0 0.8 0.8 0.6 0.6 Likelihood Likelihood 0.4 0.4 0.2 0.2 0.0 0.0 -10 -5 0 5 10 -10 -5 0 5 10 Theta Theta Likelihood Function: Cauchy, n=5 Likelihood Function: Cauchy, n=5 1.0 1.0 0.8 0.8 0.6 0.6 Likelihood Likelihood 0.4 0.4 0.2 0.2 0.0 0.0 -10 -5 0 5 10 -10 -5 0 5 10 Theta Theta Likelihood Function: Cauchy, n=5 Likelihood Function: Cauchy, n=5 1.0 1.0 0.8 0.8 0.6 0.6 Likelihood Likelihood 0.4 0.4 0.2 0.2 0.0 0.0 -10 -5 0 5 10 -10 -5 0 5 10 Theta Theta 69

  5. Likelihood Function: Cauchy, n=5 Likelihood Function: Cauchy, n=5 1.0 1.0 0.8 0.8 0.6 Likelihood Likelihood 0.6 0.4 0.4 0.2 0.2 0.0 -2 -1 0 1 2 -2 -1 0 1 2 Theta Theta Likelihood Function: Cauchy, n=5 Likelihood Function: Cauchy, n=5 1.0 1.0 0.8 0.8 0.6 0.6 Likelihood Likelihood 0.4 0.4 0.2 0.2 0.0 0.0 -2 -1 0 1 2 -2 -1 0 1 2 Theta Theta Likelihood Function: Cauchy, n=5 Likelihood Function: Cauchy, n=5 1.0 1.0 0.8 0.8 0.6 0.6 Likelihood Likelihood 0.4 0.4 0.2 0.2 0.0 0.0 -2 -1 0 1 2 -2 -1 0 1 2 Theta Theta 70

  6. Likelihood Function: Cauchy, n=25 Likelihood Function: Cauchy, n=25 1.0 1.0 0.8 0.8 0.6 0.6 Likelihood Likelihood 0.4 0.4 0.2 0.2 0.0 0.0 -10 -5 0 5 10 -10 -5 0 5 10 Theta Theta Likelihood Function: Cauchy, n=25 Likelihood Function: Cauchy, n=25 1.0 1.0 0.8 0.8 0.6 0.6 Likelihood Likelihood 0.4 0.4 0.2 0.2 0.0 0.0 -10 -5 0 5 10 -10 -5 0 5 10 Theta Theta Likelihood Function: Cauchy, n=25 Likelihood Function: Cauchy, n=25 1.0 1.0 0.8 0.8 0.6 0.6 Likelihood Likelihood 0.4 0.4 0.2 0.2 0.0 0.0 -10 -5 0 5 10 -10 -5 0 5 10 Theta Theta 71

  7. Likelihood Function: Cauchy, n=25 Likelihood Function: Cauchy, n=25 1.0 1.0 0.8 0.8 0.6 0.6 Likelihood Likelihood 0.4 0.4 0.2 0.2 0.0 0.0 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 Theta Theta Likelihood Function: Cauchy, n=25 Likelihood Function: Cauchy, n=25 1.0 1.0 0.8 0.8 0.6 0.6 Likelihood Likelihood 0.4 0.4 0.2 0.2 0.0 0.0 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 Theta Theta Likelihood Function: Cauchy, n=25 Likelihood Function: Cauchy, n=25 1.0 1.0 0.8 0.8 0.6 0.6 Likelihood Likelihood 0.4 0.4 0.2 0.2 0.0 0.0 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 Theta Theta 72

  8. I want you to notice the following points: • The likelihood functions have peaks near the true value of θ (which is 0 for the data sets I generated). • The peaks are narrower for the larger sam- ple size. • The peaks have a more regular shape for the larger value of n . • I actually plotted L ( θ ) /L (ˆ θ ) which has ex- actly the same shape as L but runs from 0 to 1 on the vertical scale. 73

  9. To maximize this likelihood: differentiate L , set result equal to 0. Notice L is product of n terms; derivative is n 1 2( X i − θ ) � � π (1 + ( X j − θ ) 2 ) π (1 + ( X i − θ ) 2 ) 2 i =1 j � = i which is quite unpleasant. Much easier to work with logarithm of L : log of product is sum and logarithm is monotone increasing. Definition : The Log Likelihood function is ℓ ( θ ) = log { L ( θ ) } . For the Cauchy problem we have log(1 + ( X i − θ ) 2 ) − n log( π ) � ℓ ( θ ) = − [Examine log likelihood plots.] 74

  10. Likelihood Ratio Intervals: Cauchy, n=5 Likelihood Ratio Intervals: Cauchy, n=5 -12 -10 -14 Log Likelihood Log Likelihood -15 -16 -18 -20 -20 -22 • • • • • -25 • • • • -10 -5 0 5 10 -10 -5 0 5 10 Theta Theta Likelihood Ratio Intervals: Cauchy, n=5 Likelihood Ratio Intervals: Cauchy, n=5 -5 -10 -10 Log Likelihood Log Likelihood -15 -15 -20 -20 • • • • • • • • • • -10 -5 0 5 10 -10 -5 0 5 10 Theta Theta Likelihood Ratio Intervals: Cauchy, n=5 Likelihood Ratio Intervals: Cauchy, n=5 -10 -12 -15 -14 Log Likelihood Log Likelihood -16 -18 -20 -20 -22 -25 -24 • • • • • • • • -10 -5 0 5 10 -10 -5 0 5 10 Theta Theta 75

  11. Likelihood Ratio Intervals: Cauchy, n=5 Likelihood Ratio Intervals: Cauchy, n=5 -11.0 -8 -11.5 -12.0 Log Likelihood Log Likelihood -10 -12.5 -12 -13.0 -13.5 -14 -2 -1 0 1 2 -2 -1 0 1 2 Theta Theta Likelihood Ratio Intervals: Cauchy, n=5 Likelihood Ratio Intervals: Cauchy, n=5 -6 -2 -7 -4 -8 Log Likelihood Log Likelihood -9 -6 -10 -11 -8 -12 -2 -1 0 1 2 -2 -1 0 1 2 Theta Theta Likelihood Ratio Intervals: Cauchy, n=5 Likelihood Ratio Intervals: Cauchy, n=5 -10 -12 -11 -13 Log Likelihood Log Likelihood -14 -12 -15 -13 -16 -14 -17 -2 -1 0 1 2 -2 -1 0 1 2 Theta Theta 76

  12. Likelihood Ratio Intervals: Cauchy, n=25 Likelihood Ratio Intervals: Cauchy, n=25 -20 -40 -40 Log Likelihood Log Likelihood -60 -60 -80 -80 -100 -100 • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • -10 -5 0 5 10 -10 -5 0 5 10 Theta Theta Likelihood Ratio Intervals: Cauchy, n=25 Likelihood Ratio Intervals: Cauchy, n=25 -20 -40 -40 -60 Log Likelihood Log Likelihood -60 -80 -80 -100 -100 • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • -10 -5 0 5 10 -10 -5 0 5 10 Theta Theta Likelihood Ratio Intervals: Cauchy, n=25 Likelihood Ratio Intervals: Cauchy, n=25 -40 -60 -60 Log Likelihood Log Likelihood -80 -80 -100 -100 -120 • • • • • • • • • •• • • • • •• • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • -10 -5 0 5 10 -10 -5 0 5 10 Theta Theta 77

  13. Likelihood Ratio Intervals: Cauchy, n=25 Likelihood Ratio Intervals: Cauchy, n=25 -22 -24 -24 -26 Log Likelihood Log Likelihood -26 -28 -28 -30 -30 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 Theta Theta Likelihood Ratio Intervals: Cauchy, n=25 Likelihood Ratio Intervals: Cauchy, n=25 -36 -22 -38 Log Likelihood Log Likelihood -24 -40 -26 -42 -28 -44 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 Theta Theta Likelihood Ratio Intervals: Cauchy, n=25 Likelihood Ratio Intervals: Cauchy, n=25 -43 -46 -44 -48 -45 Log Likelihood Log Likelihood -50 -46 -52 -47 -54 -48 -56 -49 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 Theta Theta 78

  14. Notice the following points: • Plots of ℓ for n = 25 quite smooth, rather parabolic. • For n = 5 many local maxima and minima of ℓ . Likelihood tends to 0 as | θ | → ∞ so max of ℓ occurs at a root of ℓ ′ , derivative of ℓ wrt θ . Def’n : Score Function is gradient of ℓ U ( θ ) = ∂ℓ ∂θ MLE ˆ θ usually root of Likelihood Equations U ( θ ) = 0 In our Cauchy example we find 2( X i − θ ) � U ( θ ) = 1 + ( X i − θ ) 2 [Examine plots of score functions.] Notice: often multiple roots of likelihood equa- tions. 79

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend