lecture 23 how to find estimators 6 2
play

Lecture 23: How to find estimators 6.2 0/ 29 We have been discussing - PowerPoint PPT Presentation

Lecture 23: How to find estimators 6.2 0/ 29 We have been discussing the problem of estimating on unknown parameter in a probability distribution if we are given a sample x 1 , x 2 , . . . , x n from that distribution. We introduced two


  1. Lecture 23: How to find estimators §6.2 0/ 29

  2. We have been discussing the problem of estimating on unknown parameter θ in a probability distribution if we are given a sample x 1 , x 2 , . . . , x n from that distribution. We introduced two examples. Use the sample mean x = x 1 + . . . + x n to estimate population mean µ . X is an n unbiased estimator of µ . 1/ 29 Lecture 23: How to find estimators §6.2

  3. Also we had the more subtle problem of estimators B in U ( 0 , B ) W = n + 1 max ( x 1 , x 2 , . . . , x n ) n is an unbiased estimators of θ = B . We discussed two desirable properties of estimators (i) unbiased (ii) minimum variance 2/ 29 Lecture 23: How to find estimators §6.2

  4. the general problems. Given How do you find an estimator ˆ θ = h ( x 1 , x 2 , . . . , x n ) for θ ? There are two methods. (i) The method of moments (ii) The method of maximum likelihood. 3/ 29 Lecture 23: How to find estimators §6.2

  5. The Method of Moments Definition 1 Let k be a non negative integer and X be a random variable. Then the k -th moment m k ( x ) of X is given by m k ( X ) = E ( X k ) , k ≥ 0 so m 0 ( X ) = 1 m 1 ( X ) = E ( X ) = µ m 2 ( X ) = E ( X 2 ) = σ 2 + µ 2 Definition 2 Let x 1 , x 2 , . . . , x n be a sample from X . Then the k -th sample moment S k is � n S k = 1 x k i , so S 1 = x n 1 = 1 4/ 29 Lecture 23: How to find estimators §6.2

  6. Key Point Given the k -th moment m k (X) ( k -th population moment) depends on θ whereas the k -th sample moment does not - it is just the average sum of powers of the x ’s. The method of moments says (i) Equate the k -the population moment m k ( X ) to the k -th sample moment S k . (ii) Solve the resulting system of equations for θ . 5/ 29 Lecture 23: How to find estimators §6.2

  7. ( ∗ ) m k ( X ) = S k , 1 ≤ k < ∞ We will denote the answer by ˆ θ mme Example 1 Estimating P in a Bernoulli distribution The first population moment m 1 ( X ) is the near E ( X ) = p = θ The first sample moment S 1 is the sample mean so looking at the first equation of ( ∗ ) m 1 ( X ) = S 1 so p = x gives us the sample mean as an estimator for p 6/ 29 Lecture 23: How to find estimators §6.2

  8. Example 1 (Cont.) Recall that because the x ’s are all either 1 or zero x 1 + . . . + x n = � of successes and x = # ofsuccesses n = the sample proportion p mme = X ˆ Example 2 The method of moments works well when you here several unknown parameters. Suppose we want to estimate both the mean µ and the variance σ 2 from a normal distribution (or any distribution) X ∼ N ( µ, σ 2 ) 7/ 29 Lecture 23: How to find estimators §6.2

  9. Example 2 (Cont.) We equate the first two population moments to the first two sample moments m 1 ( X ) = S 1 m 2 ( X ) = S 2 so µ = X � n σ 2 + µ 2 = 1 x 2 i n i = 1 Solving (we get µ for free, ˆ µ mme = X ) � n σ 2 = 1 X 2 i − µ 2 n i = 1 �� X i � 2 � n = 1 X 2 i − n n i = 1   � n �   = 1  i − 1     X 2 X i ) 2   n (    n i = 1 8/ 29 Lecture 23: How to find estimators §6.2

  10. Example 2 (Cont.) So �� i − ( � X i ) 2 � σ 2 mme = 1 � X 2 n n Actually the best estimator for σ 2 is the sample variance   i − ( � x i ) 2 � n   1   S 2 =    X 2      n − 1 n i = 1 � σ 2 mme is a biased estimator. Example 3 Estimating B in U ( 0 , B ) Recall that we come up with the unbiased estimator B = n + 1 � max ( x 2 , x 2 , . . . , x n ) n Put w = max ( x 1 , . . . , x n + 1 ) 9/ 29 Lecture 23: How to find estimators §6.2

  11. What do we get from the Method of Moments ? Then E ( X ) = 0 + B = B 2 2 So equating the first population moment m 1 ( X ) = µ to the first sample moment S 1 = x we get B 2 = x B = 2 x and ˆ B mme = 2 X so This is unbiased because E ( X ) = population mean = B 2 so E ( 2 X ) = B 10/ 29 Lecture 23: How to find estimators §6.2

  12. So we have a new unbiased estimator B 1 = ˆ ˆ B mme = 2 X . Recall the other was B 2 = n + 1 ˆ W n where W = Max ( X 1 , . . . , X n ) Which one is better? We will interpret this to mean “which one has the smaller variance”? 11/ 29 Lecture 23: How to find estimators §6.2

  13. V (ˆ B 1 ) = V ( 2 X ) Recall from the Distribution Hard out that X ∼ U ( A , B ) ⇒ V ( X ) = ( B − A ) 2 12 Now X ∼ U ( 0 , B ) so V ( X ) = B 2 12 This is the population variance. We also know V ( X ) = σ 2 n = population variance n V ( X ) = B 2 so 12 n B 1 ) = V ( 2 X ) = 4 B 2 12 n = B 2 V ( ˆ Then 3 n 12/ 29 Lecture 23: How to find estimators §6.2

  14. � n + 1 � V ( B 2 ) = V Max ( X 1 , . . . , X n ) n We have W = Max ( X 1 , X 2 , . . . , X n ) We have from Problem 32, pg 252 n E ( W ) = n + 1 B  nw n − 1     0 ≤ w ≤ B , and f W ( w ) = B n     0 , otherwise Hence � B � B w 2 nw n − 1 dw = n E ( W 2 ) = w n + 1 dw B n B n 0 0 �� � W n + 2 � w = B = n � n � n + 2 B 2 = � � B n n + 2 w = 0 13/ 29 Lecture 23: How to find estimators §6.2

  15. Hence V ( W ) = E ( W 2 ) − E ( W ) 2 � � 2 n n n + 2 B 2 − = n + 1 B � � n 2 n = B 2 n + 2 − ( n + 1 ) 2 � n ( n + 1 ) 2 − n 2 ( n + 2 ) � = B 2 ( n + 1 ) 2 ( n + 2 ) � n 3 + zn 2 + n − n 3 − 2 n 2 � = B 2 ( n + 1 ) 2 ( n + 2 ) n ( n + 1 ) 2 ( n + 2 ) B 2 = � n + 1 � = ( n + 1 ) 2 V (ˆ B 2 ) = V V ( W ) W n n 2 ( n + 1 ) 2 = ✘✘✘✘ ✘ n 1 ( n + 1 ) 2 ( n + 2 ) B 2 = n ( n + 2 ) B 2 n 2 ✘ ✘✘✘ 14/ 29 Lecture 23: How to find estimators §6.2

  16. B 2 is the winner because n ≥ 1. If n = 1 they tie but of course n >> 1 so ˆ ˆ B 2 is a lot better. 15/ 29 Lecture 23: How to find estimators §6.2

  17. The Method of Maximum Likelihood (a brilliant idea) Suppose we have an actual sample x 1 , x 2 , . . . , x n from the space of a discrete random variable x whose proof p X ( x , θ ) depends on an unknown parameter θ . What is the probability P of getting the sample x 1 , x 2 , . . . , x n that we actually obtained. It is P ( X 1 = x 1 , X 2 = x 2 , . . . , X n = x n ) by independence = P ( X 1 = x 1 ) P ( X 2 = x 2 ) . . . P ( X n = x n ) 16/ 29 Lecture 23: How to find estimators §6.2

  18. But since X 1 , X 2 , . . . , X n are samples from X they have the sample proof’s as X so P ( X 1 = x 1 ) = P ( X = x 1 ) = P X ( x 1 , θ ) P ( X 2 = x 2 ) = P ( X = x 2 ) = P X ( x 2 , θ ) . . . P ( X n = x n ) = P ( X = x n ) = P X ( x n , θ ) Hence P = p X ( x 1 , θ ) p X ( x 2 , θ ) . . . p X ( x n , θ ) P is a function of θ , it is called the likelihood function and denoted L θ -it is the likelihood of getting the sample we actually obtained. 17/ 29 Lecture 23: How to find estimators §6.2

  19. Note, θ is unknown but x 1 , x 2 , . . . , x n are known (given). So what is the nest guess for θ - the number that maximizes the probability of getting the sample use actually observed. This is the value of θ that is most compatible with the observed data . Bottom Line Find the value of θ that maximizes the likelihood function L ( θ ) This is the “method of maximum likelihood”. 18/ 29 Lecture 23: How to find estimators §6.2

  20. The resulting estimator will be called the maximum likelihood estimator, abbreviated mle and denoted ˆ θ mle . Remark (We will be lazy) In doing problems, following the text, we won’t really maximize L ( θ ) we will just find a critical point of L ( θ ) ie. a point where L ′ ( θ ) is zero. Later in your cancer if your have to do this you should check that the critical point is indeed a maximum . 19/ 29 Lecture 23: How to find estimators §6.2

  21. Examples 1. The mle for p in Bin ( 1 , p ) x 0 1 X ∼ Bin ( 1 , p ) means the proof of X is p (X=x) 1 − p P There is a simple formula for this p X ( x ) = p x ( 1 − p ) 1 − x , x = 0 , 1 Now since p is our unknown parameter θ we write p X ( x , θ ) = θ x ( 1 − θ ) 1 − x , x = 0 , 1 so p X ( x , θ ) = θ x 1 ( 1 − θ ) 1 − x 1 . . . p X ( x n , θ ) = θ x n ( 1 − θ ) 1 − x n 20/ 29 Lecture 23: How to find estimators §6.2

  22. Hence L ( θ ) = p X ( x 1 , θ ) . . . p X ( x n , θ ) and hence L ( θ ) = θ x 1 ( 1 − θ ) 1 − x 1 θ x 2 ( 1 − θ ) 1 − x 2 . . . θ x n ( 1 − θ ) 1 − x n � ������������������������������������������������������� �� ������������������������������������������������������� � positive number Now we want to  1. Compute L ′ ( θ )     2. Set L ′ ( θ ) = 0 and solve for  ( ∗ )    θ in terms of x 1 , x 2 , . . . , x n We can make things much simpler by using the following trick. Suppose f ( x ) is a real valued function that only takes positive value. Put h ( x ) = ln f ( x ) 21/ 29 Lecture 23: How to find estimators §6.2

  23. So the critical points of h are the same points as those of f h 1 ( x ) = 0 ⇔ f ′ ( x ) f ( x ) = 0 ⇔ f ′ ( x ) = 0 Also h takes a maximum value of x ∗ ⇔ f takes a maximum value at x ∗ . This is because ln is an increasing function so it preserves order relations. ( a < b ⇔ ln a < ln b , have we assume a > 0 and b > 0) Bottom Line Change ( ∗ ) to ( ∗∗ ) 22/ 29 Lecture 23: How to find estimators §6.2

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend