Lecture 23: How to find estimators 6.2 0/ 29 We have been discussing - PowerPoint PPT Presentation

Lecture 23: How to find estimators §6.2 0/ 29

We have been discussing the problem of estimating on unknown parameter θ in a probability distribution if we are given a sample x 1 , x 2 , . . . , x n from that distribution. We introduced two examples. Use the sample mean x = x 1 + . . . + x n to estimate population mean µ . X is an n unbiased estimator of µ . 1/ 29 Lecture 23: How to find estimators §6.2

Also we had the more subtle problem of estimators B in U ( 0 , B ) W = n + 1 max ( x 1 , x 2 , . . . , x n ) n is an unbiased estimators of θ = B . We discussed two desirable properties of estimators (i) unbiased (ii) minimum variance 2/ 29 Lecture 23: How to find estimators §6.2

the general problems. Given How do you find an estimator ˆ θ = h ( x 1 , x 2 , . . . , x n ) for θ ? There are two methods. (i) The method of moments (ii) The method of maximum likelihood. 3/ 29 Lecture 23: How to find estimators §6.2

The Method of Moments Definition 1 Let k be a non negative integer and X be a random variable. Then the k -th moment m k ( x ) of X is given by m k ( X ) = E ( X k ) , k ≥ 0 so m 0 ( X ) = 1 m 1 ( X ) = E ( X ) = µ m 2 ( X ) = E ( X 2 ) = σ 2 + µ 2 Definition 2 Let x 1 , x 2 , . . . , x n be a sample from X . Then the k -th sample moment S k is � n S k = 1 x k i , so S 1 = x n 1 = 1 4/ 29 Lecture 23: How to find estimators §6.2

Key Point Given the k -th moment m k (X) ( k -th population moment) depends on θ whereas the k -th sample moment does not - it is just the average sum of powers of the x ’s. The method of moments says (i) Equate the k -the population moment m k ( X ) to the k -th sample moment S k . (ii) Solve the resulting system of equations for θ . 5/ 29 Lecture 23: How to find estimators §6.2

( ∗ ) m k ( X ) = S k , 1 ≤ k < ∞ We will denote the answer by ˆ θ mme Example 1 Estimating P in a Bernoulli distribution The first population moment m 1 ( X ) is the near E ( X ) = p = θ The first sample moment S 1 is the sample mean so looking at the first equation of ( ∗ ) m 1 ( X ) = S 1 so p = x gives us the sample mean as an estimator for p 6/ 29 Lecture 23: How to find estimators §6.2

Example 1 (Cont.) Recall that because the x ’s are all either 1 or zero x 1 + . . . + x n = � of successes and x = # ofsuccesses n = the sample proportion p mme = X ˆ Example 2 The method of moments works well when you here several unknown parameters. Suppose we want to estimate both the mean µ and the variance σ 2 from a normal distribution (or any distribution) X ∼ N ( µ, σ 2 ) 7/ 29 Lecture 23: How to find estimators §6.2

Example 2 (Cont.) We equate the first two population moments to the first two sample moments m 1 ( X ) = S 1 m 2 ( X ) = S 2 so µ = X � n σ 2 + µ 2 = 1 x 2 i n i = 1 Solving (we get µ for free, ˆ µ mme = X ) � n σ 2 = 1 X 2 i − µ 2 n i = 1 �� X i � 2 � n = 1 X 2 i − n n i = 1   � n �   = 1  i − 1     X 2 X i ) 2   n (    n i = 1 8/ 29 Lecture 23: How to find estimators §6.2

Example 2 (Cont.) So �� i − ( � X i ) 2 � σ 2 mme = 1 � X 2 n n Actually the best estimator for σ 2 is the sample variance   i − ( � x i ) 2 � n   1   S 2 =    X 2      n − 1 n i = 1 � σ 2 mme is a biased estimator. Example 3 Estimating B in U ( 0 , B ) Recall that we come up with the unbiased estimator B = n + 1 � max ( x 2 , x 2 , . . . , x n ) n Put w = max ( x 1 , . . . , x n + 1 ) 9/ 29 Lecture 23: How to find estimators §6.2

What do we get from the Method of Moments ? Then E ( X ) = 0 + B = B 2 2 So equating the first population moment m 1 ( X ) = µ to the first sample moment S 1 = x we get B 2 = x B = 2 x and ˆ B mme = 2 X so This is unbiased because E ( X ) = population mean = B 2 so E ( 2 X ) = B 10/ 29 Lecture 23: How to find estimators §6.2

So we have a new unbiased estimator B 1 = ˆ ˆ B mme = 2 X . Recall the other was B 2 = n + 1 ˆ W n where W = Max ( X 1 , . . . , X n ) Which one is better? We will interpret this to mean “which one has the smaller variance”? 11/ 29 Lecture 23: How to find estimators §6.2

V (ˆ B 1 ) = V ( 2 X ) Recall from the Distribution Hard out that X ∼ U ( A , B ) ⇒ V ( X ) = ( B − A ) 2 12 Now X ∼ U ( 0 , B ) so V ( X ) = B 2 12 This is the population variance. We also know V ( X ) = σ 2 n = population variance n V ( X ) = B 2 so 12 n B 1 ) = V ( 2 X ) = 4 B 2 12 n = B 2 V ( ˆ Then 3 n 12/ 29 Lecture 23: How to find estimators §6.2

� n + 1 � V ( B 2 ) = V Max ( X 1 , . . . , X n ) n We have W = Max ( X 1 , X 2 , . . . , X n ) We have from Problem 32, pg 252 n E ( W ) = n + 1 B  nw n − 1     0 ≤ w ≤ B , and f W ( w ) = B n     0 , otherwise Hence � B � B w 2 nw n − 1 dw = n E ( W 2 ) = w n + 1 dw B n B n 0 0 �� W n + 2 � w = B = n � n � n + 2 B 2 = � � B n n + 2 w = 0 13/ 29 Lecture 23: How to find estimators §6.2

Hence V ( W ) = E ( W 2 ) − E ( W ) 2 � � 2 n n n + 2 B 2 − = n + 1 B � � n 2 n = B 2 n + 2 − ( n + 1 ) 2 � n ( n + 1 ) 2 − n 2 ( n + 2 ) � = B 2 ( n + 1 ) 2 ( n + 2 ) � n 3 + zn 2 + n − n 3 − 2 n 2 � = B 2 ( n + 1 ) 2 ( n + 2 ) n ( n + 1 ) 2 ( n + 2 ) B 2 = � n + 1 � = ( n + 1 ) 2 V (ˆ B 2 ) = V V ( W ) W n n 2 ( n + 1 ) 2 = ✘✘✘✘ ✘ n 1 ( n + 1 ) 2 ( n + 2 ) B 2 = n ( n + 2 ) B 2 n 2 ✘ ✘✘✘ 14/ 29 Lecture 23: How to find estimators §6.2

B 2 is the winner because n ≥ 1. If n = 1 they tie but of course n >> 1 so ˆ ˆ B 2 is a lot better. 15/ 29 Lecture 23: How to find estimators §6.2

The Method of Maximum Likelihood (a brilliant idea) Suppose we have an actual sample x 1 , x 2 , . . . , x n from the space of a discrete random variable x whose proof p X ( x , θ ) depends on an unknown parameter θ . What is the probability P of getting the sample x 1 , x 2 , . . . , x n that we actually obtained. It is P ( X 1 = x 1 , X 2 = x 2 , . . . , X n = x n ) by independence = P ( X 1 = x 1 ) P ( X 2 = x 2 ) . . . P ( X n = x n ) 16/ 29 Lecture 23: How to find estimators §6.2

But since X 1 , X 2 , . . . , X n are samples from X they have the sample proof’s as X so P ( X 1 = x 1 ) = P ( X = x 1 ) = P X ( x 1 , θ ) P ( X 2 = x 2 ) = P ( X = x 2 ) = P X ( x 2 , θ ) . . . P ( X n = x n ) = P ( X = x n ) = P X ( x n , θ ) Hence P = p X ( x 1 , θ ) p X ( x 2 , θ ) . . . p X ( x n , θ ) P is a function of θ , it is called the likelihood function and denoted L θ -it is the likelihood of getting the sample we actually obtained. 17/ 29 Lecture 23: How to find estimators §6.2

Note, θ is unknown but x 1 , x 2 , . . . , x n are known (given). So what is the nest guess for θ - the number that maximizes the probability of getting the sample use actually observed. This is the value of θ that is most compatible with the observed data . Bottom Line Find the value of θ that maximizes the likelihood function L ( θ ) This is the “method of maximum likelihood”. 18/ 29 Lecture 23: How to find estimators §6.2

The resulting estimator will be called the maximum likelihood estimator, abbreviated mle and denoted ˆ θ mle . Remark (We will be lazy) In doing problems, following the text, we won’t really maximize L ( θ ) we will just find a critical point of L ( θ ) ie. a point where L ′ ( θ ) is zero. Later in your cancer if your have to do this you should check that the critical point is indeed a maximum . 19/ 29 Lecture 23: How to find estimators §6.2

Examples 1. The mle for p in Bin ( 1 , p ) x 0 1 X ∼ Bin ( 1 , p ) means the proof of X is p (X=x) 1 − p P There is a simple formula for this p X ( x ) = p x ( 1 − p ) 1 − x , x = 0 , 1 Now since p is our unknown parameter θ we write p X ( x , θ ) = θ x ( 1 − θ ) 1 − x , x = 0 , 1 so p X ( x , θ ) = θ x 1 ( 1 − θ ) 1 − x 1 . . . p X ( x n , θ ) = θ x n ( 1 − θ ) 1 − x n 20/ 29 Lecture 23: How to find estimators §6.2

Hence L ( θ ) = p X ( x 1 , θ ) . . . p X ( x n , θ ) and hence L ( θ ) = θ x 1 ( 1 − θ ) 1 − x 1 θ x 2 ( 1 − θ ) 1 − x 2 . . . θ x n ( 1 − θ ) 1 − x n � �� positive number Now we want to  1. Compute L ′ ( θ )     2. Set L ′ ( θ ) = 0 and solve for  ( ∗ )    θ in terms of x 1 , x 2 , . . . , x n We can make things much simpler by using the following trick. Suppose f ( x ) is a real valued function that only takes positive value. Put h ( x ) = ln f ( x ) 21/ 29 Lecture 23: How to find estimators §6.2

So the critical points of h are the same points as those of f h 1 ( x ) = 0 ⇔ f ′ ( x ) f ( x ) = 0 ⇔ f ′ ( x ) = 0 Also h takes a maximum value of x ∗ ⇔ f takes a maximum value at x ∗ . This is because ln is an increasing function so it preserves order relations. ( a < b ⇔ ln a < ln b , have we assume a > 0 and b > 0) Bottom Line Change ( ∗ ) to ( ∗∗ ) 22/ 29 Lecture 23: How to find estimators §6.2

Lecture 23: How to find estimators 6.2 0/ 29 We have been discussing - PowerPoint PPT Presentation

Lecture 23: How to find estimators 6.2 0/ 29 We have been discussing the problem of estimating on unknown parameter in a probability distribution if we are given a sample x 1 , x 2 , . . . , x n from that distribution. We introduced two

L-estimators, R-estimators, Redescending M gr. Jakub Petr asek Estimators Revision Seminar

Dynamic Panel Data estimators Christopher F Baum EC 823: Applied Econometrics Boston College,

Small Sample Performance of Instrumental Variables Probit Estimators: A Monte Carlo Investigation

Review - Mathematical Statistics Estimators and Estimates Unbiased estimators Efficiency

Review - Mathematical Statistics Estimators and Estimates Unbiased estimators Efficiency

Dynamic Panel Data estimators Christopher F Baum ECON 8823: Applied Econometrics Boston College,

From Importance Sampling to Doubly Robust Policy Gradient Jiawei Huang (UIUC) Nan Jiang (UIUC)

Regression Discontinuity Estimators and LATE James Heckman University of Chicago Econ 312 May

Find Us! Find Us! Find Us! Find Us! Like Us! Like Us! Interact With Us! Interact With Us!

Continuous attractors as unreliable estimators Arvind Murugan Dept. of Physics Regression using

Combining Biased and Unbiased Estimators in High Dimensions Bill Strawderman Rutgers University

Survival models and Cox-regression Rates and Survival Lifetable estimators Bendix Carstensen

Survival Rates and Multiple timescales Survival Lifetable estimators Competing risks Kaplan-

Minimax risk of truncated series estimators over symmetric convex polytopes Adel Javanmard

Permanent estimators via random matrices Mark Rudelson joint work with Ofer Zeitouni Department

The Principles Underlying Evaluation Estimators James J. Heckman University of Chicago Econ

A Recursive Type System with Type Abbreviations and Abstract Types Keiko Nakata Institute of

On limits of applicability of G odels second incompleteness theorem F.N. Pakhomov Steklov

Pheno Technology Carl Pollard Department of Linguistics Ohio State University June 25, 2012

Outline C LON Didier Verna Marketing Features Marketing 1 Status Features 2 Status 3 2/7

Extension, Abbreviation and Refinement - Identifying High-Level Dependence Structures Using

Lecture #23: The Scheme Language Scheme is a dialect of Lisp: The only programming language

Logic as a Tool Chapter 3: Understanding First-order Logic 3.1 First-order structures and

Learnability-based Syntactic Annotation Design Roy Schwartz, Omri Abend and Ari Rappoport The