biostatistics 602 statistical inference
play

Biostatistics 602 - Statistical Inference April 16th, 2013 - PowerPoint PPT Presentation

. . . . .. . . .. . .. . . . .. . . .. . .. . . . . . .. Biostatistics 602 - Statistical Inference April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang April 16th, 2013 Hyun Min Kang E-M Algorithm & Practice


  1. • One-sided (with upper-bound) interval . . .. . . .. . . .. . . . . . . .. . . .. . . .. . Recap . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang U X Three types of intervals that . Interval Estimator E-M . Interval Estimation Summary . P3 P2 P1 . .. .. .. . .. . . .. . . . .. . .. . . .. . . . . . .. . .. . . .. . . .. . . 3 / 33 . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . ˆ θ ( X ) is usually represented as a point estimator Let [ L ( X ) , U ( X )] , where L ( X ) and U ( X ) are functions of sample X and L ( X ) ≤ U ( X ) . Based on the observed sample x , we can make an inference θ ∈ [ L ( X ) , U ( X )] Then we call [ L ( X ) , U ( X )] an interval estimator of θ . • Two-sided interval [ L ( X ) , U ( X )] • One-sided (with lower-bound) interval [ L ( X ) , ∞ )

  2. . . . . .. . . .. . .. . . . .. . . .. . .. . . . . . .. Interval Estimator April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Three types of intervals that . . . Recap Interval Estimation Summary . P3 P2 P1 E-M . .. . .. .. . . . .. . . . . . .. . . .. . . . 3 / 33 .. .. .. . . .. . . .. . . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . ˆ θ ( X ) is usually represented as a point estimator Let [ L ( X ) , U ( X )] , where L ( X ) and U ( X ) are functions of sample X and L ( X ) ≤ U ( X ) . Based on the observed sample x , we can make an inference θ ∈ [ L ( X ) , U ( X )] Then we call [ L ( X ) , U ( X )] an interval estimator of θ . • Two-sided interval [ L ( X ) , U ( X )] • One-sided (with lower-bound) interval [ L ( X ) , ∞ ) • One-sided (with upper-bound) interval ( −∞ , U ( X )]

  3. . P2 defined as . . Definition : Coverage Probability . Definitions Summary . P3 P1 L X E-M Recap . . . . . . .. . . .. . . Pr U X . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang U X L X inf Pr Confidence coefficient is defined as . . . In other words, the probability of a random variable in interval . . . . Definition: Confidence Coefficient . . covers the parameter U X L X .. .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . . 4 / 33 .. .. . .. . . .. . . .. . .. . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , its coverage probability is

  4. . Recap Definition : Coverage Probability . Definitions Summary . P3 P2 P1 E-M . . . . . . . .. . . .. . . .. .. . defined as .. . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang U X L X inf Pr Confidence coefficient is defined as . . . In other words, the probability of a random variable in interval . . . . Definition: Confidence Coefficient . . covers the parameter U X L X . . . .. . .. . . .. . . .. . . . . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. 4 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , its coverage probability is Pr ( θ ∈ [ L ( X ) , U ( X )])

  5. . . Summary . P3 P2 P1 E-M Recap . . . . . .. . . . .. . . .. . .. .. Definitions Definition : Coverage Probability . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang U X L X inf Pr Confidence coefficient is defined as . . . . . . . . Definition: Confidence Coefficient . In other words, the probability of a random variable in interval defined as . . . .. .. .. . . .. . . .. . . . . . .. . . .. . . .. . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . 4 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , its coverage probability is Pr ( θ ∈ [ L ( X ) , U ( X )]) [ L ( X ) , U ( X )] covers the parameter θ .

  6. . .. P1 E-M Recap . . . . . . .. . . . P3 . .. . . .. . .. .. P2 . . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang U X L X inf Pr Confidence coefficient is defined as . Definition: Confidence Coefficient Summary . In other words, the probability of a random variable in interval defined as . . Definition : Coverage Probability . Definitions . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . . .. . .. . .. . . .. . . .. . . . . .. . . .. . 4 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , its coverage probability is Pr ( θ ∈ [ L ( X ) , U ( X )]) [ L ( X ) , U ( X )] covers the parameter θ .

  7. . . Recap . . . . . . .. . . .. . P1 .. . . .. . . .. . E-M P2 .. . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang inf Confidence coefficient is defined as . . Definition: Confidence Coefficient In other words, the probability of a random variable in interval P3 defined as . . Definition : Coverage Probability . Definitions Summary . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . . 4 / 33 .. . . . .. . . . .. .. .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , its coverage probability is Pr ( θ ∈ [ L ( X ) , U ( X )]) [ L ( X ) , U ( X )] covers the parameter θ . θ ∈ Ω Pr ( θ ∈ [ L ( X ) , U ( X )])

  8. where X are random samples from f X x . Recap Definition : Confidence Interval . Definitions Summary . P3 P2 P1 E-M . . . . . . . .. . . .. . . .. .. . . .. of April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang average length of the interval estimator. . In other words, it is the L X E U X defined as , its expected length is U X Definition: Expected Length Given an interval estimator L X . . . . . . . . . . . .. . .. . . .. . . .. . . . . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. 5 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , if its confidence coefficient is 1 − α , we call it a (1 − α ) confidence interval

  9. where X are random samples from f X x . .. E-M Recap . . . . . . .. . . . . P2 .. . . .. . .. .. P1 . P3 . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang average length of the interval estimator. . In other words, it is the L X E U X defined as . . Definition: Expected Length . . . Definition : Confidence Interval . Definitions Summary . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . . .. . .. . .. . . .. . . .. . . . . .. . . .. . 5 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , if its confidence coefficient is 1 − α , we call it a (1 − α ) confidence interval Given an interval estimator [ L ( X ) , U ( X )] of θ , its expected length is

  10. where X are random samples from f X x . . Recap . . . . . . .. . . .. .. . P1 . . .. . . .. . E-M P2 .. . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang average length of the interval estimator. . In other words, it is the defined as . Definition: Expected Length P3 . . . Definition : Confidence Interval . Definitions Summary . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . . 5 / 33 . .. .. .. . . . .. .. . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , if its confidence coefficient is 1 − α , we call it a (1 − α ) confidence interval Given an interval estimator [ L ( X ) , U ( X )] of θ , its expected length is E [ U ( X ) − L ( X )]

  11. . . . . . . . . .. . . .. . .. E-M . . .. . . .. . . Recap P1 . Definition: Expected Length April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang average length of the interval estimator. defined as . . . P2 . . Definition : Confidence Interval . Definitions Summary . P3 .. .. . . . . .. . . .. . .. . . . .. . . .. . . .. .. . . . . .. . . .. . . .. 5 / 33 . . . .. . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , if its confidence coefficient is 1 − α , we call it a (1 − α ) confidence interval Given an interval estimator [ L ( X ) , U ( X )] of θ , its expected length is E [ U ( X ) − L ( X )] where X are random samples from f X ( x | θ ) . In other words, it is the

  12. . P1 . . 9.2.2 is an interval, but quite often There is no guarantee that the confidence set obtained from Theorem Confidence set and confidence interval Summary . P3 P2 E-M two-sided CI L X Recap . . . . . . .. . . .. . . .. 1 To obtain U X , we invert the . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . , where vs. H acceptance region of a test for H U X , then we invert the 3 To obtain a upper-bounded CI . acceptance region of a level . , where vs. H acceptance region of a test for H , then we invert the 2 To obtain a lower-bounded CI L X . . vs. H test for H .. . .. . . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . .. . . .. . .. . . .. . . .. . . .. . . .. . . .. . . 6 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . .

  13. . . Summary . P3 P2 P1 E-M Recap . . . . . .. There is no guarantee that the confidence set obtained from Theorem . . .. . . .. . . .. Confidence set and confidence interval 9.2.2 is an interval, but quite often . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . , where vs. H acceptance region of a test for H U X , then we invert the 3 To obtain a upper-bounded CI . . . , where vs. H acceptance region of a test for H , then we invert the 2 To obtain a lower-bounded CI L X . . . . .. .. .. .. . . .. . . .. . . . . . .. . . .. . . .. . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . 6 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . 1 To obtain (1 − α ) two-sided CI [ L ( X ) , U ( X )] , we invert the acceptance region of a level α test for H 0 : θ = θ 0 vs. H 1 : θ ̸ = θ 0

  14. . .. E-M Recap . . . . . . .. . . . P2 . .. . . .. . .. .. P1 P3 . 3 To obtain a upper-bounded CI April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . , where vs. H acceptance region of a test for H U X , then we invert the . . . . . . 9.2.2 is an interval, but quite often There is no guarantee that the confidence set obtained from Theorem Confidence set and confidence interval Summary . . .. . . .. . . .. . . .. . .. .. . . .. . . .. . . . . 6 / 33 . . .. . . .. . . .. . . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 To obtain (1 − α ) two-sided CI [ L ( X ) , U ( X )] , we invert the acceptance region of a level α test for H 0 : θ = θ 0 vs. H 1 : θ ̸ = θ 0 2 To obtain a lower-bounded CI [ L ( X ) , ∞ ) , then we invert the acceptance region of a test for H 0 : θ = θ 0 vs. H 1 : θ > θ 0 , where Ω = { θ : θ ≥ θ 0 } .

  15. . . .. . . .. . . .. . . . . . . .. . . .. . . .. . Recap . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . . . . . E-M 9.2.2 is an interval, but quite often There is no guarantee that the confidence set obtained from Theorem Confidence set and confidence interval Summary . P3 P2 P1 . .. .. .. . .. . . .. . . . .. . .. . . .. . . . . . . . .. . . .. . . .. . 6 / 33 .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 To obtain (1 − α ) two-sided CI [ L ( X ) , U ( X )] , we invert the acceptance region of a level α test for H 0 : θ = θ 0 vs. H 1 : θ ̸ = θ 0 2 To obtain a lower-bounded CI [ L ( X ) , ∞ ) , then we invert the acceptance region of a test for H 0 : θ = θ 0 vs. H 1 : θ > θ 0 , where Ω = { θ : θ ≥ θ 0 } . 3 To obtain a upper-bounded CI ( −∞ , U ( X )] , then we invert the acceptance region of a test for H 0 : θ = θ 0 vs. H 1 : θ < θ 0 , where Ω = { θ : θ ≤ θ 0 } .

  16. • For one-dimensional parameter, negative second order derivative . E-M Recap . . . . . . .. . . .. . P2 . .. . . .. . . .. P1 . P3 3 Check second-order derivative to check local maximum. April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang maximum. 4 Check boundary points to see whether boundary gives global . . implies local maximum. . . . 2 Find candidates that makes first order derivative to be zero . . . . Typical strategies for finding MLEs Summary . .. .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. 7 / 33 . . . .. . . .. . . .. . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Write the joint (log-)likelihood function, L ( θ | x ) = f X ( x | θ ) .

  17. • For one-dimensional parameter, negative second order derivative . E-M Recap . . . . . . .. . . .. . P2 . .. . . .. . . .. P1 . P3 3 Check second-order derivative to check local maximum. April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang maximum. 4 Check boundary points to see whether boundary gives global . . implies local maximum. . . . 2 Find candidates that makes first order derivative to be zero . . . . Typical strategies for finding MLEs Summary . .. .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. 7 / 33 . . . .. . . .. . . .. . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Write the joint (log-)likelihood function, L ( θ | x ) = f X ( x | θ ) .

  18. . .. E-M Recap . . . . . . .. . . . P2 . .. . . .. . . .. P1 P3 . 3 Check second-order derivative to check local maximum. April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang maximum. 4 Check boundary points to see whether boundary gives global . . implies local maximum. . . . 2 Find candidates that makes first order derivative to be zero . . . . Typical strategies for finding MLEs Summary . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . . .. 7 / 33 . . .. . . .. . . .. . . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Write the joint (log-)likelihood function, L ( θ | x ) = f X ( x | θ ) . • For one-dimensional parameter, negative second order derivative

  19. . .. E-M Recap . . . . . . .. . . . P2 . .. . . .. . . .. P1 P3 . 3 Check second-order derivative to check local maximum. April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang maximum. 4 Check boundary points to see whether boundary gives global . . implies local maximum. . . . 2 Find candidates that makes first order derivative to be zero . . . . Typical strategies for finding MLEs Summary . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . . .. 7 / 33 . . .. . . .. . . .. . . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Write the joint (log-)likelihood function, L ( θ | x ) = f X ( x | θ ) . • For one-dimensional parameter, negative second order derivative

  20. . .. .. . . .. . . . . . .. . . .. . . .. .. P3 April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Example: A mixture distribution Summary . P2 . P1 E-M Recap . . . . . . .. . .. . . . . . .. . . .. . . .. . . .. . . . .. . .. .. . . .. . . .. . 8 / 33 . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  21. . . .. . . .. . . .. . . . . . . .. . . .. . .. .. . Recap . mixture proportion of each component April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components parameters shared among components parameters specific to each component f the probability density function x observed data E-M k A general mixture distribution Summary . P3 P2 P1 . . .. . . . .. . . .. . .. .. . . .. . . .. . . . 9 / 33 . . . .. . . .. . . .. . . .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1

  22. . . .. . . .. . . .. . . . . . . .. . . .. . .. .. . Recap . mixture proportion of each component April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components parameters shared among components parameters specific to each component f the probability density function x observed data E-M k A general mixture distribution Summary . P3 P2 P1 . . .. . . . .. . . .. . .. .. . . .. . . .. . . . 9 / 33 . . . .. . . .. . . .. . . .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1

  23. . . . . .. . . .. . .. . . . .. . .. .. . .. . . . . . .. x observed data April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components parameters shared among components parameters specific to each component f the probability density function k Recap A general mixture distribution Summary . P3 P2 P1 E-M . . . .. . .. . . .. . . . . . .. . . .. . . . 9 / 33 .. .. .. . . .. . . .. . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1 π mixture proportion of each component

  24. . . . . .. . . .. . .. . . . .. . .. .. . .. . . . . . .. x observed data April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components parameters shared among components parameters specific to each component f the probability density function k Recap A general mixture distribution Summary . P3 P2 P1 E-M . . . .. . .. . . .. . . . . . .. . . .. . . . 9 / 33 .. .. .. . . .. . . .. . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1 π mixture proportion of each component

  25. . . . .. . . .. . .. .. . . .. . .. .. . . . .. k April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components parameters shared among components f the probability density function x observed data A general mixture distribution . . . . . Summary . P3 P2 P1 E-M Recap . . . .. . . . . .. . . . .. . .. . . .. . . . .. . . .. . . .. . . .. . . 9 / 33 .. . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1 π mixture proportion of each component φ parameters specific to each component

  26. . .. .. . . .. . . . . . .. . .. .. . . . .. . A general mixture distribution April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components f the probability density function x observed data k Summary . . P3 P2 P1 E-M Recap . . . . . .. . . .. .. .. . .. . . . . . .. . . .. . . . . .. .. . . .. . . .. . . . 9 / 33 . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1 π mixture proportion of each component φ parameters specific to each component η parameters shared among components

  27. . .. .. . . .. . . . . . .. . .. .. . . . .. . A general mixture distribution April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components f the probability density function x observed data k Summary . . P3 P2 P1 E-M Recap . . . . . .. . . .. .. .. . .. . . . . . .. . . .. . . . . .. .. . . .. . . .. . . . 9 / 33 . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1 π mixture proportion of each component φ parameters specific to each component η parameters shared among components

  28. f i x . .. P3 P2 P1 E-M Recap . . . . . . . Summary . .. . . .. . .. .. . . . MLE Problem for mixture of normals i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . Find MLEs for i i n i .. x exp i i i k . . Problem . . . .. . . . .. . . .. . . . . . .. . . .. . . .. . . . .. .. .. .. . . .. . . .. . . .. . . . . . .. . 10 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | θ = ( π, µ, σ 2 )) π i f i ( x | µ i , σ 2 = i ) i =1

  29. . .. E-M Recap . . . . . . .. . . . P2 .. .. . . .. . . .. P1 P3 . i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . Find MLEs for i i n exp . i k . . Problem . MLE Problem for mixture of normals Summary . . .. . . .. . . . . . .. . .. .. . . .. . . .. . . . .. 10 / 33 . . . .. . . .. . . .. .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | θ = ( π, µ, σ 2 )) π i f i ( x | µ i , σ 2 = i ) i =1 − ( x − µ i ) 2 1 [ ] f i ( x | µ i , σ 2 i ) = 2 σ 2 √ 2 πσ 2

  30. . .. Recap . . . . . . .. . . .. . P1 .. . . .. . . .. . E-M P2 .. exp April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . Find MLEs for n i i P3 k . . Problem . MLE Problem for mixture of normals Summary . . . . . .. . . .. . . . . .. . . . .. . . .. . . . .. 10 / 33 . . .. . . .. .. . . .. .. . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | θ = ( π, µ, σ 2 )) π i f i ( x | µ i , σ 2 = i ) i =1 − ( x − µ i ) 2 1 [ ] f i ( x | µ i , σ 2 i ) = 2 σ 2 √ 2 πσ 2 ∑ π i = 1 i =1

  31. . .. . .. . . .. .. . . Recap . .. . . .. . . . . . . . E-M . k April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang n i exp i . P1 . Problem . MLE Problem for mixture of normals Summary . P3 P2 .. . . . . . .. . . .. .. .. . . . .. . . .. . . .. . 10 / 33 .. .. . . . .. . . . . . .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | θ = ( π, µ, σ 2 )) π i f i ( x | µ i , σ 2 = i ) i =1 − ( x − µ i ) 2 1 [ ] f i ( x | µ i , σ 2 i ) = 2 σ 2 √ 2 πσ 2 ∑ π i = 1 i =1 Find MLEs for θ = ( π, µ, σ 2 ) .

  32. . .. . .. . . .. . . . Recap . .. . . .. . . . . . . . E-M . n April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang n x x i i • P1 x • • k Summary . P3 P2 .. .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. 11 / 33 . . .. . . .. .. . . .. . . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution when k = 1 ∑ π i f i ( x | µ i , σ 2 f ( x | θ ) = i ) i =1

  33. . .. .. . . .. . . . . .. .. . . .. . . .. .. P3 April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k Summary . P2 . P1 E-M Recap . . . . . . .. . . . . .. . . .. . . . . . . .. . . .. . . .. 11 / 33 . . . . .. . . . .. .. .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Solution when k = 1 ∑ π i f i ( x | µ i , σ 2 f ( x | θ ) = i ) i =1 • π = π 1 = 1 • µ = µ 1 = x • σ 2 = σ 2 i =1 ( x i − x ) 2 / n 1 = ∑ n

  34. . .. .. . . .. . . . . . .. . . .. . . . .. . n April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang sums of exponential functions. The MLE solution is not analytically tractable, because it involves multiple k Summary . . P3 P2 P1 E-M Recap . . . . . .. .. . .. .. . . .. . . . . . .. . . .. . . . .. .. . . . .. . . .. . . .. . 12 / 33 . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Incomplete data problem when k > 1   ∏ ∑ π i f i ( x i | µ j , σ 2 f ( x | θ ) = j )   i =1 j =1

  35. . .. .. . . .. . . . . . .. . . .. . . . .. . n April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang sums of exponential functions. The MLE solution is not analytically tractable, because it involves multiple k Summary . . P3 P2 P1 E-M Recap . . . . . .. .. . .. .. . . .. . . . . . .. . . .. . . . .. .. . . . .. . . .. . . .. . 12 / 33 . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Incomplete data problem when k > 1   ∏ ∑ π i f i ( x i | µ j , σ 2 f ( x | θ ) = j )   i =1 j =1

  36. j f i x i f i x i . i z i z i i n j j I z i j k n n f x z sampled from. Converting to a complete data problem Summary . P3 P2 P1 E-M Recap . . . . . i i .. i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i I z i i n i i x i I z i n I z i i i I z i i n i x i I z i i n i n i .. . . .. .. . . .. . . .. . . .. . . . . . .. . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . 13 / 33 . .. . . .. . . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was

  37. f i x i . Converting to a complete data problem n i z i z i i n k n sampled from. Summary I z i . P3 P2 P1 E-M .. Recap . . . . . . .. . i i .. I z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i I z i i n i i x i i n n i i I z i i n i x i I z i i n i . 13 / 33 . . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . .. .. . . .. . . . . . .. .. . . .. . . .. . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was   ∏ ∑ I ( z i = j ) f i ( x i | µ j , σ 2 f ( x | z , θ ) = j )   i =1 j =1

  38. . P3 i n i n k n sampled from. Converting to a complete data problem Summary . P2 i P1 E-M .. Recap . . . . . . .. . . .. I z i n . I z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i I z i i n i i x i i i n i i I z i i n i x i I z i i n . 13 / 33 .. . .. . . . .. . . .. . .. . . .. . . . .. . . .. . . . . . . . .. . . .. . . .. . . .. .. . . .. . .. . . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was   ∏ ∑ ∏ I ( z i = j ) f i ( x i | µ j , σ 2  = f i ( x i | µ z i , σ 2 f ( x | z , θ ) = j ) z i )  i =1 j =1 i =1

  39. . E-M k n sampled from. Converting to a complete data problem Summary . P3 P2 P1 .. n Recap . . . . . . .. . . .. . . .. n i . i x i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i I z i i n i I z i n i n i i I z i i n i x i I z i i . 13 / 33 .. . .. . . .. . . . .. . . . .. . . . .. . . .. . .. . .. . . .. . . .. . . .. . .. . . . . . .. .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was   ∏ ∑ ∏ I ( z i = j ) f i ( x i | µ j , σ 2  = f i ( x i | µ z i , σ 2 f ( x | z , θ ) = j ) z i )  i =1 j =1 i =1 ∑ n i =1 I ( z i = i ) ˆ = π i

  40. . . P2 P1 E-M .. . . . . . . .. . .. . . . .. . . .. . . P3 Summary . i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i I z i i n i x i Converting to a complete data problem I z i i n i n n k n sampled from. .. Recap . .. . . .. . .. . .. . . . . . .. . . .. . . .. . . .. 13 / 33 . .. . . .. . . .. . . .. . . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was   ∏ ∑ ∏ I ( z i = j ) f i ( x i | µ j , σ 2  = f i ( x i | µ z i , σ 2 f ( x | z , θ ) = j ) z i )  i =1 j =1 i =1 ∑ n i =1 I ( z i = i ) ˆ = π i ∑ n i =1 I ( z i = i ) x i µ i ˆ = ∑ n i =1 I ( z i = i )

  41. . . .. . . .. . . .. . .. .. . . .. . . .. . Recap . k April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i n n n E-M sampled from. Converting to a complete data problem Summary . P3 P2 P1 . . . . . . .. .. . .. . . .. . . . .. . . .. . . .. . . . 13 / 33 . . . . .. . .. . . .. . . .. . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was   ∏ ∑ ∏ I ( z i = j ) f i ( x i | µ j , σ 2  = f i ( x i | µ z i , σ 2 f ( x | z , θ ) = j ) z i )  i =1 j =1 i =1 ∑ n i =1 I ( z i = i ) ˆ = π i ∑ n i =1 I ( z i = i ) x i µ i ˆ = ∑ n i =1 I ( z i = i ) µ i ) 2 ∑ n i =1 I ( z i = i )( x i − ˆ σ 2 ˆ = ∑ n i =1 I ( z i = i )

  42. . . .. . . .. . . .. . .. .. . . .. . . .. . Recap . k April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i n n n E-M sampled from. Converting to a complete data problem Summary . P3 P2 P1 . . . . . . .. .. . .. . . .. . . . .. . . .. . . .. . . . 13 / 33 . . . . .. . .. . . .. . . .. . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was   ∏ ∑ ∏ I ( z i = j ) f i ( x i | µ j , σ 2  = f i ( x i | µ z i , σ 2 f ( x | z , θ ) = j ) z i )  i =1 j =1 i =1 ∑ n i =1 I ( z i = i ) ˆ = π i ∑ n i =1 I ( z i = i ) x i µ i ˆ = ∑ n i =1 I ( z i = i ) µ i ) 2 ∑ n i =1 I ( z i = i )( x i − ˆ σ 2 ˆ = ∑ n i =1 I ( z i = i )

  43. • A procedure for typically solving for the MLE. • Guaranteed to converge the MLE (!) • Particularly suited to the ”missing data” problems where analytic .. . . .. . . .. . . . . . . . . .. . . .. . . . E-M Recap The algorithm was derived and used in various special cases by a number April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Statistical Society Series B (1977). seminal paper by Dempster, Laird, and Rubin in Journal of Royal of authors, but it was not identified as a general algorithm until the solution of MLE is not tractable . E-M (Expectation-Maximization) algorithm is E-M Algorithm Summary . P3 P2 P1 .. .. . . . . .. . . .. . .. .. . . .. . . .. . . .. . 14 / 33 . . . .. . . .. . . .. . . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . .

  44. • Guaranteed to converge the MLE (!) • Particularly suited to the ”missing data” problems where analytic . . .. . . .. . . . . .. . . .. . . .. .. Recap . . . . . solution of MLE is not tractable April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Statistical Society Series B (1977). seminal paper by Dempster, Laird, and Rubin in Journal of Royal of authors, but it was not identified as a general algorithm until the The algorithm was derived and used in various special cases by a number E-M (Expectation-Maximization) algorithm is . E-M Algorithm Summary . P3 P2 P1 E-M . .. .. .. . .. . . .. . . . . . .. . . .. . . . .. . . . .. . . .. . . .. . .. . . .. . . .. . 14 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • A procedure for typically solving for the MLE.

  45. • Particularly suited to the ”missing data” problems where analytic . . . .. . . .. . . . .. . . .. . . .. .. Recap . . . . . solution of MLE is not tractable April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Statistical Society Series B (1977). seminal paper by Dempster, Laird, and Rubin in Journal of Royal of authors, but it was not identified as a general algorithm until the The algorithm was derived and used in various special cases by a number E-M (Expectation-Maximization) algorithm is . E-M Algorithm Summary . P3 P2 P1 E-M . .. .. .. . .. . . .. . . . .. . .. . . .. . . . . . . . .. . . .. . . .. . . .. . . .. . . .. 14 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • A procedure for typically solving for the MLE. • Guaranteed to converge the MLE (!)

  46. . . . . .. . . .. . .. . . . .. . . .. . .. . . . . . .. solution of MLE is not tractable April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Statistical Society Series B (1977). seminal paper by Dempster, Laird, and Rubin in Journal of Royal of authors, but it was not identified as a general algorithm until the The algorithm was derived and used in various special cases by a number E-M (Expectation-Maximization) algorithm is Recap E-M Algorithm Summary . P3 P2 P1 E-M . .. . .. . .. . . .. . . . .. . .. . . .. . . . . . .. .. . . .. . . .. . . . . . .. . . .. 14 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • A procedure for typically solving for the MLE. • Guaranteed to converge the MLE (!) • Particularly suited to the ”missing data” problems where analytic

  47. . . . . .. . . .. . .. . . . .. . . .. . .. . . . . . .. solution of MLE is not tractable April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Statistical Society Series B (1977). seminal paper by Dempster, Laird, and Rubin in Journal of Royal of authors, but it was not identified as a general algorithm until the The algorithm was derived and used in various special cases by a number E-M (Expectation-Maximization) algorithm is Recap E-M Algorithm Summary . P3 P2 P1 E-M . .. . .. . .. . . .. . . . .. . .. . . .. . . . . . .. .. . . .. . . .. . . . . . .. . . .. 14 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • A procedure for typically solving for the MLE. • Guaranteed to converge the MLE (!) • Particularly suited to the ”missing data” problems where analytic

  48. • Complete data likelihood : f x • Incomplete data likelihood : g y . . Overview of E-M Algorithm Summary . P3 P2 P1 E-M Recap . . . . . . . .. . . .. . . .. .. Basic Structure . . f y z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . g y y We are interested in MLE for L d z f y z .. . . . . . . . . Complete and incomplete data likelihood . . . .. . .. . . .. . . .. . . . . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. 15 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • y is observed (or incomplete) data • z is missing (or augmented) data • x = ( y , z ) is complete data

  49. • Incomplete data likelihood : g y . P1 E-M Recap . . . . . . .. . . .. P3 . . .. . . .. .. . P2 Summary . f y z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . g y y We are interested in MLE for L d z . . . Complete and incomplete data likelihood . . . Basic Structure . Overview of E-M Algorithm .. . . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. .. 15 / 33 . . . .. . . .. . . .. . . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . • y is observed (or incomplete) data • z is missing (or augmented) data • x = ( y , z ) is complete data • Complete data likelihood : f ( x | θ ) = f ( y , z | θ )

  50. . . Recap . . . . . . .. . . .. . P1 .. . . .. .. . .. . E-M P2 .. . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . g y y We are interested in MLE for L . Complete and incomplete data likelihood P3 . . . Basic Structure . Overview of E-M Algorithm Summary . . . . . .. . . .. . . .. . .. . . . .. . . .. . . . . .. .. .. . . .. . . .. . . . 15 / 33 .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • y is observed (or incomplete) data • z is missing (or augmented) data • x = ( y , z ) is complete data • Complete data likelihood : f ( x | θ ) = f ( y , z | θ ) ∫ • Incomplete data likelihood : g ( y | θ ) = f ( y , z | θ ) d z

  51. . . .. . . .. . . .. . . . . . . .. . . .. . . .. . Recap . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . . Complete and incomplete data likelihood . . E-M Basic Structure . Overview of E-M Algorithm Summary . P3 P2 P1 . .. .. .. . .. . . .. . . . .. . .. . . .. . . . . . . . .. . . .. . . .. . 15 / 33 .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . • y is observed (or incomplete) data • z is missing (or augmented) data • x = ( y , z ) is complete data • Complete data likelihood : f ( x | θ ) = f ( y , z | θ ) ∫ • Incomplete data likelihood : g ( y | θ ) = f ( y , z | θ ) d z We are interested in MLE for L ( θ | y ) = g ( y | θ ) .

  52. . P1 k z g y y L Maximizing incomplete data likelihood Summary . P3 P2 E-M f y z Recap . . . . . . .. . . .. . .. .. y g y . y Z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M y y E log k Z y E log L log L y log L y , creating the new identity under k z Because z is missing data, we replace the right side with its expectation y log k z y z log L y . . .. . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . . .. . . . .. . . .. . . .. . . .. 16 / 33 . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) =

  53. . Recap y k z Maximizing incomplete data likelihood Summary . P3 P2 P1 E-M . . . . . g y . .. . . .. . .. .. . f y z log L .. y Z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M y y E log k Z y E log L y y log L y , creating the new identity under k z Because z is missing data, we replace the right side with its expectation y log k z y z log L . . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . . 16 / 33 .. . .. . . .. . . .. . . .. . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) = L ( θ | y ) = g ( y | θ )

  54. . . Summary . P3 P2 P1 E-M Recap . . . . . .. log L . . .. . .. .. . . .. Maximizing incomplete data likelihood y . y April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M y y E log k Z y Z log L E log L y log L y , creating the new identity under k z Because z is missing data, we replace the right side with its expectation y log k z y z . . .. .. . .. . . .. . . .. . . . . . .. . . .. . . .. . . . 16 / 33 .. .. . .. . . .. . . .. . . .. . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) = L ( θ | y ) = g ( y | θ ) f ( y , z | θ ) k ( z | θ, y ) = g ( y | θ )

  55. . .. E-M Recap . . . . . . .. . . . P2 . .. .. . .. . . .. P1 P3 . y April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M y y E log k Z y Z . E log L y log L y , creating the new identity under k z Because z is missing data, we replace the right side with its expectation Maximizing incomplete data likelihood Summary . . .. . . .. . . . .. . . .. . .. .. . . .. . . .. . . . 16 / 33 . . .. . . .. . . .. . .. . . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) = L ( θ | y ) = g ( y | θ ) f ( y , z | θ ) k ( z | θ, y ) = g ( y | θ ) log L ( θ | y ) = log L ( θ | y , z ) − log k ( z | θ, y )

  56. . . Recap . . . . . . .. . . .. . P1 .. .. . .. . . .. . E-M P2 .. E log k Z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M y y y P3 y Z E log L y log L Because z is missing data, we replace the right side with its expectation Maximizing incomplete data likelihood Summary . . . . . .. . . . . . .. . .. . . . .. . . .. . . . .. 16 / 33 . .. . . .. . . .. . .. . . . .. . . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) = L ( θ | y ) = g ( y | θ ) f ( y , z | θ ) k ( z | θ, y ) = g ( y | θ ) log L ( θ | y ) = log L ( θ | y , z ) − log k ( z | θ, y ) under k ( z | θ ′ , y ) , creating the new identity

  57. . .. .. . . .. . . . . . .. . . .. . . . .. . Maximizing incomplete data likelihood April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M E Because z is missing data, we replace the right side with its expectation Summary . . P3 P2 P1 E-M Recap . . . . . .. .. . .. .. . . .. . . . . . .. . . .. . . . .. .. .. . . .. . . .. . . . 16 / 33 . . . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) = L ( θ | y ) = g ( y | θ ) f ( y , z | θ ) k ( z | θ, y ) = g ( y | θ ) log L ( θ | y ) = log L ( θ | y , z ) − log k ( z | θ, y ) under k ( z | θ ′ , y ) , creating the new identity [ ] [ ] log L ( θ | y ) = log L ( θ | y , Z ) | θ ′ , y − E log k ( Z | θ, y ) | θ ′ , y

  58. . .. .. . . .. . . . . . .. . . .. . . . .. . Maximizing incomplete data likelihood April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M E Because z is missing data, we replace the right side with its expectation Summary . . P3 P2 P1 E-M Recap . . . . . .. .. . .. .. . . .. . . . . . .. . . .. . . . .. .. .. . . .. . . .. . . . 16 / 33 . . . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) = L ( θ | y ) = g ( y | θ ) f ( y , z | θ ) k ( z | θ, y ) = g ( y | θ ) log L ( θ | y ) = log L ( θ | y , z ) − log k ( z | θ, y ) under k ( z | θ ′ , y ) , creating the new identity [ ] [ ] log L ( θ | y ) = log L ( θ | y , Z ) | θ ′ , y − E log k ( Z | θ, y ) | θ ′ , y

  59. • Let f y z r is the estimation of • Q r . . . . . . . Summary . P3 P2 P1 E-M Recap . . .. . . .. . . .. . .. Overview of E-M Algorithm (cont’d) . Objective r April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang on the observed data and is the expected log-likelihood of complete data , conditioning r in r -th iteration. where y . E log f y Z r Q function y directly, we work with the surrogate rather than working with l denotes the pdf of complete data. In E-M algorithm, . .. . . .. . .. . . .. . . .. . . . .. . .. . . .. . . .. . . . .. . . . . .. . . .. . . .. . .. . . . .. . . .. 17 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • Maximize L ( θ | y ) or l ( θ | y ) .

  60. r is the estimation of • Q r . . P2 P1 E-M Recap . . . . . . .. . . .. . . . .. . . .. . P3 . Summary r April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang on the observed data and is the expected log-likelihood of complete data , conditioning r in r -th iteration. where y Overview of E-M Algorithm (cont’d) E log f y Z r Q function . . Objective .. . .. . .. .. . . .. . . .. . . . . . .. . . .. . . .. . . . . .. . .. . . .. . . .. . . .. . .. . . . .. . 17 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • Maximize L ( θ | y ) or l ( θ | y ) . • Let f ( y , z | θ ) denotes the pdf of complete data. In E-M algorithm, rather than working with l ( θ | y ) directly, we work with the surrogate

  61. r is the estimation of • Q r . . Recap . . . . . . .. . . .. . . P1 .. . . .. . . .. E-M . P2 where April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang on the observed data and is the expected log-likelihood of complete data , conditioning r in r -th iteration. E P3 function . . Objective . Overview of E-M Algorithm (cont’d) Summary . . .. .. . . .. . . .. . . .. . .. .. . . .. . . .. . . . . . . . .. . . .. . . .. . . .. 17 / 33 .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . • Maximize L ( θ | y ) or l ( θ | y ) . • Let f ( y , z | θ ) denotes the pdf of complete data. In E-M algorithm, rather than working with l ( θ | y ) directly, we work with the surrogate [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) =

  62. • Q r . . . . . . . . . .. . . .. . . .. E-M . .. .. . .. . Recap P2 P1 function April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang on the observed data and is the expected log-likelihood of complete data , conditioning r E . .. . Objective . Overview of E-M Algorithm (cont’d) Summary . P3 . . . . . . . .. . . .. . .. . . . .. . . .. . . . .. 17 / 33 .. .. . . .. . . .. . . .. . . . . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . • Maximize L ( θ | y ) or l ( θ | y ) . • Let f ( y , z | θ ) denotes the pdf of complete data. In E-M algorithm, rather than working with l ( θ | y ) directly, we work with the surrogate [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = where θ ( r ) is the estimation of θ in r -th iteration.

  63. . . . . .. . . .. . .. . . . .. . . .. . .. . . . . . .. Objective April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang E function . . . Recap Overview of E-M Algorithm (cont’d) Summary . P3 P2 P1 E-M . .. . .. .. . . . .. . . . . . .. . . .. . . . 17 / 33 .. .. .. . . .. . . .. . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . • Maximize L ( θ | y ) or l ( θ | y ) . • Let f ( y , z | θ ) denotes the pdf of complete data. In E-M algorithm, rather than working with l ( θ | y ) directly, we work with the surrogate [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = where θ ( r ) is the estimation of θ in r -th iteration. • Q ( θ | θ ( r ) ) is the expected log-likelihood of complete data , conditioning on the observed data and θ ( r ) .

  64. • Maximize Q • The arg max Q • Repeat E-step until convergence . E-M Expectation Step . Key Steps of E-M algorithm Summary . P3 P2 P1 . . . . . Recap . . .. . . .. . .. .. . Maximization Step . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang E-step. to be fed into the -th will be the r r with respect to . r . . . . . . . . . . .. . . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . .. 18 / 33 . .. .. . . .. . . . .. . . . . .. . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Compute Q ( θ | θ ( r ) ) . • This typically involves in estimating the conditional distribution Z | Y , assuming θ = θ ( r ) . • After computing Q ( θ | θ ( r ) ) , move to the M-step

  65. . .. . .. . . .. . . . Recap . .. . . .. . . . . . . . E-M . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang E-step. . . Maximization Step . P1 . Expectation Step . Key Steps of E-M algorithm Summary . P3 P2 .. .. . . .. . .. . . .. . .. . . . .. . . .. . . .. . 18 / 33 .. .. . . . .. . . . . . .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Compute Q ( θ | θ ( r ) ) . • This typically involves in estimating the conditional distribution Z | Y , assuming θ = θ ( r ) . • After computing Q ( θ | θ ( r ) ) , move to the M-step • Maximize Q ( θ | θ ( r ) ) with respect to θ . • The arg max θ Q ( θ | θ ( r ) ) will be the ( r + 1) -th θ to be fed into the • Repeat E-step until convergence

  66. r y log f y z r y i log f y i z i f y i z i log f y i z i f y i z i i f y i z i . z E . . E-step . E-M algorithm for mixture of normals Summary P2 . P3 P1 E-M Recap . . . . . . .. . .. .. k z k n z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j k g y i z i r i g y i r z i k i n k z i z i . . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . . .. . .. .. . . . . .. 19 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) =

  67. r y i log f y i z i f y i z i log f y i z i f y i z i i f y i z i P2 . . E-step . E-M algorithm for mixture of normals Summary . P3 . P1 z E-M Recap . . . . . . .. . .. .. . E k n z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j k g y i z i r i g y i r z i k i n k z i z i .. . . . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. 19 / 33 . .. .. . . .. . .. . . . . . .. . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = ∑ k ( z | θ ( r ) , y ) log f ( y , z | θ ) =

  68. f y i z i log f y i z i f y i z i i f y i z i . E-M algorithm for mixture of normals Summary . P3 P2 P1 E-M Recap . . . . . . . . .. .. . .. . . .. E-step z . z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j k g y i z i r E g y i r z i k i n k n . . . .. . . . .. . . .. . . .. . .. .. . . .. . . .. . . .. . . . 19 / 33 . . . .. . . .. . . .. . . .. . . .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = ∑ k ( z | θ ( r ) , y ) log f ( y , z | θ ) = ∑ ∑ k ( z i | θ ( r ) , y i ) log f ( y i , z i | θ ) = i =1 z i =1

  69. f y i z i i f y i z i . P3 P2 P1 E-M Recap . . . . . . .. .. . Summary .. . . .. . . .. . . . E-M algorithm for mixture of normals z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j k g y i z i k .. n k n z E . . E-step . . . .. . . .. . . . .. . . . . . .. . . .. . . .. . . .. 19 / 33 . .. .. . .. . . . .. . . . .. . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = ∑ k ( z | θ ( r ) , y ) log f ( y , z | θ ) = ∑ ∑ k ( z i | θ ( r ) , y i ) log f ( y i , z i | θ ) = i =1 z i =1 f ( y i , z i | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1

  70. i f y i z i . . P1 E-M Recap . . . . . . .. . . .. P3 . .. . . .. . . P2 Summary . n April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j k g y i k k . n z E . . E-step . E-M algorithm for mixture of normals .. .. . .. . . .. . . .. . .. . . . . .. . . .. . . .. . . .. 19 / 33 . .. . .. . .. . .. . . . . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = ∑ k ( z | θ ( r ) , y ) log f ( y , z | θ ) = ∑ ∑ k ( z i | θ ( r ) , y i ) log f ( y i , z i | θ ) = i =1 z i =1 f ( y i , z i | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 N ( µ z i , σ 2 f ( y i , z i | θ ) ∼ z i )

  71. . . Recap . . . . . . .. . . .. . P1 .. . . .. . . .. . E-M P2 .. n April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k k n k z P3 E . . E-step . E-M algorithm for mixture of normals Summary . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . . . 19 / 33 .. .. . . .. . . .. . . .. . . . . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = ∑ k ( z | θ ( r ) , y ) log f ( y , z | θ ) = ∑ ∑ k ( z i | θ ( r ) , y i ) log f ( y i , z i | θ ) = i =1 z i =1 f ( y i , z i | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 N ( µ z i , σ 2 f ( y i , z i | θ ) ∼ z i ) ∑ g ( y i | θ ) = π i f ( y i , z i = j | θ ) j =1

  72. f y i z i x i k z i g y i r r j n i j y i j r n i k z i j y i r r . i n . .. n k r j n i i k z i j y i r n n n x i k z i . j y i n i x i r j k z i r j y i n r j Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 r k z i j y i n r n r j r j i i x i r j k z i j y i r n M-step . E-M algorithm for mixture of normals (cont’d) . .. . . .. . . .. . . .. . . .. . . .. . .. . . . . .. . . .. . .. . . . .. . . .. . . .. Summary .. . P3 P2 P1 E-M Recap . . .. . . . . . . . . .. . .. . . . 20 / 33 . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . f ( y i , z i | θ ( r ) ) Q ( θ | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1

  73. . j x i k z i i n r j y i k z i i n r j y i x i k z i i n r r n n n n j k n .. . M-step . E-M algorithm for mixture of normals (cont’d) Summary . j y i n P2 r April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j r n r j y i k z i j r x i i n j y i r k z i i n r j y i k z i j r x i i n j r j P3 . 20 / 33 .. . .. . . .. . . . .. . .. . . .. . . .. . . . .. . . .. . . .. . . . . . .. . . .. . . .. . .. .. . . .. . . . . . . . . Recap .. . . E-M P1 . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . f ( y i , z i | θ ( r ) ) Q ( θ | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 f ( y i , z i = j | θ ( r ) ) 1 k ( z i = j | y i , θ ( r ) ) = 1 π ( r +1) ∑ ∑ = g ( y i | θ ( r ) ) i =1 i =1

  74. . . r j j n n n n j k n . M-step n . E-M algorithm for mixture of normals (cont’d) .. . P3 P2 P1 E-M Recap . . . . . . j i . x i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j r n r j y i k z i j r i x i n r j y i k z i i n r j y i k z i j r .. Summary . . . . .. . . .. . . .. . . .. . .. .. . .. . . . .. . . .. . .. .. . . .. . . . 20 / 33 . . . . .. . . .. . . .. . . .. . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . f ( y i , z i | θ ( r ) ) Q ( θ | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 f ( y i , z i = j | θ ( r ) ) 1 k ( z i = j | y i , θ ( r ) ) = 1 π ( r +1) ∑ ∑ = g ( y i | θ ( r ) ) i =1 i =1 i =1 x i k ( z i = j | y i , θ ( r ) ) i =1 x i k ( z i = j | y i , θ ( r ) ) ∑ n ∑ n µ ( r +1) = i =1 k ( z i = j | y i , θ ( r ) ) = n π ( r +1) ∑ n

  75. . P2 . . M-step . E-M algorithm for mixture of normals (cont’d) Summary . P3 .. P1 k E-M Recap . . . . . . .. . . .. . . n j . j April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j r n r j y i k z i r n x i i n j j j j n n n .. 20 / 33 . . . .. . .. . .. . . .. . . . . .. .. .. . . . . .. . . . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . f ( y i , z i | θ ( r ) ) Q ( θ | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 f ( y i , z i = j | θ ( r ) ) 1 k ( z i = j | y i , θ ( r ) ) = 1 π ( r +1) ∑ ∑ = g ( y i | θ ( r ) ) i =1 i =1 i =1 x i k ( z i = j | y i , θ ( r ) ) i =1 x i k ( z i = j | y i , θ ( r ) ) ∑ n ∑ n µ ( r +1) = i =1 k ( z i = j | y i , θ ( r ) ) = n π ( r +1) ∑ n i =1 ( x i − µ ( r +1) ) 2 k ( z i = j | y i , θ ( r ) ) ∑ n σ 2 , ( r +1) = i =1 k ( z i = j | y i , θ ( r ) ) ∑ n

  76. . .. P3 P2 .. P1 E-M Recap . . . . . . . Summary . .. . . .. . . .. . . E-M algorithm for mixture of normals (cont’d) .. j April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j j j j n . n n n j k n . . M-step . 20 / 33 . . .. . . .. . . .. . . . .. .. . . .. . . . . . .. .. .. .. . . .. . . .. . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . f ( y i , z i | θ ( r ) ) Q ( θ | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 f ( y i , z i = j | θ ( r ) ) 1 k ( z i = j | y i , θ ( r ) ) = 1 π ( r +1) ∑ ∑ = g ( y i | θ ( r ) ) i =1 i =1 i =1 x i k ( z i = j | y i , θ ( r ) ) i =1 x i k ( z i = j | y i , θ ( r ) ) ∑ n ∑ n µ ( r +1) = i =1 k ( z i = j | y i , θ ( r ) ) = n π ( r +1) ∑ n i =1 ( x i − µ ( r +1) ) 2 k ( z i = j | y i , θ ( r ) ) ∑ n σ 2 , ( r +1) = i =1 k ( z i = j | y i , θ ( r ) ) ∑ n i =1 ( x i − µ ( r +1) ) 2 k ( z i = j | y i , θ ( r ) ) ∑ n = n π ( r +1)

  77. r y r y r y Z r y r y converges monotonically . . . . . . Summary . P3 P2 P1 E-M Recap .. . . . . .. . . .. .. Does E-M iteration converge to MLE? L Theorem 7.2.20 - Monotonic EM sequence y Z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . y for some stationary point to L Theorem 7.5.2 further guarantees that L E log L r . E log L value of the maximized expected complete-data log likelihood, that is with equality holding if and only if successive iterations yield the same L y r .. . . . . .. . .. . . .. . . .. . . . . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . 21 / 33 . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . The sequence { ˆ θ ( r ) } defined by the E-M procedure satisfies

  78. r y r y Z r y r y converges monotonically . P3 P2 P1 E-M Recap . . . . . . .. . . Summary .. .. . .. . . .. . Theorem 7.2.20 - Monotonic EM sequence Does E-M iteration converge to MLE? y Z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . y for some stationary point to L Theorem 7.5.2 further guarantees that L E log L r . E log L value of the maximized expected complete-data log likelihood, that is with equality holding if and only if successive iterations yield the same L L . . . . . .. .. .. . . . .. . . .. . . . . . .. . . .. . . .. . . . 21 / 33 .. . . .. . . .. . . .. . . .. . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . The sequence { ˆ θ ( r ) } defined by the E-M procedure satisfies ( ) ( ) ˆ ˆ θ ( r +1) | y θ ( r ) | y ≥

  79. r y r y Z r y r y converges monotonically . P3 P2 P1 E-M Recap . . . . . . .. . . Summary .. .. . .. . . .. . Theorem 7.2.20 - Monotonic EM sequence Does E-M iteration converge to MLE? y Z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . y for some stationary point to L Theorem 7.5.2 further guarantees that L E log L r . E log L value of the maximized expected complete-data log likelihood, that is with equality holding if and only if successive iterations yield the same L L . . . . . .. .. .. . . . .. . . .. . . . . . .. . . .. . . .. . . . 21 / 33 .. . . .. . . .. . . .. . . .. . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . The sequence { ˆ θ ( r ) } defined by the E-M procedure satisfies ( ) ( ) ˆ ˆ θ ( r +1) | y θ ( r ) | y ≥

  80. r y converges monotonically . . P3 P2 P1 E-M .. . . . . . . .. . Summary .. . . .. . . .. . . Does E-M iteration converge to MLE? .. E April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . y for some stationary point to L Theorem 7.5.2 further guarantees that L log L log L . E value of the maximized expected complete-data log likelihood, that is with equality holding if and only if successive iterations yield the same L L . . Theorem 7.2.20 - Monotonic EM sequence . Recap . .. . . .. . . . .. . . . . . .. . . .. . . .. . . .. 21 / 33 . . . . . . .. . .. . . .. . .. .. . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . The sequence { ˆ θ ( r ) } defined by the E-M procedure satisfies ( ) ( ) ˆ ˆ θ ( r +1) | y θ ( r ) | y ≥ [ ( ) ] [ ( ) ] ˆ | ˆ ˆ | ˆ θ ( r +1) | y , Z θ ( r ) , y θ ( r ) | y , Z θ ( r ) , y =

  81. . . .. . . . . . . .. . . .. . P1 .. . . .. . . .. . E-M P2 .. with equality holding if and only if successive iterations yield the same April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang log L E log L E value of the maximized expected complete-data log likelihood, that is L P3 L . . Theorem 7.2.20 - Monotonic EM sequence . Does E-M iteration converge to MLE? Summary . . Recap . . .. . . .. . . .. . . . .. . . .. . . .. . . . 21 / 33 . . .. .. .. . . .. . . .. . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . The sequence { ˆ θ ( r ) } defined by the E-M procedure satisfies ( ) ( ) ˆ ˆ θ ( r +1) | y θ ( r ) | y ≥ [ ( ) ] [ ( ) ] ˆ | ˆ ˆ | ˆ θ ( r +1) | y , Z θ ( r ) , y θ ( r ) | y , Z θ ( r ) , y = Theorem 7.5.2 further guarantees that L (ˆ θ ( r ) | y ) converges monotonically to L (ˆ θ | y ) for some stationary point ˆ θ .

  82. user@host~/> ./mixEM ./mix.dat Maximum log-likelihood = 3043.46, at pi = (0.667842,0.332158) between N(-0.0299457,1.00791) and N(5.0128,0.913825) . . P2 P1 E-M Recap . . . . . . .. . . .. . . .. . . .. . P3 A working example (from BIOSTAT615/815 Fall 2012) Summary . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . . . . . .. . . Running example of implemented software . . . Example Data (n=1,500) . .. . . .. .. . . .. . . .. . . . . . .. . . .. . . .. . . . .. . .. .. . . .. . . .. . . . . .. . . .. . . 22 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . .

  83. . . .. . . .. . . .. . . . . . . .. . . .. . .. .. . Recap . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . . Running example of implemented software . . E-M Example Data (n=1,500) . A working example (from BIOSTAT615/815 Fall 2012) Summary . P3 P2 P1 . . .. . . . .. . . .. . .. .. . . .. . . .. . . . 22 / 33 . . .. . . .. . . .. . . .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . user@host~/> ./mixEM ./mix.dat Maximum log-likelihood = 3043.46, at pi = (0.667842,0.332158) between N(-0.0299457,1.00791) and N(5.0128,0.913825)

  84. • Can we use the Cramer-Rao bound? No, because the • Then, can we use complete sufficient statistics? . Summary . Strategy to solve the problem . . . Problem . Practice Problem 1 . . P3 P2 P1 E-M Recap . . . . . . .. .. . .. . . . T April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . T T such that E 3 or Make a function . . E W T , and compute . 2 For a trivial unbiased estimator W for . . 1 Find a complete sufficient statistic T . . . interchangeability condition does not hold . . . . . .. .. .. . . .. . . .. . . .. . . . . . .. . . .. . . .. . . .. . . . .. . .. . .. . . .. . . .. . . . 23 / 33 . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . Let X 1 , · · · , X n be a random sample from a population with pdf f ( x | θ ) = 1 − θ < x < θ, θ > 0 2 θ Find, if one exists, a best unbiased estimator of θ .

  85. • Then, can we use complete sufficient statistics? . P1 . Problem . Practice Problem 1 Summary . P3 P2 E-M . Recap . . . . . . .. . . .. .. . .. . . Strategy to solve the problem E W T April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . T T such that E 3 or Make a function . . T . , and compute 2 For a trivial unbiased estimator W for . . 1 Find a complete sufficient statistic T . . . interchangeability condition does not hold No, because the . . . .. . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . . 23 / 33 .. . . . .. . . . . .. . . .. . . . .. . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Let X 1 , · · · , X n be a random sample from a population with pdf f ( x | θ ) = 1 − θ < x < θ, θ > 0 2 θ Find, if one exists, a best unbiased estimator of θ . • Can we use the Cramer-Rao bound?

  86. • Then, can we use complete sufficient statistics? . P1 . Problem . Practice Problem 1 Summary . P3 P2 E-M . Recap . . . . . . .. . . .. .. . .. . Strategy to solve the problem . E W T April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . T T such that E 3 or Make a function . . T . , and compute 2 For a trivial unbiased estimator W for . . 1 Find a complete sufficient statistic T . . . interchangeability condition does not hold . . . .. . . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . .. 23 / 33 . . . .. . . .. . . .. . . .. . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . Let X 1 , · · · , X n be a random sample from a population with pdf f ( x | θ ) = 1 − θ < x < θ, θ > 0 2 θ Find, if one exists, a best unbiased estimator of θ . • Can we use the Cramer-Rao bound? No, because the

  87. . Recap Problem . Practice Problem 1 Summary . P3 P2 P1 E-M . . . . . . . .. . . .. .. . .. . . . .. E W T April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . T T such that E 3 or Make a function . . T Strategy to solve the problem , and compute 2 For a trivial unbiased estimator W for . . 1 Find a complete sufficient statistic T . . . interchangeability condition does not hold . . . . . . . . .. . . .. . . .. . .. .. . . .. . . .. . . .. . . . 23 / 33 . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Let X 1 , · · · , X n be a random sample from a population with pdf f ( x | θ ) = 1 − θ < x < θ, θ > 0 2 θ Find, if one exists, a best unbiased estimator of θ . • Can we use the Cramer-Rao bound? No, because the • Then, can we use complete sufficient statistics?

  88. . Recap Problem . Practice Problem 1 Summary . P3 P2 P1 E-M . . . . . . . .. . . .. .. . .. . . . .. E W T April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . T T such that E 3 or Make a function . . T Strategy to solve the problem , and compute 2 For a trivial unbiased estimator W for . . 1 Find a complete sufficient statistic T . . . interchangeability condition does not hold . . . . . . . . .. . . .. . . .. . .. .. . . .. . . .. . . .. . . . 23 / 33 . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Let X 1 , · · · , X n be a random sample from a population with pdf f ( x | θ ) = 1 − θ < x < θ, θ > 0 2 θ Find, if one exists, a best unbiased estimator of θ . • Can we use the Cramer-Rao bound? No, because the • Then, can we use complete sufficient statistics?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend