Biostatistics 602 - Statistical Inference April 16th, 2013 - PowerPoint PPT Presentation

• One-sided (with upper-bound) interval . . .. . . .. . . .. . . . . . . .. . . .. . . .. . Recap . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang U X Three types of intervals that . Interval Estimator E-M . Interval Estimation Summary . P3 P2 P1 . .. .. .. . .. . . .. . . . .. . .. . . .. . . . . . .. . .. . . .. . . .. . . 3 / 33 . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . ˆ θ ( X ) is usually represented as a point estimator Let [ L ( X ) , U ( X )] , where L ( X ) and U ( X ) are functions of sample X and L ( X ) ≤ U ( X ) . Based on the observed sample x , we can make an inference θ ∈ [ L ( X ) , U ( X )] Then we call [ L ( X ) , U ( X )] an interval estimator of θ . • Two-sided interval [ L ( X ) , U ( X )] • One-sided (with lower-bound) interval [ L ( X ) , ∞ )

. . . . .. . . .. . .. . . . .. . . .. . .. . . . . . .. Interval Estimator April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Three types of intervals that . . . Recap Interval Estimation Summary . P3 P2 P1 E-M . .. . .. .. . . . .. . . . . . .. . . .. . . . 3 / 33 .. .. .. . . .. . . .. . . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . ˆ θ ( X ) is usually represented as a point estimator Let [ L ( X ) , U ( X )] , where L ( X ) and U ( X ) are functions of sample X and L ( X ) ≤ U ( X ) . Based on the observed sample x , we can make an inference θ ∈ [ L ( X ) , U ( X )] Then we call [ L ( X ) , U ( X )] an interval estimator of θ . • Two-sided interval [ L ( X ) , U ( X )] • One-sided (with lower-bound) interval [ L ( X ) , ∞ ) • One-sided (with upper-bound) interval ( −∞ , U ( X )]

. P2 defined as . . Definition : Coverage Probability . Definitions Summary . P3 P1 L X E-M Recap . . . . . . .. . . .. . . Pr U X . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang U X L X inf Pr Confidence coefficient is defined as . . . In other words, the probability of a random variable in interval . . . . Definition: Confidence Coefficient . . covers the parameter U X L X .. .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . . 4 / 33 .. .. . .. . . .. . . .. . .. . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , its coverage probability is

. Recap Definition : Coverage Probability . Definitions Summary . P3 P2 P1 E-M . . . . . . . .. . . .. . . .. .. . defined as .. . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang U X L X inf Pr Confidence coefficient is defined as . . . In other words, the probability of a random variable in interval . . . . Definition: Confidence Coefficient . . covers the parameter U X L X . . . .. . .. . . .. . . .. . . . . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. 4 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , its coverage probability is Pr ( θ ∈ [ L ( X ) , U ( X )])

. . Summary . P3 P2 P1 E-M Recap . . . . . .. . . . .. . . .. . .. .. Definitions Definition : Coverage Probability . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang U X L X inf Pr Confidence coefficient is defined as . . . . . . . . Definition: Confidence Coefficient . In other words, the probability of a random variable in interval defined as . . . .. .. .. . . .. . . .. . . . . . .. . . .. . . .. . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . 4 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , its coverage probability is Pr ( θ ∈ [ L ( X ) , U ( X )]) [ L ( X ) , U ( X )] covers the parameter θ .

. .. P1 E-M Recap . . . . . . .. . . . P3 . .. . . .. . .. .. P2 . . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang U X L X inf Pr Confidence coefficient is defined as . Definition: Confidence Coefficient Summary . In other words, the probability of a random variable in interval defined as . . Definition : Coverage Probability . Definitions . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . . .. . .. . .. . . .. . . .. . . . . .. . . .. . 4 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , its coverage probability is Pr ( θ ∈ [ L ( X ) , U ( X )]) [ L ( X ) , U ( X )] covers the parameter θ .

. . Recap . . . . . . .. . . .. . P1 .. . . .. . . .. . E-M P2 .. . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang inf Confidence coefficient is defined as . . Definition: Confidence Coefficient In other words, the probability of a random variable in interval P3 defined as . . Definition : Coverage Probability . Definitions Summary . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . . 4 / 33 .. . . . .. . . . .. .. .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , its coverage probability is Pr ( θ ∈ [ L ( X ) , U ( X )]) [ L ( X ) , U ( X )] covers the parameter θ . θ ∈ Ω Pr ( θ ∈ [ L ( X ) , U ( X )])

where X are random samples from f X x . Recap Definition : Confidence Interval . Definitions Summary . P3 P2 P1 E-M . . . . . . . .. . . .. . . .. .. . . .. of April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang average length of the interval estimator. . In other words, it is the L X E U X defined as , its expected length is U X Definition: Expected Length Given an interval estimator L X . . . . . . . . . . . .. . .. . . .. . . .. . . . . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. 5 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , if its confidence coefficient is 1 − α , we call it a (1 − α ) confidence interval

where X are random samples from f X x . .. E-M Recap . . . . . . .. . . . . P2 .. . . .. . .. .. P1 . P3 . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang average length of the interval estimator. . In other words, it is the L X E U X defined as . . Definition: Expected Length . . . Definition : Confidence Interval . Definitions Summary . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . . .. . .. . .. . . .. . . .. . . . . .. . . .. . 5 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , if its confidence coefficient is 1 − α , we call it a (1 − α ) confidence interval Given an interval estimator [ L ( X ) , U ( X )] of θ , its expected length is

where X are random samples from f X x . . Recap . . . . . . .. . . .. .. . P1 . . .. . . .. . E-M P2 .. . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang average length of the interval estimator. . In other words, it is the defined as . Definition: Expected Length P3 . . . Definition : Confidence Interval . Definitions Summary . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . . 5 / 33 . .. .. .. . . . .. .. . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , if its confidence coefficient is 1 − α , we call it a (1 − α ) confidence interval Given an interval estimator [ L ( X ) , U ( X )] of θ , its expected length is E [ U ( X ) − L ( X )]

. . . . . . . . .. . . .. . .. E-M . . .. . . .. . . Recap P1 . Definition: Expected Length April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang average length of the interval estimator. defined as . . . P2 . . Definition : Confidence Interval . Definitions Summary . P3 .. .. . . . . .. . . .. . .. . . . .. . . .. . . .. .. . . . . .. . . .. . . .. 5 / 33 . . . .. . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Given an interval estimator [ L ( X ) , U ( X )] of θ , if its confidence coefficient is 1 − α , we call it a (1 − α ) confidence interval Given an interval estimator [ L ( X ) , U ( X )] of θ , its expected length is E [ U ( X ) − L ( X )] where X are random samples from f X ( x | θ ) . In other words, it is the

. P1 . . 9.2.2 is an interval, but quite often There is no guarantee that the confidence set obtained from Theorem Confidence set and confidence interval Summary . P3 P2 E-M two-sided CI L X Recap . . . . . . .. . . .. . . .. 1 To obtain U X , we invert the . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . , where vs. H acceptance region of a test for H U X , then we invert the 3 To obtain a upper-bounded CI . acceptance region of a level . , where vs. H acceptance region of a test for H , then we invert the 2 To obtain a lower-bounded CI L X . . vs. H test for H .. . .. . . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . .. . . .. . .. . . .. . . .. . . .. . . .. . . .. . . 6 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . .

. . Summary . P3 P2 P1 E-M Recap . . . . . .. There is no guarantee that the confidence set obtained from Theorem . . .. . . .. . . .. Confidence set and confidence interval 9.2.2 is an interval, but quite often . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . , where vs. H acceptance region of a test for H U X , then we invert the 3 To obtain a upper-bounded CI . . . , where vs. H acceptance region of a test for H , then we invert the 2 To obtain a lower-bounded CI L X . . . . .. .. .. .. . . .. . . .. . . . . . .. . . .. . . .. . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . 6 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . 1 To obtain (1 − α ) two-sided CI [ L ( X ) , U ( X )] , we invert the acceptance region of a level α test for H 0 : θ = θ 0 vs. H 1 : θ ̸ = θ 0

. .. E-M Recap . . . . . . .. . . . P2 . .. . . .. . .. .. P1 P3 . 3 To obtain a upper-bounded CI April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . , where vs. H acceptance region of a test for H U X , then we invert the . . . . . . 9.2.2 is an interval, but quite often There is no guarantee that the confidence set obtained from Theorem Confidence set and confidence interval Summary . . .. . . .. . . .. . . .. . .. .. . . .. . . .. . . . . 6 / 33 . . .. . . .. . . .. . . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 To obtain (1 − α ) two-sided CI [ L ( X ) , U ( X )] , we invert the acceptance region of a level α test for H 0 : θ = θ 0 vs. H 1 : θ ̸ = θ 0 2 To obtain a lower-bounded CI [ L ( X ) , ∞ ) , then we invert the acceptance region of a test for H 0 : θ = θ 0 vs. H 1 : θ > θ 0 , where Ω = { θ : θ ≥ θ 0 } .

. . .. . . .. . . .. . . . . . . .. . . .. . . .. . Recap . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . . . . . E-M 9.2.2 is an interval, but quite often There is no guarantee that the confidence set obtained from Theorem Confidence set and confidence interval Summary . P3 P2 P1 . .. .. .. . .. . . .. . . . .. . .. . . .. . . . . . . . .. . . .. . . .. . 6 / 33 .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 To obtain (1 − α ) two-sided CI [ L ( X ) , U ( X )] , we invert the acceptance region of a level α test for H 0 : θ = θ 0 vs. H 1 : θ ̸ = θ 0 2 To obtain a lower-bounded CI [ L ( X ) , ∞ ) , then we invert the acceptance region of a test for H 0 : θ = θ 0 vs. H 1 : θ > θ 0 , where Ω = { θ : θ ≥ θ 0 } . 3 To obtain a upper-bounded CI ( −∞ , U ( X )] , then we invert the acceptance region of a test for H 0 : θ = θ 0 vs. H 1 : θ < θ 0 , where Ω = { θ : θ ≤ θ 0 } .

• For one-dimensional parameter, negative second order derivative . E-M Recap . . . . . . .. . . .. . P2 . .. . . .. . . .. P1 . P3 3 Check second-order derivative to check local maximum. April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang maximum. 4 Check boundary points to see whether boundary gives global . . implies local maximum. . . . 2 Find candidates that makes first order derivative to be zero . . . . Typical strategies for finding MLEs Summary . .. .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. 7 / 33 . . . .. . . .. . . .. . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Write the joint (log-)likelihood function, L ( θ | x ) = f X ( x | θ ) .

. .. E-M Recap . . . . . . .. . . . P2 . .. . . .. . . .. P1 P3 . 3 Check second-order derivative to check local maximum. April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang maximum. 4 Check boundary points to see whether boundary gives global . . implies local maximum. . . . 2 Find candidates that makes first order derivative to be zero . . . . Typical strategies for finding MLEs Summary . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . . .. 7 / 33 . . .. . . .. . . .. . . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Write the joint (log-)likelihood function, L ( θ | x ) = f X ( x | θ ) . • For one-dimensional parameter, negative second order derivative

. .. .. . . .. . . . . . .. . . .. . . .. .. P3 April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Example: A mixture distribution Summary . P2 . P1 E-M Recap . . . . . . .. . .. . . . . . .. . . .. . . .. . . .. . . . .. . .. .. . . .. . . .. . 8 / 33 . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . .. . . .. . . .. . . . . . . .. . . .. . .. .. . Recap . mixture proportion of each component April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components parameters shared among components parameters specific to each component f the probability density function x observed data E-M k A general mixture distribution Summary . P3 P2 P1 . . .. . . . .. . . .. . .. .. . . .. . . .. . . . 9 / 33 . . . .. . . .. . . .. . . .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1

. . . . .. . . .. . .. . . . .. . .. .. . .. . . . . . .. x observed data April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components parameters shared among components parameters specific to each component f the probability density function k Recap A general mixture distribution Summary . P3 P2 P1 E-M . . . .. . .. . . .. . . . . . .. . . .. . . . 9 / 33 .. .. .. . . .. . . .. . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1 π mixture proportion of each component

. . . .. . . .. . .. .. . . .. . .. .. . . . .. k April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components parameters shared among components f the probability density function x observed data A general mixture distribution . . . . . Summary . P3 P2 P1 E-M Recap . . . .. . . . . .. . . . .. . .. . . .. . . . .. . . .. . . .. . . .. . . 9 / 33 .. . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1 π mixture proportion of each component φ parameters specific to each component

. .. .. . . .. . . . . . .. . .. .. . . . .. . A general mixture distribution April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k number of mixture components f the probability density function x observed data k Summary . . P3 P2 P1 E-M Recap . . . . . .. . . .. .. .. . .. . . . . . .. . . .. . . . . .. .. . . .. . . .. . . . 9 / 33 . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | π, φ, η ) = π i f ( x | φ i , η ) i =1 π mixture proportion of each component φ parameters specific to each component η parameters shared among components

f i x . .. P3 P2 P1 E-M Recap . . . . . . . Summary . .. . . .. . .. .. . . . MLE Problem for mixture of normals i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . Find MLEs for i i n i .. x exp i i i k . . Problem . . . .. . . . .. . . .. . . . . . .. . . .. . . .. . . . .. .. .. .. . . .. . . .. . . .. . . . . . .. . 10 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | θ = ( π, µ, σ 2 )) π i f i ( x | µ i , σ 2 = i ) i =1

. .. E-M Recap . . . . . . .. . . . P2 .. .. . . .. . . .. P1 P3 . i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . Find MLEs for i i n exp . i k . . Problem . MLE Problem for mixture of normals Summary . . .. . . .. . . . . . .. . .. .. . . .. . . .. . . . .. 10 / 33 . . . .. . . .. . . .. .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | θ = ( π, µ, σ 2 )) π i f i ( x | µ i , σ 2 = i ) i =1 − ( x − µ i ) 2 1 [ ] f i ( x | µ i , σ 2 i ) = 2 σ 2 √ 2 πσ 2

. .. Recap . . . . . . .. . . .. . P1 .. . . .. . . .. . E-M P2 .. exp April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . Find MLEs for n i i P3 k . . Problem . MLE Problem for mixture of normals Summary . . . . . .. . . .. . . . . .. . . . .. . . .. . . . .. 10 / 33 . . .. . . .. .. . . .. .. . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | θ = ( π, µ, σ 2 )) π i f i ( x | µ i , σ 2 = i ) i =1 − ( x − µ i ) 2 1 [ ] f i ( x | µ i , σ 2 i ) = 2 σ 2 √ 2 πσ 2 ∑ π i = 1 i =1

. .. . .. . . .. .. . . Recap . .. . . .. . . . . . . . E-M . k April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang n i exp i . P1 . Problem . MLE Problem for mixture of normals Summary . P3 P2 .. . . . . . .. . . .. .. .. . . . .. . . .. . . .. . 10 / 33 .. .. . . . .. . . . . . .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∑ f ( x | θ = ( π, µ, σ 2 )) π i f i ( x | µ i , σ 2 = i ) i =1 − ( x − µ i ) 2 1 [ ] f i ( x | µ i , σ 2 i ) = 2 σ 2 √ 2 πσ 2 ∑ π i = 1 i =1 Find MLEs for θ = ( π, µ, σ 2 ) .

. .. . .. . . .. . . . Recap . .. . . .. . . . . . . . E-M . n April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang n x x i i • P1 x • • k Summary . P3 P2 .. .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. 11 / 33 . . .. . . .. .. . . .. . . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution when k = 1 ∑ π i f i ( x | µ i , σ 2 f ( x | θ ) = i ) i =1

. .. .. . . .. . . . . .. .. . . .. . . .. .. P3 April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k Summary . P2 . P1 E-M Recap . . . . . . .. . . . . .. . . .. . . . . . . .. . . .. . . .. 11 / 33 . . . . .. . . . .. .. .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Solution when k = 1 ∑ π i f i ( x | µ i , σ 2 f ( x | θ ) = i ) i =1 • π = π 1 = 1 • µ = µ 1 = x • σ 2 = σ 2 i =1 ( x i − x ) 2 / n 1 = ∑ n

. .. .. . . .. . . . . . .. . . .. . . . .. . n April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang sums of exponential functions. The MLE solution is not analytically tractable, because it involves multiple k Summary . . P3 P2 P1 E-M Recap . . . . . .. .. . .. .. . . .. . . . . . .. . . .. . . . .. .. . . . .. . . .. . . .. . 12 / 33 . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Incomplete data problem when k > 1   ∏ ∑ π i f i ( x i | µ j , σ 2 f ( x | θ ) = j )   i =1 j =1

j f i x i f i x i . i z i z i i n j j I z i j k n n f x z sampled from. Converting to a complete data problem Summary . P3 P2 P1 E-M Recap . . . . . i i .. i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i I z i i n i i x i I z i n I z i i i I z i i n i x i I z i i n i n i .. . . .. .. . . .. . . .. . . .. . . . . . .. . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . 13 / 33 . .. . . .. . . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was

f i x i . Converting to a complete data problem n i z i z i i n k n sampled from. Summary I z i . P3 P2 P1 E-M .. Recap . . . . . . .. . i i .. I z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i I z i i n i i x i i n n i i I z i i n i x i I z i i n i . 13 / 33 . . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. . .. .. . . .. . . . . . .. .. . . .. . . .. . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was   ∏ ∑ I ( z i = j ) f i ( x i | µ j , σ 2 f ( x | z , θ ) = j )   i =1 j =1

. P3 i n i n k n sampled from. Converting to a complete data problem Summary . P2 i P1 E-M .. Recap . . . . . . .. . . .. I z i n . I z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i I z i i n i i x i i i n i i I z i i n i x i I z i i n . 13 / 33 .. . .. . . . .. . . .. . .. . . .. . . . .. . . .. . . . . . . . .. . . .. . . .. . . .. .. . . .. . .. . . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was   ∏ ∑ ∏ I ( z i = j ) f i ( x i | µ j , σ 2  = f i ( x i | µ z i , σ 2 f ( x | z , θ ) = j ) z i )  i =1 j =1 i =1

. E-M k n sampled from. Converting to a complete data problem Summary . P3 P2 P1 .. n Recap . . . . . . .. . . .. . . .. n i . i x i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i I z i i n i I z i n i n i i I z i i n i x i I z i i . 13 / 33 .. . .. . . .. . . . .. . . . .. . . . .. . . .. . .. . .. . . .. . . .. . . .. . .. . . . . . .. .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was   ∏ ∑ ∏ I ( z i = j ) f i ( x i | µ j , σ 2  = f i ( x i | µ z i , σ 2 f ( x | z , θ ) = j ) z i )  i =1 j =1 i =1 ∑ n i =1 I ( z i = i ) ˆ = π i

. . P2 P1 E-M .. . . . . . . .. . .. . . . .. . . .. . . P3 Summary . i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i I z i i n i x i Converting to a complete data problem I z i i n i n n k n sampled from. .. Recap . .. . . .. . .. . .. . . . . . .. . . .. . . .. . . .. 13 / 33 . .. . . .. . . .. . . .. . . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was   ∏ ∑ ∏ I ( z i = j ) f i ( x i | µ j , σ 2  = f i ( x i | µ z i , σ 2 f ( x | z , θ ) = j ) z i )  i =1 j =1 i =1 ∑ n i =1 I ( z i = i ) ˆ = π i ∑ n i =1 I ( z i = i ) x i µ i ˆ = ∑ n i =1 I ( z i = i )

. . .. . . .. . . .. . .. .. . . .. . . .. . Recap . k April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang The MLE solution is analytically tractable, if z is known. i n n n E-M sampled from. Converting to a complete data problem Summary . P3 P2 P1 . . . . . . .. .. . .. . . .. . . . .. . . .. . . .. . . . 13 / 33 . . . . .. . .. . . .. . . .. . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Let z i ∈ { 1 , · · · , k } denote the source distribution where each x i was   ∏ ∑ ∏ I ( z i = j ) f i ( x i | µ j , σ 2  = f i ( x i | µ z i , σ 2 f ( x | z , θ ) = j ) z i )  i =1 j =1 i =1 ∑ n i =1 I ( z i = i ) ˆ = π i ∑ n i =1 I ( z i = i ) x i µ i ˆ = ∑ n i =1 I ( z i = i ) µ i ) 2 ∑ n i =1 I ( z i = i )( x i − ˆ σ 2 ˆ = ∑ n i =1 I ( z i = i )

• A procedure for typically solving for the MLE. • Guaranteed to converge the MLE (!) • Particularly suited to the ”missing data” problems where analytic .. . . .. . . .. . . . . . . . . .. . . .. . . . E-M Recap The algorithm was derived and used in various special cases by a number April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Statistical Society Series B (1977). seminal paper by Dempster, Laird, and Rubin in Journal of Royal of authors, but it was not identified as a general algorithm until the solution of MLE is not tractable . E-M (Expectation-Maximization) algorithm is E-M Algorithm Summary . P3 P2 P1 .. .. . . . . .. . . .. . .. .. . . .. . . .. . . .. . 14 / 33 . . . .. . . .. . . .. . . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . .

• Guaranteed to converge the MLE (!) • Particularly suited to the ”missing data” problems where analytic . . .. . . .. . . . . .. . . .. . . .. .. Recap . . . . . solution of MLE is not tractable April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Statistical Society Series B (1977). seminal paper by Dempster, Laird, and Rubin in Journal of Royal of authors, but it was not identified as a general algorithm until the The algorithm was derived and used in various special cases by a number E-M (Expectation-Maximization) algorithm is . E-M Algorithm Summary . P3 P2 P1 E-M . .. .. .. . .. . . .. . . . . . .. . . .. . . . .. . . . .. . . .. . . .. . .. . . .. . . .. . 14 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • A procedure for typically solving for the MLE.

• Particularly suited to the ”missing data” problems where analytic . . . .. . . .. . . . .. . . .. . . .. .. Recap . . . . . solution of MLE is not tractable April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Statistical Society Series B (1977). seminal paper by Dempster, Laird, and Rubin in Journal of Royal of authors, but it was not identified as a general algorithm until the The algorithm was derived and used in various special cases by a number E-M (Expectation-Maximization) algorithm is . E-M Algorithm Summary . P3 P2 P1 E-M . .. .. .. . .. . . .. . . . .. . .. . . .. . . . . . . . .. . . .. . . .. . . .. . . .. . . .. 14 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • A procedure for typically solving for the MLE. • Guaranteed to converge the MLE (!)

. . . . .. . . .. . .. . . . .. . . .. . .. . . . . . .. solution of MLE is not tractable April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang Statistical Society Series B (1977). seminal paper by Dempster, Laird, and Rubin in Journal of Royal of authors, but it was not identified as a general algorithm until the The algorithm was derived and used in various special cases by a number E-M (Expectation-Maximization) algorithm is Recap E-M Algorithm Summary . P3 P2 P1 E-M . .. . .. . .. . . .. . . . .. . .. . . .. . . . . . .. .. . . .. . . .. . . . . . .. . . .. 14 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • A procedure for typically solving for the MLE. • Guaranteed to converge the MLE (!) • Particularly suited to the ”missing data” problems where analytic

• Complete data likelihood : f x • Incomplete data likelihood : g y . . Overview of E-M Algorithm Summary . P3 P2 P1 E-M Recap . . . . . . . .. . . .. . . .. .. Basic Structure . . f y z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . g y y We are interested in MLE for L d z f y z .. . . . . . . . . Complete and incomplete data likelihood . . . .. . .. . . .. . . .. . . . . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . .. . . . .. . . .. 15 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • y is observed (or incomplete) data • z is missing (or augmented) data • x = ( y , z ) is complete data

• Incomplete data likelihood : g y . P1 E-M Recap . . . . . . .. . . .. P3 . . .. . . .. .. . P2 Summary . f y z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . g y y We are interested in MLE for L d z . . . Complete and incomplete data likelihood . . . Basic Structure . Overview of E-M Algorithm .. . . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. .. 15 / 33 . . . .. . . .. . . .. . . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . • y is observed (or incomplete) data • z is missing (or augmented) data • x = ( y , z ) is complete data • Complete data likelihood : f ( x | θ ) = f ( y , z | θ )

. . Recap . . . . . . .. . . .. . P1 .. . . .. .. . .. . E-M P2 .. . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . g y y We are interested in MLE for L . Complete and incomplete data likelihood P3 . . . Basic Structure . Overview of E-M Algorithm Summary . . . . . .. . . .. . . .. . .. . . . .. . . .. . . . . .. .. .. . . .. . . .. . . . 15 / 33 .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • y is observed (or incomplete) data • z is missing (or augmented) data • x = ( y , z ) is complete data • Complete data likelihood : f ( x | θ ) = f ( y , z | θ ) ∫ • Incomplete data likelihood : g ( y | θ ) = f ( y , z | θ ) d z

. . .. . . .. . . .. . . . . . . .. . . .. . . .. . Recap . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . . Complete and incomplete data likelihood . . E-M Basic Structure . Overview of E-M Algorithm Summary . P3 P2 P1 . .. .. .. . .. . . .. . . . .. . .. . . .. . . . . . . . .. . . .. . . .. . 15 / 33 .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . • y is observed (or incomplete) data • z is missing (or augmented) data • x = ( y , z ) is complete data • Complete data likelihood : f ( x | θ ) = f ( y , z | θ ) ∫ • Incomplete data likelihood : g ( y | θ ) = f ( y , z | θ ) d z We are interested in MLE for L ( θ | y ) = g ( y | θ ) .

. P1 k z g y y L Maximizing incomplete data likelihood Summary . P3 P2 E-M f y z Recap . . . . . . .. . . .. . .. .. y g y . y Z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M y y E log k Z y E log L log L y log L y , creating the new identity under k z Because z is missing data, we replace the right side with its expectation y log k z y z log L y . . .. . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . . .. . . . .. . . .. . . .. . . .. 16 / 33 . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) =

. Recap y k z Maximizing incomplete data likelihood Summary . P3 P2 P1 E-M . . . . . g y . .. . . .. . .. .. . f y z log L .. y Z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M y y E log k Z y E log L y y log L y , creating the new identity under k z Because z is missing data, we replace the right side with its expectation y log k z y z log L . . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . . 16 / 33 .. . .. . . .. . . .. . . .. . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) = L ( θ | y ) = g ( y | θ )

. . Summary . P3 P2 P1 E-M Recap . . . . . .. log L . . .. . .. .. . . .. Maximizing incomplete data likelihood y . y April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M y y E log k Z y Z log L E log L y log L y , creating the new identity under k z Because z is missing data, we replace the right side with its expectation y log k z y z . . .. .. . .. . . .. . . .. . . . . . .. . . .. . . .. . . . 16 / 33 .. .. . .. . . .. . . .. . . .. . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) = L ( θ | y ) = g ( y | θ ) f ( y , z | θ ) k ( z | θ, y ) = g ( y | θ )

. .. E-M Recap . . . . . . .. . . . P2 . .. .. . .. . . .. P1 P3 . y April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M y y E log k Z y Z . E log L y log L y , creating the new identity under k z Because z is missing data, we replace the right side with its expectation Maximizing incomplete data likelihood Summary . . .. . . .. . . . .. . . .. . .. .. . . .. . . .. . . . 16 / 33 . . .. . . .. . . .. . .. . . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) = L ( θ | y ) = g ( y | θ ) f ( y , z | θ ) k ( z | θ, y ) = g ( y | θ ) log L ( θ | y ) = log L ( θ | y , z ) − log k ( z | θ, y )

. . Recap . . . . . . .. . . .. . P1 .. .. . .. . . .. . E-M P2 .. E log k Z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M y y y P3 y Z E log L y log L Because z is missing data, we replace the right side with its expectation Maximizing incomplete data likelihood Summary . . . . . .. . . . . . .. . .. . . . .. . . .. . . . .. 16 / 33 . .. . . .. . . .. . .. . . . .. . . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) = L ( θ | y ) = g ( y | θ ) f ( y , z | θ ) k ( z | θ, y ) = g ( y | θ ) log L ( θ | y ) = log L ( θ | y , z ) − log k ( z | θ, y ) under k ( z | θ ′ , y ) , creating the new identity

. .. .. . . .. . . . . . .. . . .. . . . .. . Maximizing incomplete data likelihood April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang algorithm. Iteratively maximizing the first term in the right-hand side results in E-M E Because z is missing data, we replace the right side with its expectation Summary . . P3 P2 P1 E-M Recap . . . . . .. .. . .. .. . . .. . . . . . .. . . .. . . . .. .. .. . . .. . . .. . . . 16 / 33 . . . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . L ( θ | y , z ) f ( y , z | θ ) = L ( θ | y ) = g ( y | θ ) f ( y , z | θ ) k ( z | θ, y ) = g ( y | θ ) log L ( θ | y ) = log L ( θ | y , z ) − log k ( z | θ, y ) under k ( z | θ ′ , y ) , creating the new identity [ ] [ ] log L ( θ | y ) = log L ( θ | y , Z ) | θ ′ , y − E log k ( Z | θ, y ) | θ ′ , y

• Let f y z r is the estimation of • Q r . . . . . . . Summary . P3 P2 P1 E-M Recap . . .. . . .. . . .. . .. Overview of E-M Algorithm (cont’d) . Objective r April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang on the observed data and is the expected log-likelihood of complete data , conditioning r in r -th iteration. where y . E log f y Z r Q function y directly, we work with the surrogate rather than working with l denotes the pdf of complete data. In E-M algorithm, . .. . . .. . .. . . .. . . .. . . . .. . .. . . .. . . .. . . . .. . . . . .. . . .. . . .. . .. . . . .. . . .. 17 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • Maximize L ( θ | y ) or l ( θ | y ) .

r is the estimation of • Q r . . P2 P1 E-M Recap . . . . . . .. . . .. . . . .. . . .. . P3 . Summary r April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang on the observed data and is the expected log-likelihood of complete data , conditioning r in r -th iteration. where y Overview of E-M Algorithm (cont’d) E log f y Z r Q function . . Objective .. . .. . .. .. . . .. . . .. . . . . . .. . . .. . . .. . . . . .. . .. . . .. . . .. . . .. . .. . . . .. . 17 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . • Maximize L ( θ | y ) or l ( θ | y ) . • Let f ( y , z | θ ) denotes the pdf of complete data. In E-M algorithm, rather than working with l ( θ | y ) directly, we work with the surrogate

r is the estimation of • Q r . . Recap . . . . . . .. . . .. . . P1 .. . . .. . . .. E-M . P2 where April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang on the observed data and is the expected log-likelihood of complete data , conditioning r in r -th iteration. E P3 function . . Objective . Overview of E-M Algorithm (cont’d) Summary . . .. .. . . .. . . .. . . .. . .. .. . . .. . . .. . . . . . . . .. . . .. . . .. . . .. 17 / 33 .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . • Maximize L ( θ | y ) or l ( θ | y ) . • Let f ( y , z | θ ) denotes the pdf of complete data. In E-M algorithm, rather than working with l ( θ | y ) directly, we work with the surrogate [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) =

• Q r . . . . . . . . . .. . . .. . . .. E-M . .. .. . .. . Recap P2 P1 function April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang on the observed data and is the expected log-likelihood of complete data , conditioning r E . .. . Objective . Overview of E-M Algorithm (cont’d) Summary . P3 . . . . . . . .. . . .. . .. . . . .. . . .. . . . .. 17 / 33 .. .. . . .. . . .. . . .. . . . . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . • Maximize L ( θ | y ) or l ( θ | y ) . • Let f ( y , z | θ ) denotes the pdf of complete data. In E-M algorithm, rather than working with l ( θ | y ) directly, we work with the surrogate [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = where θ ( r ) is the estimation of θ in r -th iteration.

. . . . .. . . .. . .. . . . .. . . .. . .. . . . . . .. Objective April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang E function . . . Recap Overview of E-M Algorithm (cont’d) Summary . P3 P2 P1 E-M . .. . .. .. . . . .. . . . . . .. . . .. . . . 17 / 33 .. .. .. . . .. . . .. . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . • Maximize L ( θ | y ) or l ( θ | y ) . • Let f ( y , z | θ ) denotes the pdf of complete data. In E-M algorithm, rather than working with l ( θ | y ) directly, we work with the surrogate [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = where θ ( r ) is the estimation of θ in r -th iteration. • Q ( θ | θ ( r ) ) is the expected log-likelihood of complete data , conditioning on the observed data and θ ( r ) .

• Maximize Q • The arg max Q • Repeat E-step until convergence . E-M Expectation Step . Key Steps of E-M algorithm Summary . P3 P2 P1 . . . . . Recap . . .. . . .. . .. .. . Maximization Step . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang E-step. to be fed into the -th will be the r r with respect to . r . . . . . . . . . . .. . . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . .. 18 / 33 . .. .. . . .. . . . .. . . . . .. . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Compute Q ( θ | θ ( r ) ) . • This typically involves in estimating the conditional distribution Z | Y , assuming θ = θ ( r ) . • After computing Q ( θ | θ ( r ) ) , move to the M-step

. .. . .. . . .. . . . Recap . .. . . .. . . . . . . . E-M . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang E-step. . . Maximization Step . P1 . Expectation Step . Key Steps of E-M algorithm Summary . P3 P2 .. .. . . .. . .. . . .. . .. . . . .. . . .. . . .. . 18 / 33 .. .. . . . .. . . . . . .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Compute Q ( θ | θ ( r ) ) . • This typically involves in estimating the conditional distribution Z | Y , assuming θ = θ ( r ) . • After computing Q ( θ | θ ( r ) ) , move to the M-step • Maximize Q ( θ | θ ( r ) ) with respect to θ . • The arg max θ Q ( θ | θ ( r ) ) will be the ( r + 1) -th θ to be fed into the • Repeat E-step until convergence

r y log f y z r y i log f y i z i f y i z i log f y i z i f y i z i i f y i z i . z E . . E-step . E-M algorithm for mixture of normals Summary P2 . P3 P1 E-M Recap . . . . . . .. . .. .. k z k n z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j k g y i z i r i g y i r z i k i n k z i z i . . . .. . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . . . .. . . .. . . .. . . .. . .. .. . . . . .. 19 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) =

r y i log f y i z i f y i z i log f y i z i f y i z i i f y i z i P2 . . E-step . E-M algorithm for mixture of normals Summary . P3 . P1 z E-M Recap . . . . . . .. . .. .. . E k n z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j k g y i z i r i g y i r z i k i n k z i z i .. . . . . . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. 19 / 33 . .. .. . . .. . .. . . . . . .. . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = ∑ k ( z | θ ( r ) , y ) log f ( y , z | θ ) =

f y i z i log f y i z i f y i z i i f y i z i . E-M algorithm for mixture of normals Summary . P3 P2 P1 E-M Recap . . . . . . . . .. .. . .. . . .. E-step z . z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j k g y i z i r E g y i r z i k i n k n . . . .. . . . .. . . .. . . .. . .. .. . . .. . . .. . . .. . . . 19 / 33 . . . .. . . .. . . .. . . .. . . .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = ∑ k ( z | θ ( r ) , y ) log f ( y , z | θ ) = ∑ ∑ k ( z i | θ ( r ) , y i ) log f ( y i , z i | θ ) = i =1 z i =1

f y i z i i f y i z i . P3 P2 P1 E-M Recap . . . . . . .. .. . Summary .. . . .. . . .. . . . E-M algorithm for mixture of normals z i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j k g y i z i k .. n k n z E . . E-step . . . .. . . .. . . . .. . . . . . .. . . .. . . .. . . .. 19 / 33 . .. .. . .. . . . .. . . . .. . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = ∑ k ( z | θ ( r ) , y ) log f ( y , z | θ ) = ∑ ∑ k ( z i | θ ( r ) , y i ) log f ( y i , z i | θ ) = i =1 z i =1 f ( y i , z i | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1

i f y i z i . . P1 E-M Recap . . . . . . .. . . .. P3 . .. . . .. . . P2 Summary . n April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j k g y i k k . n z E . . E-step . E-M algorithm for mixture of normals .. .. . .. . . .. . . .. . .. . . . . .. . . .. . . .. . . .. 19 / 33 . .. . .. . .. . .. . . . . . .. . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = ∑ k ( z | θ ( r ) , y ) log f ( y , z | θ ) = ∑ ∑ k ( z i | θ ( r ) , y i ) log f ( y i , z i | θ ) = i =1 z i =1 f ( y i , z i | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 N ( µ z i , σ 2 f ( y i , z i | θ ) ∼ z i )

. . Recap . . . . . . .. . . .. . P1 .. . . .. . . .. . E-M P2 .. n April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang k k n k z P3 E . . E-step . E-M algorithm for mixture of normals Summary . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . . . 19 / 33 .. .. . . .. . . .. . . .. . . . . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . [ log f ( y , Z | θ ) | y , θ ( r ) ] Q ( θ | θ ( r ) ) = ∑ k ( z | θ ( r ) , y ) log f ( y , z | θ ) = ∑ ∑ k ( z i | θ ( r ) , y i ) log f ( y i , z i | θ ) = i =1 z i =1 f ( y i , z i | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 N ( µ z i , σ 2 f ( y i , z i | θ ) ∼ z i ) ∑ g ( y i | θ ) = π i f ( y i , z i = j | θ ) j =1

f y i z i x i k z i g y i r r j n i j y i j r n i k z i j y i r r . i n . .. n k r j n i i k z i j y i r n n n x i k z i . j y i n i x i r j k z i r j y i n r j Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 r k z i j y i n r n r j r j i i x i r j k z i j y i r n M-step . E-M algorithm for mixture of normals (cont’d) . .. . . .. . . .. . . .. . . .. . . .. . .. . . . . .. . . .. . .. . . . .. . . .. . . .. Summary .. . P3 P2 P1 E-M Recap . . .. . . . . . . . . .. . .. . . . 20 / 33 . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . f ( y i , z i | θ ( r ) ) Q ( θ | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1

. j x i k z i i n r j y i k z i i n r j y i x i k z i i n r r n n n n j k n .. . M-step . E-M algorithm for mixture of normals (cont’d) Summary . j y i n P2 r April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j r n r j y i k z i j r x i i n j y i r k z i i n r j y i k z i j r x i i n j r j P3 . 20 / 33 .. . .. . . .. . . . .. . .. . . .. . . .. . . . .. . . .. . . .. . . . . . .. . . .. . . .. . .. .. . . .. . . . . . . . . Recap .. . . E-M P1 . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . f ( y i , z i | θ ( r ) ) Q ( θ | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 f ( y i , z i = j | θ ( r ) ) 1 k ( z i = j | y i , θ ( r ) ) = 1 π ( r +1) ∑ ∑ = g ( y i | θ ( r ) ) i =1 i =1

. . r j j n n n n j k n . M-step n . E-M algorithm for mixture of normals (cont’d) .. . P3 P2 P1 E-M Recap . . . . . . j i . x i April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j r n r j y i k z i j r i x i n r j y i k z i i n r j y i k z i j r .. Summary . . . . .. . . .. . . .. . . .. . .. .. . .. . . . .. . . .. . .. .. . . .. . . . 20 / 33 . . . . .. . . .. . . .. . . .. . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . f ( y i , z i | θ ( r ) ) Q ( θ | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 f ( y i , z i = j | θ ( r ) ) 1 k ( z i = j | y i , θ ( r ) ) = 1 π ( r +1) ∑ ∑ = g ( y i | θ ( r ) ) i =1 i =1 i =1 x i k ( z i = j | y i , θ ( r ) ) i =1 x i k ( z i = j | y i , θ ( r ) ) ∑ n ∑ n µ ( r +1) = i =1 k ( z i = j | y i , θ ( r ) ) = n π ( r +1) ∑ n

. P2 . . M-step . E-M algorithm for mixture of normals (cont’d) Summary . P3 .. P1 k E-M Recap . . . . . . .. . . .. . . n j . j April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j r n r j y i k z i r n x i i n j j j j n n n .. 20 / 33 . . . .. . .. . .. . . .. . . . . .. .. .. . . . . .. . . . .. .. . . .. . . .. . . .. . . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . f ( y i , z i | θ ( r ) ) Q ( θ | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 f ( y i , z i = j | θ ( r ) ) 1 k ( z i = j | y i , θ ( r ) ) = 1 π ( r +1) ∑ ∑ = g ( y i | θ ( r ) ) i =1 i =1 i =1 x i k ( z i = j | y i , θ ( r ) ) i =1 x i k ( z i = j | y i , θ ( r ) ) ∑ n ∑ n µ ( r +1) = i =1 k ( z i = j | y i , θ ( r ) ) = n π ( r +1) ∑ n i =1 ( x i − µ ( r +1) ) 2 k ( z i = j | y i , θ ( r ) ) ∑ n σ 2 , ( r +1) = i =1 k ( z i = j | y i , θ ( r ) ) ∑ n

. .. P3 P2 .. P1 E-M Recap . . . . . . . Summary . .. . . .. . . .. . . E-M algorithm for mixture of normals (cont’d) .. j April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang j j j j j n . n n n j k n . . M-step . 20 / 33 . . .. . . .. . . .. . . . .. .. . . .. . . . . . .. .. .. .. . . .. . . .. . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . f ( y i , z i | θ ( r ) ) Q ( θ | θ ( r ) ) ∑ ∑ = g ( y i | θ ( r ) ) log f ( y i , z i | θ ) i =1 z i =1 f ( y i , z i = j | θ ( r ) ) 1 k ( z i = j | y i , θ ( r ) ) = 1 π ( r +1) ∑ ∑ = g ( y i | θ ( r ) ) i =1 i =1 i =1 x i k ( z i = j | y i , θ ( r ) ) i =1 x i k ( z i = j | y i , θ ( r ) ) ∑ n ∑ n µ ( r +1) = i =1 k ( z i = j | y i , θ ( r ) ) = n π ( r +1) ∑ n i =1 ( x i − µ ( r +1) ) 2 k ( z i = j | y i , θ ( r ) ) ∑ n σ 2 , ( r +1) = i =1 k ( z i = j | y i , θ ( r ) ) ∑ n i =1 ( x i − µ ( r +1) ) 2 k ( z i = j | y i , θ ( r ) ) ∑ n = n π ( r +1)

r y r y r y Z r y r y converges monotonically . . . . . . Summary . P3 P2 P1 E-M Recap .. . . . . .. . . .. .. Does E-M iteration converge to MLE? L Theorem 7.2.20 - Monotonic EM sequence y Z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . y for some stationary point to L Theorem 7.5.2 further guarantees that L E log L r . E log L value of the maximized expected complete-data log likelihood, that is with equality holding if and only if successive iterations yield the same L y r .. . . . . .. . .. . . .. . . .. . . . . . .. . . .. . . .. . . . .. . . .. . . .. . . .. . . .. . 21 / 33 . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . The sequence { ˆ θ ( r ) } defined by the E-M procedure satisfies

r y r y Z r y r y converges monotonically . P3 P2 P1 E-M Recap . . . . . . .. . . Summary .. .. . .. . . .. . Theorem 7.2.20 - Monotonic EM sequence Does E-M iteration converge to MLE? y Z April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . y for some stationary point to L Theorem 7.5.2 further guarantees that L E log L r . E log L value of the maximized expected complete-data log likelihood, that is with equality holding if and only if successive iterations yield the same L L . . . . . .. .. .. . . . .. . . .. . . . . . .. . . .. . . .. . . . 21 / 33 .. . . .. . . .. . . .. . . .. . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . The sequence { ˆ θ ( r ) } defined by the E-M procedure satisfies ( ) ( ) ˆ ˆ θ ( r +1) | y θ ( r ) | y ≥

r y converges monotonically . . P3 P2 P1 E-M .. . . . . . . .. . Summary .. . . .. . . .. . . Does E-M iteration converge to MLE? .. E April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . y for some stationary point to L Theorem 7.5.2 further guarantees that L log L log L . E value of the maximized expected complete-data log likelihood, that is with equality holding if and only if successive iterations yield the same L L . . Theorem 7.2.20 - Monotonic EM sequence . Recap . .. . . .. . . . .. . . . . . .. . . .. . . .. . . .. 21 / 33 . . . . . . .. . .. . . .. . .. .. . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . The sequence { ˆ θ ( r ) } defined by the E-M procedure satisfies ( ) ( ) ˆ ˆ θ ( r +1) | y θ ( r ) | y ≥ [ ( ) ] [ ( ) ] ˆ | ˆ ˆ | ˆ θ ( r +1) | y , Z θ ( r ) , y θ ( r ) | y , Z θ ( r ) , y =

. . .. . . . . . . .. . . .. . P1 .. . . .. . . .. . E-M P2 .. with equality holding if and only if successive iterations yield the same April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang log L E log L E value of the maximized expected complete-data log likelihood, that is L P3 L . . Theorem 7.2.20 - Monotonic EM sequence . Does E-M iteration converge to MLE? Summary . . Recap . . .. . . .. . . .. . . . .. . . .. . . .. . . . 21 / 33 . . .. .. .. . . .. . . .. . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . The sequence { ˆ θ ( r ) } defined by the E-M procedure satisfies ( ) ( ) ˆ ˆ θ ( r +1) | y θ ( r ) | y ≥ [ ( ) ] [ ( ) ] ˆ | ˆ ˆ | ˆ θ ( r +1) | y , Z θ ( r ) , y θ ( r ) | y , Z θ ( r ) , y = Theorem 7.5.2 further guarantees that L (ˆ θ ( r ) | y ) converges monotonically to L (ˆ θ | y ) for some stationary point ˆ θ .

user@host~/> ./mixEM ./mix.dat Maximum log-likelihood = 3043.46, at pi = (0.667842,0.332158) between N(-0.0299457,1.00791) and N(5.0128,0.913825) . . P2 P1 E-M Recap . . . . . . .. . . .. . . .. . . .. . P3 A working example (from BIOSTAT615/815 Fall 2012) Summary . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . . . . . .. . . Running example of implemented software . . . Example Data (n=1,500) . .. . . .. .. . . .. . . .. . . . . . .. . . .. . . .. . . . .. . .. .. . . .. . . .. . . . . .. . . .. . . 22 / 33 . . . . . . . . . . . . . . . . . . . . . . . . . .

. . .. . . .. . . .. . . . . . . .. . . .. . .. .. . Recap . . April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . . Running example of implemented software . . E-M Example Data (n=1,500) . A working example (from BIOSTAT615/815 Fall 2012) Summary . P3 P2 P1 . . .. . . . .. . . .. . .. .. . . .. . . .. . . . 22 / 33 . . .. . . .. . . .. . . .. . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . user@host~/> ./mixEM ./mix.dat Maximum log-likelihood = 3043.46, at pi = (0.667842,0.332158) between N(-0.0299457,1.00791) and N(5.0128,0.913825)

• Can we use the Cramer-Rao bound? No, because the • Then, can we use complete sufficient statistics? . Summary . Strategy to solve the problem . . . Problem . Practice Problem 1 . . P3 P2 P1 E-M Recap . . . . . . .. .. . .. . . . T April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . T T such that E 3 or Make a function . . E W T , and compute . 2 For a trivial unbiased estimator W for . . 1 Find a complete sufficient statistic T . . . interchangeability condition does not hold . . . . . .. .. .. . . .. . . .. . . .. . . . . . .. . . .. . . .. . . .. . . . .. . .. . .. . . .. . . .. . . . 23 / 33 . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . Let X 1 , · · · , X n be a random sample from a population with pdf f ( x | θ ) = 1 − θ < x < θ, θ > 0 2 θ Find, if one exists, a best unbiased estimator of θ .

• Then, can we use complete sufficient statistics? . P1 . Problem . Practice Problem 1 Summary . P3 P2 E-M . Recap . . . . . . .. . . .. .. . .. . . Strategy to solve the problem E W T April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . T T such that E 3 or Make a function . . T . , and compute 2 For a trivial unbiased estimator W for . . 1 Find a complete sufficient statistic T . . . interchangeability condition does not hold No, because the . . . .. . .. . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . . 23 / 33 .. . . . .. . . . . .. . . .. . . . .. . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Let X 1 , · · · , X n be a random sample from a population with pdf f ( x | θ ) = 1 − θ < x < θ, θ > 0 2 θ Find, if one exists, a best unbiased estimator of θ . • Can we use the Cramer-Rao bound?

• Then, can we use complete sufficient statistics? . P1 . Problem . Practice Problem 1 Summary . P3 P2 E-M . Recap . . . . . . .. . . .. .. . .. . Strategy to solve the problem . E W T April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . T T such that E 3 or Make a function . . T . , and compute 2 For a trivial unbiased estimator W for . . 1 Find a complete sufficient statistic T . . . interchangeability condition does not hold . . . .. . . . .. . . .. . . .. . .. . . . .. . . .. . . .. . . .. 23 / 33 . . . .. . . .. . . .. . . .. . .. . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . Let X 1 , · · · , X n be a random sample from a population with pdf f ( x | θ ) = 1 − θ < x < θ, θ > 0 2 θ Find, if one exists, a best unbiased estimator of θ . • Can we use the Cramer-Rao bound? No, because the

. Recap Problem . Practice Problem 1 Summary . P3 P2 P1 E-M . . . . . . . .. . . .. .. . .. . . . .. E W T April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang . T T such that E 3 or Make a function . . T Strategy to solve the problem , and compute 2 For a trivial unbiased estimator W for . . 1 Find a complete sufficient statistic T . . . interchangeability condition does not hold . . . . . . . . .. . . .. . . .. . .. .. . . .. . . .. . . .. . . . 23 / 33 . . .. . . .. . . .. . . .. . .. . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . Let X 1 , · · · , X n be a random sample from a population with pdf f ( x | θ ) = 1 − θ < x < θ, θ > 0 2 θ Find, if one exists, a best unbiased estimator of θ . • Can we use the Cramer-Rao bound? No, because the • Then, can we use complete sufficient statistics?

Biostatistics 602 - Statistical Inference April 16th, 2013 - PowerPoint PPT Presentation

. . . . .. . . .. . .. . . . .. . . .. . .. . . . . . .. Biostatistics 602 - Statistical Inference April 16th, 2013 Biostatistics 602 - Lecture 24 Hyun Min Kang April 16th, 2013 Hyun Min Kang E-M Algorithm & Practice

Biostatistics 602 - Statistical Inference March 14th, 2013 Biostatistics 602 - Lecture 16 Hyun

Biostatistics 602 - Statistical Inference March 19th, 2013 Biostatistics 602 - Lecture 16 Hyun

Biostatistics 602 - Statistical Inference February 19th, 2013 Biostatistics 602 - Lecture 12

Biostatistics 602 - Statistical Inference February 28th, 2013 Biostatistics 602 - Lecture 14

Biostatistics 602 - Statistical Inference April 18th, 2013 Biostatistics 602 - Lecture 25 Hyun

Biostatistics 602 - Statistical Inference February 26th, 2013 Biostatistics 602 - Lecture 13

Biostatistics 602 - Statistical Inference Apil 23rd, 2013 Biostatistics 602 - Lecture 26 Hyun

Biostatistics 602 - Statistical Inference Apil 23rd, 2013 Biostatistics 602 - Lecture 26 Hyun

Wald Test Asymptotics of LRT Lecture 21 Biostatistics 602 - Statistical Inference . . . .

Hypothesis Testing Lecture 18 Biostatistics 602 - Statistical Inference . Summary . .

Ancillary Statistics Lecture 04 Biostatistics 602 - Statistical Inference . . . .

Bayes Estimator Lecture 15 Biostatistics 602 - Statistical Inference . . Summary Conjugate

Evaluation of Point Estimators Lecture 11 Biostatistics 602 - Statistical Inference . .

Likelihood and Point Estimation Lecture 09 Biostatistics 602 - Statistical Inference . . . .

Factorization Theorem Lecture 02 Biostatistics 602 - Statistical Inference . Summary . .

Likelihood Ratio Test Lecture 19 Biostatistics 602 - Statistical Inference . . . . Unbiased

Monetary Policy and the evolution of US economy Fabio Canova ICREA, Universitat Pompeu Fabra, CREI

Massachusetts Division of Banks AARMR Summer 2019 Agenda Revisionists History / Projects

Imminent risk from Brexit Last year saw first budget surplus since 2007 August 2019 Index Page

Writing for the Web Content Management Support Team Writing for the Web Todays session: 1.

An Australian gold miner for global investors Corporate Strategy Day - August 2018 Disclaimer

Improved carrier selectivity of diffused silicon wafer solar cells 12 th October 2017 SPREE

Presentation April 27-28, Finland Made by: Sofia TRUCHANOWICZ WHAT DO WE WANT TO ACHIEVE? To

RRS 42 - propulsion A short presentation for (young) sailors Alexandroupolis (GRE) August 2015