Likelihood inference in complex settings Nancy Reid with Uyen - PowerPoint PPT Presentation

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Likelihood inference in complex settings Nancy Reid with Uyen Hoang, Wei Lin, Ximing Xu 1 / 30

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods 2 / 30

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Why likelihood? • likelihood function depends on data only through sufficient statistics • “likelihood map is sufficient” Fraser & Naderi, 2006 • provides summary statistics with known limiting distribution • leading to approximate pivotal functions, based on normal distribution • in some models the likelihood function gives exact inference • “likelihood function as pivotal” Hinkley, 1980 • likelihood function + sample space derivative gives better approximate inference 3 / 30

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Summary statistics and approximate pivotals f ( y ; θ ) , y ∈ R n , θ ∈ R d • model • log-likelihood function ℓ ( θ ; y ) = log f ( y ; θ ) + a ( y ) • score function u ( θ ) = ∂ℓ ( θ ; y ) /∂θ ˆ • maximum likelihood estimate θ = arg sup θ ℓ ( θ ; y ) w ( θ ) = 2 { ℓ (ˆ • log-likelihood ratio θ ; y ) − ℓ ( θ ; y ) } 4 / 30

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Approximate pivotals √ n (ˆ θ − θ ) . ∼ N d { 0 , j − 1 (ˆ θ ) } θ ) − ℓ ( θ ) } . w ( θ ) = 2 { ℓ (ˆ ∼ χ 2 d 1 √ nU ( θ ) . ∼ N d { 0 , j (ˆ θ ) } 1 L √ nU ( θ ) − → N d { 0 , I ( θ ) } j (ˆ θ ) = − ℓ ′′ (ˆ θ ) / n I ( θ ) = E { j ( θ ) } 5 / 30

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ...approximate pivotals log−likelihood function 0 −1 log−likelihood −2 −3 −4 16 17 18 19 20 21 22 23 θ 6 / 30

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ...approximate pivotals log−likelihood function 0 −1 log−likelihood −2 −3 −4 θ 16 17 18 19 20 21 22 23 θ 7 / 30

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ...approximate pivotals log−likelihood function 0 −1 log−likelihood −2 θ − θ −3 −4 θ 16 17 18 19 20 21 22 23 θ 8 / 30

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ...approximate pivotals log−likelihood function 0 −1 log−likelihood −2 θ − θ −3 −4 θ 16 17 18 19 20 21 22 23 θ 9 / 30

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ...approximate pivotals log−likelihood function 0 −1 1.92 w/2 log−likelihood −2 θ − θ −3 −4 θ 16 17 18 19 20 21 22 23 θ 10 / 30

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ...approximate pivotals θ ) − ℓ ( θ ) } . w ( θ ) = 2 { ℓ (ˆ ∼ χ 2 d M M (a) (a) -3 -2 -2 M -3 M -3 M M -4 M -4 -4 m -2 M 0 M 1 M 2 M 2 M 1 M M -1 0 M M M -1 M -1 2 M 2 M -3 -3 M -3 M -4 -2 -2 M M -2 M -1 -1 -4 M 1 M 1 M M 0 m 0 -4 M -1 (a) 2 1 0 -1 -2 -3 -4 -4 -3 -2 -1 0 1 2 11 / 30

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Likelihood as pivotal • Example: location model f ( y ; θ ) = � n i = 1 f 0 ( y i − θ ) , θ ∈ R exp { ℓ ( θ ; y ) } f (ˆ • Fisher (1934) θ | a ; θ ) = � exp { ℓ ( θ ; y ) } d θ • → (ˆ a i = y i − ˆ ( y 1 , . . . , y n ) ← θ, a 1 , . . . , a n ) θ • exact (conditional) distribution of maximum likelihood estimator given by renormalized likelihood function • p ∗ approximation: θ ) | 1 / 2 exp { ℓ ( θ ; ˆ p ∗ (ˆ θ | a ; θ ) = c ( θ, a ) | j (ˆ θ, a ) − ℓ (ˆ θ ; ˆ θ, a ) } 12 / 30

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods A simpler approach • avoid → (ˆ ( y 1 , . . . , y n ) ← θ, a ) • define a derivative ∂ � ϕ ( θ ) ≡ ℓ ; V ( θ ; y 0 ) = � ∂ V ( y ) ℓ ( θ ; y ) � � y = y 0 • a directional derivative on the sample space • along with ℓ ( θ ; y 0 ) the observed log-likelihood function • can be extended to derivative of mean likelihood – usable in wider context Fraser/R Bka 2009 13 / 30

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Tangent exponential model • A continuous model f ( y ; θ ) on R n can be approximated by an exponential family model on R d : f TEM ( s ; θ ) ds = exp { ϕ ( θ ) ′ s + ℓ 0 ( θ ) } h ( s ) ds (1) s ( y ) = − ℓ ϕ (ˆ • s is a score variable on R d θ 0 ; y ) • ℓ 0 ( θ ) = ℓ ( θ ; y 0 ) is the observed log-likelihood function • ϕ ( θ ) = ϕ ( θ ; y 0 ) is the directional derivative ℓ ; V ( θ ; y 0 ) • (1) approximates original model to O ( n − 1 ) • gives approximation to the p -value for testing θ • p -value is accurate to O ( n − 3 / 2 ) 14 / 30

Cauchy density and TEM approximation 0.30 0.25 0.20 density 0.15 0.10 0.05 -4 -2 0 2 4 y

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Example: microscopic fluorescence • “tracking of microscopic fluorescent particles attached to biological specimens” Hughes et al., AOAS, 2010 • “CCD (charge-coupled device) camera attached to a microscope used to observe the specimens repeatedly” • “we introduce an improved technique for analyzing such images over time” • Model for counts: − ( x i − x j ) 2 + ( y i − y j ) 2 � � � Z i ∼ N ( f i , f i + ψ ) , f i ≃ B + A j exp S 2 j • f i developed from a model for photon emission; Normal approximation to Poisson; ψ catches the instrument error 16 / 30

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ... microscopic fluorescence • “Our method, which applies maximum likelihood principles, improves the fit to the data, derives accurate standard errors from the data with minimal computation, and uses model-selection criteria to “count” the fluorophores in an image” • “likelihood ratio tests are used to select the final model” • potential for improved inference using likelihood methods? 17 / 30

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ... a simpler model Y i ∼ N ( µ i , µ i + ψ ) , µ i = exp ( β 0 + β 1 x i ) approximate pivot r ∗ constructed from ℓ ( θ ; y 0 ) , ϕ ( θ ; y 0 ) should follow a N ( 0 , 1 ) distribution – simulations Normal Q-Q Plot 3 2 Sample Quantiles 1 0 -1 -2 -3 18 / 30 -3 -2 -1 0 1 2 3

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods More realistic models • for example for analytic inferences for survey data • stochastic processes in space or space-time • extremes in several dimensions • frailty models in survival data • longitudinal data • family-based genetic data and other forms of clustering • estimation of recombination rates from SNP data • ... 19 / 30

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Example: Gaussian random field • scalar output y at p − dimensional input x = ( x 1 , . . . , x p ) • y ( x ) = φ ( x ) T β + Z ( x ) , Z ( x ) Gaussian process on R p • p Cov { Z ( x 1 ) , Z ( x 2 ) } = σ 2 � R ( | x 1 i − x 2 i | ; θ ) i = 1 • R ( | x 1 i − x 2 i | ) = exp {− γ i | x 1 i − x 2 i | α } • anisotropic covariance matrix for inputs on different scales • application to computer experiments Ximing Xu,U Toronto; Derek Bingham, SFU 20 / 30

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods ... Gaussian random field y n = ( y 1 , . . . , y n ) = { y ( x 1 ) , . . . , y ( x n ) } , at n locations x i in R p ℓ ( β, σ, θ ) = − 1 2 { n log σ 2 + log | R ( θ ) | + 1 σ 2 ( y n − Φ β ) T R − 1 ( θ )( y n − Φ β ) } , computation of R − 1 is O ( n 3 ) , n typically 100s or 1000s solution – make the correlation matrix sparse solution – simplify the likelihood function 21 / 30

Likelihood inference in complex settings Nancy Reid with Uyen - PowerPoint PPT Presentation

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Likelihood inference in complex settings Nancy Reid with Uyen Hoang, Wei Lin, Ximing Xu 1 / 30 Likelihood inference for simple

Recurrent machines for likelihood-free inference Arthur Pesah Antoine Wehenkel Gilles Louppe

Lesson 3: Likelihood-based inference for POMP models Aaron A. King, Edward L. Ionides, Kidus

Max. likelihood & Bayesian techniques are both likelihood-based. Weaknesses of likelihood for

Complex Numbers Complex Numbers 1 / 19 Complex Numbers Complex numbers ( C ) are an extension of

Likelihood Methods of Inference Toss coin 6 times and get Heads twice. p is probability of getting

Intermembrane Space H + H + Cyt c Co Q Complex Complex III IV H + ATPase H + Complex

Chapter 8: Estimation In this chapter we will cover: 1. The likelihood and maximum likelihood

Maximum Likelihood properties Maximum parsimony Maximum likelihood Experimental design

Maximum likelihood models Tues. Feb. 27, 2018 1 Overview of today Informal notion of

Applied Statistics Lecturer: Serena Arima Likelihood ML estimator Summaries ML properties LR

Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood

Max Likelihood for Log-Linear Models Daphne Koller Log-Likelihood for Markov Nets A B C

Stat 5102 Lecture Slides: Deck 3 Likelihood Inference Charles J. Geyer School of Statistics

Advanced inference in probabilistic programs Brooks Paige Inference thus far Likelihood

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

AN INTRODUCTION TO BACKGROUND SETTINGS: Allows you to change background BACKGROUND SETTINGS: Allows

Social Pr Social Protection otection: : Conc Concepts and Lif epts and Lifec ecycle le

T he So c ia l E nte rprise L ife Cyc le Da na Bra kma n Re ise r a nd Ste ve n De a n Pro

life-cycle of the product Spiros Vamvakas Head of Scientific Advice Product Development

The Life Cycle of Test Items March 28, 2018 Milestone Events in Test Item Development Test Item

Lecture #7: Regularization Data Science 1 CS 109A, STAT 121A, AC 209A, E-109A Pavlos Protopapas

Improving EnKF with machine learning algorithms John Harlim Department of Mathematics and

Likelihood Ratio Test Lecture 19 Biostatistics 602 - Statistical Inference . . . . Unbiased

Controlling Health Care Costs Through Limited Network Insurance Plans: Evidence from

Likelihood inference in complex settings Nancy Reid with Uyen - PowerPoint PPT Presentation

Likelihood inference for simple problems Higher order approximation Harder problems Approximations to likelihoods Likelihood inference in complex settings Nancy Reid with Uyen Hoang, Wei Lin, Ximing Xu 1 / 30 Likelihood inference for simple

Recurrent machines for likelihood-free inference Arthur Pesah Antoine Wehenkel Gilles Louppe

Lesson 3: Likelihood-based inference for POMP models Aaron A. King, Edward L. Ionides, Kidus

Max. likelihood &amp; Bayesian techniques are both likelihood-based. Weaknesses of likelihood for

Complex Numbers Complex Numbers 1 / 19 Complex Numbers Complex numbers ( C ) are an extension of

Likelihood Methods of Inference Toss coin 6 times and get Heads twice. p is probability of getting

Intermembrane Space H + H + Cyt c Co Q Complex Complex III IV H + ATPase H + Complex

Chapter 8: Estimation In this chapter we will cover: 1. The likelihood and maximum likelihood

Maximum Likelihood properties Maximum parsimony Maximum likelihood Experimental design

Maximum likelihood models Tues. Feb. 27, 2018 1 Overview of today Informal notion of

Applied Statistics Lecturer: Serena Arima Likelihood ML estimator Summaries ML properties LR

Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood

Max Likelihood for Log-Linear Models Daphne Koller Log-Likelihood for Markov Nets A B C

Stat 5102 Lecture Slides: Deck 3 Likelihood Inference Charles J. Geyer School of Statistics

Advanced inference in probabilistic programs Brooks Paige Inference thus far Likelihood

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

AN INTRODUCTION TO BACKGROUND SETTINGS: Allows you to change background BACKGROUND SETTINGS: Allows

Social Pr Social Protection otection: : Conc Concepts and Lif epts and Lifec ecycle le

T he So c ia l E nte rprise L ife Cyc le Da na Bra kma n Re ise r a nd Ste ve n De a n Pro

life-cycle of the product Spiros Vamvakas Head of Scientific Advice Product Development

The Life Cycle of Test Items March 28, 2018 Milestone Events in Test Item Development Test Item

Lecture #7: Regularization Data Science 1 CS 109A, STAT 121A, AC 209A, E-109A Pavlos Protopapas

Improving EnKF with machine learning algorithms John Harlim Department of Mathematics and

Likelihood Ratio Test Lecture 19 Biostatistics 602 - Statistical Inference . . . . Unbiased

Controlling Health Care Costs Through Limited Network Insurance Plans: Evidence from

Max. likelihood & Bayesian techniques are both likelihood-based. Weaknesses of likelihood for