Nested logit models Michel Bierlaire michel.bierlaire@epfl.ch - PowerPoint PPT Presentation

Nested logit models Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Nested logit models – p. 1/23

Red bus/Blue bus paradox • Mode choice example • Two alternatives: car and bus • There are red buses and blue buses • Car and bus travel times are equal: T Nested logit models – p. 2/23

Red bus/Blue bus paradox Model 1 U car = βT + ε car U bus = βT + ε bus Therefore, e βT e βT + e βT = 1 P ( car |{ car , bus } ) = P ( bus |{ car , bus } ) = 2 Nested logit models – p. 3/23

Red bus/Blue bus paradox Model 2 U car = βT + ε car U blue bus = βT + ε blue bus U red bus = βT + ε red bus e βT e βT + e βT + e βT = 1 P ( car |{ car , blue bus , red bus } ) = 3  P ( car |{ car , blue bus , red bus } )  = 1  P ( blue bus |{ car , blue bus , red bus } ) 3 .  P ( red bus |{ car , blue bus , red bus } )  Nested logit models – p. 4/23

Red bus/Blue bus paradox • Assumption of logit: ε i.i.d • ε blue bus and ε red bus contain common unobserved attributes: ◮ fare ◮ headway ◮ comfort ◮ convenience ◮ etc. Nested logit models – p. 5/23

Capturing the correlation ⑦ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ⑦ ⑦ Bus Car ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ⑦ ⑦ Blue Red Nested logit models – p. 6/23

Capturing the correlation If bus is chosen then U blue bus = V blue bus + ε blue bus U red bus = V red bus + ε red bus where V blue bus = V red bus = βT e βT e βT + e βT = 1 P ( blue bus |{ blue bus , red bus } ) = 2 Nested logit models – p. 7/23

Capturing the correlation ⑦ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ⑦ ⑦ Bus Car ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ✁ ❆ ⑦ ⑦ Blue Red Nested logit models – p. 8/23

Capturing the correlation What about the choice between bus and car? U car = βT + ε car U bus = V bus + ε bus with V bus = V bus ( V blue bus , V red bus ) ε bus = ? Define V bus as the expected maximum utility of red bus and blue bus Nested logit models – p. 9/23

Expected maximum utility For a set of alternative C , define U C = max i ∈C U i = max i ∈C ( V i + ε i ) and V C = E [ U C ] For logit i ∈C U i ] = 1 � e µV i E [max µ ln i ∈C i ∈C e µV i + γ Actually, E [max i ∈C U i ] = 1 µ ln � µ , but the constant term can be ignored. Nested logit models – p. 10/23

Expected maximum utility µ b ln( e µ b V blue bus + e µ b V red bus ) 1 V bus = µ b ln( e µ b βT + e µ b βT ) 1 = 1 = βT + µ b ln 2 where µ b is the scale parameter for the logit model associated with the choice between red bus and blue bus Nested logit models – p. 11/23

Nested Logit Model Probability model: e µV car e µβT 1 P ( car ) = e µV car + e µV bus = µb ln 2 = e µβT + e µβT + µ µ 1 + 2 µb If µ = µ b , then P(car) = 1 3 (Model 2) µ µ b → 0 , and P(car) → 1 If µ b → ∞ , then 2 (Model 1) Nested logit models – p. 12/23

Nested Logit Model Probability model: e µβT + µ µb ln 2 e µV bus 1 P ( bus ) = e µV car + e µV bus = µb ln 2 = e µβT + e µβT + µ 1 + 2 − µ µb If µ = µ b , then P(bus) = 2 3 (Model 2) µ µ b → 0 , then P(bus) → 1 If 2 (Model 1) Nested logit models – p. 13/23

Nested Logit Model 1 P(car) P(bus) 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 mu/mu_b µ µ b Nested logit models – p. 14/23

Solving the paradox µ If µ b → 0 , we have P ( car ) = 1 / 2 P ( bus ) = 1 / 2 P ( red bus | bus ) = 1 / 2 P ( blue bus | bus ) = 1 / 2 P ( red bus ) = P ( red bus | bus ) P ( bus ) = 1 / 4 P ( blue bus ) = P ( blue bus | bus ) P ( bus ) = 1 / 4 Nested logit models – p. 15/23

Comments • A group of similar alternatives is called a nest • Each alternative belongs to exactly one nest • The model is named Nested Logit • The ratio µ/µ b must be estimated from the data • 0 < µ/µ b ≤ 1 (between models 1 and 2) Nested logit models – p. 16/23

Derivation from random utility • Let C be the choice set. • Let C 1 , . . . , C M be a partition of C . • The model is derived as M � P ( i |C ) = Pr( i | m, C ) Pr( m |C ) . m =1 • Each i belongs to exactly one nest m . P ( i |C ) = Pr( i | m ) Pr( m |C ) . • Utility: error components U i = V i + ε i = V i + ε m + ε im . Nested logit models – p. 17/23

Derivation: Pr( i | m ) Pr( i | m ) = Pr( U i ≥ U j , j ∈ C m ) = Pr( V i + ε m + ε im ≥ V j + ε m + ε jm , j ∈ C m ) = Pr( V i + ε im ≥ V j + ε jm , j ∈ C m ) Assumption: ε im i.i.d. EV( 0 , µ m ) e µ m V i Pr( i | m ) = j ∈C m e µ m V j . � Nested logit models – p. 18/23

Derivation: Pr( m |C ) � � Pr( m |C ) = Pr i ∈C m U i ≥ max max j ∈C ℓ U j , ∀ ℓ � = m � � = Pr ε m + max i ∈C m ( V i + ε im ) ≥ ε ℓ + max j ∈C ℓ ( V j + ε jℓ ) , ∀ ℓ � = m , As ε im are i.i.d. EV( 0 , µ m ), i ∈C m ( V i + ε im ) ∼ EV ( ˜ max V m , µ m ) , where 1 � ˜ e µ m V i . V m = ln µ m i ∈C m Nested logit models – p. 19/23

Derivation: Pr( m |C ) Denote i ∈C m ( V i + ε im ) = ˜ V m + ε ′ max m , to obtain Pr( m |C ) = Pr( ˜ m + ε m ≥ ˜ V m + ε ′ V ℓ + ε ′ ℓ + ε ℓ , ∀ ℓ � = m ) . where ε ′ m ∼ EV (0 , µ m ) . Define ε m = ε ′ ˜ m + ε m , to obtain Pr( m |C ) = Pr( ˜ ε m ≥ ˜ V m + ˜ V ℓ + ˜ ε ℓ , ∀ ℓ � = m ) . Nested logit models – p. 20/23

Derivation: Pr( m |C ) Assumption: ˜ ε m i.i.d. EV( 0 , µ ) Pr( m |C ) = Pr( ˜ ε m ≥ ˜ V m + ˜ V ℓ + ˜ ε ℓ , ∀ ℓ � = m ) e µ ˜ V m = V p . � M p =1 e µ ˜ We obtain the nested logit model e µ ˜ e µ m V i V m P ( i |C ) = � j ∈C m e µ m V j � M p =1 e µ ˜ V p � � µ ℓ ∈C m e µ m V ℓ µ m ln � exp e µ m V i = � j ∈C m e µ m V j � � � M µ µ p ln � ℓ ∈C p e µ p V ℓp p =1 exp Nested logit models – p. 21/23

Nested Logit Model µ • If µ m = 1 , for all m , NL becomes logit. • Sequential estimation: • Estimation of NL decomposed into two estimations of logit • Estimator is consistent but not efficient • Simultaneous estimation: • Log-likelihood function is generally non concave • No guarantee of global maximum • Estimator asymptotically efficient • Log likelihood for observation n is ln P ( i n |C n ) = ln P ( i n |C mn ) + ln P ( C mn |C n ) where i n is the chosen alternative. Nested logit models – p. 22/23

Correlation Correlation matrix is block diagonal:  1 if i = j,   1 − µ 2  Corr( U i , U j ) = if i � = j , i and j are in the same nest m, µ 2 m   otherwise . 0  Variance-covariance matrix is block diagonal:  π 2 if i = j,   6 µ 2    6 µ 2 − π 2 π 2 Cov( U i , U j ) = if i � = j , i and j are in the same nest m, 6 µ 2   m   0 otherwise .  Nested logit models – p. 23/23

Nested logit models Michel Bierlaire michel.bierlaire@epfl.ch - PowerPoint PPT Presentation

Nested logit models Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Nested logit models p. 1/23 Red bus/Blue bus paradox Mode choice example Two alternatives: car and bus There are red buses and blue

Nested Word Automata Jens Stimpfle 30.6.2014 Nested Words Nested Words Theoretically and

Overview Weaknesses of NE 1 Example 1: Centipede Game Example 2: Matching Pennies Logit QRE 2

Nested Transactions Nested Transactions Flat transactions The rules for committing of

Nested and Composite Classes Lecture 14 COP 3252 Summer 2017 May 30, 2017 Nested Classes

Advanced OpenMP Lecture 6: Nested parallelism Nested parallelism Nested parallelism is

Practical note on specification of discrete choice model Toshiyuki Yamamoto Nagoya University,

Comparing Nested Models Two models are nested if one model contains all the terms of the other,

Comparing Nested Models Two regression models are called nested if one contains all the predictors

Shall We Mixed Logit? Estimation stability and prediction reliability of error component mixed

Computer Lab II Further Introduction to Biogeme Binary Logit Model Estimation Anna Fernndez

6 Subsequences and sequential compactness 6.1 Nested intervals and nested d -cells Recall the

NEVE: Nested Virtualization Extensions for ARM Jin Tack Lim, Christo ff er Dall, Shih-Wei Li, Jason

Improving the efficiency of individualized designs for the covariates mixed logit choice model by

Binary choice 3.2 Apply the model on data Michel Bierlaire Solution of the practice quiz.

Logit with multiple alternatives Michel Bierlaire Transport and Mobility Laboratory School of

How to assess the fit of multilevel logit models with Stata? Meeting of the German Stata User

compsci 514: algorithms for data science Cameron Musco University of Massachusetts Amherst.

PLUGIN CLASSIFIERS: NAIVE BAYES, LDA, PLUGIN CLASSIFIERS: NAIVE BAYES, LDA, LOGISTIC REGRESSION

Where do Multivariate Normal Samples Come from? Paul E. Johnson 1 2 1 Department of Political

Exact Inference for Gaussian Process Regression in case of Big Data with the Cartesian Product

The Kalman Filter An Algorithm for Dealing with Uncertainty Steven Janke May 2011 Steven Janke

Data Classification Linear Classifier II Latent Differential Analysis Mean Classification

1 Random vectors I Some experiments produce outcomes that are vectors. Such a vector is

Chapter 2: Video 4 - Supplementary Slides Stationarity To obtain parsimony in a time series model

Sambuz

Useful Links

Newsletter

Mail Us