The Price of Competition: Effect Size Heterogeneity Matters in High - PowerPoint PPT Presentation

The Price of Competition: Effect Size Heterogeneity Matters in High Dimensions! joint work with Yachong Yang and Weijie Su Hua Wang The Wharton School, University of Pennsylvania June 2, 2020 Hua Wang (Wharton) The Price of Competition June 2, 2020 1 / 29

Settings: Model selection in high dimensions High-dimensional linear regression = β + y X z n × 1 n × p p × 1 n × 1 An important question of great practical value is model selection. How hard is model selection? Hua Wang (Wharton) The Price of Competition June 2, 2020 2 / 29

Settings: Model selection in high dimensions High-dimensional linear regression = β + y X z n × 1 n × p p × 1 n × 1 An important question of great practical value is model selection. How hard is model selection? An intuitive answer: It depends on sparsity (as long as signals are large enough, e.g. beta-min). Hua Wang (Wharton) The Price of Competition June 2, 2020 2 / 29

Performance criteria: FDP and TPP Relevant variables (or signals). S = { j : β j � = 0 } Hua Wang (Wharton) The Price of Competition June 2, 2020 3 / 29

Performance criteria: FDP and TPP Relevant variables (or signals). S = { j : β j � = 0 } Discoveries, or model selected at λ S = { j : ˆ � β j ( λ ) � = 0 } Hua Wang (Wharton) The Price of Competition June 2, 2020 3 / 29

Performance criteria: FDP and TPP Relevant variables (or signals). S = { j : β j � = 0 } Discoveries, or model selected at λ S = { j : ˆ � β j ( λ ) � = 0 } true model FDP( λ ) := # { j : j ∈ � S , β j = 0 } 200 = # � 100 + 200 300 100 200 S TPP( λ ) := # { j : j ∈ � S , β j � = 0 } 100 = # { j : β j � = 0 } 300 + 100 estimated model Hua Wang (Wharton) The Price of Competition June 2, 2020 3 / 29

Folklore theorem of signal strength When p > n , Lasso is the popular method to do variable selection. Hua Wang (Wharton) The Price of Competition June 2, 2020 4 / 29

Folklore theorem of signal strength When p > n , Lasso is the popular method to do variable selection. Belief (Some folks, nowadays) With � β � 0 fixed, the stronger all signals are, the better a model selector (e.g. Lasso) will perform. Hua Wang (Wharton) The Price of Competition June 2, 2020 4 / 29

Folklore theorem of signal strength When p > n , Lasso is the popular method to do variable selection. Belief (Some folks, nowadays) With � β � 0 fixed, the stronger all signals are, the better a model selector (e.g. Lasso) will perform. Is it really the case? Hua Wang (Wharton) The Price of Competition June 2, 2020 4 / 29

In which setting does Lasso perform best in? n = 1000 , p = 1000 , s = 200, with weak noise σ = 0 . 01. The structure of signals: Setting 1: Strongest. Setting 3: Weak. Setting 2: Strong. Setting 4: Weakest. Hua Wang (Wharton) The Price of Competition June 2, 2020 5 / 29

The result... The tpp and fdp are calculated along Lasso path with λ varies from ∞ to 0. Hua Wang (Wharton) The Price of Competition June 2, 2020 6 / 29

Surprisingly... The tpp and fdp are calculated along Lasso path with λ varies from ∞ to 0. Hua Wang (Wharton) The Price of Competition June 2, 2020 7 / 29

Surprisingly... The TPP and FDP are calculated along Lasso path with λ varies from ∞ to 0. Hua Wang (Wharton) The Price of Competition June 2, 2020 8 / 29

Surprisingly... The TPP and FDP are calculated along Lasso path with λ varies from ∞ to 0. Hua Wang (Wharton) The Price of Competition June 2, 2020 9 / 29

Lasso prefers weak signals?? Everything (including sparsity) except the strength of the signals are the same. The Lasso perform better with weaker signals! Hua Wang (Wharton) The Price of Competition June 2, 2020 10 / 29

Lasso prefers weak signals?? Everything (including sparsity) except the strength of the signals are the same. The Lasso perform better with weaker signals! Our explanation: Lasso favors strong signals as we expected, but it “prefers” signals that are wildly differing with each other . Hua Wang (Wharton) The Price of Competition June 2, 2020 10 / 29

Lasso prefers weak signals?? Everything (including sparsity) except the strength of the signals are the same. The Lasso perform better with weaker signals! Our explanation: Lasso favors strong signals as we expected, but it “prefers” signals that are wildly differing with each other . We term this diverse structure of signals as “Effect Size Heterogeneity”. Hua Wang (Wharton) The Price of Competition June 2, 2020 10 / 29

Lasso prefers weak signals?? Everything (including sparsity) except the strength of the signals are the same. The Lasso perform better with weaker signals! Our explanation: Lasso favors strong signals as we expected, but it “prefers” signals that are wildly differing with each other . We term this diverse structure of signals as “Effect Size Heterogeneity”. With everything else fixed, Lasso performs the best with the most heterogeneous signals. Hua Wang (Wharton) The Price of Competition June 2, 2020 10 / 29

Effect size heterogeneity matters! Everything (including sparsity) except the strength of the signals are the same. The Lasso perform better with weaker signals! Our explanation: Lasso favors strong signals as we expected, but it “prefers” signals that are wildly differing with each other . We term this diverse structure of signals as “Effect Size Heterogeneity”. With everything else fixed, Lasso performs the best with the most heterogeneous signals. Hua Wang (Wharton) The Price of Competition June 2, 2020 10 / 29

Effect size heterogeneity matters! Everything (including sparsity) except the strength of the signals are the same. The Lasso perform better with weaker signals! Our explanation: Lasso favors strong signals as we expected, but it “prefers” signals that are wildly differing with each other . We term this diverse structure of signals as “Effect Size Heterogeneity”. With everything else fixed, Lasso performs the best with the most heterogeneous signals. Effect Size Heterogeneity matters! Hua Wang (Wharton) The Price of Competition June 2, 2020 10 / 29

Which setting will Lasso perform best in? (Re-visit) Setting 1: Most Homogeneous Setting 3: Heterogeneous. Setting 2: Homogeneous. Setting 4: Most Heterogeneous. Hua Wang (Wharton) The Price of Competition June 2, 2020 11 / 29

Theory of Lasso in literature Belief (Literature 1 , nowadays (informal)) Given the information of k = � β � 0 , and the structure of X (n , p , RIP conditions, etc.), we can understand Lasso (as a model selector) well, especially if signals are sufficiently large (beta-min condition). 1 e.g. E. Candes, T. Tao 2007; PJ. Bickel, Y. Ritov, AB. Tsybakov 2009; MJ. Wainwright 2009... Hua Wang (Wharton) The Price of Competition June 2, 2020 12 / 29

Theory of Lasso in literature Belief (Literature 1 , nowadays (informal)) Given the information of k = � β � 0 , and the structure of X (n , p , RIP conditions, etc.), we can understand Lasso (as a model selector) well, especially if signals are sufficiently large (beta-min condition). Theorem (W., Yang and Su, 2020 (informal)) The information of ( � β � 0 , X ) is not enough, we need to know more about the inner structure of β . 1 e.g. E. Candes, T. Tao 2007; PJ. Bickel, Y. Ritov, AB. Tsybakov 2009; MJ. Wainwright 2009... Hua Wang (Wharton) The Price of Competition June 2, 2020 12 / 29

Main results Assume X has iid N (0 , 1 / n ) entries, σ = 0, i.e. noise z i = 0, regression coefficients β i are iid from prior Π with E Π 2 < ∞ and P (Π � = 0) = ǫ ∈ (0 , 1), n / p → δ ∈ (0 , ∞ ). Then Theorem (W., Yang and Su, 2020+) With probability tending to one, q △ (TPP( λ )) − 0 . 001 ≤ FDP( λ ) ≤ q ▽ (TPP( λ )) + 0 . 001 uniformly for all λ , where q △ ( · ) = q △ ( · ; δ, ǫ ) > 0 and q ▽ ( · ) = q ▽ ( · ; δ, ǫ ) < 1 are two deterministic function. Hua Wang (Wharton) The Price of Competition June 2, 2020 13 / 29

The Lasso Crescent FDP 𝒓 𝛂 𝒓 ∆ Lasso Crescent Unachievable Zone 0 0 TPP 1 Hua Wang (Wharton) The Price of Competition June 2, 2020 14 / 29

The sharpest of the Lasso Crescent Definition (most favorable prior) For M > 0 and an integer m > 0 , we call the following the ( ǫ, m , M ) -prior:  0 w.p. 1 − ǫ   ǫ  M w.p.   m  Π △ = M 2 ǫ w.p. m  · · · · · ·     ǫ M m  w.p. m . Definition (least favorable prior) For M > 0 , we call the following the ( ǫ, M ) -prior: � 0 w.p. 1 − ǫ Π ∇ = M w.p. ǫ. Theorem (Effect Size Heterogeneity Matters!) The Π ▽ achieves q ▽ , and Π △ achieves q △ , as M , m → ∞ . Hua Wang (Wharton) The Price of Competition June 2, 2020 15 / 29

The Lasso Crescent (Re-visit) FDP 𝒓 𝛂 𝒓 ∆ Lasso Crescent Unachievable Zone 0 0 TPP 1 Hua Wang (Wharton) The Price of Competition June 2, 2020 16 / 29

Remarks on the results Theorem (W., Yang and Su, 2020+) With probability tending to one, q △ (TPP( λ )) − 0 . 001 ≤ FDP( λ ) ≤ q ▽ (TPP( λ )) + 0 . 001 for all λ > 0 . 01 , where q △ ( · ) and q ▽ ( · ) are two deterministic function. And the Π ▽ (absolutely homogeneous) gives q ▽ , and Π △ (absolutely heterogeneous) gives q △ . Hua Wang (Wharton) The Price of Competition June 2, 2020 17 / 29

The Price of Competition: Effect Size Heterogeneity Matters in High - PowerPoint PPT Presentation

The Price of Competition: Effect Size Heterogeneity Matters in High Dimensions! joint work with Yachong Yang and Weijie Su Hua Wang The Wharton School, University of Pennsylvania June 2, 2020 Hua Wang (Wharton) The Price of Competition June

Chapter 5: Short Run Price Competition Price competition (Bertrand competition) A1. Firms meet

Chapter 5: Short Run Price Competition Price competition (Bertrand competition) A1. Firms meet

CHAPTER 8 POWER & EFFECT SIZE F OR EDUC/PSY 6600 Cohen Chap 8 - Power & Effect Size 1

Greg Kirk Director Planfarm Pty Ltd Poor Price Good Price Poor Price Good Price Poor Season

Haskell Literacy in Six Slides Greg Price ( price ) 2008 Jan 29 Greg Price ( price ) () Haskell

Reasoning in Haskell Greg Price ( price ) 2008 Jan 31 Greg Price ( price ) () Reasoning in

Competition between the rotational effect and the finite-size effect on relativistic fermions

External Validity of Hedonic Price Estimates: Heterogeneity in the Price Discount Associated with

A comparison of A comparison of heterogeneity correction heterogeneity correction algorithms

WORK IN THE GIG ECONOMY Huma Humans a ns as a s a Se Service rvice @JeremiasPrassl VAST

Etiologic Heterogeneity Etiologic Heterogeneity In Endometrial Cancer Advances in Endometrial

Processing Heterogeneity Nikolaus Grigorieff Heterogeneity and Biology Translocation, Brilot et

Processing Heterogeneity Nikolaus Grigorieff Larson, The Far Side Heterogeneity and Biology

Detecting and Detecting and Characterizing Heterogeneity Characterizing Heterogeneity

Unobserved Heterogeneity in Matching Games Jeremy T. Fox 1 Chenyu Yang 2 1 University of Michigan

Toward Understanding Heterogeneity in Computing Arnold L. Rosenberg Ron C. Chiang Electrical

Probabilistic Graphical Models David Sontag New York University Lecture 2, February 7, 2013

Improving Information from Manipulable Data Alex Frankel Navin Kartik July 2020 Improving

Advanced and cost efficient the Joint Fire Support Team Trainer of the Bundeswehr #ITEC2019

Joint longitudinal and time-to-event models for multilevel hierarchical data Sam Brilleman 1,2 ,

Skeletal Posture Estimation Shane Transue, Phuc Nguyen, Tam Vu, and Min-Hyung Choi University of

The KPZ fixed point Jeremy Quastel University of Toronto joint work with Konstantin Matetski

Power systems and Queueing theory: Storage and Electric Vehicles (Joint work with Lisa Flatley,

ITS Joint Program Office Updates Walton Fehr August 13, 2015 Topics Security and the Uniform

Sambuz

Useful Links

Newsletter

Mail Us

The Price of Competition: Effect Size Heterogeneity Matters in High - PowerPoint PPT Presentation

The Price of Competition: Effect Size Heterogeneity Matters in High Dimensions! joint work with Yachong Yang and Weijie Su Hua Wang The Wharton School, University of Pennsylvania June 2, 2020 Hua Wang (Wharton) The Price of Competition June

Chapter 5: Short Run Price Competition Price competition (Bertrand competition) A1. Firms meet

Chapter 5: Short Run Price Competition Price competition (Bertrand competition) A1. Firms meet

CHAPTER 8 POWER &amp; EFFECT SIZE F OR EDUC/PSY 6600 Cohen Chap 8 - Power &amp; Effect Size 1

Greg Kirk Director Planfarm Pty Ltd Poor Price Good Price Poor Price Good Price Poor Season

Haskell Literacy in Six Slides Greg Price ( price ) 2008 Jan 29 Greg Price ( price ) () Haskell

Reasoning in Haskell Greg Price ( price ) 2008 Jan 31 Greg Price ( price ) () Reasoning in

Competition between the rotational effect and the finite-size effect on relativistic fermions

External Validity of Hedonic Price Estimates: Heterogeneity in the Price Discount Associated with

A comparison of A comparison of heterogeneity correction heterogeneity correction algorithms

WORK IN THE GIG ECONOMY Huma Humans a ns as a s a Se Service rvice @JeremiasPrassl VAST

Etiologic Heterogeneity Etiologic Heterogeneity In Endometrial Cancer Advances in Endometrial

Processing Heterogeneity Nikolaus Grigorieff Heterogeneity and Biology Translocation, Brilot et

Processing Heterogeneity Nikolaus Grigorieff Larson, The Far Side Heterogeneity and Biology

Detecting and Detecting and Characterizing Heterogeneity Characterizing Heterogeneity

Unobserved Heterogeneity in Matching Games Jeremy T. Fox 1 Chenyu Yang 2 1 University of Michigan

Toward Understanding Heterogeneity in Computing Arnold L. Rosenberg Ron C. Chiang Electrical

Probabilistic Graphical Models David Sontag New York University Lecture 2, February 7, 2013

Improving Information from Manipulable Data Alex Frankel Navin Kartik July 2020 Improving

Advanced and cost efficient the Joint Fire Support Team Trainer of the Bundeswehr #ITEC2019

Joint longitudinal and time-to-event models for multilevel hierarchical data Sam Brilleman 1,2 ,

Skeletal Posture Estimation Shane Transue, Phuc Nguyen, Tam Vu, and Min-Hyung Choi University of

The KPZ fixed point Jeremy Quastel University of Toronto joint work with Konstantin Matetski

Power systems and Queueing theory: Storage and Electric Vehicles (Joint work with Lisa Flatley,

ITS Joint Program Office Updates Walton Fehr August 13, 2015 Topics Security and the Uniform

Sambuz

Useful Links

Newsletter

Mail Us

CHAPTER 8 POWER & EFFECT SIZE F OR EDUC/PSY 6600 Cohen Chap 8 - Power & Effect Size 1