Feature Selection Risk Alex Chinco University of Illinois at - PowerPoint PPT Presentation

Feature Selection Risk Alex Chinco University of Illinois at Urbana-Champaign September 15, 2014

” Our model allows us to identify and interpret events faster than more traditional methods used by other investors. —Quant. Fund Pitch Book

Imagine you’re a trader. Each stock can have Y / N exposure to 7 features. Whether or not. . . 1. It’s involved in a crowded trade 2. It’s mentioned in M&A rumors 3. Its major supplier closed down 4. Its labor force unionized 5. It belongs alcohol/tobacco/gaming industry 6. It’s referenced in a scientific article 7. It’s been added to the S&P 500 1 of the 7 features might have realized a shock. Having mystery feature raises demand by α > 0 shares. Question: How many observations do you need to see in order to decide which (if any) of the 7 features has realized a shock?

Answer: Only 3 !

Answer: Only 3 ! ◮ Stock 1 : crowded trade, supplier close, ATG ind., S&P 500 add. ◮ Stock 2 : M&A rumor, supplier close, sci. article, S&P 500 add. ◮ Stock 3 : labor unionization, ATG ind., sci. article, S&P 500 add. Data matrix ( X ) 3 × 7 tells you if stock n has attribute q : � 1 if yes iid ∼ N(0 , σ 2 x n,q = with ǫ n ǫ ) , α ≫ σ ǫ 0 if no e.g., if only d 1 ≈ α then crowded trade shock:   α       α ǫ 1 1 0 1 0 1 0 1   0         0 0 1 1 0 0 1 1 + ǫ 2  .  ≈       .   . 0 0 0 0 1 1 1 1   ǫ 3 0 �� ( d ) 3 × 1 ( X ) 3 × 7 ( ǫ ) 3 × 1 �� ( α ) 7 × 1 e.g., if d 1 ≈ d 2 ≈ d 3 ≈ α , then S&P 500 addition shock.

Key Insight: Inference problem changes character at N ⋆ = 3 .

Key Insight: Inference problem changes character at N ⋆ = 3 . First, imagine you’ve seen N = 4 observations:         α α 1 0 1 0 1 0 1 ǫ 1   0       0 0 1 1 0 0 1 1 ǫ 2           + ≈ .       . 0 0 0 0 1 1 1 1   ǫ 3     .     1 1 0 0 1 1 0 α ǫ 4 0 � �� ( d ) 4 × 1 ( X ) 4 × 7 ( ǫ ) 4 × 1 ( α ) 7 × 1 √ Estimate of α is now ( d 1 + d 4 ) / 2 ≈ α ± σ ǫ / 2 .

Key Insight: Inference problem changes character at N ⋆ = 3 . First, imagine you’ve seen N = 4 observations:         α α 1 0 1 0 1 0 1 ǫ 1   0       0 0 1 1 0 0 1 1 ǫ 2           + ≈ .       . 0 0 0 0 1 1 1 1   ǫ 3     .     1 1 0 0 1 1 0 α ǫ 4 0 � �� ( d ) 4 × 1 ( X ) 4 × 7 ( ǫ ) 4 × 1 ( α ) 7 × 1 √ Estimate of α is now ( d 1 + d 4 ) / 2 ≈ α ± σ ǫ / 2 . Now, imagine you’ve instead seen only N = 2 observations:   α � � � � � �   0 α ǫ 1 1 0 1 0 1 0 1     + ≈ . . ǫ 2 0 0 1 1 0 0 1 1   .   �� 0 ( d ) 2 × 1 ( X ) 2 × 7 ( ǫ ) 2 × 1 �� ( α ) 7 × 1 Could be either crd. trade or ATG ind. How to value 3 rd asset? � 0 1 � x 3 = 0 0 1 1 1

This is a stylized example, but. . . the problem scales! iid Suppose Q = 400 , K = 5 , and x n,q ∼ N(0 , 1) : 400 � d n = ˜ d n − E[ ˜ d n | f ] = α q · x n,q + ǫ n q =1 Bonferroni Threshold FDR Threshold LASSO 1.00 α q } ) 2 q =1 1 { α q � =ˆ 0.75 0.50 25 · ( � 400 N ⋆ ≈ 22 N ⋆ ≈ 22 N ⋆ ≈ 22 0.25 1 / 0.00 3 4 5 6 3 4 5 6 3 4 5 6 log( N )

1) Derive feature selection bound 2) Embed in eqm. asset-pricing model 3) Outline empirical predictions: ◮ Noise trader and feature selection risks are substitutes. ◮ Derivatives more informative than Arrow securities. Slogan: There are fundamental limits on how quickly even the most sophisticated trader can interpret market signals. Sparse B.R.: Gabaix (2012); Compressed Sensing: Candes, Romberg, and Tao (2004); Candes and Tao (2005); Donoho (2006); Cogn. Control: Chinco (2014); High-D. Inference: Chinco and Clark-Joseph (2014); Info-Based Asset Pricing: Grossman and Stiglitz (1980); Kyle (1985); Veldkamp (2006); Behavioral Finance: Barberis, Shleifer, and Wurgler (2005); Garleanu and Pedersen (2012).

Consider sequences of Kyle (1985)-type markets where: N →∞ Q N , K N = ∞ lim N ≥ K N lim K N / Q N = 0 N →∞ Agents must use feature selection rule, φ ( d , X ) , to identify shocks: φ : R N × R N × Q �→ R Q where FSE[ φ ] is prob. that φ identifies wrong features. Proposition (Feature Selection Bound) If there exists some constant C > 0 such that: N < C × K N · log( Q N / K N ) as N → ∞ , then there exists some constant c > 0 such that: min φ ∈ Φ FSE[ φ ] > c N ⋆ ( Q, K ) ≍ K · log( Q / K ) is the feature selection bound.

Static Kyle (1985)-type model with N assets. N informed traders each get priv. signal about value of single asset. Single market maker (MM) views agg. demand for N assets: � � θ ) · d � 2 α = arg min � � X � α − ( 1 / 2 + γ · � � α � 1 α ∈ R Q � ◮ Informed trader demand rule: y n = θ · v n ◮ Market maker pricing rule: p n = λ · d n Proposition (Equilibrium Using the LASSO) If MM uses the LASSO and N > N ⋆ , then there exists an equilibrium: � � σ z � � 1 K λ = and θ = C · log( Q ) × N · 2 · θ σ v � for C > 0 and γ = 2 · ( σ z / θ ) · 2 · log( Q ) .

Informed trader expected profit: � C / 2 · K / N · log( Q ) × σ v · σ z Question: What is the feature count for noise trader demand volatility exchange rate that leaves informed traders indifferent?

Informed trader expected profit: � C / 2 · K / N · log( Q ) × σ v · σ z Question: What is the feature count for noise trader demand volatility exchange rate that leaves informed traders indifferent? Consider transformations: Q �→ Q ′ = Q · (1 + ∆ Q ) σ z �→ σ ′ and z = σ z · (1 + ∆ σ z ) Proposition (Substituting Risks) If σ z decreases by ∆ σ z < 0 , then informed trader expected profits are unchanged if Q increases by ∆ Q > 0 : � Q � ∆ Q = 2 · log( Q ) · × − ∆ σ z σ z

Question: What kind of asset reveals shocks using fewest obs.?

Question: What kind of asset reveals shocks using fewest obs.? Could look at Arrow securities:     d ( A )   1 0 0 0 α 1 · · · 1  d ( A )     0 1 0 0  α 2 · · ·     2       d ( A )   0 0 1 0 α 3 = · · · + “Noise”       3      . . . .  . . ... . . . . .  .      . . . . . .   d ( A ) α Q 0 0 0 1 · · · Q � �� X ( A ) . . . but this is over-kill!

Question: What kind of asset reveals shocks using fewest obs.? Could look at Arrow securities:     d ( A )   1 0 0 0 α 1 · · · 1  d ( A )     0 1 0 0  α 2 · · ·     2       d ( A )   0 0 1 0 α 3 = · · · + “Noise”       3      . . . .  . . ... . . . . .  .      . . . . . .   d ( A ) α Q 0 0 0 1 · · · Q � �� X ( A ) . . . but this is over-kill! Could also look at N deriv. constr. by fin. eng. from Q Arrow sec.: N × Q D X ( A ) N × Q = X Q × Q Can’t have ind. exposures to all Q features since N ≪ Q . e.g., all deriv. must have sim. exp. to, say, crwd. trade and S&P 500 incl.

Key insight: Don’t need complete independence! If any (2 · K ) columns of X are lin. indep., then any K -sparse signal α ∈ R Q can be reconstructed uniquely from X α . Why? Suppose not. i.e., there exists α , α ′ ∈ R Q with X α = X α ′ ; but, this implies X ( α − α ′ ) = 0 which is a contrdtn. α − α ′ is at most (2 · K ) -sparse. There can’t be lin. dep. betw. (2 · K ) cols. of X by asm. Proposition (Seemingly Redundant Assets) If N ≥ N ⋆ ( Q, K ) , then MM studying deriv. using the LASSO can identify K -sparse shocks with prob. greater than 1 − C 1 · e − C 2 · K using: Θ[ K / Q · log( Q / K )] times fewer assets than MM studying Arrow sec with C 1 , C 2 > 0 .

Thanks!

Feature Selection Risk Alex Chinco University of Illinois at - PowerPoint PPT Presentation

Feature Selection Risk Alex Chinco University of Illinois at Urbana-Champaign September 15, 2014 Our model allows us to identify and interpret events faster than more traditional methods used by other investors. Quant. Fund Pitch Book

Outline Reducing Dimensionality Feature Selection 1 Steven J Zeil Feature Extraction 2

Reducing Dimensionality Steven J Zeil Old Dominion Univ. Fall 2010 1 Feature Selection

Decision Tree Prof. Seungchul Lee Industrial AI Lab. Feature Test Feature 1 Feature 2 Feature

Feature Selection: ROC and Subset Selection Theodoridis 5.5-5.7 Using ROC for Feature Selection

A Distinctive Feature of A Distinctive Feature of A Distinctive Feature of A Distinctive Feature

Week 3 Video 4 Automated Feature Generation Automated Feature Selection Automated Feature

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

Mutual Information an Adequate Tool for Feature Selection ? Benot Frnay November 15, 2013

Earth: The Feature Presentation - feature, landscape, topography Earth: The Feature Presentation

Risk Management Workshop 1 Risk management workshop Why do we Risk Risk and need risk

PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2018 Soleymani

SECONDHAND SELECTION Sales Price - 275,000.00 EU SECONDHAND SELECTION INTERNAL VIEWS SECONDHAND

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

SELECTION Deterministic Stochastic Proportionate selection: Roulette Wheel Selection

Selection 2 Selection Selection given a set of (distinct) elements, finding the element larger

Feature Extraction 7-1 Ronald Peikert SciVis 2007 - Feature Extraction What are features?

Our tailor-made solutions your way to reach the goal. www.silvirom.ee Silvirom company was

Vickery Extension Project IPC Public Hearing 2 July 2020 Australias largest independent

Carondelet Health Network APR DRG Information for Physicians September 2014 Introduction

RomReal Limited Investor presentation Second Quarter (Q2) 2019 results Harris Palaondas - IR

Ch Christ stine Kessler M MN, A ANP NP-BC, C CNS NS, B BC-ADM, C CDTC, F FAANP NP

Hubble Catalog of variables - Presentation @Napoli Observatory Presentation October 2016

msroot@microsoft.com https://docs.microsoft.com/en-us/ https://aka.ms/RootCert

OpenStack Charms Project Update, OpenStack Summit Vancouver James Page (jamespage) What are the

Feature Selection Risk Alex Chinco University of Illinois at - PowerPoint PPT Presentation

Feature Selection Risk Alex Chinco University of Illinois at Urbana-Champaign September 15, 2014 Our model allows us to identify and interpret events faster than more traditional methods used by other investors. Quant. Fund Pitch Book

Outline Reducing Dimensionality Feature Selection 1 Steven J Zeil Feature Extraction 2

Reducing Dimensionality Steven J Zeil Old Dominion Univ. Fall 2010 1 Feature Selection

Decision Tree Prof. Seungchul Lee Industrial AI Lab. Feature Test Feature 1 Feature 2 Feature

Feature Selection: ROC and Subset Selection Theodoridis 5.5-5.7 Using ROC for Feature Selection

A Distinctive Feature of A Distinctive Feature of A Distinctive Feature of A Distinctive Feature

Week 3 Video 4 Automated Feature Generation Automated Feature Selection Automated Feature

ERP Selection KIRTANE &amp; PANDIT Suhas Deshpande Why ERP Selection is important ?

Mutual Information an Adequate Tool for Feature Selection ? Benot Frnay November 15, 2013

Earth: The Feature Presentation - feature, landscape, topography Earth: The Feature Presentation

Risk Management Workshop 1 Risk management workshop Why do we Risk Risk and need risk

PCA &amp; ICA CE-717: Machine Learning Sharif University of Technology Spring 2018 Soleymani

SECONDHAND SELECTION Sales Price - 275,000.00 EU SECONDHAND SELECTION INTERNAL VIEWS SECONDHAND

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

SELECTION Deterministic Stochastic Proportionate selection: Roulette Wheel Selection

Selection 2 Selection Selection given a set of (distinct) elements, finding the element larger

Feature Extraction 7-1 Ronald Peikert SciVis 2007 - Feature Extraction What are features?

Our tailor-made solutions your way to reach the goal. www.silvirom.ee Silvirom company was

Vickery Extension Project IPC Public Hearing 2 July 2020 Australias largest independent

Carondelet Health Network APR DRG Information for Physicians September 2014 Introduction

RomReal Limited Investor presentation Second Quarter (Q2) 2019 results Harris Palaondas - IR

Ch Christ stine Kessler M MN, A ANP NP-BC, C CNS NS, B BC-ADM, C CDTC, F FAANP NP

Hubble Catalog of variables - Presentation @Napoli Observatory Presentation October 2016

msroot@microsoft.com https://docs.microsoft.com/en-us/ https://aka.ms/RootCert

OpenStack Charms Project Update, OpenStack Summit Vancouver James Page (jamespage) What are the

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2018 Soleymani