latent and network models with applications to finance
play

Latent and Network Models with Applications to Finance Jingchen Liu - PowerPoint PPT Presentation

Latent and Network Models with Applications to Finance Jingchen Liu Department of Statistics Columbia University Joint work with Yunxiao Chen, Xiaoou Li, and Zhiliang Ying At ISFA-Columbia Workshop, June 28, 2016 November 16, 2015 1 / 44


  1. Latent and Network Models with Applications to Finance Jingchen Liu Department of Statistics Columbia University Joint work with Yunxiao Chen, Xiaoou Li, and Zhiliang Ying At ISFA-Columbia Workshop, June 28, 2016 November 16, 2015 1 / 44

  2. Modeling multivariate distribution ◮ Multivariate random vector: ( R 1 , ..., R J ) ◮ Continuous vectors: multivariate Gaussian, multivariate t -distribution... ◮ Categorical vectors: loglinear model... ◮ Copula ◮ Regression 2 / 44

  3. Modeling multivariate distribution ◮ Multivariate random vector: ( R 1 , ..., R J ) ◮ Continuous vectors: multivariate Gaussian, multivariate t -distribution... ◮ Categorical vectors: loglinear model... ◮ Copula ◮ Regression 2 / 44

  4. Modeling multivariate distribution ◮ Multivariate random vector: ( R 1 , ..., R J ) ◮ Continuous vectors: multivariate Gaussian, multivariate t -distribution... ◮ Categorical vectors: loglinear model... ◮ Copula ◮ Regression 2 / 44

  5. Modeling multivariate distribution ◮ Multivariate random vector: ( R 1 , ..., R J ) ◮ Continuous vectors: multivariate Gaussian, multivariate t -distribution... ◮ Categorical vectors: loglinear model... ◮ Copula ◮ Regression 2 / 44

  6. Modeling multivariate distribution ◮ Multivariate random vector: ( R 1 , ..., R J ) ◮ Continuous vectors: multivariate Gaussian, multivariate t -distribution... ◮ Categorical vectors: loglinear model... ◮ Copula ◮ Regression 2 / 44

  7. Latent variable modeling ◮ There exists α such that f ( R 1 , ..., R J | α ) is simple. ◮ What is considered as simple? ◮ Independence, small variance,... 3 / 44

  8. Latent variable modeling ◮ There exists α such that f ( R 1 , ..., R J | α ) is simple. ◮ What is considered as simple? ◮ Independence, small variance,... 3 / 44

  9. Latent variable modeling ◮ There exists α such that f ( R 1 , ..., R J | α ) is simple. ◮ What is considered as simple? ◮ Independence, small variance,... 3 / 44

  10. Graphical representation 4 / 44

  11. Local independence � f ( R 1 , ..., R J | α ) = f ( R j | α ) j 5 / 44

  12. Applications ◮ Finance, political sciences ◮ Education ◮ Psychiatry/psychology ◮ Marketing and e-commerce 6 / 44

  13. Applications ◮ Finance, political sciences ◮ Education ◮ Psychiatry/psychology ◮ Marketing and e-commerce 6 / 44

  14. Applications ◮ Finance, political sciences ◮ Education ◮ Psychiatry/psychology ◮ Marketing and e-commerce 6 / 44

  15. Applications ◮ Finance, political sciences ◮ Education ◮ Psychiatry/psychology ◮ Marketing and e-commerce 6 / 44

  16. Linear factor models ◮ ( R 1 , ..., R J ) is continous. ◮ Linear factor models: α = ( α 1 , ..., α K ) R j = a ⊤ j α + ε j ◮ Principle component analysis 7 / 44

  17. Linear factor models ◮ ( R 1 , ..., R J ) is continous. ◮ Linear factor models: α = ( α 1 , ..., α K ) R j = a ⊤ j α + ε j ◮ Principle component analysis 7 / 44

  18. Linear factor models ◮ ( R 1 , ..., R J ) is continous. ◮ Linear factor models: α = ( α 1 , ..., α K ) R j = a ⊤ j α + ε j ◮ Principle component analysis 7 / 44

  19. Categorical variable and item response theory model ◮ Binary R i ∈ { 0 , 1 } . e a ⊤ α − bj j ◮ P ( R j = 1 | α ) = α ∈ R K α − bj , a ⊤ 1+ e j 1.0 0.8 0.6 y 0.4 0.2 0.0 -4 -2 0 2 4 x 8 / 44

  20. Categorical variable and item response theory model ◮ Binary R i ∈ { 0 , 1 } . e a ⊤ α − bj j ◮ P ( R j = 1 | α ) = α ∈ R K α − bj , a ⊤ 1+ e j 1.0 0.8 0.6 y 0.4 0.2 0.0 -4 -2 0 2 4 x 8 / 44

  21. Stock Price Structure ◮ Data1: 97 stocks selected from S&P100 in 1013 trading days from 2009 to 2014. ◮ Data2: 117 stocks selected from SSE180 (Shanghai Stock Exchange) in 1159 trading days from 2009 to 2014. 9 / 44

  22. Stock Price Structure ◮ Data1: 97 stocks selected from S&P100 in 1013 trading days from 2009 to 2014. ◮ Data2: 117 stocks selected from SSE180 (Shanghai Stock Exchange) in 1159 trading days from 2009 to 2014. 9 / 44

  23. Exploratory Analysis e.g. ◮ The block circled by blue contains mostly the energy companies: APA (Apache Corp), APC (Anadarko Petroleum), BHI (Baker Hughes), COP (Conoco Phillips), CVX (Chevron), DVN (Devon), ... ◮ The block circled by black contains the financial companies: The heatmap of stock-stock cor- C (citi), BAC (BOA), MS (Morgan Stanley), relation (Data 1; based on daily BK(Bank of New York Mellon), JPM (JP log return); stocks have been re- Morgan), ... ordered 10 / 44

  24. Linear factor model ◮ Linear factor models R j = a ⊤ j α + ε j ◮ Fama-French model: R = R f + β ( K − R f ) + b s SMB + b v HML + α 11 / 44

  25. Linear factor model ◮ Linear factor models R j = a ⊤ j α + ε j ◮ Fama-French model: R = R f + β ( K − R f ) + b s SMB + b v HML + α 11 / 44

  26. Linear factor model ◮ ( R 1 , ..., R J ) is not multivariate Gaussian in many ways if J is large! ◮ Marginal tail, joint tail, asymmetric correlation... ◮ Too many factors! 12 / 44

  27. Linear factor model ◮ ( R 1 , ..., R J ) is not multivariate Gaussian in many ways if J is large! ◮ Marginal tail, joint tail, asymmetric correlation... ◮ Too many factors! 12 / 44

  28. Linear factor model ◮ ( R 1 , ..., R J ) is not multivariate Gaussian in many ways if J is large! ◮ Marginal tail, joint tail, asymmetric correlation... ◮ Too many factors! 12 / 44

  29. Nonlinear factor model > S open ◮ Dichotomize R ji = 1 if S close for stock j on day i i i e a ⊤ j α − bj α ∈ R K ◮ P ( R j = 1 | α ) = j α − bj , a ⊤ 1+ e 13 / 44

  30. Nonlinear factor model > S open ◮ Dichotomize R ji = 1 if S close for stock j on day i i i e a ⊤ j α − bj α ∈ R K ◮ P ( R j = 1 | α ) = j α − bj , a ⊤ 1+ e 13 / 44

  31. Latent graphical model 14 / 44

  32. Latent graphical model 15 / 44

  33. Latent graphical model 16 / 44

  34. Issues to concern ◮ Parametric/nonparametric models: latent variable and graph ◮ Inference: identifiability 17 / 44

  35. Issues to concern ◮ Parametric/nonparametric models: latent variable and graph ◮ Inference: identifiability 17 / 44

  36. The latent variable component – IRT model ◮ Alternative formulation: e a ⊤ j α − b j P ( R j | α ) ∝ e R j ( a ⊤ j α − b j ) P ( R j = 1 | α ) = ⇔ 1 + e a ⊤ j α − b j ◮ Local independence J j R j ( a ⊤ � � j α − b j ) P ( R 1 , ..., R J | α ) = P ( R j | α ) ∝ e j =1 18 / 44

  37. The latent variable component – IRT model ◮ Alternative formulation: e a ⊤ j α − b j P ( R j | α ) ∝ e R j ( a ⊤ j α − b j ) P ( R j = 1 | α ) = ⇔ 1 + e a ⊤ j α − b j ◮ Local independence J j R j ( a ⊤ � � j α − b j ) P ( R 1 , ..., R J | α ) = P ( R j | α ) ∝ e j =1 18 / 44

  38. Graphical component component – Ising model 1 � i , j s ij R i R j P ( R 1 , ..., R J | S ) ∝ e 2 ◮ Physics ◮ Graphical representation 19 / 44

  39. Graphical component component – Ising model 1 � i , j s ij R i R j P ( R 1 , ..., R J | S ) ∝ e 2 ◮ Physics ◮ Graphical representation 19 / 44

  40. Latent graphical model: IRT model + Ising model ◮ Nonlocal independence j R j ( a ⊤ j α − b j )+ 1 � � i , j s ij R i R j P ( R 1 , ..., R J | α ) ∝ e 2 ◮ Simplification: R 2 j = R j j R j a ⊤ j α + 1 � � i , j s ij R i R j P ( R 1 , ..., R J | α ) ∝ e 2 20 / 44

  41. Latent graphical model: IRT model + Ising model ◮ Nonlocal independence j R j ( a ⊤ j α − b j )+ 1 � � i , j s ij R i R j P ( R 1 , ..., R J | α ) ∝ e 2 ◮ Simplification: R 2 j = R j j R j a ⊤ j α + 1 � � i , j s ij R i R j P ( R 1 , ..., R J | α ) ∝ e 2 20 / 44

  42. Latent variable and network modeling ◮ The item response function f A , S ( R | α ) ∝ exp { α ⊤ A R + 1 2 R ⊤ S R } where A K × J = ( a 1 , ..., a J ) and S J × J = ( s ij ) ◮ Population (prior) distribution such that f A , S ( R , α ) ∝ exp {−| α | 2 / 2 + α ⊤ A R + 1 2 R ⊤ S R } 21 / 44

  43. Latent variable and network modeling ◮ The item response function f A , S ( R | α ) ∝ exp { α ⊤ A R + 1 2 R ⊤ S R } where A K × J = ( a 1 , ..., a J ) and S J × J = ( s ij ) ◮ Population (prior) distribution such that f A , S ( R , α ) ∝ exp {−| α | 2 / 2 + α ⊤ A R + 1 2 R ⊤ S R } 21 / 44

  44. Latent variable and network modeling ◮ Marginalized likelihood � f ( R , α ) d α ∝ exp { 1 2 R ⊤ ( A ⊤ A + S ) R } L ( A , S ) = ◮ Let L J × J = A ⊤ A L ( L , S ) = f ( R | L , S ) ∝ exp { 1 2 R ⊤ ( L + S ) R } 22 / 44

  45. Latent variable and network modeling ◮ Marginalized likelihood � f ( R , α ) d α ∝ exp { 1 2 R ⊤ ( A ⊤ A + S ) R } L ( A , S ) = ◮ Let L J × J = A ⊤ A L ( L , S ) = f ( R | L , S ) ∝ exp { 1 2 R ⊤ ( L + S ) R } 22 / 44

  46. Identifiability ◮ Identifiability of L and S ◮ Low dimension latent factor : L J × J = A ⊤ A is positive semi-definite of rank K ≪ J ◮ Small remaining dependence S is sparse 23 / 44

  47. Identifiability ◮ Identifiability of L and S ◮ Low dimension latent factor : L J × J = A ⊤ A is positive semi-definite of rank K ≪ J ◮ Small remaining dependence S is sparse 23 / 44

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend