Algebraic models for multilinear dependence Jason Morton Stanford - PowerPoint PPT Presentation

Algebraic models for multilinear dependence Jason Morton Stanford University February 21, 2009 NSF Tensor Workshop Joint work with Lek-Heng Lim of U.C. Berkeley J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 1 / 37

Univariate cumulants Mean, variance, skewness and kurtosis describe the shape of a univariate distribution. J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 2 / 37

Covariance matrices The covariance matrix partly describes the dependence structure of a multivariate distribution. Principal Component Analysis Factor models Risk–bilinear form computes variance h ⊤ Σ h of holdings But if the variables are not multivariate Gaussian, not the whole story. This is one point of view on the financial crisis; too much reliance on a quadratic, Gaussian perspective on risk. Exploited by trading skewness and kurtosis risk for apparent reduction in variance. J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 3 / 37

Sharpe Ratio ( µ − µ f σ ) vs Skewness 10 0 Skewness −10 −20 −30 0 0.5 1 1.5 Sharpe Ratio Hedge Fund Research Indices daily returns J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 4 / 37

Non-multivariate Gaussian returns are common; HFRI Distressed/Restructuring Index vs. Merger Arbitrage 5 Merger Arb 0 −5 −5 0 5 Distressed J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 5 / 37

Even if marginals normal, dependence might not be 1000 Simulated Clayton(3)−Dependent N(0,1) Values 5 4 3 2 X2 ~ N(0,1) 1 0 −1 −2 −3 −4 −5 −5 0 5 X1 ~ N(0,1) J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 6 / 37

Covariance matrix analogs: multivariate cumulants The cumulant tensors are the multivariate analog of skewness and kurtosis. They describe higher order dependence among random variables. The covariance matrix lets us optimize wrt variance; the cumulant tensors let us optimize wrt skewness, kurtosis, . . . Definitions: tensors and cumulants 1 Properties of cumulant tensors 2 Low multilinear rank model (subspace variety) 3 Quasi-Newton algorithm on Grassmannian 4 Multi-moment portfolio optimization 5 Dimension reduction 6 J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 7 / 37

Introduction 1 Definitions 2 Properties 3 Principal Cumulant Component Analysis 4 Algorithm 5 Applications 6 J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 8 / 37

Symmetric multilinear matrix multiplication K Q C If Q is a p × r matrix, C an r × r × r tensor, make a p × p × p tensor K = ( Q , Q , Q ) · C or K = Q · C ( r , r , r ) � κ ℓ mn = q ℓ i q mj q nk c ijk . i , j , k =(1 , 1 , 1) J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 9 / 37

Moments and Cumulants are symmetric tensors Vector-valued random variable x = ( X 1 , . . . , X n ). Three natural d -way tensors are: The d th non-central moment s i 1 ,..., i d of x : � p � S d ( x ) = E ( x i 1 x i 2 · · · x id ) i 1 ,..., i d =1 . The d th central moment M d = S d ( x − E [ x ]), and The d th cumulant κ i 1 ... i d of x : p   � ( − 1) q − 1 ( q − 1)! s A 1 . . . s A q . K d ( x ) =   A 1 ⊔···⊔ A q = { i 1 ,..., i d } i 1 ,..., i d =1 s i 1 ,..., i d = � � b ∈ B κ b B and κ ijk ℓ = m ijk ℓ − ( m ij m k ℓ + m ik m j ℓ + m i ℓ m jk ) J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 10 / 37

Measuring useful properties. For univariate x , the cumulants K d ( x ) for d = 1 , 2 , 3 , 4 are expectation κ i = E [ x ], variance κ ii = σ 2 , skewness κ iii /κ 3 / 2 ii , and kurtosis κ iiii /κ 2 ii . The tensor versions are the multivariate generalizations κ ijk they provide a natural measure of non-Gaussianity. J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 11 / 37

Alternative Definitions of Cumulants In terms of log characteristic function, ∂ d � κ α 1 ··· α d ( x ) = ( − i ) d � log E (exp( i � t , x � ) . � ∂ t α 1 · · · ∂ t α d � t = 0 In terms of Edgeworth series, ∞ i | α | κ α ( x ) t α � log E (exp( i � t , x � ) = α ! α =0 where α = ( α 1 , . . . , α d ) is a multi-index, t α = t α 1 1 · · · t α d d , and α ! = α 1 ! · · · α d !. See [Fisher 1929, McCullagh 1984,1987] for definitions and properties. J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 12 / 37

Properties of cumulants: Multilinearity Multilinearity: if x is a R r -valued random variable and A ∈ R p × r K d ( A x ) = A · K d ( x ) , where · is the multilinear action . This makes factor models work: y = A x implies K d ( y ) = A · K d ( x ); Covariance factor model: K 2 ( y ) = AK 2 ( x ) A ⊤ . Independent Component Analysis finds an A to approximately diagonalize K d ( x ). J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 14 / 37

Properties of cumulants: Independence Independence: If x 1 , . . . , x p are random variables mutually independent of y 1 , . . . , y p , we have K d ( x 1 + y 1 , . . . , x p + y p ) = K d ( x 1 , . . . , x p ) + K d ( y 1 , . . . , y p ). K i 1 ,..., i d ( x ) = 0 whenever there is a partition of { i 1 , . . . , i d } into two nonempty sets I and J such that x I and x J are independent. Why we want to diagonalize in independent component analysis Exploitable in other sparse cumulant techniques (breaks rotational symmetry) J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 15 / 37

Properties of cumulants: Vanishing and Extending Gaussian: If x is multivariate normal, then K d ( x ) = 0 for all d ≥ 3. ◮ Why one might not have heard of them: for Gaussians, the covariance matrix does tell the whole story. Marcinkiewicz Theorem: There are no distributions with a bound D so that � � = 0 3 ≤ d ≤ D , K d ( x ) = 0 d > D . ◮ Parametrization is trickier when K 2 doesn’t tell the whole story. J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 16 / 37

Making cumulants useful, tractable and estimable Cumulant tensors are a useful generalization, but too big. They have � # vars + d − 1 � quantities, too many to d estimate with a reasonable amount of data, optimize, and store. Needed: small, implicit factor models analogous to Principal Component Analysis (PCA) PCA: eigenvalue decomposition of a positive semidefinite real symmetric matrix. We need a tensor analog. But, it isn’t as easy as it looks . . . J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 17 / 37

Tensor decomposition Three possible generalizations are the same in the matrix case but not in the tensor case. For a p × p × p tensor K , Name minimum r such that K = � r Tensor rank i =1 u i ⊗ v i ⊗ w i not closed Border rank K = lim ǫ → 0 ( S ǫ ), Tensor rank( S ǫ ) = r closed but hard to represent; defining equations unknown. K = A · C , C ∈ R r × r × r , A ∈ R p × r , Multilinear rank closed and understood. J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 18 / 37

Geometric perspective Secants of Veronese in S d ( R p ) and rank subsets — difficult to study. Symmetric subspace variety in S d ( R p ) — closed, easy to study. We take the long skinny matrix to be orthonormal. ◮ Stiefel manifold O( p , r ) is set of p × r real matrices Q with orthonormal columns. ◮ Grassmannian Gr( p , r ) is set of equivalence classes [Q] of O( p , r ) under right multiplication by O( r ). Parametrization of S d ( R n ) via Gr( p , r ) × S d ( R r ) → S d ( R p ) . J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 19 / 37

Multilinear rank factor model Let y = Y 1 , . . . , Y n be a random vector. Write the d th order cumulant K d ( y ) as a best r -multilinear rank approximation in terms of the cumulant K d ( x ) of a smaller set of r factors x : K d ( y ) ≈ Q · K d ( x ) ≈ where Q is orthonormal, and Q ⊤ projects to the factors The column space of Q defines the r -dim subspace which best explains the d th order dependence. In place of eigenvalues, we have the core tensor K d ( x ), the cumulant of the factors, analogous to the covariance matrix of the factors in the r × r case. Have model, need loss and algorithm. J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF Tensor 21 / 37

Algebraic models for multilinear dependence Jason Morton Stanford - PowerPoint PPT Presentation

Algebraic models for multilinear dependence Jason Morton Stanford University February 21, 2009 NSF Tensor Workshop Joint work with Lek-Heng Lim of U.C. Berkeley J. Morton (Stanford) Algebraic models for multilinear dependence 2/21/09 NSF

Multilinear maps from lattices Constructions, attacks, and applications Yilei Chen (Visa

Measuring Dependence and Conditional Dependence with Kernels Kenji Fukumizu The Institute of

Linear dependence and independence Linear dependence 1 Definition (linear (in)dependence) Let {

Multilinear Maps over the Integers From Design to Security Tancrde Lepoint CryptoExperts The

Chapter 3 Structural breaks for models with path dependence 2 Chapter 3 Path dependence (p.

Treating Tobacco Treating Tobacco Treating Tobacco Treating Tobacco Dependence and Providing

Control-dependence Analysis 2 Control-dependence Analysis 1. Introduction (motivation, overview)

More refined representations Control dependence graph Problem: control-flow edges in CFG

Bistability in ODE and algebraic models Matthew Macauley Department of Mathematical Sciences

A. Operations with algebraic Algebra practice part 1 expressions 3 4 A. Operations with

Unusual compositional dependence of the Unusual compositional dependence of the exciton reduced

Data Dependence in Data Dependence in Combining Classifiers Combining Classifiers Mohamed

Local Dependence and Persistence in Discrete Sliding Window Processes Ohad N. Feldheim Joint

From Data to Effects Dependence Graphs: Source-to-Source Transformations for C CPC 2015 Nelson

Energy Dependence of Multiplicity Fluctuations in Heavy Ion Collisions Benjamin Lungwitz, IKF

The Expressive Power of Backround Modal Dependence Logic Modal logic Team semantics Modal

Upcoming HORIZON 2020 HEALTH CALLS 2018-20 Digital Health Ecosystem Wales Andy Bleaden -

Novel tensor framework for neural networks and model reduction Shashanka Ubaru 1 Lior Horesh 1

FY2019 Results Briefing Results Briefing Agenda

A Combinatorial Proof of the Cyclic Sieving Phenomenon for Faces of Coxeterhedra Tung-Shan Fu

Optimizing an homogeneous polynomial on the unit sphere Rima Khouja Advisors: Bernard Mourrain

MINCON INTERIM RESULTS 2020 The Drillers Choice SUMMARY H1 2020 PROGRESS THROUGH CHALLENGING

STAT 339 Approximate Inference I 15 March 2017 Colin Reimer Dawson Outline Approximation

Last time: Problem-Solving Problem solving: Goal formulation Problem formulation