Tensor Tutorial Misha Kilmer Department of Mathematics Tufts - PowerPoint PPT Presentation

Tensor Tutorial Misha Kilmer Department of Mathematics Tufts University Research Thanks: NSF 0914957, NSF 1319653, NSF 1821148 IBM JSA Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 1 / 67

Motivation Real-world data naturally multidim., w/ different characteristics: Hyperspectral images (classification) 1 1 Bannon,”Hyperspectral imaging: Cubes and Slices,” Nature Photonics, 2009. Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 2 / 67

Motivation Real-world data naturally multidim., w/ different characteristics: Discrete solutions, u ( x j , y i , t k ) to PDEs 1 1 Jiani Zhang, Tufts Mathematics Ph.D. Thesis, “Design and Application of Tensor Decompositions to Problems in Model and Image Compression and Analysis,” 2017. Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 2 / 67

Motivation Traditional algorithms for compressing, analyzing, clustering data done by ‘unfolding’ this data into a matrix, or 2D array, and employing matrix algebra tools. Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 3 / 67

Motivation CLAIM: Traditional matrix-based methods for dim reduction, classification, training, based on vectorizing data generally do not make the most of possible high dimensional correlations/structure for compression and analysis. There is much to be gained by designing mathematical and computational techniques for the data in its natural form. Review current mathematical definitions, constructs, theory, algorithms, for multiway data compression + applications. Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 4 / 67

Tensors: Definition X ∈ R n 1 × n 2 ×···× n j ← j -th order tensor 1st-order tensor: 2nd-order tensor: 3rd-order tensor: 4th-order tensor: · · · Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 5 / 67

Notation Uppercase Script: A , is a 3rd order tensor. Uppercase Bold: X , is a matrix. Bold lowercase: y , is a vector OR a 1 × 1 × n tensor. Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 6 / 67

Data Organization Reveals Latent Structure Suppose y ∈ R mn Reshape as m × n matrix, Y = uv ⊤ = u ◦ v   v 1 u   v 2 u   ⇒ y = v ⊗ u =   . .   . v n u Implies storage is reduced from mn to m + n numbers. Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 7 / 67

Data Organization Reveals Latent Structure Suppose y ∈ R mn Reshape as m × n matrix, Y = uv ⊤ = u ◦ v   v 1 u   v 2 u   ⇒ y = v ⊗ u =   . .   . v n u Implies storage is reduced from mn to m + n numbers. Moving to higher dimensions reveals compressible structure. Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 7 / 67

Goals Uncover hidden patterns in data by computing an appropriate tensor decomposition/approximation? Use this to compress or constrain data in applications. Patterns are application dependent, the type of tensor decomposition should respect this. Consider tensor decompositions that are synonymous with ‘factorization’ in a matrix-mimetic sense vs. those that are not. Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 8 / 67

Reference, Toolbox Required reading for my students: Kolda and Bader, “Tensor Decompositions and Applications,” SIAM Review, Vol. 51, 2009. MATLAB Tensor Toolbox Version 3.1, Available online, June 2019. URL: https://gitlab.com/tensors/tensor_toolbox There are other free toolboxes as well that use slightly different constructs. Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 9 / 67

Notation - The Basics 2 Modes: the different dimensions Fibers: hold all indicies fixed except 1 Slices: hold all indicies fixed except 2 2 graphics: Elizabeth Newman, “A Step in the Right Dimension,” Tufts Ph.D. Thesis, 2019 Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 10 / 67

Norms Norm is extension of Frobenius norm: � � � I 1 � I 2 � I N � � a 2 � A � = · · · i 1 ,..., ı N . i 1 =1 i 2 =1 i N =1 If X , Y of same dimension, can take an inner-product (collapsing along dimensions) to a scalar: I N � I 1 � I 2 � < X , Y > = · · · x i 1 ,..., ı N y i 1 ,...,i N . i 1 =1 i 2 =1 i N =1 Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 11 / 67

Matricization 3 A tensor “matricization” refers to (specific) mappings of the tensor to a matrix. The n th mode unfolding maps A to A via ( i 1 , . . . , i N ) → ( i n , j ) , and � � k − 1 � N � j = 1 + ( i k − 1) I m . k =1 ,k � = n m =1 ,m � = n A graphical illustration is illuminating: 3 graphics: Elizabeth Newman, Tufts Mathematics Ph.D. Thesis, “A Step in the Right Dimension,” 2019 Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 12 / 67

Matricization 3 3 graphics: Elizabeth Newman, Tufts Mathematics Ph.D. Thesis, “A Step in the Right Dimension,” 2019 Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 12 / 67

Tensor-Matrix products C = A × n X ← → C ( n ) = X · A ( n ) Note that A × m X × n Y = A × n Y × m X . Frontal slice A : , : ,k Example: � A := A × 1 X × 2 Y ⇒ � A : , : ,i = X A : , : ,i Y ⊤ Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 13 / 67

Step Back to the Matrix SVD Traditional workhorse, dim reduction/feature extraction: matrix SVD PCA - directions of most variability; projections in ‘dominant’ directions allows for dim reduction/relative comparison Compression (reduce near redundancies) via truncated SVD expansion is optimal (Eckart-Young Theorem) A = USV ⊤ = � r i =1 σ i ( u ( i ) ◦ v ( i ) ) , σ 1 ≥ σ 2 ≥ · · · ≥ 0 p � σ i ( u ( i ) ◦ v ( i ) ) B = solves i =1 min � A − B � F s.t. B has rank p ≤ r Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 14 / 67

Step Back to the Matrix SVD Traditional workhorse, dim reduction/feature extraction: matrix SVD PCA - directions of most variability; projections in ‘dominant’ directions allows for dim reduction/relative comparison Compression (reduce near redundancies) via truncated SVD expansion is optimal (Eckart-Young Theorem) A = USV ⊤ = � r i =1 σ i ( u ( i ) ◦ v ( i ) ) , σ 1 ≥ σ 2 ≥ · · · ≥ 0 p � σ i ( u ( i ) ◦ v ( i ) ) B = solves i =1 min � A − B � F s.t. B has rank p ≤ r Implicit storage: for an m × n , p ( n + m ) numbers stored, vs mn . Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 14 / 67

Step Back to the Matrix SVD Traditional workhorse, dim reduction/feature extraction: matrix SVD PCA - directions of most variability; projections in ‘dominant’ directions allows for dim reduction/relative comparison Compression (reduce near redundancies) via truncated SVD expansion is optimal (Eckart-Young Theorem) A = USV ⊤ = � r i =1 σ i ( u ( i ) ◦ v ( i ) ) , σ 1 ≥ σ 2 ≥ · · · ≥ 0 p � σ i ( u ( i ) ◦ v ( i ) ) B = solves i =1 min � A − B � F s.t. B has rank p ≤ r Question: What’s the right high-dimensional analogue? (history, see Kolda & Bader) Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 14 / 67

Rank-1 Tensor Idea 1 (Hitchcock, 1927): Like SVD, try to decompose as a sum of rank-1 tensors. X = a ◦ b ◦ c ⇒ X ℓ,j,k = a ℓ b j c k Note that vec ( X ) = c ⊗ b ⊗ a . Thus, some papers use Kronecker in place of outer-product notation. Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 15 / 67

Tensor Decompositions - CP CP (CANDECOMP-PARAFAC) Decomposition : � r a ( i ) ◦ b ( i ) ◦ c ( i ) X ≈ i =1 ◮ If equality & r minimal, then r is called the rank of the tensor ◮ Not generally orthogonal ◮ Not based on a ‘product based factorization’ ◮ Finding the rank is NP hard! ◮ No perfect procedure for fitting CP model to k terms Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 16 / 67

Kruskal Notation � r a ( i ) ◦ b ( i ) ◦ c ( i ) X ≈ i =1 Kruskal notation: � A , B , C � or, if unit-normalized � λ ; A , B , C � . Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 17 / 67

Demo - Chemical Mixing Bro, R, Multi-way Analysis in the Food Industry. Models, Algorithms, and Applications. 1998. Ph.D. Thesis, Univ. of Amsterdam (NL) & Royal Veterinary and Agricultural University (DK). (see http://www.models.kvl.dk/amino_acid_fluo ) 5, simple lab-made samples. Each sample: vary amts. tyrosine, tryptophan and phenylalanine dissolved in phosphate buffered water. Samples measured by fluorescence (excitation 250-300 nm, emission 250-450 nm, 1 nm intervals) 51 × 201 × 5 tensor Brett W. Bader, Tamara G. Kolda and others. MATLAB Tensor Toolbox Version 3.1, Available online, June 2019. URL: https://gitlab.com/tensors/tensor_toolbox Matlab script: Thanks, T. Kolda, July 2019 Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 18 / 67

Tensor Tutorial Misha Kilmer Department of Mathematics Tufts - PowerPoint PPT Presentation

Tensor Tutorial Misha Kilmer Department of Mathematics Tufts University Research Thanks: NSF 0914957, NSF 1319653, NSF 1821148 IBM JSA Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 1 / 67 Motivation Real-world data naturally

8. Tensor Field Visualization Tensor: extension of concept of scalar and vector Tensor data

Tutorial Tutorial A2 is out, its called Inpainting Tutorial Tutorial A2 is out, its called

(Some) Challenges in (Some) Challenges in Tensor Mining Tensor Mining Evrim Acar Sandia

Tensor Field Techniques Lecture 11 March 5, 2020 Outline Basics of tensor algebra Tensor

TENSOR ALGEBRA Continuum Mechanics Course (MMC) - ETSECCPB - UPC Introduction to Tensors Tensor

Tensor-Matrix Products with a Compressed Sparse Tensor Shaden Smith George Karypis University

Tensor Field Visualization 9-1 Ronald Peikert SciVis 2007 - Tensor Fields Tensors

Tutorial: A brief survey on tensor rank and tensor decomposition, from a geometric perspective.

A GAMS TUTORIAL A GAMS TUTORIAL A GAMS TUTORIAL WHAT IS GAMS ? General Algebraic Modeling

PROGRAMMING TENSOR CORES: NATIVE VOLTA TENSOR CORES WITH CUTLASS Andrew Kerr, Timmy Liu, Mostafa

TENSOR LAYERS FOR COMPRESSION OF DEEP LEARNING NETWORKS Cris Cecka Senior Research Scientist,

Tensor Methods for Signal Processing and Machine Learning Qibin Zhao Tensor Learning Unit RIKEN

and You Tensor network methods Matrix product states (MPS) Projected Entangled Pair States

Renormalization of Tensor Network States II. RG of Tensor Network States Tao Xiang Institute of

Design of a High-Performance GEMM-like Tensor-Tensor Multiplication Paul Springer and Paolo

Tensor Invariants and Kronecker Coefficients Jiarui Fei University of California, Riverside

Atmospheric Fronts The material in this section is based largely on Lectures on Dynamical

Modern Times 21H.102 Henry Ford (1863-1947) Early car manufacturing in Saginaw, MI Ford Motor

THOUGHT FOR THE DAY to confess that she had recently "The longest journey of any person is

Computational Discourse 11-711 Algorithms for NLP 15 November 2018 What Is Discourse? Discourse

De Deep Learning fo for Face ce Analysis Chen-Change LOY MMLAB The Chinese University of

Named Data Networking of Things: NDN for Microcontrollers (NDN-RIOT) Wentao Shang, Alex

Face detection Bill Freeman, MIT 6.869 April 5, 2005 Today (April 5, 2005) Face detection

Sparse direct solvers on top of runtime systems ANR SOLHAR E. Agullo, G. Bosilca, A. Buttari, A.

Tensor Tutorial Misha Kilmer Department of Mathematics Tufts - PowerPoint PPT Presentation

Tensor Tutorial Misha Kilmer Department of Mathematics Tufts University Research Thanks: NSF 0914957, NSF 1319653, NSF 1821148 IBM JSA Misha E. Kilmer (Tufts University) Tensor Tutorial 2020 1 / 67 Motivation Real-world data naturally

8. Tensor Field Visualization Tensor: extension of concept of scalar and vector Tensor data

Tutorial Tutorial A2 is out, its called Inpainting Tutorial Tutorial A2 is out, its called

(Some) Challenges in (Some) Challenges in Tensor Mining Tensor Mining Evrim Acar Sandia

Tensor Field Techniques Lecture 11 March 5, 2020 Outline Basics of tensor algebra Tensor

TENSOR ALGEBRA Continuum Mechanics Course (MMC) - ETSECCPB - UPC Introduction to Tensors Tensor

Tensor-Matrix Products with a Compressed Sparse Tensor Shaden Smith George Karypis University

Tensor Field Visualization 9-1 Ronald Peikert SciVis 2007 - Tensor Fields Tensors

Tutorial: A brief survey on tensor rank and tensor decomposition, from a geometric perspective.

A GAMS TUTORIAL A GAMS TUTORIAL A GAMS TUTORIAL WHAT IS GAMS ? General Algebraic Modeling

PROGRAMMING TENSOR CORES: NATIVE VOLTA TENSOR CORES WITH CUTLASS Andrew Kerr, Timmy Liu, Mostafa

TENSOR LAYERS FOR COMPRESSION OF DEEP LEARNING NETWORKS Cris Cecka Senior Research Scientist,

Tensor Methods for Signal Processing and Machine Learning Qibin Zhao Tensor Learning Unit RIKEN

and You Tensor network methods Matrix product states (MPS) Projected Entangled Pair States

Renormalization of Tensor Network States II. RG of Tensor Network States Tao Xiang Institute of

Design of a High-Performance GEMM-like Tensor-Tensor Multiplication Paul Springer and Paolo

Tensor Invariants and Kronecker Coefficients Jiarui Fei University of California, Riverside

Atmospheric Fronts The material in this section is based largely on Lectures on Dynamical

Modern Times 21H.102 Henry Ford (1863-1947) Early car manufacturing in Saginaw, MI Ford Motor

THOUGHT FOR THE DAY to confess that she had recently &quot;The longest journey of any person is

Computational Discourse 11-711 Algorithms for NLP 15 November 2018 What Is Discourse? Discourse

De Deep Learning fo for Face ce Analysis Chen-Change LOY MMLAB The Chinese University of

Named Data Networking of Things: NDN for Microcontrollers (NDN-RIOT) Wentao Shang, Alex

Face detection Bill Freeman, MIT 6.869 April 5, 2005 Today (April 5, 2005) Face detection

Sparse direct solvers on top of runtime systems ANR SOLHAR E. Agullo, G. Bosilca, A. Buttari, A.

THOUGHT FOR THE DAY to confess that she had recently "The longest journey of any person is