Partial Trace Regression and Low-Rank Kraus Decomposition Hachem - - PowerPoint PPT Presentation

partial trace regression and low rank kraus decomposition
SMART_READER_LITE
LIVE PREVIEW

Partial Trace Regression and Low-Rank Kraus Decomposition Hachem - - PowerPoint PPT Presentation

Partial Trace Regression and Low-Rank Kraus Decomposition Hachem Kadri 1 , St ephane Ayache 1 , Riikka Huusari 2 , Alain Rakotomamonjy 3 , 4 , Liva Ralaivola 4 1 Aix-Marseille University, CNRS, LIS, Marseille, France 2 Helsinki Institute for


slide-1
SLIDE 1

Partial Trace Regression and Low-Rank Kraus Decomposition

Hachem Kadri1, St´ ephane Ayache1, Riikka Huusari2, Alain Rakotomamonjy3,4, Liva Ralaivola4

1Aix-Marseille University, CNRS, LIS, Marseille, France 2Helsinki Institute for Information Technology, Aalto University, Espoo, Finland 3Universit´

e Rouen Normandie, LITIS, Rouen, France

4Criteo AI Lab, Paris

slide-2
SLIDE 2

Trace Regression y = tr

  • B⊤

∗ X

  • + ǫ

Generalization of linear regression to matrix input X

→ Spatio-temporal data, covariance descriptors, . . .

Output y is a real number

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 2/17

slide-3
SLIDE 3

Trace Regression y = tr

  • B⊤

∗ X

  • + ǫ

Generalization of linear regression to matrix input X

→ Spatio-temporal data, covariance descriptors, . . .

Output y is a real number

Low-Rank Estimation [Koltchinskii et al., 2011]

  • B = arg min

B ℓ

  • i=1
  • yi − tr

B⊤Xi 2 + λB1

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 2/17

slide-4
SLIDE 4

Trace Regression y = tr

  • B⊤

∗ X

  • + ǫ

Generalization of linear regression to matrix input X

→ Spatio-temporal data, covariance descriptors, . . .

Output y is a real number

PSD-contrained Estimation [Slawski et al., 2015]

  • B = arg min

B∈S+

p

  • i=1
  • yi − tr

B⊤Xi 2

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 2/17

slide-5
SLIDE 5

Trace Regression y = tr

  • B⊤

∗ X

  • + ǫ

Generalization of linear regression to matrix input X

→ Spatio-temporal data, covariance descriptors, . . .

Output y is a real number Relevant to:

→ Matrix completion → Phase retrieval → Quantum state tomography → . . .

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 2/17

slide-6
SLIDE 6

Trace Regression y = tr

  • B⊤

∗ X

  • + ǫ

Generalization of linear regression to matrix input X

→ Spatio-temporal data, covariance descriptors, . . .

Output y is a real number Relevant to:

→ Matrix completion

arg min

B

PΩ(B∗) − PΩ(B)2 s.t. rank(B) = r

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 2/17

slide-7
SLIDE 7

Trace Regression y = tr

  • B⊤

∗ X

  • + ǫ

Generalization of linear regression to matrix input X

→ Spatio-temporal data, covariance descriptors, . . .

Output y is a real number Relevant to:

→ Matrix completion

arg min

B

  • (i,j)∈Ω

PΩ(B∗)ij − tr(B⊤Eij) 2 s.t. rank(B) = r

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 2/17

slide-8
SLIDE 8

Partial Trace Regression

Generalizes Trace Regression to the case when both inputs and

  • utputs are matrices.

Domain adaptation

Figure from [Liu et al., 2019]

EEG data Covariance matrix

Figure from [Williamson et al., 2012]

Inspiration: partial trace, CP maps and Kraus decomposition in quantum computing

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 3/17

slide-9
SLIDE 9

Notational Conventions

Mp := Mp(R) the space of all p × p real matrices Mp(Mq) the space of p × p block matrices whose i, j entry is an element of Mq L(Mp, Mq) the space of linear maps from Mp to Mq

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 4/17

slide-10
SLIDE 10

From Trace to Partial Trace

Trace tr

     

  • . . .
  • . . .
  • .

. . . . . . . . ... . . .

  • . . .

    

= •

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 5/17

slide-11
SLIDE 11

From Trace to Partial Trace

Trace of a block matrix tr

                  

  • . . .
  • .

. . . . . ... . . .

  • . . .
  • . . .
  • . . .
  • .

. . . . . ... . . .

  • . . .
  • .

. . . . . . . .

  • . . .
  • .

. . . . . ... . . .

  • . . .
  • . . .
  • . . .
  • .

. . . . . ... . . .

  • . . .

                 

= •

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 5/17

slide-12
SLIDE 12

From Trace to Partial Trace

Partial-trace trm

                  

  • . . .
  • .

. . . . . ... . . .

  • . . .
  • . . .
  • . . .
  • .

. . . . . ... . . .

  • . . .
  • .

. . . . . . . .

  • . . .
  • .

. . . . . ... . . .

  • . . .
  • . . .
  • . . .
  • .

. . . . . ... . . .

  • . . .

                 

=

  

  • . . .
  • .

. . . . . ... . . .

  • . . .

  The partial trace operation applied to m × m-blocks of a qm × qm matrix gives a q × q matrix as an output.

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 5/17

slide-13
SLIDE 13

Partial Trace Regression

Y = trm

  • A∗XB⊤

  • + ǫ

Matrix Input X ∈ Mp and Matrix Output Y ∈ Mq A∗, B∗ ∈ Mqm×p are the unknown parameters of the model We recover the trace regression model when q = 1 Learning the Model Parameters Our solution: Kraus representation of completely positive maps

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 6/17

slide-14
SLIDE 14

Positive and Completely Positive Maps [Bhatia, 2009]

Positive maps

Φ ∈ L(Mp, Mq) is positive if for all M ∈ S+

p , Φ(M) ∈ S+ q

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 7/17

slide-15
SLIDE 15

Positive and Completely Positive Maps [Bhatia, 2009]

m-Positive maps

Φ ∈ L(Mp, Mq) is m m m-positive if Φm : Mm(Mp) → Mm(Mq) defined as Φm

  

A11 A12 . . . . . . ... Am1 Amm

   :=   

Φ(A11) Φ(A12) . . . . . . ... Φ(Am1) Φ(Amm)

  

is positive.

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 7/17

slide-16
SLIDE 16

Positive and Completely Positive Maps [Bhatia, 2009]

Completely positive maps

Φ is completely positive if it is m-positive for any m ≥ 1

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 7/17

slide-17
SLIDE 17

A Positive But Not Completely Positive Map

Example: the transpose map

Define Φ : M2 → M2 by Φ(A) = A⊤. Then Φ1 ≥ ≥ ≥ 0 but Φ2

  • 0.

1 1 1 1               1 1 1 1           Φ2 =           Φ

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 8/17

slide-18
SLIDE 18

A Completely Positive Map

Example

Let V ∈ Mq×p. Define Φ : Mp → Mq by Φ(A) = V AV ⊤. Then Φ is completely positive. Φ2

  • A11

A12 A21 A22

  • =
  • V A11V ⊤

V A12V ⊤ V A21V ⊤ V A22V ⊤

  • = (I2 ⊗ V )
  • A11

A12 A21 A22

  • (I2 ⊗ V ⊤)
  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 9/17

slide-19
SLIDE 19

Stinespring Representation

Stinespring’s Theorem 1955

Let Φ ∈ L(Mp, Mq). Φ writes as Φ(X) = trm

  • AXA⊤

for some A ∈ Mqm×p if and only if Φ is completely positive. Partial trace regression ↔ Learning a completely positive map Partial trace version of the PSD-contrained trace regression Efficient optimization via Kraus decomposition

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 10/17

slide-20
SLIDE 20

Kraus Representation

Choi’s Theorem 1975, Kraus Decomposition 1971

Let Φ ∈ L(Mp, Mq) be a completely positive linear map. Then there exist Aj ∈ Mq×p, 1 ≤ j ≤ r, with r ≤ pq such that ∀X ∈ Mp, Φ(X) =

r

  • j=1

AjXA⊤

j .

Learning a completely positive map ↔ Finding a Kraus decomposition Small values of r correspond to low-rank Kraus representation

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 11/17

slide-21
SLIDE 21

Back to Partial Trace Regression

Low-Rank Kraus Estimation

arg min

Aj∈Mq×p l

  • i=1

Yi,

r

  • j=1

AjXiA⊤

j

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 12/17

slide-22
SLIDE 22

Back to Partial Trace Regression

Generalization Bound

F = {Φ : Mp → Mq : Φ is completely positive and its Kraus rank is equal to r} Under some assumptions on ℓ, for any δ > 0, with probability at least 1 − δ, the following holds for all h ∈ F, R(h) ≤ ˆ R(h) + γ

  • pqr log(8epq

r ) log( l pqr)

l + γ

  • log
  • 1

δ

  • 2l
  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 12/17

slide-23
SLIDE 23

Back to Matrix Completion

(Block) PSD Matrix Completion

Let Φ : Mp → Mq be a linear mapping. Then the following conditions are equivalent:

1 Φ is completely positive. 2 The block matrix M ∈ Mp(Mq) defined by

Mij = Φ(Eij), 1 ≤ i, j ≤ p, is positive, where Eij are the matrix units of Mp.

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 13/17

slide-24
SLIDE 24

Experiments

(PSD) Matrix-to-Matrix Regression (Block) PSD Matrix Completion

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 14/17

slide-25
SLIDE 25

(PSD) Matrix-to-Matrix Regression

Simulated data: 10000 10000 10000 samples, 20 × 20 → 10 × 10 20 × 20 → 10 × 10 20 × 20 → 10 × 10

Vectorized Tensorized Trace Partial Trace

Model MSE Multivariate Regression 0.058 ± 0.0134 Reduced-Rank Regression 0.245 ± 0.1023 HOPLS 1.602 ± 0.0011 Tucker-NN 0.595 ± 0.0252 TensorTrain-NN 0.001 ± 0.0009 0.001 ± 0.0009 0.001 ± 0.0009 Trace Regression 0.028 ± 0.0144 Partial Trace Regression 0.001 ± 0.0008 0.001 ± 0.0008 0.001 ± 0.0008

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 15/17

slide-26
SLIDE 26

(PSD) Matrix-to-Matrix Regression

Simulated data: 100 100 100 samples, 20 × 20 → 10 × 10 20 × 20 → 10 × 10 20 × 20 → 10 × 10 Model MSE Param. TensorTrain-NN 0.662 ± 0.364 4,000 Partial Trace Regression 0.007 ± 0.013 0.007 ± 0.013 0.007 ± 0.013 1, 000 1, 000 1, 000 → Partial Trace Regression preserves PSD structure

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 15/17

slide-27
SLIDE 27

(PSD) Matrix-to-Matrix Regression

Real BCI data: Mapping covariance matrices for domain adaptation

No adaptation Optimal Transport Partial Trace (Full adaptation) (Out-of-sample adaptation)

Subject NoAdapt FullAdapt OoSAdapt 1 73.66 73.66 72.24 3 65.93 72.89 68.86 7 53.42 64.62 59.20 8 73.80 75.27 73.06 9 73.10 75.00 76.89

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 15/17

slide-28
SLIDE 28

(Block) PSD Matrix Completion

Simulated data: Missing blocks, 28 × 28 28 × 28 28 × 28 (p = 7, q = 4) (p = 7, q = 4) (p = 7, q = 4)

Original With missing values Reconstructed

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 16/17

slide-29
SLIDE 29

(Block) PSD Matrix Completion

Simulated data: Missing blocks, 28 × 28 28 × 28 28 × 28 (p = 7, q = 4) (p = 7, q = 4) (p = 7, q = 4) Model MSE TensorTrain-NN 3.942 ± 1.463 Trace Regression 2.996 ± 1.078 Partial Trace Regression 0.572 ± 0.019 0.572 ± 0.019 0.572 ± 0.019

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 16/17

slide-30
SLIDE 30

(Block) PSD Matrix Completion

Simulated data: Missing entries, 28 × 28 28 × 28 28 × 28 (p = 7, q = 4) (p = 7, q = 4) (p = 7, q = 4)

Original With missing values Reconstructed

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 16/17

slide-31
SLIDE 31

(Block) PSD Matrix Completion

Simulated data: Missing entries, 28 × 28 28 × 28 28 × 28 (p = 7, q = 4) (p = 7, q = 4) (p = 7, q = 4)

Original With missing values Reconstructed depth-1 Reconstructed depth-2

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 16/17

slide-32
SLIDE 32

(Block) PSD Matrix Completion

Multiple features digits dataset: Kernel matrix completion in a multi-view setting

SVM accuracy depth-1 Vs depth-2

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 16/17

slide-33
SLIDE 33

Conclusion

New model – Partial trace regression Novel concepts in ML – Completely postive maps and low-rank Kraus decomposition Theoretically well founded Promising performance A bridge between Machine Learning and Quantum Computing

  • H. Kadri & al.,

Partial Trace Regression and Low-Rank Kraus Decomposition, 17/17