a randomized block sampling approach to the canonical
play

A randomized block sampling approach to the canonical polyadic - PowerPoint PPT Presentation

A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors Nico Vervliet Joint work with Lieven De Lathauwer SIAM AN17, July 13, 2017 Classification of hazardous gasses using e-noses Sensor Classify


  1. A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors Nico Vervliet Joint work with Lieven De Lathauwer SIAM AN17, July 13, 2017

  2. Classification of hazardous gasses using e-noses Sensor Classify 900 experiments containing 72 time series with 26 000 samples each. Time E x p e r i m e n t 2

  3. Overview Decomposing large-scale tensors Randomized block sampling Experimental results Chemo-sensing application 3

  4. Canonical polyadic decomposition ◮ Sum of R rank-1 terms c 1 c R b 1 b R = + · · · + T a 1 a R 4

  5. Canonical polyadic decomposition ◮ Sum of R rank-1 terms c 1 c R b 1 b R = + · · · + T a 1 a R ◮ Mathematically, for a general N th order tensor T R a (1) ⊗ a (2) ⊗ · · · ⊗ a ( N ) � T = r r r r =1 � A (1) , A (2) , . . . , A ( N ) � = 4

  6. Computing a CPD ◮ Optimization problem: 1 2 � � � A (1) , A (2) , . . . , A ( N ) � � � � T − min � � � � 2 � � � F A (1) , A (2) ,..., A ( N ) 5

  7. Computing a CPD ◮ Optimization problem: 1 2 � � � A (1) , A (2) , . . . , A ( N ) � � � � T − min � � � � 2 � � � F A (1) , A (2) ,..., A ( N ) ◮ Algorithms ◮ Alternating least squares ◮ CPOPT [Acar et al. 2011a] ◮ (Damped) Gauss–Newton [Phan et al. 2013] ◮ (Inexact) nonlinear least squares [Sorber et al. 2013] 5

  8. Curse of dimensionality ◮ Suppose N th order T ∈ C I × I ×···× I , then ◮ number of entries: I N ◮ memory and time complexity: O � I N � 6

  9. Curse of dimensionality ◮ Suppose N th order T ∈ C I × I ×···× I , then ◮ number of entries: I N ◮ memory and time complexity: O � I N � ◮ number of variables: NIR 6

  10. Curse of dimensionality ◮ Suppose N th order T ∈ C I × I ×···× I , then ◮ number of entries: I N ◮ memory and time complexity: O � I N � ◮ number of variables: NIR Example [Vervliet et al. 2014] Ninth-order tensor with I = 100 and rank R = 5: ◮ number of entries: 10 18 ◮ number of variables: 4500 6

  11. How to handle large tensors? ◮ Use incomplete tensors Acar et al. 2011b; Vervliet et al. 2014; Vervliet et al. 2016a ◮ Exploit sparsity Kang et al. 2012; Papalexakis et al. 2012; Bader and Kolda 2007 ◮ Compress the tensor Sidiropoulos et al. 2014; Oseledets and Tyrtyshnikov 2010; Vervliet et al. 2016b ◮ Decompose subtensors and combine results Papalexakis et al. 2012; Phan and Cichocki 2011 ◮ Parallel Liavas and Sidiropoulos 2015 + many of the above 7

  12. Overview Decomposing large-scale tensors Randomized block sampling Experimental results Chemo-sensing application 8

  13. Randomized block sampling CPD: idea ≈ + · · · + 9

  14. Randomized block sampling CPD: idea ≈ + · · · + 9

  15. Randomized block sampling CPD: idea ≈ + · · · + Take sample 9

  16. Randomized block sampling CPD: idea ≈ + · · · + Take sample Initialization Compute step + · · · + 9

  17. Randomized block sampling CPD: idea ≈ + · · · + Update Take sample Initialization Compute step + · · · + 9

  18. Randomized block sampling CPD: algorithm input : Data T and initial guess A ( n ) , n = 1 , ..., N output: A ( n ) , n = 1 , ..., N such that T ≈ A (1) , . . . , A ( N ) � � while k < K and not converged do Create sample T s and corresponding A ( n ) s , n = 1 , . . . , N A ( n ) Let ¯ be the result of 1 iteration in a restricted CPD s algorithm on T s with initial guess A ( n ) s , n = 1 , ..., N and restriction ∆ Update the affected variables A ( n ) using ¯ A ( n ) s , n = 1 , ..., N k ← k + 1 10

  19. Randomized block sampling CPD: algorithm input : Data T and initial guess A ( n ) , n = 1 , ..., N output: A ( n ) , n = 1 , ..., N such that T ≈ A (1) , . . . , A ( N ) � � while k < K and not converged do Create sample T s and corresponding A ( n ) s , n = 1 , . . . , N A ( n ) Let ¯ be the result of 1 iteration in a restricted CPD s algorithm on T s with initial guess A ( n ) s , n = 1 , ..., N and restriction ∆ Update the affected variables A ( n ) using ¯ A ( n ) s , n = 1 , ..., N k ← k + 1 10

  20. Ingredient 1: randomized block sampling For a 6 × 6 tensor and block size 3 × 2: I 1 = { 3 , 1 , 2 , 6 , 5 , 4 } I 2 = { 1 , 2 , 4 , 6 , 3 , 5 } 11

  21. Ingredient 1: randomized block sampling For a 6 × 6 tensor and block size 3 × 2: I 1 = { 3 , 1 , 2 , 6 , 5 , 4 } I 1 = { 3 , 1 , 2 , 6 , 5 , 4 } I 2 = { 1 , 2 , 4 , 6 , 3 , 5 } I 2 = { 1 , 2 , 4 , 6 , 3 , 5 } 11

  22. Ingredient 1: randomized block sampling For a 6 × 6 tensor and block size 3 × 2: I 1 = { 3 , 1 , 2 , 6 , 5 , 4 } I 1 = { 3 , 1 , 2 , 6 , 5 , 4 } I 1 = { 6 , 1 , 4 , 2 , 5 , 3 } I 2 = { 1 , 2 , 4 , 6 , 3 , 5 } I 2 = { 1 , 2 , 4 , 6 , 3 , 5 } I 2 = { 1 , 2 , 4 , 6 , 3 , 5 } 11

  23. Ingredient 2: restricted CPD algorithm ◮ ALS variant − 1 A ( n ) k +1 = (1 − α ) A ( n ) + α T ( n ) ¯ V ( n ) ( ¯ W ( n ) ) k Enforce restriction by α = ∆ k . 12

  24. Ingredient 2: restricted CPD algorithm ◮ ALS variant − 1 A ( n ) k +1 = (1 − α ) A ( n ) + α T ( n ) ¯ V ( n ) ( ¯ W ( n ) ) k Enforce restriction by α = ∆ k . ◮ NLS variant 1 2 || vec ( F ( x k )) − J k p k || 2 s.t. || p k || ≤ ∆ k min p k in which � A (1) , . . . , A ( N ) � F = T − 12

  25. Ingredient 3: restriction Use restriction of form � ∆ 0 if k < K search ∆ k = ˆ ∆ 0 · α ( k − K search ) / Q if k ≥ K search 10 − 1 10 − 3 0 50 100 150 200 Iteration k 13

  26. Ingredient 3: restriction Use restriction of form � ∆ 0 if k < K search ∆ k = ˆ ∆ 0 · α ( k − K search ) / Q if k ≥ K search 10 − 1 10 − 3 0 50 100 150 200 Iteration k Example (Selecting Q ) For a 100 × 100 × 100 tensor and block size 25 × 25 × 25, Q = 4 13

  27. Ingredient 4: A stopping criterion � 2 � �� A (1) , ..., A ( N ) � � �� ◮ Function evaluation f val = 0 . 5 � � T − 10 0 f val CPD Error 10 − 1 10 − 2 10 − 3 0 500 1 000 1 500 Iteration k 14

  28. Ingredient 4: A stopping criterion � 2 � �� A (1) , ..., A ( N ) � � �� ◮ Function evaluation f val = 0 . 5 � � T − 10 0 f val CPD Error 10 − 1 10 − 2 10 − 3 0 500 1 000 1 500 Iteration k ◮ Step size 14

  29. Intermezzo: Cram´ er–Rao bound ◮ Uncertainty of an estimate 68% − 3 σ − 2 σ − σ 0 σ 2 σ 3 σ 15

  30. Intermezzo: Cram´ er–Rao bound ◮ Uncertainty of an estimate 68% − 3 σ − 2 σ − σ 0 σ 2 σ 3 σ ◮ CRB ≤ σ 2 15

  31. Intermezzo: Cram´ er–Rao bound ◮ Uncertainty of an estimate 68% − 3 σ − 2 σ − σ 0 σ 2 σ 3 σ ◮ CRB ≤ σ 2 ◮ C = τ 2 ( J H J ) − 1 15

  32. Ingredient 4: Cram´ er–Rao bound based stopping criterion ◮ Experimental bound ◮ Use estimates A ( n ) k ◮ Use f val to estimate noise τ 16

  33. Ingredient 4: Cram´ er–Rao bound based stopping criterion ◮ Experimental bound ◮ Use estimates A ( n ) k ◮ Use f val to estimate noise τ ◮ Stopping criterion: � � � A ( n ) k ( i , r ) − A ( n ) k − K CRB ( i , r ) N I n R � � 1 � � � � D CRB = R � � n I n C ( n ) ( i , r ) n =1 i =1 r =1 16

  34. Ingredient 4: Cram´ er–Rao bound based stopping criterion ◮ Experimental bound ◮ Use estimates A ( n ) k ◮ Use f val to estimate noise τ ◮ Stopping criterion: � � � A ( n ) k ( i , r ) − A ( n ) k − K CRB ( i , r ) N I n R � � 1 � � � � D CRB = R � � n I n C ( n ) ( i , r ) n =1 i =1 r =1 ≤ γ 16

  35. Unrestricted phase vs restricted phase CPD Error 1 2 3 Iteration k ◮ Unrestricted phase (1 + 2): converge to a neighborhood of an optimum ◮ Restricted phase (3): pull iterates towards optimum 17

  36. Unrestricted phase vs restricted phase CPD Error 1 2 3 Iteration k ◮ Unrestricted phase (1 + 2): converge to a neighborhood of an optimum ◮ Restricted phase (3): pull iterates towards optimum 17

  37. Unrestricted phase vs restricted phase CPD Error 1 2 3 Iteration k ◮ Unrestricted phase (1 + 2): converge to a neighborhood of an optimum ◮ Restricted phase (3): pull iterates towards optimum Assumptions ◮ CPD of rank R exists ◮ SNR is high enough ◮ Most block dimensions > R 17

  38. Overview Decomposing large-scale tensors Randomized block sampling Experimental results Chemo-sensing application 18

  39. Experiment overview ◮ Experiments ◮ Comparison ALS vs NLS (see paper) ◮ Influence of block size ◮ Influence of step size (see paper) 19

  40. Experiment overview ◮ Experiments ◮ Comparison ALS vs NLS (see paper) ◮ Influence of block size ◮ Influence of step size (see paper) ◮ Performance ◮ 50 Monte Carlo experiments ◮ CPD error � � � � � � � � � A ( n ) � A ( n ) − A ( n ) max � / � � � � � � � � 0 res 0 � � � � � n 19

  41. Experiment overview ◮ Experiments ◮ Comparison ALS vs NLS (see paper) ◮ Influence of block size ◮ Influence of step size (see paper) ◮ Performance ◮ 50 Monte Carlo experiments ◮ CPD error � � � � � � � � � A ( n ) � A ( n ) − A ( n ) max � / � � � � � � � � 0 res 0 � � � � � n ◮ cpd rbs in Tensorlab 3.0 [Vervliet et al. 2016c] 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend