(Some) Challenges in (Some) Challenges in Tensor Mining Tensor Mining
Evrim Acar
Sandia National Labs., Livermore, CA
(Some) Challenges in (Some) Challenges in Tensor Mining Tensor - - PowerPoint PPT Presentation
(Some) Challenges in (Some) Challenges in Tensor Mining Tensor Mining Evrim Acar Sandia National Labs., Livermore, CA Tensor Mining Tensor Mining Parafac Parafac unsupervised unsupervised = + + X X E X X = + dense or sparse
Evrim Acar
Sandia National Labs., Livermore, CA
X Xtest
test
X X
X
= +
E
+
X E
= + unsupervised unsupervised Parafac Parafac Tucker Tucker X Xtrain y y supervised supervised
Xtrain
≈ y y
X X dense or sparse dense or sparse
evolving over time.
– DBLP dataset: Authors x Conferences x Years (10K x 2K x 14: ~0.1% dense) authors authors conferences conferences
1991 1991 1992 1992
… …
2004 2004
Q1: Can we use tensor decompositions to model the data and extract meaningful underlying factors? Q2: Can we predict who is going to publish at which conferences in future? (Link Prediction in time)
Joint work with Joint work with T.G. T.G. Kolda Kolda and D. M. and D. M. Dunlavy Dunlavy
SIAM CS&E March 2-6, 2009
# of papers by ith author at jth conf. in year k
authors authors
conferences conferences
≈
a1 b1 c1
+ +
a2 b2 c2 …
…
aR bR cR years years
Solve using a gradient-
based
Initialization:
first two modes using svd svd, ,
last mode: random,
X
authors authors
conferences conferences
≈
a1 b1 c1
+ +
a2 b2 c2 …
…
aR bR cR
X X
year year
c cr
r
b br
r
a ar
r
1992 1994 1996 1998 2000 2002 2004 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Years Coeffs. Time mode 200 400 600 800 1000 1200 1400 1600 1800BILDMED CARS DAGM
2000 4000 6000 8000 10000 12000Hans Peter Meinzer Heinrich Niemann Thomas Martin Lehmann
1992 1994 1996 1998 2000 2002 2004IJCAI Craig Boutilier Daphne Koller
Success with 70% randomly missing data [Tomasi&Bro, 2005]
time time channels channels
Joint work with Joint work with
Yener, C. A. , C. A. Bingol
. H. Bingol Bingol
CWT
Time samples Channels Time samples Scales (freq.) Channels
xij: Electrical potential at ith
sample jth channel
xijk: Power of a wavelet coeff.
at ith sample jth scale kth channel
transform (CWT):
Time samples Scales Channels
a1 b1 c1 a2 b2 c2
Acar et al’07, De Vos et al’07
500 1000 1500 2000
1 2 3 4 5 6 7 8 Signature in time domain Coeffs. Time Samples 20 40 60 80 100 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 Scales Coeffs. Signature in freq. domain Signature in electrodes domain
Fp1 F3 F7 C3 T3 T5 O1 Fp2 F4 F8 C4 T4 T6 P4 O2 Fz Pz
0.2 0.4 0.6 0.8 1
500 1000 1500 2000
2 4 6 8 10 Time Samples Coeffs. Signature in time domain 20 40 60 80 100 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 Scales Coeffs. Signature in freq. domain Signature in electrodes domain
Fp1 F3 F7 C3 T3 T5 O1 Fp2 F4 F8 C4 T4 T6 P4 O2 Fz Pz
0.2 0.4 0.6 0.8 1
ALS ALS
1000 2000
10 Time Samples 50 100 0.1 0.2 Scales 1000 2000
10 Time Samples 50 100 0.1 0.2 Scales 1000 2000
10 Time Samples 50 100 0.1 0.2 Scales 1000 2000
10 Time Samples 50 100 0.1 0.2 Scales 1000 2000
10 Time Samples 50 100 0.1 0.2 Scales 1000 2000
20 Time Samples 50 100 0.1 0.2 Scales 1000 2000
10 Time Samples 50 100 0.1 0.2 Scales 1000 2000
10 Time Samples 50 100
0.5 Scales 1000 2000
20 Time Samples 50 100 0.2 0.4 Scales
HOSVD HOSVD RANDOM RANDOM
1000 2000
10 Time Samples 50 100 0.1 0.2 Scales 1000 2000
10 Time Samples 50 100 0.1 0.2 Scales 1000 2000
10 Time Samples 50 100 0.1 0.2 Scales 1000 2000
10 Time Samples 50 100 0.1 0.2 Scales
time time channels channels
Time samples
1 2
( ) ( ) ( )
n
f s f s f s ⎡ ⎤ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦
Features Time epochs Channels Channels
Epilepsy Feature Tensor
xij: Electrical potential at ith
channel jth time sample
xijk: Value of jth feature at ith epoch
recorded at kth channel
X ytrain
Training Set
and the labels y.
seizure non-seizure
Pre1 Seizure1 Post1 Seizure3 Pre3 Post3 Seizure2 Pre2 Post2
Test Test
y ytest
test
Test Set
Time epochs
I
J
K
y ytrain
train
– Modify multiway regression models, e.g., multilinear PLS [Bro, 1996; Bro et al., 2001], as classifiers.
Time Epochs
…..
Features - Channels
Xtrain Multilinear PLS
– Unfold the data and apply two-way classification, e.g., SVM.
Xtest Ttest
Linear Discriminant Analysis
ytest
i
Ttrain WJ WK
R R R
I
J
K
– We need models to capture the underlying sparse factors in sparse tensors with missing entries.
– Important also in practice.
– Algorithms suffer from the local minima problem. In practice, we may end up interpreting our results differently.
– We need classification models for tensors as good as the state-of-the-art two- way classification approaches such as SVMs.
– Social Networks Analysis: [Tensor toolbox & Poblano toolbox (by Sandia)]
Decompositions, SAND2009-0857, Feb. 2009.
– Understanding Epileptic Seizures: [PLS toolbox (by Eigenvector Research)]
23(13): i10-i18, 2007.
29th Int. Conf. IEEE Engineering in Medicine and Biology Society, 2007.
– Survey:
Transactions on Knowledge and Data Engineering, 21(1): 6-20, 2009.
Evrim Acar, Sandia National Laboratories, eacarat@sandia.gov