Low-rank Interaction with Sparse Additive Effects Model for Large - - PowerPoint PPT Presentation

low rank interaction with sparse additive effects model
SMART_READER_LITE
LIVE PREVIEW

Low-rank Interaction with Sparse Additive Effects Model for Large - - PowerPoint PPT Presentation

Low-rank Interaction with Sparse Additive Effects Model for Large Data Frames Genevive Robin 1 , Hoi-To Wai 2 , Julie Josse 1 , Olga Klopp 3 , ric Moulines 1 1 cole Polytechnique, 2 University of Hong Kong, 3 ESSEC Business School December


slide-1
SLIDE 1

Low-rank Interaction with Sparse Additive Effects Model for Large Data Frames

Geneviève Robin1, Hoi-To Wai2, Julie Josse1, Olga Klopp3, Éric Moulines1

December 6. 2018 Thirty-second Conference

  • n Neural Information

Processing Systems

Poster #87 210 & 230 AB 5-7pm

1École Polytechnique, 2University of Hong Kong, 3ESSEC Business School

slide-2
SLIDE 2

Motivation: species monitoring

White headed duck: endangered

  • lead poisoning
  • wetland loss

Eurasian curlew: declining

  • lead poisoning
  • habitat destruction
  • disturbances

2008 2009 2010 site 1 NA 16 32 site 2 299 286 346 site 3 NA 96 151 site 4 NA NA NA site 5 NA NA NA site 6 4647 6054 2442 site 7 16 45 30 site 8 5916 6485 1249

Site Surface Country Latitude

1 0.35 Algeria

36.64

2 15.4 Tunisia

34.11

3 1.12 Lybia

35.75

4 0.34 Morocco

35.56

5 2.8 Algeria

34.49

6 2.6 Algeria

35.91

7 0.98 Tunisia

35.75

8 7.2 Morocco

30.36

Year Spring N/O Spring N/E Winter S/O

2008 0,499 1,672

0,505

2009 0,175 2,527

0,215

2010 0,36

  • 1,453

0,290

  • Mixed: categorical, real and discrete
  • Large scale: 25,000+ survey sites
  • Incomplete: missing values
  • Side information: row & column covariates

U Y

Waterbirds counts Sites and year covariates

  • Main effects: effect of covariates
  • Interactions: the remaining effects

1) Characteristics of the data 2) Goal: estimate

slide-3
SLIDE 3

Low-rank Interaction and Sparse main effects

Xij = huij, αi + Lij

regression term

fYij(y) = fij(y, Xij)

Heterogeneous exponential family parametric model: Main effects and interactions in parameter space:

(ˆ α, ˆ L) 2 argmin L(Y ; X) + λ1 kLk? + λ2 kαk1

Two-fold generalisation of “sparse plus low-rank” matrix recovery

  • 1. general sparsity pattern
  • 2. exponential family noise

X =

q

X

k=1

αkU k + L

depends on the entry parameter (unknown) “residual” sparse regression

  • n dictionary

low-rank design

Estimation:

slide-4
SLIDE 4
  • ˆ

α α0 2

2 

  • α0
  • 1

π ⇥ maxk kU(k)k1 κ2 + Dα

kˆ L L0k2

F  rank(L0) max(n, p)

π + DL

Theorem 1:

(ˆ α, ˆ L) 2 argmin L(Y ; X) + λ1 kLk? + λ2 kαk1

Near optimal error bounds for main effects and interactions

Statistical guarantees Theorem 2:

The MCGD method converges to an

  • solution in iterations

O(1/✏)

Mixed Coordinate Gradient Descent Algorithm:

  • proximal update for
  • conditional gradient/Franke-Wolfe update for

Sublinear convergence and computationally efficient

α L Convergence results

slide-5
SLIDE 5

0e+00 1e+07 2e+07 3e+07 4e+07 100 200 300 400 500 600 size of data frame running time (s) + + + + 0e+00 1e+07 2e+07 3e+07 4e+07 500 1000 1500 2000 2500

  • +
  • MIEL

group mean + svd + + + + 0e+00 1e+07 2e+07 3e+07 4e+07 50 100 150 200

  • +
  • MIEL

group mean + svd

  • ˆ

α − α0 2

2

kˆ L L0k2

F

Time (s)

LORIS Two-step

LORIS Two-step

0.00 0.05 0.10 0.15 0.20 0.25 10 20 30 40 50 60 70 80

Percentage of missing values Relative RMSE method

CA GLMM LORI MEAN TRIM

Imputation error

S

Fast in large dimensions Estimation of main effects constant with dimensions Robust to large proportions

  • f missing values
slide-6
SLIDE 6

Poster #87 210 & 230 AB 5-7pm