low rank interaction with sparse additive effects model
play

Low-rank Interaction with Sparse Additive Effects Model for Large - PowerPoint PPT Presentation

Low-rank Interaction with Sparse Additive Effects Model for Large Data Frames Genevive Robin 1 , Hoi-To Wai 2 , Julie Josse 1 , Olga Klopp 3 , ric Moulines 1 1 cole Polytechnique, 2 University of Hong Kong, 3 ESSEC Business School December


  1. Low-rank Interaction with Sparse Additive Effects Model for Large Data Frames Geneviève Robin 1 , Hoi-To Wai 2 , Julie Josse 1 , Olga Klopp 3 , Éric Moulines 1 1 École Polytechnique, 2 University of Hong Kong, 3 ESSEC Business School December 6. 2018 Poster #87 210 & 230 AB Thirty-second Conference on Neural Information 5-7pm Processing Systems

  2. Motivation: species monitoring Waterbirds counts Sites and year covariates 2008 2009 2010 Site Surface Country Latitude Year Spring N/O Spring N/E Winter S/O site 1 NA 0.35 Algeria 16 32 1 36.64 2008 0,499 1,672 0,505 site 2 299 286 346 15.4 Tunisia 2 34.11 Y U 2009 0,175 2,527 0,215 site 3 NA 96 151 1.12 Lybia 3 35.75 White headed duck: endangered site 4 NA NA NA 2010 0,36 -1,453 0,290 0.34 Morocco 4 35.56 • lead poisoning site 5 NA NA NA 2.8 Algeria 5 34.49 • wetland loss site 6 4647 6054 2442 2.6 Algeria 6 35.91 site 7 16 45 30 0.98 Tunisia 7 35.75 site 8 5916 6485 1249 7.2 Morocco 8 30.36 1) Characteristics of the data 2) Goal: estimate • Main e ff ects : e ff ect of covariates • Mixed : categorical, real and discrete Eurasian curlew: declining • Interactions : the remaining e ff ects • Large scale : 25,000+ survey sites • lead poisoning • Incomplete : missing values • habitat destruction • Side information : row & column covariates • disturbances

  3. Low-rank Interaction and Sparse main effects parameter (unknown) Heterogeneous f Y ij ( y ) = f ij ( y, X ij ) exponential family parametric model: depends on the entry q Main e ff ects and α k U k + L X X ij = h u ij , α i + L ij X = interactions in parameter space: regression k =1 “residual” term sparse regression low-rank on dictionary design α , ˆ Estimation: (ˆ L ) 2 argmin L ( Y ; X ) + λ 1 k L k ? + λ 2 k α k 1 Two-fold generalisation of 1. general sparsity pattern “sparse plus low-rank” 2. exponential family noise matrix recovery

  4. Statistical guarantees Convergence results α , ˆ Mixed Coordinate Gradient Descent Algorithm : (ˆ L ) 2 argmin L ( Y ; X ) + λ 1 k L k ? + λ 2 k α k 1 • proximal update for α • conditional gradient/Franke-Wolfe update for L Near optimal error bounds for main e ff ects and interactions Theorem 1: Sublinear convergence and computationally e ffi cient � � α 0 � ⇥ max k k U ( k ) k 1 � 2 � α � α 0 � � 1 � ˆ 2  + D α Theorem 2: κ 2 π The MCGD method converges to an F  rank( L 0 ) max( n, p ) k ˆ L � L 0 k 2 + D L - solution in iterations O (1 / ✏ ) ✏ π

  5. 200 600 � 2 LORIS LORIS + � α − α 0 � + k ˆ L � L 0 k 2 + MIEL MIEL Time (s) � ˆ 2500 Two-step o Two-step F o o group mean + svd group mean + svd o 2 500 150 2000 400 running time (s) 1500 100 300 1000 200 o 50 + 500 100 o o o + o + o + + + + 0 0 0 0e+00 1e+07 2e+07 3e+07 4e+07 0e+00 1e+07 2e+07 3e+07 4e+07 0e+00 1e+07 2e+07 3e+07 4e+07 size of data frame 0.25 Imputation error Fast in large dimensions 0.20 method 0.15 Relative RMSE Estimation of main e ff ects CA GLMM S LORI constant with dimensions MEAN 0.10 TRIM Robust to large proportions 0.05 of missing values 0.00 10 20 30 40 50 60 70 80 Percentage of missing values

  6. Poster #87 210 & 230 AB 5-7pm

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend