Joint variable and rank selection for parsimonious estimation of - - PowerPoint PPT Presentation

joint variable and rank selection for parsimonious
SMART_READER_LITE
LIVE PREVIEW

Joint variable and rank selection for parsimonious estimation of - - PowerPoint PPT Presentation

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary Joint variable and rank selection for parsimonious estimation of high dimensional matrices Florentina Bunea


slide-1
SLIDE 1

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary

Joint variable and rank selection for parsimonious estimation of high dimensional matrices

Florentina Bunea Department of Statistical Science Cornell University High-dimensional Problems in Statistics Workshop ETH, September 2011

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-2
SLIDE 2

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary

1 Framework and motivation 2

Joint Rank and Row Selection JRRS Methods The construction of the one-step JRRS estimator Row and rank sparsity oracle inequalities via one-step JRRS One-step JRRS to select the best estimator from a finite list

3 Two-step JRRS estimators

Rank Constrained Group Lasso RCGL Adaptive RCGL for joint row and rank selection Row and rank sparsity oracle inequalities via two-step JRRS

4 Numerical performance and examples 5 Summary

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-3
SLIDE 3

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary

A rank and row sparse model

Model: Y = XA + E; E noise matrix. Data: m × n matrix Y and m × p matrix X. Target: p × n matrix A ← → pn unknown parameters Rank of A is r ≤ n ∧ p. Nbr of non-zero rows of A is |J| ≤ p. Row and Rank Sparse Target ← → r(|J| + n − r) free param. Full rank + all rows + large n and p = Hopeless, if m small. Low rank + Small |J| = HOPE, if m small. Estimate A under joint rank and row constraints.

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-4
SLIDE 4

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary

Why rank and row sparse Y = XA + E ?

  • Multivariate response regression

Measure n response variables for m subjects: Yi ∈ Rn, 1 ≤ i ≤ m. Measure p predictor variables for m subjects: Xi ∈ Rp, 1 ≤ i ≤ m. No (rank / row ) constraints on A ⇐ ⇒ n separate univ. Zero rows in A ⇐ ⇒ Not all predictors in the model. Low rank of A ⇐ ⇒ Only few orthogonal scores relevant.

Goal: Estimation tailored to row and rank sparsity Use only a subset of the predictors to construct few scores, with high predictive power, under JOINT rank and row restrictions

  • n A.

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-5
SLIDE 5

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary

Why row and rank sparse Y = XA + E ? Contd.

  • Supervised row and rank sparse PCA.
  • Provides framework for row and rank sparse PCA and CCA.
  • Building block in functional data analysis (with predictors).

Y = matrix of discretized trajectories for n subjects; X = matrix of basis functions evaluated at discrete data points + possibly other predictors of interest.

  • Building block in multiple time series analysis.

(Macro-economics and forecasting) Y = matrix of n time series observed over m time periods (n types of interest rates) X = Y in the past + other predictive time series (other potentially connected macro-economic factors).

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-6
SLIDE 6

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary

A historical perspective on sparse Y = XA + E

Rank Sparse Models

  • Reduced-Rank Regression: Y = XA + E, rank (A) = k = known.

Asymptotic results m → ∞: Anderson (1951, 1999, 2002); Rao (1979); Reinsel and Velu (1998); Izenman (1975; 2008).

  • Low rank approximations: Y = XA + E, rank (A) = r = unknown.

Adaptive estimation + Finite sample theoretical analysis, valid for any m, n, p and any r. Rank Selection Criterion (RSC): Bunea, She and Wegkamp (2011). Nuclear Norm Penalized (NNP) estimators: Cand` es and Plan; Tao (2009+), Rhode and Tsybakov (2011), Negahban and Wainwright (2011); Koltchinskii, Lounici, and Tsybakov (2011).

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-7
SLIDE 7

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary

A historical perspective on sparse Y = XA + E contd.

Row-Sparse Models

  • Predictor Xj not in the model ⇐

⇒ The j-th row of A is zero.

  • Individual variable selection in multivariate response regression
  • Group selection in univariate response regression.

Popular method: The Group Lasso. Yuan and Lin ( 2006); Lounici, Pontil, Tsybakov and van der Geer (2011).

No rank and row sparse models; no adaptive methods tailored to both.

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-8
SLIDE 8

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary

Joint rank and row selection: JRRS

  • Will develop new criteria, for joint rank and predictor selection.
  • r ≤ n ∧ |J|, rank(X) = q ≤ m ∧ p; |J| ≤ p; r and J unknown.
  • Optimal risk rates achievable adaptively by

the G-Lasso, RSC/NNP and (to show) JRRS. G-Lasso: |J|n, in row-sparse models RSC or NNP: (p + n)r, in rank-sparse models JRRS: (|J| + n)r, in rank and row-sparse models

  • JRRS rates never worse and typically much better.

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-9
SLIDE 9

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary The construction of the one-step JRRS estimator Row and rank sparsity oracle inequalities via one-step JRRS One-step JRRS to select the best estimator from a finite list

A penalized least squares estimator

  • Y is a m × n matrix;

X is a m × p matrix.

  • M2

F is the sum of the squared entries of M ∈ Mp×n.

  • Candidate model B ∈ Mp×n has number of parameters

(n + |J(B)| − rank(B))rank(B) ≤ (n + |J(B)|)rank(B). The one-step JRRS estimator

  • A

= arg min

B ∈ Mp×n

{Y − XB2

F + cσ2(2n + |J(B)|)rank(B)}.

  • Generalizes to multivariate response models

the AIC/Cp-type criteria developed for univariate response.

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-10
SLIDE 10

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary The construction of the one-step JRRS estimator Row and rank sparsity oracle inequalities via one-step JRRS One-step JRRS to select the best estimator from a finite list

More on the one-step JRRS penalty

  • B ∈ Mp×n with J(B) non-zero rows.
  • JRRS penalty

pen(B) ∝ σ2(n + |J(B)|)rank(B)

  • B ∈ Mp×n ( ignoring non-zero rows), rank(X) = q.
  • RSC penalty

pen(B) ∝ σ2(n + q)rank(B)

  • Squared ”error level” in full model = Ed2

1(PE) ≈ σ2(n + q),

E with iid sub-Gaussian entries, P = X(X ′X)−X ′.

  • JRRS generalizes RSC to allow for variable selection.
  • To reduce rank and select variables work with:

Ed2

1(PJ(B)E) ≈ σ2(n + |J(B)|).

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-11
SLIDE 11

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary The construction of the one-step JRRS estimator Row and rank sparsity oracle inequalities via one-step JRRS One-step JRRS to select the best estimator from a finite list

Oracle-type bounds for the risk of the one-step JRRS

  • rank(A) = r,

non-zero rows of A with indices in J(A) = J. Adaptation to Row and Rank Sparsity via one-step JRRS For all A and X E

  • XA − X

A2

  • inf

B

  • XA − XB2 + σ2(n + |J(B)|)r(B)
  • σ2{n + |J|}r.
  • RHS = the best bias-variance trade-off across B.

A is adaptive: it mimics the behavior of an optimal estimator computed knowing r and J. Minimax rate, under suitable conditions.

  • Bound valid for any m, n, p.

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-12
SLIDE 12

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary The construction of the one-step JRRS estimator Row and rank sparsity oracle inequalities via one-step JRRS One-step JRRS to select the best estimator from a finite list

Select the best from a finite list

  • If p > 20, JRRS estimation over all B becomes

computationally intractable

  • B = {B1, . . . , BL} = Finite (large) collection of (random)

matrices with different sparsity patterns; may depend on data X and Y . Optimal selection from a finite list via JRRS For all A and X E

  • XA − X

A2

  • inf

1≤j≤L

  • XA − XBj2 + σ2(n + J(Bj))r(Bj)
  • .
  • A

= arg min

B ∈ B

{Y − XB2

F + cσ2(2n + |J(B)|)rank(B)}.

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-13
SLIDE 13

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary Rank Constrained Group Lasso RCGL Adaptive RCGL for joint row and rank selection Row and rank sparsity oracle inequalities via two-step JRRS

Rank Constrained Group Lasso: main building block

  • One-step JRRS penalty pen(B) ∝ (n + |J(B)|)rank(B).

J(B) forces complete enumeration; for large p that’s a problem!

  • Idea: use convex relation B2,1 = p

j=1 bj2.

  • Set λk ∝ σ
  • kd2

1(X), for each k.

  • Bk = arg min

rank(B)≤k

  • Y − XB2

F + λkB2,1

  • .

Bk is a Rank-Constrained G-Lasso. (RCGL) Other ”group” penalties possible.

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-14
SLIDE 14

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary Rank Constrained Group Lasso RCGL Adaptive RCGL for joint row and rank selection Row and rank sparsity oracle inequalities via two-step JRRS

Bk = arg minrank(B)≤k

  • Y − XB2

F + λkB2,1

  • .
  • For k = n ∧ p, estimator

Bk is G-Lasso.

  • For λ = 0, estimator

Bk is a reduced-rank estimator.

  • Otherwise,

Bk is a synthesis of the two; new algorithm needed. Efficient algorithm Bunea, She and Wegkamp (2011).

  • Works in high dimensions.

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-15
SLIDE 15

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary Rank Constrained Group Lasso RCGL Adaptive RCGL for joint row and rank selection Row and rank sparsity oracle inequalities via two-step JRRS

Two-step JRRS: Method 1

Method 1 Step 1. Use the Rank Selection Criterion RSC to estimate consistently r by r. Step 2. Compute the Rank Constrained G-Lasso estimator

  • Bk with k =

r to obtain the final estimator B = B

r.

Major Practical Advantage: Easy tuning, backed up by theory.

  • For Step 1: Same tuning parameter of RSC gives best MSE and

correct rank. Can use CV safely; other alternatives exist.

  • For Step 2: We want best MSE, CV safe.

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-16
SLIDE 16

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary Rank Constrained Group Lasso RCGL Adaptive RCGL for joint row and rank selection Row and rank sparsity oracle inequalities via two-step JRRS

Two-step JRRS: Method 2

Method 2 Step 1. Pre-specify a grid of values Λ for λ. Use RCGL to construct B = { Bk,λ : k ∈ {1, . . . , q}, λ ∈ Λ}. Step 2. Compute

  • B = arg min

B∈B

{Y − XB2

F + pen(B)},

with pen(B) ∝ σ2(n + |J(B)|)rank(B).

  • Requires a 2-D grid search: more computationally involved than Met. 1.

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-17
SLIDE 17

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary Rank Constrained Group Lasso RCGL Adaptive RCGL for joint row and rank selection Row and rank sparsity oracle inequalities via two-step JRRS

Oracle-type bounds for the risk of the two-step JRRS

  • Method 1 (RSC + RCGL) →

B; Method 2 (RCGL + AIC-M) → B Adaptation to Row and Rank Sparsity via two-step JRRS For all A and for X satisfying Assumption 1 E

  • XA − X

B2

  • inf

B

  • XA − XB2 + σ2(n + J(B))r(B)
  • σ2{n + J(A)}r(A).

If, in addition, dr(XA) > 2 √ 2σ(√n + √q), same inequality holds for B.

  • RHS = the best bias-variance trade-off across all matrices B.

B, B are adaptive: mimic the behavior of an optimal estimator computed knowing r(A) and J(A).

  • Bound valid for any m, n, p; computationally efficient.

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-18
SLIDE 18

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary Rank Constrained Group Lasso RCGL Adaptive RCGL for joint row and rank selection Row and rank sparsity oracle inequalities via two-step JRRS

Mild conditions on the design matrix

Assumption 1 There exists a set J ⊂ {1, . . . , p} and a number δJ > 0 such that 1 mXB2

F ≥ δJ

  • j∈J

bj2

2,

for all B = [b1 · · · bp]T ∈ Rp×n

  • Only a sub-matrix of X ′X has a non-zero smallest eigen-value.

Mild condition.

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-19
SLIDE 19

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary

Large p - small m numerical performance comparison

  • m = 30, |J| = 15, p = 100, n = 10, r = 2, σ2 = 1.
  • Performance comparison between:

rank and row reduction via RSC→RCGL and G-LASSO→RSC,

  • nly row via G-LASSO, and only rank via RSC.
  • All optimally tuned on a very large independent set.

Method MSE RSC→RCGL 363 G-LASSO→RSC 402 G-LASSO 511 RSC 1905

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-20
SLIDE 20

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary

Large m - small p numerical performance comparison

  • m = 100, |J| = 15, p = 25, n = 25, r = 5, σ2 = 1.
  • Performance comparison between:

rank and row reduction via RSC→RCGL, G-LASSO→RSC,

  • nly row via G-LASSO, and only rank via RSC
  • All optimally tuned on a very large independent set.

Method MSE RSC→RCGL 8.1 G-LASSO→RSC 8.1 RSC 11.5 G-LASSO 17.7

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-21
SLIDE 21

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary

A study of the effect of HIV-infection

  • n human cognitive abilities
  • HIV-Neuroimaging laboratory at Brown University, PI R. Cohen.
  • m = 62 HIV+ patients, also infected with Hepatitis C,

and with a history of drug abuse

  • n = 13 neuro-cognitive indices (NCIs) from five domains:

attention/working memory, speed of information processing psychomotor abilities, executive function, and learning and memory.

  • p = 234 predictors (a) clinical and demographic predictors and (b)

brain volumetric and diffusion tensor imaging (DTI) derived measures of several white-matter regions of interest, such as fractional anisotropy, mean diffusivity, axial diffusivity, and radial diffusivity, along with all volumetrics × DTI interactions.

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-22
SLIDE 22

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary

RSC and JRRS: two rank-1 models

  • Both methods: One new predictive score S.
  • Left = RSC;

MSE = 193; S = lin. comb. of p = 234 predictors.

  • Right = JRRS; MSE = 138; S = lin. comb. of |J| = 10 predictors.
  • −40

−20 20 40 60 80 Original predictors Weights

HIV_stage hcv_current age Education kmsk_alc kmsk_cocopi fa_cc1 fa_cc234 fa_cc5 md_cc1 md_cc234 md_cc5 fa_ic1 fa_ic2 fa_ic3 fa_ic4 md_ic1 md_ic2 md_ic3 md_ic4 whitematter cortex thalamus caudate putamen pallidum hippocampus amygdala cc_splenium cc_body cc_genu HIV_stage.fa_cc1 HIV_stage.fa_cc234 HIV_stage.fa_cc5 HIV_stage.md_cc1 HIV_stage.md_cc234 HIV_stage.md_cc5 HIV_stage.fa_ic1 HIV_stage.fa_ic2 HIV_stage.fa_ic3 HIV_stage.fa_ic4 HIV_stage.md_ic1 HIV_stage.md_ic2 HIV_stage.md_ic3 HIV_stage.md_ic4 fa_cc1.fa_cc1 fa_cc234.fa_cc234 fa_cc5.fa_cc5 md_cc1.md_cc1 md_cc234.md_cc234 md_cc5.md_cc5 fa_ic1.fa_ic1 fa_ic2.fa_ic2 fa_ic3.fa_ic3 fa_ic4.fa_ic4 md_ic1.md_ic1 md_ic2.md_ic2 md_ic3.md_ic3 md_ic4.md_ic4 HIV_stage.whitematter HIV_stage.cortex HIV_stage.thalamus HIV_stage.caudate HIV_stage.putamen HIV_stage.pallidum HIV_stage.hippocampus HIV_stage.amygdala HIV_stage.cc_splenium HIV_stage.cc_body HIV_stage.cc_genu fa_cc1.whitematter fa_cc1.cortex fa_cc1.thalamus fa_cc1.caudate fa_cc1.putamen fa_cc1.pallidum fa_cc1.hippocampus fa_cc1.amygdala fa_cc1.cc_splenium fa_cc1.cc_body fa_cc1.cc_genu fa_cc234.whitematter fa_cc234.cortex fa_cc234.thalamus fa_cc234.caudate fa_cc234.putamen fa_cc234.pallidum fa_cc234.hippocampus fa_cc234.amygdala fa_cc234.cc_splenium fa_cc234.cc_body fa_cc234.cc_genu fa_cc5.whitematter fa_cc5.cortex fa_cc5.thalamus fa_cc5.caudate fa_cc5.putamen fa_cc5.pallidum fa_cc5.hippocampus fa_cc5.amygdala fa_cc5.cc_splenium fa_cc5.cc_body fa_cc5.cc_genu md_cc1.whitematter md_cc1.cortex md_cc1.thalamus md_cc1.caudate md_cc1.putamen md_cc1.pallidum md_cc1.hippocampus md_cc1.amygdala md_cc1.cc_splenium md_cc1.cc_body md_cc1.cc_genu md_cc234.whitematter md_cc234.cortex md_cc234.thalamus md_cc234.caudate md_cc234.putamen md_cc234.pallidum md_cc234.hippocampus md_cc234.amygdala md_cc234.cc_splenium md_cc234.cc_body md_cc234.cc_genu md_cc5.whitematter md_cc5.cortex md_cc5.thalamus md_cc5.caudate md_cc5.putamen md_cc5.pallidum md_cc5.hippocampus md_cc5.amygdala md_cc5.cc_splenium md_cc5.cc_body md_cc5.cc_genu fa_ic1.whitematter fa_ic1.cortex fa_ic1.thalamus fa_ic1.caudate fa_ic1.putamen fa_ic1.pallidum fa_ic1.hippocampus fa_ic1.amygdala fa_ic1.cc_splenium fa_ic1.cc_body fa_ic1.cc_genu fa_ic2.whitematter fa_ic2.cortex fa_ic2.thalamus fa_ic2.caudate fa_ic2.putamen fa_ic2.pallidum fa_ic2.hippocampus fa_ic2.amygdala fa_ic2.cc_splenium fa_ic2.cc_body fa_ic2.cc_genu fa_ic3.whitematter fa_ic3.cortex fa_ic3.thalamus fa_ic3.caudate fa_ic3.putamen fa_ic3.pallidum fa_ic3.hippocampus fa_ic3.amygdala fa_ic3.cc_splenium fa_ic3.cc_body fa_ic3.cc_genu fa_ic4.whitematter fa_ic4.cortex fa_ic4.thalamus fa_ic4.caudate fa_ic4.putamen fa_ic4.pallidum fa_ic4.hippocampus fa_ic4.amygdala fa_ic4.cc_splenium fa_ic4.cc_body fa_ic4.cc_genu md_ic1.whitematter md_ic1.cortex md_ic1.thalamus md_ic1.caudate md_ic1.putamen md_ic1.pallidum md_ic1.hippocampus md_ic1.amygdala md_ic1.cc_splenium md_ic1.cc_body md_ic1.cc_genu md_ic2.whitematter md_ic2.cortex md_ic2.thalamus md_ic2.caudate md_ic2.putamen md_ic2.pallidum md_ic2.hippocampus md_ic2.amygdala md_ic2.cc_splenium md_ic2.cc_body md_ic2.cc_genu md_ic3.whitematter md_ic3.cortex md_ic3.thalamus md_ic3.caudate md_ic3.putamen md_ic3.pallidum md_ic3.hippocampus md_ic3.amygdala md_ic3.cc_splenium md_ic3.cc_body md_ic3.cc_genu md_ic4.whitematter md_ic4.cortex md_ic4.thalamus md_ic4.caudate md_ic4.putamen md_ic4.pallidum md_ic4.hippocampus md_ic4.amygdala md_ic4.cc_splenium md_ic4.cc_body md_ic4.cc_genu whitematter.whitematter cortex.cortex thalamus.thalamus caudate.caudate putamen.putamen pallidum.pallidum hippocampus.hippocampus amygdala.amygdala cc_splenium.cc_splenium cc_body.cc_body cc_genu.cc_genu
  • −10

−5 5 Original predictors Weights

HIV_stage hcv_current age Education kmsk_alc kmsk_cocopi fa_cc1 fa_cc234 fa_cc5 md_cc1 md_cc234 md_cc5 fa_ic1 fa_ic2 fa_ic3 fa_ic4 md_ic1 md_ic2 md_ic3 md_ic4 whitematter cortex thalamus caudate putamen pallidum hippocampus amygdala cc_splenium cc_body cc_genu HIV_stage.fa_cc1 HIV_stage.fa_cc234 HIV_stage.fa_cc5 HIV_stage.md_cc1 HIV_stage.md_cc234 HIV_stage.md_cc5 HIV_stage.fa_ic1 HIV_stage.fa_ic2 HIV_stage.fa_ic3 HIV_stage.fa_ic4 HIV_stage.md_ic1 HIV_stage.md_ic2 HIV_stage.md_ic3 HIV_stage.md_ic4 fa_cc1.fa_cc1 fa_cc234.fa_cc234 fa_cc5.fa_cc5 md_cc1.md_cc1 md_cc234.md_cc234 md_cc5.md_cc5 fa_ic1.fa_ic1 fa_ic2.fa_ic2 fa_ic3.fa_ic3 fa_ic4.fa_ic4 md_ic1.md_ic1 md_ic2.md_ic2 md_ic3.md_ic3 md_ic4.md_ic4 HIV_stage.whitematter HIV_stage.cortex HIV_stage.thalamus HIV_stage.caudate HIV_stage.putamen HIV_stage.pallidum HIV_stage.hippocampus HIV_stage.amygdala HIV_stage.cc_splenium HIV_stage.cc_body HIV_stage.cc_genu fa_cc1.whitematter fa_cc1.cortex fa_cc1.thalamus fa_cc1.caudate fa_cc1.putamen fa_cc1.pallidum fa_cc1.hippocampus fa_cc1.amygdala fa_cc1.cc_splenium fa_cc1.cc_body fa_cc1.cc_genu fa_cc234.whitematter fa_cc234.cortex fa_cc234.thalamus fa_cc234.caudate fa_cc234.putamen fa_cc234.pallidum fa_cc234.hippocampus fa_cc234.amygdala fa_cc234.cc_splenium fa_cc234.cc_body fa_cc234.cc_genu fa_cc5.whitematter fa_cc5.cortex fa_cc5.thalamus fa_cc5.caudate fa_cc5.putamen fa_cc5.pallidum fa_cc5.hippocampus fa_cc5.amygdala fa_cc5.cc_splenium fa_cc5.cc_body fa_cc5.cc_genu md_cc1.whitematter md_cc1.cortex md_cc1.thalamus md_cc1.caudate md_cc1.putamen md_cc1.pallidum md_cc1.hippocampus md_cc1.amygdala md_cc1.cc_splenium md_cc1.cc_body md_cc1.cc_genu md_cc234.whitematter md_cc234.cortex md_cc234.thalamus md_cc234.caudate md_cc234.putamen md_cc234.pallidum md_cc234.hippocampus md_cc234.amygdala md_cc234.cc_splenium md_cc234.cc_body md_cc234.cc_genu md_cc5.whitematter md_cc5.cortex md_cc5.thalamus md_cc5.caudate md_cc5.putamen md_cc5.pallidum md_cc5.hippocampus md_cc5.amygdala md_cc5.cc_splenium md_cc5.cc_body md_cc5.cc_genu fa_ic1.whitematter fa_ic1.cortex fa_ic1.thalamus fa_ic1.caudate fa_ic1.putamen fa_ic1.pallidum fa_ic1.hippocampus fa_ic1.amygdala fa_ic1.cc_splenium fa_ic1.cc_body fa_ic1.cc_genu fa_ic2.whitematter fa_ic2.cortex fa_ic2.thalamus fa_ic2.caudate fa_ic2.putamen fa_ic2.pallidum fa_ic2.hippocampus fa_ic2.amygdala fa_ic2.cc_splenium fa_ic2.cc_body fa_ic2.cc_genu fa_ic3.whitematter fa_ic3.cortex fa_ic3.thalamus fa_ic3.caudate fa_ic3.putamen fa_ic3.pallidum fa_ic3.hippocampus fa_ic3.amygdala fa_ic3.cc_splenium fa_ic3.cc_body fa_ic3.cc_genu fa_ic4.whitematter fa_ic4.cortex fa_ic4.thalamus fa_ic4.caudate fa_ic4.putamen fa_ic4.pallidum fa_ic4.hippocampus fa_ic4.amygdala fa_ic4.cc_splenium fa_ic4.cc_body fa_ic4.cc_genu md_ic1.whitematter md_ic1.cortex md_ic1.thalamus md_ic1.caudate md_ic1.putamen md_ic1.pallidum md_ic1.hippocampus md_ic1.amygdala md_ic1.cc_splenium md_ic1.cc_body md_ic1.cc_genu md_ic2.whitematter md_ic2.cortex md_ic2.thalamus md_ic2.caudate md_ic2.putamen md_ic2.pallidum md_ic2.hippocampus md_ic2.amygdala md_ic2.cc_splenium md_ic2.cc_body md_ic2.cc_genu md_ic3.whitematter md_ic3.cortex md_ic3.thalamus md_ic3.caudate md_ic3.putamen md_ic3.pallidum md_ic3.hippocampus md_ic3.amygdala md_ic3.cc_splenium md_ic3.cc_body md_ic3.cc_genu md_ic4.whitematter md_ic4.cortex md_ic4.thalamus md_ic4.caudate md_ic4.putamen md_ic4.pallidum md_ic4.hippocampus md_ic4.amygdala md_ic4.cc_splenium md_ic4.cc_body md_ic4.cc_genu whitematter.whitematter cortex.cortex thalamus.thalamus caudate.caudate putamen.putamen pallidum.pallidum hippocampus.hippocampus amygdala.amygdala cc_splenium.cc_splenium cc_body.cc_body cc_genu.cc_genu

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-23
SLIDE 23

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary

  • JRRS selected rank 1 and only 10 predictors.
  • Education is one of them, confirming past findings.
  • The fractional anisotropy at corpus callosum stands out among

the very many DTI-derived measures, in terms of predictive power.

  • New finding in the lab and first quantitative confirmation.

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-24
SLIDE 24

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary

Summary

Methods Adaptation Assumptions Restrictions to RR-sparsity

  • n X and/or A
  • n p

One-step JRRS (AIC-M) Yes None p ≤ 20 Two-step JRRS1 Restricted Eigenvalue; (RSC → RCGL ) Yes dr(XA) > ”noise level” None Two-step JRRS2 (RCGL→ AIC-M) Yes Restricted Eigenvalue None GL → RSC Yes Mutual coherence et al. None minj aj2 > noise level

  • RSC → RCGL easy to tune in practice; backed up by theory. Best !
  • RCGL→ AIC-M tuning requires search over a 2-D grid. Second best !
  • GL → RSC: (1) Most restrictive theoretical assumptions;

(2) Requires tuning for consistent group selection, open problem!

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-25
SLIDE 25

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary

Summary: Our contribution

Jointly rank and row-sparse models and their estimation

1

Introduced jointly rank and row sparse models.

2

Offered new procedures tailored to the new class of models.

3

Showed that the one-step JRRS is a theoretically optimal adaptive procedure: Finite sample oracle inequalities for E|XA − X A2

F for all A and X.

4

Introduced computationally efficient two-step JRRS.

5

Two-step JRRS satisfy finite sample oracle inequalities under minimal conditions on X.

6

Guaranteed small EXA − X A2

F if A of low rank and few non-zero

  • rows. Analysis valid for all m, n, p, rank r and J. In particular, r

and |J| can grow with m and n.

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of

slide-26
SLIDE 26

Framework and motivation Joint Rank and Row Selection Methods Two-step JRRS estimators Numerical performance and examples Summary

Bibliography and acknowledgment

Talk based on

  • Florentina Bunea, Yiyuan She and Marten Wegkamp

Joint variable and rank selection for parsimonious estimation of high dimensional matrices ; Cornell University Technical Report, 2011.

  • Florentina Bunea, Yiyuan She and Marten Wegkamp

Optimal selection of reduced rank estimators of high-dimensional matrices; Annals of Statistics, Vol 39, 2011.

  • Research partially supported by NSF-DMS 1007444.

Florentina Bunea Department of Statistical Science Cornell University Joint variable and rank selection for parsimonious estimation of