Graphlet Screening (GS) Achieves Optimal Rate in Variable Selection - PowerPoint PPT Presentation

Graphlet Screening (GS) Achieves Optimal Rate in Variable Selection Jiashun Jin Carnegie Mellon University Collaborated with Cun-Hui Zhang (Rutgers) Qi Zhang (Univ. of Pittsburgh) Jiashun Jin Graphlet Screening (GS)

Variable selection Y = X β + z , X = X n , p , z ∼ N (0 , I n ) ◮ p ≫ n ≫ 1 ◮ signals are rare and weak ◮ let G = X ′ X be the Gram matrix ◮ diagonals of G are normalized to 1 ◮ G is sparse (few large entries each row) Jiashun Jin Graphlet Screening (GS)

Subset selection 2 + λ 2 1 2 � Y − X β � 2 2 � β � 0 ◮ L 0 -penalization method ◮ Variants: Cp, AIC, BIC, RIC ◮ Computationally challenging Mallows (1973), Akaike (1974), Schwartz (1978), Foster & George (1994) Jiashun Jin Graphlet Screening (GS)

The lasso 1 2 � Y − X β � 2 2 + λ � β � 1 ◮ L 1 -penalization method; Basis Pursuit ◮ Widely used ◮ computationally efficient even when p is large ◮ in the noiseless case , if signals sufficiently sparse, equivalent to L 0 -penalization Chen et al. (1998); Tibshirani (1996); Donoho (2006) Jiashun Jin Graphlet Screening (GS)

Limitation of L 0 -Penalization, I Ex. Y = X β + z , z ∼ N (0 , I n ), β j take values from { 0 , τ } and D 0 . . . 0   0 D . . . 0 � � 1 a G = X ′ X =    , D = . . . ...  . . .  a 1 . . .  0 0 . . . D { 1 , 2 , . . . p } partitions into 3 types of 2 × 2 blocks: ◮ I. No signal ◮ II. One signal ◮ III. Two signals Jiashun Jin Graphlet Screening (GS)

Limitation of L 0 penalization, II ◮ one-stage method ◮ one tuning parameter ◮ does not exploit ‘local’ graphical structure Therefore, many penalization methods (e.g. lasso, SCAD, MC+, Dantzig selector) are non-optimal, as L 0 -penalization is the ‘idol’ these methods mimic ‘local’: neighboring nodes in geodesic distance of a graph (TBD) Jiashun Jin Graphlet Screening (GS)

Where are the signals? Tukey, J.W. (1965). Which part of the sample contains the information, Proc. Natl. Sci. Acad. John Wilder Tukey (1915-2000) Jiashun Jin Graphlet Screening (GS)

Graph of Strong Dependence (GOSD) GOSD is the graph G = ( V , E ): ◮ V = { 1 , 2 , . . . , p } : each variable is a node ◮ An edge between nodes i and j iff 1 � ≥ � � � G ( i , j ) log( p ) , say ◮ G = X ′ X sparse = ⇒ G sparse Jiashun Jin Graphlet Screening (GS)

Signal sparsity and graph sparsity ◮ Despite its sparsity, G is usually complicate ◮ Denote the support of β by S = S ( β ) = { 1 ≤ i ≤ p , β i � = 0 } Restricting nodes to S forms a subgraph G S ◮ Key insight : G S decomposes into many small-size components that are disconnected to each other Component: a maximal connected subgraph Jiashun Jin Graphlet Screening (GS)

For today Graphlet Screening (GS): ◮ gs -step: graphlet screening by sequential χ 2 -tests ◮ gc -step: graphlet cleaning by Penalized MLE ◮ Focus: rare and weak signals Jiashun Jin Graphlet Screening (GS)

Graphlet screening (gs-step), Initial stage Y = X β + z , X = X n , p , z ∼ N (0 , I n ); G : GOSD ◮ Fix m ≥ 1 (small) ◮ Let {G t : 1 ≤ t ≤ T } be all connected subgraphs of G with size ≤ m ◮ arranged by size, ties breaking lexicographically: p = 10, m = 3, T = 30; 2 1 1 4 {G t , 1 ≤ t ≤ T } : 5 3 { 1 } , { 2 } , . . . { 10 } 10 { 1 , 2 } , { 1 , 7 } , . . . , { 9 , 10 } 6 { 1 , 2 , 4 } , { 1 , 2 , 7 } , . . . , { 8 , 9 , 10 } 7 9 8 8 Jiashun Jin Graphlet Screening (GS)

gs -step, II. Updating stage {G t } T X = [ x 1 , x 2 , . . . , x p ] , t =1 : all connected subgraphs with size ≤ m For t = 1 , 2 , . . . , T ◮ S t − 1 : set of retained indices in last stage ◮ Define T ( Y ; D , F ) = � P G t Y � 2 − � P F Y � 2 ◮ F = G t ∩ S t − 1 : nodes accepted previously ◮ D = G t \ F : nodes currently under investigation ◮ P F : projection from R n to subspace { x j : j ∈ F } ◮ Adding nodes in D to S t − 1 iff T ( Y ; D , F ) > t ( D , F ) , t ( D , F ): threshold TBD Once accepted, a node is kept until the end of gs-step Jiashun Jin Graphlet Screening (GS)

Comparison with marginal regression (computational complexity) ◮ Marginal screening ◮ ineffective (neglects ‘local’ graphical structure) ◮ ‘brute-forth’ m -variate screening is computationally challenging: O ( p m ) ◮ gs -step ◮ only screens connected subgraphs of G ◮ if maximum degree of G ≤ K , then there are ≤ C ( eK ) m p such subgraphs Fan & Lv (2008), Wasserman & Roeder (2009), Frieze & Molloy (1999) Jiashun Jin Graphlet Screening (GS)

Two important properties of gs-step S ∗ ≡ S T : set of survived nodes in the end of gs -step If both signals and Graph G are sparse: ◮ Sure Screening (SS) : S ∗ retains all but a small proportion of signals ◮ Separable After Screening (SAS) : S ∗ decomposes into many small-size components Jiashun Jin Graphlet Screening (GS)

Reduce to many small-size regression, I I 0 ⊂ S ∗ : a component G = X ′ X , G I 0 : G I 0 , I 0 : row restriction; row & column restriction ◮ Restrict regression to I 0 ⇒ X ′ Y = X ′ X β + X ′ z Y = X β + z = ⇒ ( X ′ Y ) I 0 = ( G β ) I 0 + ( X ′ z ) I 0 = ◮ ( X ′ z ) I 0 ∼ N (0 , G I 0 , I 0 ) since z ∼ N (0 , I n ) ◮ Key : ( G β ) I 0 ≈ G I 0 , I 0 β I 0 ◮ Result : many small-size regression: ( X ′ Y ) I 0 ≈ N � G I 0 , I 0 β I 0 , G I 0 , I 0 � Jiashun Jin Graphlet Screening (GS)

Reduce to small-size regression, II Why ( G β ) I 0 ≡ G I 0 β ≈ G I 0 , I 0 β I 0 ?  β I 0  0   G I 0 β = G I 0 , I 0 � G I 0 , J 0 � . . . � �  β J 0      0   . . . ◮ I 0 , J 0 ⊂ S ∗ : components ◮ By SS property, β � = 0 ◮ By SAS property, G I 0 , J 0 ≈ 0 Jiashun Jin Graphlet Screening (GS)

Graphlet cleaning (gc-step) Y = X β + z , z ∼ N (0 , I n ) ◮ I 0 : a component of S ∗ ; S ∗ : set of all survived nodes ◮ β I 0 : restricting rows of β to I 0 ◮ X ∗ , I 0 : restricting columns of X to I 0 Fixing ( u gs , v gs ), ∈ S ∗ : set ˆ ◮ j / β j = 0 ◮ j ∈ S ∗ : estimate β I 0 via minimizing � P I 0 ( Y − X ∗ , I 0 θ ) � 2 + ( u gs ) 2 � θ � 0 , where an entry of θ is 0 or ≥ v gs in magnitude Jiashun Jin Graphlet Screening (GS)

Random design model  X ′  1 ∼ N (0 , 1 iid  , Y = X β + z , X = . . . X i n Ω)  X ′ n ◮ Ω: unknown correlation matrix ◮ Ex: Compressive Sensing, Computer Security Dinur and Nissim (2004), Nowak et al. (2007) Jiashun Jin Graphlet Screening (GS)

Rare and Weak signal model Y = X β + z , z ∼ N (0 , I n ) iid µ ∈ Θ ∗ β = b ◦ µ, ∼ Bernoulli ( ǫ ) , b i p ( τ, a ) ◮ b ◦ µ ∈ R p : ( b ◦ µ ) j = b j µ j p ( τ, a ) = { µ ∈ R p : τ ≤ | µ j | ≤ a τ } , a > 1 ◮ Θ ∗ ◮ Two key parameters: ǫ : sparsity; τ : (minimum) signal strength Jiashun Jin Graphlet Screening (GS)

Asymptotic framework Use p as driving asymptotic parameter, and tie ( ǫ, τ, n ) to p by fixed parameters ◮ Signal rarity: ǫ = ǫ p = p − ϑ , 0 < ϑ < 1 ◮ Signal weakness: � τ = τ p = 2 r log( p ) , r > 0 ◮ Sample size: n = p θ , (1 − ϑ ) < θ < 1 , so that p ǫ p ≪ n p ≪ p Jiashun Jin Graphlet Screening (GS)

Limitation of ‘Oracle Property’ Oracle property or probability of exact support recovery is a widely used criterion for assessing optimality in variable selection However, when signals are rare and weak, it is usually impossible to have exact recovery Jiashun Jin Graphlet Screening (GS)

Minimax Hamming distance Measuring errors with Hamming distance: � p �� H p (ˆ � sgn (ˆ � β, ǫ p , µ ; Ω) = E 1 β j ) � = sgn ( β j ) j =1 Minimax Hamming distance: H p (ˆ Hamm ∗ p ( ϑ, θ, r , a , Ω) = inf sup β, ǫ p , µ ; Ω) ˆ β µ ∈ Θ ∗ p ( τ p , a ) Jiashun Jin Graphlet Screening (GS)

Exponent ρ ∗ j = ρ ∗ j ( ϑ, r , Ω) � � Define ω = ω ( S 0 , S 1 ; Ω) = inf δ δ ′ Ω δ where � u ( k ) = 0 , i / ∈ S k δ ≡ u (0) − u (1) : i , k = 0 , 1 1 ≤ | u ( k ) | ≤ a , i ∈ S k i Define 4 + ( | S 1 | − | S 0 | ) 2 ϑ 2 ρ ( S 0 , S 1 ; ϑ, r , a , Ω) = | S 0 | + | S 1 | ϑ + ω r 2 4 ω r Minimax rate critically depends on the exponents: ρ ∗ j = ρ ∗ j ( ϑ, r ; Ω) = ( S 0 , S 1 ): j ∈ S 0 ∪ S 1 ρ ( S 0 , S 1 , ϑ, r , a , Ω) min ◮ not dependent on ( θ, a ) (mild regularity cond.) ◮ computable; has explicit form for some Ω Jiashun Jin Graphlet Screening (GS)

Graph of Least Favorable (GOLF) Define sets of least favorable configuration at site j � � ( S ∗ 0 j , S ∗ 1 j ) = argmax { ( S 0 , S 1 ): j ∈ S 0 ∪ S 1 } ρ ( S 0 , S 1 ; ϑ, r , a , Ω) Definition . GOLF is the graph G ⋄ = ( V , E ) where V = { 1 , 2 , . . . , p } and there is an edge between i and j if and only if ( S ∗ 0 j ∪ S ∗ 1 j ) ∩ ( S ∗ 0 k ∪ S ∗ 1 k ) � = ∅ Jiashun Jin Graphlet Screening (GS)

Lower bound iid µ ∈ Θ ∗ β = b ◦ µ, ∼ Bernoulli ( ǫ p ) , b j p ( τ p , a ) � ǫ p = p − ϑ , τ p = 2 r log( p ) Theorem 1 . Let d ( G ⋄ ) be the maximum degree of GOLF. As p → ∞ , � p j =1 p − ρ ∗ L p j Hamm ∗ p ( ϑ, θ, r , a , Ω) ≥ d p ( G ⋄ ) where L p is a generic multi-log( p ) term. Jiashun Jin Graphlet Screening (GS)

Graphlet Screening (GS) Achieves Optimal Rate in Variable Selection - PowerPoint PPT Presentation

Graphlet Screening (GS) Achieves Optimal Rate in Variable Selection Jiashun Jin Carnegie Mellon University Collaborated with Cun-Hui Zhang (Rutgers) Qi Zhang (Univ. of Pittsburgh) Jiashun Jin Graphlet Screening (GS) Variable selection Y = X

Metal - - screening screening Metal Thomas-Fermi (static) screening potential of point charge

Graphlet Kernels Karsten Borgwardt and Nino Shervashidze joint work with SVN Vishwanathan, Tobias

Pyramidal Stochastic Graphlet Embedding for Document Pattern Classification Anjan Dutta , Pau

Colorectal Cancer Screening Fall 2018 Agenda CRC Screening Landscape Colonoscopy: The

DIABETIC EYE SCREENING What is Diabetic Eye Screening? Diabetic eye screening means taking a

Screening Controlled Substance Screening Controlled Substance Screening Controlled Substance

Diabetic Eye Screening Extended Screening Intervals Public Health England leads the NHS Screening

Screening Screening By: Michael OReilly Technical Advisor FETP Thailand Session Objectives

Goals CRC/Screening Facts Available CRC Screening Tests Tools To Help you Talk About

Gaussian ensemble screening (GES): A new Gaussian ensemble screening (GES): A new approach to

1 2 3 4 5 F I N A N C I A L P R O T E C T I O N S S E R V I C E SCREENING The Screening

Participant Manual August 2017 Pre-CERCLA Screening Training Pre-CERCLA Screening Course 1

Participant Manual July 2017 Pre-CERCLA Screening Training Pre-CERCLA Screening Course 1

BOWEL CANCER SCREENING PR0GRAMME OVERVIEW Amy Smith and Marie-Francoise Lawson Bowel Cancer

Secondary Suicide Screening in Acute Care Settings Screening for Suicide Risk Saves Lives!

What Parents And Caregivers Should Know What is Newborn Screening? Who Receives Newborn

Sparse Recovery via Differential Inclusions Yuan Yao School of Mathematical Sciences Peking

Global solution to non-convex optimization problems involving an approximate 0 penalization

Additional Topics on Linear Regression Ping Yu School of Economics and Finance The University of

Growth and Scalability Joe Black "product/market fit means being in a good market with a

Forecasting intraday-load curve using sparse learning methods Dominique Picard LPMA- Universit

Boosting: more than an ensemble method for prediction Peter B uhlmann ETH Z urich

German EoI for Power Converters of SIS100 - SIS100 Dipole Power Converter 1678 k -

TENET: Tail-Event-driven NETwork Risk Wolfgang Karl Hrdle Weining Wang Lining Yu Ladislaus

Sambuz

Useful Links

Newsletter

Mail Us

Graphlet Screening (GS) Achieves Optimal Rate in Variable Selection - PowerPoint PPT Presentation

Graphlet Screening (GS) Achieves Optimal Rate in Variable Selection Jiashun Jin Carnegie Mellon University Collaborated with Cun-Hui Zhang (Rutgers) Qi Zhang (Univ. of Pittsburgh) Jiashun Jin Graphlet Screening (GS) Variable selection Y = X

Metal - - screening screening Metal Thomas-Fermi (static) screening potential of point charge

Graphlet Kernels Karsten Borgwardt and Nino Shervashidze joint work with SVN Vishwanathan, Tobias

Pyramidal Stochastic Graphlet Embedding for Document Pattern Classification Anjan Dutta , Pau

Colorectal Cancer Screening Fall 2018 Agenda CRC Screening Landscape Colonoscopy: The

DIABETIC EYE SCREENING What is Diabetic Eye Screening? Diabetic eye screening means taking a

Screening Controlled Substance Screening Controlled Substance Screening Controlled Substance

Diabetic Eye Screening Extended Screening Intervals Public Health England leads the NHS Screening

Screening Screening By: Michael OReilly Technical Advisor FETP Thailand Session Objectives

Goals CRC/Screening Facts Available CRC Screening Tests Tools To Help you Talk About

Gaussian ensemble screening (GES): A new Gaussian ensemble screening (GES): A new approach to

1 2 3 4 5 F I N A N C I A L P R O T E C T I O N S S E R V I C E SCREENING The Screening

Participant Manual August 2017 Pre-CERCLA Screening Training Pre-CERCLA Screening Course 1

Participant Manual July 2017 Pre-CERCLA Screening Training Pre-CERCLA Screening Course 1

BOWEL CANCER SCREENING PR0GRAMME OVERVIEW Amy Smith and Marie-Francoise Lawson Bowel Cancer

Secondary Suicide Screening in Acute Care Settings Screening for Suicide Risk Saves Lives!

What Parents And Caregivers Should Know What is Newborn Screening? Who Receives Newborn

Sparse Recovery via Differential Inclusions Yuan Yao School of Mathematical Sciences Peking

Global solution to non-convex optimization problems involving an approximate 0 penalization

Additional Topics on Linear Regression Ping Yu School of Economics and Finance The University of

Growth and Scalability Joe Black &quot;product/market fit means being in a good market with a

Forecasting intraday-load curve using sparse learning methods Dominique Picard LPMA- Universit

Boosting: more than an ensemble method for prediction Peter B uhlmann ETH Z urich

German EoI for Power Converters of SIS100 - SIS100 Dipole Power Converter 1678 k -

TENET: Tail-Event-driven NETwork Risk Wolfgang Karl Hrdle Weining Wang Lining Yu Ladislaus

Sambuz

Useful Links

Newsletter

Mail Us

Growth and Scalability Joe Black "product/market fit means being in a good market with a