Outline 1 - PDF document

� �� Outline �� 1 �� 2 † ‡ �� 3 n ≫ p �� † �� n ≪ p �� ‡ JST �� 2015 � 12 � 24 � �� 4 �� 5 1 / 95 2 / 95 �� : �� 1992 Donoho and Johnstone Wavelet shrinkage (Soft-thresholding) �� Lasso �� 1996 Tibshirani �� 2000 Knight and Fu Lasso �� ( n ≫ p ) �� 2006 Candes and Tao, ( �� p ≫ n ) Donoho �� d = 10000 �� n = 1000 �� 2009 Bickel et al., Zhang �� (Lasso �� , p ≫ n ) �� ( �� ) �� 2013 van de Geer et al., Lockhart et al. ( p ≫ n ) �� L 1 �� (2010) �� . 3 / 95 4 / 95 �� Outline �� 1 �� 2 �� 3 n ≫ p �� n ≪ p �� ≪ �� 4 �� 5 �� 5 / 95 6 / 95

�� ≪ �� ≪ �� ≫ �� Lasso �� R. Tsibshirani (1996). Regression shrinkage and selection via the lasso. J. Royal. Statist. Soc B., Vol. 58, No. 1, pages 267–288. �� 14728 (2015 � 12 � 23 � ) �� 6 / 95 7 / 95 �� X = ( X ij ) ∈ R n × p , �� X = ( X ij ) ∈ R n × p , �� Y = ( Y i ) ∈ R n �� Y = ( Y i ) ∈ R n p ( �� ) ≫ n ( �� ) . p ( �� ) ≫ n ( �� ) . �� β ∗ ∈ R p : �� d � ( �� ). �� β ∗ ∈ R p : �� d � ( �� ). Y = X β ∗ + ǫ Y = X β ∗ + ǫ �� : �� : ( Y i = � p ( Y i = � p j =1 X ij β ∗ j =1 X ij β ∗ j + ǫ i ) ( i = 1 , . . . , n )) j + ǫ i ) ( i = 1 , . . . , n )) ( Y , X ) �� β ∗ �� ( Y , X ) �� β ∗ �� d �� d �� Mallows’ C p , AIC: � Y − X β � 2 + 2 σ 2 � β � 0 ˆ β MC = argmin β ∈ R p �� β � 0 = |{ j | β j � = 0 }| . � 2 p �� NP- �� 8 / 95 8 / 95 Lasso �� Y − X β � 2 + 2 σ 2 � β � 0 . ˆ Mallows’ C p �� : β MC = argmin β ∈ R p �� : � β � 0 �� Lasso [ L 1 �� ] � Y − X β � 2 + λ � β � 1 ˆ β Lasso = argmin β ∈ R p �� β � 1 = � p j =1 | β j | . http://www.astroml.org/sklearn_tutorial/practical.html � �� L 1 �� L 0 �� [ − 1 , 1] p �� y = b + β 1 x + β 2 x 2 + · · · + β d x d + ǫ �� ( �� ) L 1 �� Lov´ asz �� 9 / 95 10 / 95

Lasso �� Lasso �� p = n , X = I �� p � 1 ˆ n � X β − Y � 2 β = arg min 2 + λ n | β j | . 1 2 � Y − β � 2 + C � β � 1 ˆ β Lasso = argmin β ∈ R p j =1 β ∈ R p 1 2( y i − b ) 2 + C | b | ˆ ⇒ β Lasso , i = argmin b ∈ R � sign ( y i )( y i − C ) ( | y i | > C ) = 0 ( | y i | ≤ C ) . �� 0 �� 11 / 95 12 / 95 �� Y = X β + ǫ. p � 1 ˆ n � X β − Y � 2 β = arg min 2 + λ n | β j | . n = 1 , 000, p = 10 , 000, d = 500. β ∈ R p j =1 10 True Theorem (Lasso �� ) 9 8 �� C �� 7 6 2 ≤ C d log( p ) � ˆ β − β ∗ � 2 . 5 n 4 �� log( p ) �� d �� 3 �� 2 � �� 1 0 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 13 / 95 14 / 95 �� Y = X β + ǫ. Y = X β + ǫ. n = 1 , 000, p = 10 , 000, d = 500. n = 1 , 000, p = 10 , 000, d = 500. 12 12 True True Lasso Lasso 10 10 LeastSquares 8 8 6 6 4 4 2 2 0 0 -2 -2 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 14 / 95 14 / 95

�� Lasso �� Outline �� 1 Lasso: n � 1 i β ) 2 + � β � 1 �� ( y i − x ⊤ 2 min . n �� β ∈ R p i =1 �� 3 n ≫ p �� n ≪ p �� 4 �� 5 15 / 95 16 / 95 Lasso �� L 1 �� Lasso: n � 1 i β ) 2 + � β � 1 ( y i − x ⊤ min . n �� β ∈ R p i =1 �� : � n 1 min ℓ ( z i , β ) + ψ ( β ) . n w ∈ R p i =1 L 1 �� 16 / 95 17 / 95 �� Lounici et al. (2009) T �� : � � β g � C y ( t ) ≈ x ( t ) ⊤ β ( t ) ( i = 1 , . . . , n ( t ) , t = 1 , . . . , T ) . i i g ∈ G n ( t ) p � T � � β ( t ) ) 2 + C ( y i − x ( t ) ⊤ � ( β (1) k , . . . , β ( T ) min ) � . i k β ( t ) t =1 i =1 k =1 � �� 0 �� β (1) β (2) β ( T ) �� 18 / 95 19 / 95

�� 映画 �� ユーザ �� Lounici et al. (2009) T �� : y ( t ) ≈ x ( t ) ⊤ β ( t ) ( i = 1 , . . . , n ( t ) , t = 1 , . . . , T ) . W : M × N �� i i n ( t ) min { M , N } T p � � � � β ( t ) ) 2 + C 1 ( y i − x ( t ) ⊤ � ( β (1) k , . . . , β ( T ) � W � Tr = Tr [( WW ⊤ ) 2 ] = σ j ( W ) min ) � . i k β ( t ) j =1 t =1 i =1 k =1 � �� σ j ( W ) � W � j �� ( �� ) � β (1) β (2) β ( T ) �� = �� L 1 �� = �� 19 / 95 20 / 95 � : �� : �� 1 �� A �� B �� C · · · �� X �� A �� B �� C · · · �� X �� 1 4 8 * · · · 2 �� 1 4 8 4 · · · 2 �� 2 · · · �� 2 · · · 2 * 2 * 2 4 2 1 �� 3 �� 3 2 4 * · · · * 2 4 2 · · · 1 . . . . . . (e.g., Srebro et al. (2005), NetFlix Bennett and Lanning (2007)) (e.g., Srebro et al. (2005), NetFlix Bennett and Lanning (2007)) � ( Y i , j − W i , j ) 2 + λ � W � Tr ( i , j ) ∈ T 21 / 95 21 / 95 � : �� : �� (Anderson, 1951, Burket, 1964, Izenman, 1975) N �� (Argyriou et al., 2008) �� W *= Y X W * W M = + n N N M � �� : �� Rademacher Complexity: Srebro et al. (2005). Compressed sensing: Cand` es and Tao (2009), Cand` es and Recht (2009). W ∗ � �� . 22 / 95 23 / 95

Outline 1 - PDF document

Outline 1 2

Ins Domingues Breast Cancer Workshop April 7th 2015 Outline Outline Outline Outline

Presentation Preparation Outline Speech Outline Template ***Use this outline to guide you in

Outline for St Outline for St Outline for

Beob Kyun Kim, S oonwook Hwang {kyun, hwang}@ kisti.re.kr KIS TI, Korea Outline Outline

Catherine Revels, World Bank November 2009 Presentation outline Presentation outline

Battlestar Galactica Battlestar Galactica Galactica Battlestar Outline Outline Outline

Outline 2 Outline 2 ZSim core simulation techniques Outline 2 ZSim core simulation

Appendix J: Capstone Presentation Outline Revised Spring 2016 CAPSTONE PRESENTATION OUTLINE This

PT1 TMP Presentation Outline 1 Group Members: ___________________________________ Use this outline

Broverview Outline 2 Outline Philosophy and Architecture A framework for network traffic

Xingqian Peng, Huaqiao University, China Presented by Zhen Wu Presented by Zhen Wu October 30,2011

1 Web Application Development 2 3 Web Application Development CSS Outline An outline is a

Lecture Outline Strengthening Induction Hypothesis. Lecture Outline Strengthening Induction

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

High Dimensional Approximation - Outline Background and Sources Wolfgang Dahmen Seminar: USC,

Outline Outline Deaf and Hearing Impaired Deaf and Hearing Impaired Physical Structures of

Truncations of unitary matrices and Brownian bridges Alain Rouault (Laboratoire de

Generalization Error of Generalized Linear Models in High Dimensions Melika Emami 1 , Mojtaba

Steins Method for Matrix Concentration Lester Mackey Collaborators: Michael I. Jordan ,

Towards Demystifying Overparameterization in Deep Learning Mahdi Soltanolkotabi Department of

Data Science in the Wild Lecture 8: Advanced Experimental Analysis Eran Toch Data Science in the

Variational Model Selection for Sparse Gaussian Process Regression Michalis K. Titsias School of

Representing Images and Sounds Class 4. 3 Sep 2009 Instructor: Bhiksha Raj Representing an

Investigating the security properties of MACs based on stream ciphers Leonie Simpson, Mufeed Al