consistency result Classifjcation and regression based on - PDF document

HAL Id: hal-00668212 scientifjques de niveau recherche, publiés ou non, Nathalie Villa-Vialaneix, Fabrice Rossi. Classifjcation and regression based on derivatives : a consis- To cite this version: Nathalie Villa-Vialaneix, Fabrice Rossi consistency result Classifjcation and regression based on derivatives : a publics ou privés. recherche français ou étrangers, des laboratoires émanant des établissements d’enseignement et de destinée au dépôt et à la difgusion de documents https://hal.archives-ouvertes.fr/hal-00668212 L’archive ouverte pluridisciplinaire HAL , est abroad, or from public or private research centers. teaching and research institutions in France or The documents may come from lished or not. entifjc research documents, whether they are pub- archive for the deposit and dissemination of sci- HAL is a multi-disciplinary open access Submitted on 9 Feb 2012 tency result. II Simposio sobre Modelamiento Estadístico, Dec 2010, Valparaiso, Chile. ฀hal-00668212฀

Classification and regression based on derivatives: a consistency result Nathalie Villa-Vialaneix (Joint work with Fabrice Rossi) http://www.nathalievilla.org II Simposio sobre Modelamiento Estadístico Valparaiso, December, 3 rd Nathalie Villa-Vialaneix 1 / 30 �

Introduction and motivations Outline 1 Introduction and motivations 2 A general consistency result 3 Examples Nathalie Villa-Vialaneix 2 / 30 �

Introduction and motivations Regression and classification from an infinite dimensional predictor Settings ( X , Y ) is a random pair of variables where Y ∈ {− 1 , 1 } (binary classification problem) or Y ∈ R Nathalie Villa-Vialaneix 3 / 30 �

Introduction and motivations Regression and classification from an infinite dimensional predictor Settings ( X , Y ) is a random pair of variables where Y ∈ {− 1 , 1 } (binary classification problem) or Y ∈ R X ∈ ( X , � ., . � X ) , an infinite dimensional Hilbert space. Nathalie Villa-Vialaneix 3 / 30 �

Introduction and motivations Regression and classification from an infinite dimensional predictor Settings ( X , Y ) is a random pair of variables where Y ∈ {− 1 , 1 } (binary classification problem) or Y ∈ R X ∈ ( X , � ., . � X ) , an infinite dimensional Hilbert space. We are given a learning set S n = { ( X i , Y i ) } n i = 1 of n i.i.d. copies of ( X , Y ) . Nathalie Villa-Vialaneix 3 / 30 �

Introduction and motivations Regression and classification from an infinite dimensional predictor Settings ( X , Y ) is a random pair of variables where Y ∈ {− 1 , 1 } (binary classification problem) or Y ∈ R X ∈ ( X , � ., . � X ) , an infinite dimensional Hilbert space. We are given a learning set S n = { ( X i , Y i ) } n i = 1 of n i.i.d. copies of ( X , Y ) . Purpose : Find φ n : X → {− 1 , 1 } or R , that is universally consistent: Classification case : lim n → + ∞ P ( φ n ( X ) � Y ) = L ∗ where L ∗ = inf φ : X→{− 1 , 1 } P ( φ ( X ) � Y ) is the Bayes risk . Nathalie Villa-Vialaneix 3 / 30 �

Introduction and motivations Regression and classification from an infinite dimensional predictor Settings ( X , Y ) is a random pair of variables where Y ∈ {− 1 , 1 } (binary classification problem) or Y ∈ R X ∈ ( X , � ., . � X ) , an infinite dimensional Hilbert space. We are given a learning set S n = { ( X i , Y i ) } n i = 1 of n i.i.d. copies of ( X , Y ) . Purpose : Find φ n : X → {− 1 , 1 } or R , that is universally consistent: � [ φ n ( X ) − Y ] 2 � = L ∗ where Regression case : lim n → + ∞ E � [ φ ( X ) − Y ] 2 � L ∗ = inf φ : X→ R E will also be called the Bayes risk. Nathalie Villa-Vialaneix 3 / 30 �

Introduction and motivations An example Predicting the rate of yellow berry in durum wheat from its NIR spectrum . Nathalie Villa-Vialaneix 4 / 30 �

Introduction and motivations Using derivatives Practically , X ( m ) is often more relevant than X for the prediction. Nathalie Villa-Vialaneix 5 / 30 �

Introduction and motivations Using derivatives Practically , X ( m ) is often more relevant than X for the prediction. But X → X ( m ) induces information loss and � � φ ( X ( m ) ) � Y φ : X→{− 1 , 1 } P ( φ ( X ) � Y ) = L ∗ φ : D m X→{− 1 , 1 } P inf ≥ inf and �� 2 � � [ φ ( X ) − Y ] 2 � φ ( X ( m ) ) − Y = L ∗ . φ : D m X→ R E inf ≥ φ : X→ R P inf Nathalie Villa-Vialaneix 5 / 30 �

Introduction and motivations Sampled functions Practically , ( X i ) i are not perfectly known; only a discrete sampling is given: X τ d = ( X i ( t )) t ∈ τ d where τ d = { t τ d 1 , . . . , t τ d | τ d | } . i Nathalie Villa-Vialaneix 6 / 30 �

Introduction and motivations Sampled functions Practically , ( X i ) i are not perfectly known; only a discrete sampling is given: X τ d = ( X i ( t )) t ∈ τ d where τ d = { t τ d 1 , . . . , t τ d | τ d | } . i The sampling can be non uniform... Nathalie Villa-Vialaneix 6 / 30 �

Introduction and motivations Sampled functions Practically , ( X i ) i are not perfectly known; only a discrete sampling is given: X τ d = ( X i ( t )) t ∈ τ d where τ d = { t τ d 1 , . . . , t τ d | τ d | } . i ... and the data can be corrupted by noise. Nathalie Villa-Vialaneix 6 / 30 �

Introduction and motivations Sampled functions Practically , ( X i ) i are not perfectly known; only a discrete sampling is given: X τ d = ( X i ( t )) t ∈ τ d where τ d = { t τ d 1 , . . . , t τ d | τ d | } . i Then , X ( m ) X ( m ) is estimated from X τ d i , by � τ d , which also induces i information loss : � � � � X ( m ) φ ( � φ ( X ( m ) ) � Y ≥ L ∗ τ d ) � Y ≥ φ : D m X→{− 1 , 1 } P inf φ : D m X→{− 1 , 1 } P inf and �� 2 � �� 2 � X ( m ) φ ( � φ ( X ( m ) ) − Y ≥ L ∗ . inf τ d ) − Y ≥ inf φ : D m X→ R E φ : D m X→ R E Nathalie Villa-Vialaneix 6 / 30 �

Introduction and motivations Purpose of the presentation X ( m ) Find a classifier or a regression function φ n ,τ d built from � such τ d that the risk of φ n ,τ d asymptotically reaches the Bayes risk L ∗ : � � X ( m ) φ n ,τ d ( � = L ∗ | τ d |→ + ∞ lim lim τ d ) � Y n → + ∞ P or �� 2 � X ( m ) φ n ,τ d ( � = L ∗ τ d ) − Y | τ d |→ + ∞ lim lim n → + ∞ E Nathalie Villa-Vialaneix 7 / 30 �

Introduction and motivations Purpose of the presentation X ( m ) Find a classifier or a regression function φ n ,τ d built from � such τ d that the risk of φ n ,τ d asymptotically reaches the Bayes risk L ∗ : � � X ( m ) φ n ,τ d ( � = L ∗ | τ d |→ + ∞ lim lim τ d ) � Y n → + ∞ P or �� 2 � X ( m ) φ n ,τ d ( � = L ∗ τ d ) − Y | τ d |→ + ∞ lim lim n → + ∞ E Main idea : Use a relevant way to estimate X ( m ) from X τ d (by smoothing splines) and combine the consistency of splines with the consistency of a R | τ d | -classifier or regression function. Nathalie Villa-Vialaneix 7 / 30 �

A general consistency result Outline 1 Introduction and motivations 2 A general consistency result 3 Examples Nathalie Villa-Vialaneix 8 / 30 �

A general consistency result Basics about smoothing splines I Suppose that X is the Sobolev space � [ 0 , 1 ] |∀ j = 1 , . . . , m , D j h exists (weak sense) and D m h ∈ L 2 � H m = h ∈ L 2 Nathalie Villa-Vialaneix 9 / 30 �

A general consistency result Basics about smoothing splines I Suppose that X is the Sobolev space � [ 0 , 1 ] |∀ j = 1 , . . . , m , D j h exists (weak sense) and D m h ∈ L 2 � H m = h ∈ L 2 equipped with the scalar product � m � u , v � H m = � D m u , D m v � L 2 + B j uB j v j = 1 where B are m boundary conditions such that Ker B ∩ P m − 1 = { 0 } . Nathalie Villa-Vialaneix 9 / 30 �

A general consistency result Basics about smoothing splines I Suppose that X is the Sobolev space � [ 0 , 1 ] |∀ j = 1 , . . . , m , D j h exists (weak sense) and D m h ∈ L 2 � H m = h ∈ L 2 equipped with the scalar product � m � u , v � H m = � D m u , D m v � L 2 + B j uB j v j = 1 where B are m boundary conditions such that Ker B ∩ P m − 1 = { 0 } . ( H m , � ., . � H m ) is a RKHS : ∃ k 0 : P m − 1 × P m − 1 → R and k 1 : Ker B × Ker B → R such that ∀ u ∈ P m − 1 , t ∈ [ 0 , 1 ] , � u , k 0 ( t , . ) � H m = u ( t ) and ∀ u ∈ Ker B , t ∈ [ 0 , 1 ] , � u , k 1 ( t , . ) � H m = u ( t ) Nathalie Villa-Vialaneix See [Berlinet and Thomas-Agnan, 2004] for further details. 9 / 30 �

A general consistency result Basics about smoothing splines II A simple example of boundary conditions : h ( 0 ) = h ( 1 ) ( 0 ) = . . . = h ( m − 1 ) ( 0 ) = 0 . Then, � m − 1 t k s k k 0 ( s , t ) = ( k !) 2 k = 0 and � 1 ( t − w ) m − 1 ( s − w ) m − 1 + + k 1 ( s , t ) = dw . ( m − 1 )! 0 Nathalie Villa-Vialaneix 10 / 30 �

A general consistency result Estimating the predictors with smoothing splines I Assumption (A1) | τ d | ≥ m − 1 sampling points are distinct in [ 0 , 1 ] B j are linearly independent from h → h ( t ) for all t ∈ τ d Nathalie Villa-Vialaneix 11 / 30 �

consistency result Classifjcation and regression based on - PDF document

HAL Id: hal-00668212 scientifjques de niveau recherche, publis ou non, Nathalie Villa-Vialaneix, Fabrice Rossi. Classifjcation and regression based on derivatives : a consis- To cite this version: Nathalie Villa-Vialaneix, Fabrice Rossi

Consistency - Chapter 5 Introduce several notions of Local Consistency: arc consistency,

Constraint Programming - An overview Node-consistency Arc-consistency Path-consistency

A Persistent WeisfeilerLehman Procedure for Graph Classifjcation Bastian Rieck Christian Bock

Web Cache Consistency Web Cache Consistency Web Cache Consistency Web Cache Consistency

Statistical Natural Language Processing Classifjcation ar ltekin University of

Machine Learning for Computational Linguistics Classifjcation ar ltekin University of

1 Applications ? Trading Consistency for Performance Applications ? Trading Consistency for

Lecture 10: Nonparametric Regression (2) Applied Statistics 2015 1 / 18 Consistency of

Classifjcation / Division 07.08.10 || English 1301: Composition & Rhetoric I || D. Glen Smith,

Seminar: Search and Optimization Directional Consistency Gabi R oger Universit at Basel

Consistent Storage or Scalable Storage Why Not Both? CONSISTENCY Strong Consistency

Advanced consistency methods Chapter 8 ICS-275 Winter 2016 Winter 2016 ICS 275 - Constraint

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Consistency in key- value stores Monika Moser Consistency guarantees explain a system's

2010 Full Year Result 2010 Full Year Result 23 February 2011 2010 Full Year Result 2010 Full

Biomass Energy Analytical Model Presented to Agricultural Utilization Research Institute Michael

Resilient, water- and energy-efficient forage and feed crops for Mediterranean agricultural

Trade policy and food price volatility in LIFDCs David Hallam Food and Agriculture Organization

Idaho Wheat Commission Cereal Schools About the Idaho Wheat Commission Self-governing state

Delineation of Nitrate Priority Areas (Idaho DEQ) 1 12/24/2014 2011 Idaho Agricultural

SAHRA MEETI SAHR A MEETING NG 2 2014-2016 Performing Plant of the Year 2012, 2013 and 2014

ETP Russia Agri Export PRESENTATION INTERNATIONAL TRADING RUSSIAN AGRICULTURAL PRODUCTS CEREALS,

Organic farming, g, Experience of Bulgaria. XV International Conference BLACK SEA GRAIN -

consistency result Classifjcation and regression based on - PDF document

HAL Id: hal-00668212 scientifjques de niveau recherche, publis ou non, Nathalie Villa-Vialaneix, Fabrice Rossi. Classifjcation and regression based on derivatives : a consis- To cite this version: Nathalie Villa-Vialaneix, Fabrice Rossi

Consistency - Chapter 5 Introduce several notions of Local Consistency: arc consistency,

Constraint Programming - An overview Node-consistency Arc-consistency Path-consistency

A Persistent WeisfeilerLehman Procedure for Graph Classifjcation Bastian Rieck Christian Bock

Web Cache Consistency Web Cache Consistency Web Cache Consistency Web Cache Consistency

Statistical Natural Language Processing Classifjcation ar ltekin University of

Machine Learning for Computational Linguistics Classifjcation ar ltekin University of

1 Applications ? Trading Consistency for Performance Applications ? Trading Consistency for

Lecture 10: Nonparametric Regression (2) Applied Statistics 2015 1 / 18 Consistency of

Classifjcation / Division 07.08.10 || English 1301: Composition &amp; Rhetoric I || D. Glen Smith,

Seminar: Search and Optimization Directional Consistency Gabi R oger Universit at Basel

Consistent Storage or Scalable Storage Why Not Both? CONSISTENCY Strong Consistency

Advanced consistency methods Chapter 8 ICS-275 Winter 2016 Winter 2016 ICS 275 - Constraint

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Consistency in key- value stores Monika Moser Consistency guarantees explain a system's

2010 Full Year Result 2010 Full Year Result 23 February 2011 2010 Full Year Result 2010 Full

Biomass Energy Analytical Model Presented to Agricultural Utilization Research Institute Michael

Resilient, water- and energy-efficient forage and feed crops for Mediterranean agricultural

Trade policy and food price volatility in LIFDCs David Hallam Food and Agriculture Organization

Idaho Wheat Commission Cereal Schools About the Idaho Wheat Commission Self-governing state

Delineation of Nitrate Priority Areas (Idaho DEQ) 1 12/24/2014 2011 Idaho Agricultural

SAHRA MEETI SAHR A MEETING NG 2 2014-2016 Performing Plant of the Year 2012, 2013 and 2014

ETP Russia Agri Export PRESENTATION INTERNATIONAL TRADING RUSSIAN AGRICULTURAL PRODUCTS CEREALS,

Organic farming, g, Experience of Bulgaria. XV International Conference BLACK SEA GRAIN -

Classifjcation / Division 07.08.10 || English 1301: Composition & Rhetoric I || D. Glen Smith,