Scalable Non-Parametric Statistical Estimation Aymeric DIEULEVEUT - PowerPoint PPT Presentation

Scalable Non-Parametric Statistical Estimation Aymeric DIEULEVEUT ENS Paris, INRIA February 6, 2017

Statistics Statistical model Performance measure Estimator Convergence: F (# obs )

Optimization Statistics Minimize a given function Statistical model Algorithm focused Performance measure Scales with dimension and Estimator observations Convergence: F (# obs ) Convergence: F (#iter)

Optimization Statistics Minimize a given function Statistical model Accurate & Efficient Algorithm focused Performance measure Scalable estimators with Scales with dimension and Estimator optimal statistical properties observations Convergence: F (# obs ) Convergence: F (#iter)

Optimization Statistics Minimize a given function Statistical model Accurate & Efficient Algorithm focused Performance measure Scalable estimators with Scales with dimension and Estimator optimal statistical properties observations Convergence: F (# obs ) Convergence: F (#iter) Non-parametric Regression Square loss Tikhonov regularization

Optimization Statistics Minimize a given function Statistical model Accurate & Efficient Algorithm focused Performance measure Scalable estimators with Scales with dimension and Estimator optimal statistical properties observations Convergence: F (# obs ) Convergence: F (#iter) Non-parametric Stochastic Regression algorithms Square loss First order methods Tikhonov regularization Few passes on the data

Optimization Statistics Minimize a given function Statistical model Accurate & Efficient Algorithm focused Performance measure Scalable estimators with Scales with dimension and Estimator optimal statistical properties observations Convergence: F (# obs ) Convergence: F (#iter) Non-parametric Non-parametric Stochastic Regression Stochastic algorithms Square loss First order methods Approximation, Tikhonov regularization Few passes on the data AOS, 2015

Non-parametric Stochastic Approximation with large step sizes 1/2. Aymeric Dieuleveut & Francis Bach, in the Annals of Statistics , 2015. Random design least-squares regression.

Non-parametric Stochastic Approximation with large step sizes 1/2. Aymeric Dieuleveut & Francis Bach, in the Annals of Statistics , 2015. Random design least-squares regression. � ( f ( X ) − Y ) 2 � ε ( f ) := E ( X , Y ) .

Non-parametric Stochastic Approximation with large step sizes 1/2. Aymeric Dieuleveut & Francis Bach, in the Annals of Statistics , 2015. Random design least-squares regression. � ( f ( X ) − Y ) 2 � ε ( f ) := E ( X , Y ) . Within a reproducing kernel Hilbert space H : f ∈H ε ( f ) . min ( x i , y i ) i.i.d. observations.

Non-parametric Stochastic Approximation with large step sizes 1/2. Aymeric Dieuleveut & Francis Bach, in the Annals of Statistics , 2015. Random design least-squares regression. Sequence of estimators f t ∈ H . � ( f ( X ) − Y ) 2 � ε ( f ) := E ( X , Y ) . Within a reproducing kernel Hilbert space H : f ∈H ε ( f ) . min ( x i , y i ) i.i.d. observations.

Non-parametric Stochastic Approximation with large step sizes 1/2. Aymeric Dieuleveut & Francis Bach, in the Annals of Statistics , 2015. Random design least-squares regression. Sequence of estimators f t ∈ H . � ( f ( X ) − Y ) 2 � Update after each observation. ε ( f ) := E ( X , Y ) . Within a reproducing kernel Hilbert space H : f ∈H ε ( f ) . min ( x i , y i ) i.i.d. observations.

Non-parametric Stochastic Approximation with large step sizes 1/2. Aymeric Dieuleveut & Francis Bach, in the Annals of Statistics , 2015. Random design least-squares regression. Sequence of estimators f t ∈ H . � ( f ( X ) − Y ) 2 � Update after each observation. ε ( f ) := E ( X , Y ) . Using unbiased gradients of the Within a reproducing kernel Hilbert loss function: space H : f ∈H ε ( f ) . min ( x i , y i ) i.i.d. observations.

Non-parametric Stochastic Approximation with large step sizes 1/2. Aymeric Dieuleveut & Francis Bach, in the Annals of Statistics , 2015. Random design least-squares regression. Sequence of estimators f t ∈ H . � ( f ( X ) − Y ) 2 � Update after each observation. ε ( f ) := E ( X , Y ) . Using unbiased gradients of the Within a reproducing kernel Hilbert loss function: space H : f t +1 = f t − γ t ( f t ( x t ) − y t ) K x t , f ∈H ε ( f ) . min where: K is the kernel, ( x i , y i ) i.i.d. observations. . K x = K ( x , · ).

Non-parametric Stochastic Approximation with large step sizes 1/2. Aymeric Dieuleveut & Francis Bach, in the Annals of Statistics , 2015. Random design least-squares regression. Sequence of estimators f t ∈ H . � ( f ( X ) − Y ) 2 � Update after each observation. ε ( f ) := E ( X , Y ) . Using unbiased gradients of the Within a reproducing kernel Hilbert loss function: space H : f t +1 = f t − γ t ( f t ( x t ) − y t ) K x t , f ∈H ε ( f ) . min where: K is the kernel, ( x i , y i ) i.i.d. observations. . K x = K ( x , · ). � Stochastic Approximation.

Non-parametric Stochastic Approximation with large step sizes 1/2. Aymeric Dieuleveut & Francis Bach, in the Annals of Statistics , 2015. Random design least-squares regression. Sequence of estimators f t ∈ H . � ( f ( X ) − Y ) 2 � Update after each observation. ε ( f ) := E ( X , Y ) . Using unbiased gradients of the Within a reproducing kernel Hilbert loss function: space H : f t +1 = f t − γ t ( f t ( x t ) − y t ) K x t , f ∈H ε ( f ) . min where: K is the kernel, ( x i , y i ) i.i.d. observations. . K x = K ( x , · ). � Stochastic Approximation. Depending on assumptions on: ◮ the Gaussian complexity of the unit ball of the kernel space, ◮ the smoothness in H of the optimal predictor f ∗ ( X ) = E [ Y | X ].

Non-parametric Stochastic Approximation with large step sizes 2/2. Aymeric Dieuleveut & Francis Bach, in the Annals of Statistics , 2015. x f ∗ x f ∗ H L 2 ρ X

Non-parametric Stochastic Approximation with large step sizes 2/2. Aymeric Dieuleveut & Francis Bach, in the Annals of Statistics , 2015. x f ∗ x f ∗ x f ∗ x f ∗ H H L 2 L 2 ρ X ρ X

Non-parametric Stochastic Approximation with large step sizes 2/2. Aymeric Dieuleveut & Francis Bach, in the Annals of Statistics , 2015. x f ∗ x f ∗ x f ∗ x f ∗ x f ∗ x f ∗ H H H L 2 L 2 L 2 ρ X ρ X ρ X

Non-parametric Stochastic Approximation with large step sizes 2/2. Aymeric Dieuleveut & Francis Bach, in the Annals of Statistics , 2015. x f ∗ x f ∗ x f ∗ x f ∗ x f ∗ x f ∗ H H H L 2 L 2 L 2 ρ X ρ X ρ X Theorem: Averaged, unregularized, least mean squares algorithm, with large step sizes, gets Statistical optimal rate of convergence.

Non-parametric Stochastic Approximation with large step sizes 2/2. Aymeric Dieuleveut & Francis Bach, in the Annals of Statistics , 2015. x f ∗ x f ∗ x f ∗ x f ∗ x f ∗ x f ∗ H H H L 2 L 2 L 2 ρ X ρ X ρ X Theorem: Averaged, unregularized, least mean squares algorithm, with large step sizes, gets Statistical optimal rate of convergence. � � σ 2 d � Recovers the finite dimension situation with rate O . n � Optimal rates in both the well-specified regime and some situations of the mis-specified.

Optimization Statistics Minimize a given function Statistical model Accurate & Efficient Algorithm focused Performance measure Scalable estimators with Scales with dimension and Estimator optimal statistical properties observations Convergence: F (# obs ) Convergence: F (#iter) Non-parametric Non-parametric Stochastic Regression Stochastic algorithms Square loss First order methods Approximation, Tikhonov regularization Few passes on the data AOS, 2015

Optimization Statistics Minimize a given function Statistical model Accurate & Efficient Algorithm focused Performance measure Scalable estimators with Scales with dimension and Estimator optimal statistical properties observations Convergence: F (# obs ) Convergence: F (#iter) Non-parametric Non-parametric Stochastic Regression Stochastic algorithms Square loss First order methods Approximation, Tikhonov regularization Few passes on the data AOS, 2015 Faster Rates for Least-Squares Regression, Tech. report, 2016

Scalable Non-Parametric Statistical Estimation Aymeric DIEULEVEUT - PowerPoint PPT Presentation

Scalable Non-Parametric Statistical Estimation Aymeric DIEULEVEUT ENS Paris, INRIA February 6, 2017 Statistics Statistical model Performance measure Estimator Convergence: F (# obs ) Optimization Statistics Minimize a given function

MLSE Channel Estimation MLSE Channel Estimation MLSE Channel Estimation Parametric or Non-

Semi-parametric and response setup non-parametric approaches to Parametric models

Non-parametric Bayesian Statistics Graham Neubig 2011-12-22 1 Graham Neubig Non-parametric

Part 3. Spectrum Estimation Part 3. Spectrum Estimation 3.2 Parametric Methods for Spectral

Non-parametric Methods Oliver Schulte - CMPT 726 Bishop PRML Ch. 2.5 Kernel Density Estimation

Introduction to non-parametric Bayes Introduction to non-parametric Bayes methods 1 Overview

Towards a non-parametric Towards a non-parametric stochastic framework: a consistent approach of

Non parametric prediction and mapping of standing Non-parametric prediction and mapping of

Non-parametric Density Estimation on a Transformation Group for Vision Erik G. Miller, UC

Estimation theory Parametric estimation Properties of estimators Minimum variance

Cache Coherence in Scalable Machines Scalable Cache Coherent Systems Scalable, distributed

TCTL model checking lower/upper-bound Introduction parametric timed automata without Parametric

CMSC427 Notes on piecewise parametric curves: Hermite, Catmull-Rom, and Bezier I. Parametric

Optical Parametric Generation and Amplification 1 Optical Parametric Generation Sum frequency

Parametric Bootstrapping 18.05 Spring 2017 Parametric bootstrapping Use the estimated parameter

Dose-response evaluation using a combined parametric/non-parametric approach John-Philip Lawo

Certified Symbolic Management of Financial Multi-Party Contracts Patrick Bahr 1 Jost Berthold 2

Electroweak effects in Higgs boson production Frank Petriello University of Wisconsin, Madison

HI AND METAL ABSORPTION LINES DURING THE EPOCH OF REIONIZATION LUZ NGELA GARCA PEALOZA

Out-of-focus (OOF) holography at the Effelsberg telescope Effelsberg Science Workshop Tomas

33: Accumulators & Polishing code (Functional Data) Accumulators Estimated Value and search

11/21/2006 Massachusetts Institute of Technology Motivation Complex embedded systems

Outline Mixed models in R using the lme4 package Part 1: Introduction to R Web site and

Performance comparison of scheduling algorithms for IPTV traffic over Polymorphous OBS routers

Scalable Non-Parametric Statistical Estimation Aymeric DIEULEVEUT - PowerPoint PPT Presentation

Scalable Non-Parametric Statistical Estimation Aymeric DIEULEVEUT ENS Paris, INRIA February 6, 2017 Statistics Statistical model Performance measure Estimator Convergence: F (# obs ) Optimization Statistics Minimize a given function

MLSE Channel Estimation MLSE Channel Estimation MLSE Channel Estimation Parametric or Non-

Semi-parametric and response setup non-parametric approaches to Parametric models

Non-parametric Bayesian Statistics Graham Neubig 2011-12-22 1 Graham Neubig Non-parametric

Part 3. Spectrum Estimation Part 3. Spectrum Estimation 3.2 Parametric Methods for Spectral

Non-parametric Methods Oliver Schulte - CMPT 726 Bishop PRML Ch. 2.5 Kernel Density Estimation

Introduction to non-parametric Bayes Introduction to non-parametric Bayes methods 1 Overview

Towards a non-parametric Towards a non-parametric stochastic framework: a consistent approach of

Non parametric prediction and mapping of standing Non-parametric prediction and mapping of

Non-parametric Density Estimation on a Transformation Group for Vision Erik G. Miller, UC

Estimation theory Parametric estimation Properties of estimators Minimum variance

Cache Coherence in Scalable Machines Scalable Cache Coherent Systems Scalable, distributed

TCTL model checking lower/upper-bound Introduction parametric timed automata without Parametric

CMSC427 Notes on piecewise parametric curves: Hermite, Catmull-Rom, and Bezier I. Parametric

Optical Parametric Generation and Amplification 1 Optical Parametric Generation Sum frequency

Parametric Bootstrapping 18.05 Spring 2017 Parametric bootstrapping Use the estimated parameter

Dose-response evaluation using a combined parametric/non-parametric approach John-Philip Lawo

Certified Symbolic Management of Financial Multi-Party Contracts Patrick Bahr 1 Jost Berthold 2

Electroweak effects in Higgs boson production Frank Petriello University of Wisconsin, Madison

HI AND METAL ABSORPTION LINES DURING THE EPOCH OF REIONIZATION LUZ NGELA GARCA PEALOZA

Out-of-focus (OOF) holography at the Effelsberg telescope Effelsberg Science Workshop Tomas

33: Accumulators &amp; Polishing code (Functional Data) Accumulators Estimated Value and search

11/21/2006 Massachusetts Institute of Technology Motivation Complex embedded systems

Outline Mixed models in R using the lme4 package Part 1: Introduction to R Web site and

Performance comparison of scheduling algorithms for IPTV traffic over Polymorphous OBS routers

33: Accumulators & Polishing code (Functional Data) Accumulators Estimated Value and search