Functional Analytic Framework Functional Analytic Framework for - PowerPoint PPT Presentation

IFAC-SYSID2003 Aug. 27, 2003 Functional Analytic Framework Functional Analytic Framework for Model Selection for Model Selection Masashi Sugiyama Tokyo Institute of Technology, Tokyo, Japan Fraunhofer FIRST-IDA, Berlin, Germany

2 Regression Problem Regression Problem :Underlying function :Learned function L :Training examples L (noise) From , obtain a good approximation to

3 Model Selection Model Selection Target function Learned function Too simple Appropriate Too complex Choice of the model is extremely important for obtaining good learned function ! (Model refers to, e.g., regularization parameter)

4 Aims of Our Research Aims of Our Research � Model is chosen such that a generalization error estimator is minimized. � Therefore, model selection research is essentially to pursue an accurate estimator of the generalization error. � We are interested in � Having a novel method in different framework. � Estimating the generalization error with small (finite) samples.

5 Formulating Regression Problem Formulating Regression Problem as Function Approximation Problem as Function Approximation Problem � : A functional Hilbert space � We assume � We shall measure the “goodness” of the learned function (or the generalization error) by :Expectation over noise :Norm in

6 Function Spaces for Learning Function Spaces for Learning � In learning problems, we sample values of the target function at sample points (e.g., ). � Therefore, values of the target function at sample points should be specified. � This means that usual -space is not suitable for learning problems. is spanned by and have different values at But they are treated as the same function in

7 Reproducing Kernel Hilbert Spaces Reproducing Kernel Hilbert Spaces � In a reproducing kernel Hilbert space (RKHS), a value of a function at an input point is always specified. � Indeed, an RKHS has the reproducing kernel with reproducing property: :Inner product in

8 Sampling Operator Sampling Operator � For any RKHS , there exists a linear operator from to such that � Indeed, :Neumann-Schatten product For vectors, : -th standard basis in

9 Our Framework Our Framework RKHS Sample value space Sampling operator (Always linear) Learning target function ＋ noise Gen. error Learning operator (Generally non-linear) Learned function :Expectation over noise

10 Tricks for Estimating Tricks for Estimating Generalization Error Generalization Error � We want to estimate . But it includes unknown so it is not straightforward. � To cope with this problem, � We shall estimate only its essential part Essential part Constant � We focus on the kernel regression model: :Reproducing kernel of

11 A Key Lemma A Key Lemma For the kernel regression model, the essential gen. error is expressed by :Expectation over noise Unknown target function can be erased! :Generalized inverse

12 Estimating Essential Part Estimating Essential Part � is an unbiased estimator of the essential gen. error . � However, the noise vector is unknown. � Let us define � Clearly, it is still unbiased: � We would like to handle well.

13 How to Deal with How to Deal with Depending on the type of learning operator we consider the following three cases. A) is linear. B) is non-linear but twice almost differentiable. C) is general non-linear.

14 A) Examples of A) Examples of Linear Learning Operator Linear Learning Operator � Kernel ridge regression � A particular Gaussian process regression � Least-squares support vector machine :Parameters to be learned :Ridge parameter

15 A) Linear Learning A) Linear Learning When the learning operator is linear, :Adjoint of � This induces the subspace information M. Sugiyama & H. Ogawa (Neural Comp, 2001) criterion (SIC): M. Sugiyama & K.-R. Müller (JMLR, 2002) � SIC is unbiased with finite samples:

17 B) Examples of Twice Almost B) Examples of Twice Almost Differentiable Learning Operator Differentiable Learning Operator � Support vector regression with Huber’s loss :Ridge parameter :Threshold

18 B) Twice Differentiable Learning B) Twice Differentiable Learning For the Gaussian noise, we have :Vector-valued function � SIC for twice almost differentiable learning: � It reduces to the original SIC if is linear. � It is still unbiased with finite samples:

20 C) Examples of General C) Examples of General Non-Linear Learning Operator Non-Linear Learning Operator � Kernel sparse regression � Support vector regression with Vapnik’s loss

21 C) General Non-Linear Learning C) General Non-Linear Learning Approximation by the bootstrap :Expectation over bootstrap replications � Bootstrap approximation of SIC (BASIC): � BASIC is almost unbiased:

22 Simulation: Learning Sinc function Simulation: Learning Sinc function � :Gaussian RKHS � Kernel ridge regression :Ridge parameter

23 Simulation: DELVE Data Sets Simulation: DELVE Data Sets Normalized test error Red: Best or comparable (95%t-test)

24 Conclusions Conclusions � We provided a functional analytic framework for regression, where the generalization error is measured using the RKHS norm: � Within this framework, we derived a generalization error estimator called SIC. A) Linear learning (Kernel ridge, GPR, LS-SVM): SIC is exact unbiased with finite samples. B) Twice almost differentiable learning (SVR+Huber): SIC is exact unbiased with finite samples. C) Non-linear learning (K-sparse, SVR+Vapnik): BASIC is almost unbiased.

Functional Analytic Framework Functional Analytic Framework for - PowerPoint PPT Presentation

IFAC-SYSID2003 Aug. 27, 2003 Functional Analytic Framework Functional Analytic Framework for Model Selection for Model Selection Masashi Sugiyama Tokyo Institute of Technology, Tokyo, Japan Fraunhofer FIRST-IDA, Berlin, Germany 2

Zeros of analytic functions Lecture 14 Zeros of analytic functions Zeros of analytic functions

A Decision A Decision A Decision-Analytic Approach for A Decision Analytic Approach for

FFR Guided Functional FFR Guided Functional FFR Guided Functional FFR Guided Functional

6/7/2012 Analytic Framework Target Population Integrated Health Service Models Clinical

Hardware Observability Framework Hardware Observability Framework Hardware Observability

On p -adic comparison theorems for analytic spaces Wies lawa Nizio l, joint with Pierre

Analytic Combinatorics in Several Variables Robin Pemantle and Mark Wilson A of A conference, 30

Hadamard type operators for real analytic functions of several variables and moments of analytic

5. Analytic Combinatorics http://aofa.cs.princeton.edu Analytic combinatorics is a calculus for

Functional Linear Models 1 66 / 181 Functional Linear Models Statistical Models So far we have

Functional Programming in 40 minutes @russolsen Functional Programming in 40 minutes

+ f(x) = Python Functional Programming Python Functional Programming Functional Programming by

Play Framework One Web Framework to rule them all Felix Mller Agenda Yet another web

tidyfun : Tidy Functional Data A new framework for working with functional data in R Fabian Scheipl

Functional Systems Or: Functional Functional Programming Marius Eriksen Twitter Inc. @marius

Functional Safety Functional Safety Adam Kane Principal Sponsor 13-15 November 2018 Rotorua,

(Sub)Gradient Descent CMSC 422 M ARINE C ARPUAT marine@cs.umd.edu Figures credit: Piyush Rai

GLoBES Patrick Huber Center for Neutrino Physics, Virginia Tech What is ? Galileo Galilei

GLoBES Patrick Huber IPNAS, Virginia Tech International Neutrino Summer School Fermilab, July

Yasunori Nomura UC Berkeley; LBNL Is there a New Physics? if so, where is it? Naturalness

On-line Support Vector Motivation and antecedents Formulation of SVM regression Machine

Eisenstein Series for subgroups of SL ( 2 , Z ) Tim Huber Iowa State University June 3, 2009

Linear Models for Multi-Frame Super-Resolution Restoration under Non-Affine Registration and

Optimization for Kernel Methods S. Sathiya Keerthi Yahoo! Research, Burbank, CA, USA Kernel