9.54 class 4 Supervised learning Shimon Ullman + Tomaso Poggio - PowerPoint PPT Presentation

9.54 class 4 Supervised learning � Shimon Ullman + Tomaso Poggio Danny Harari + Daneil Zysman + Darren Seibert 9.54, fall semester 2014

Intro 9.54, fall semester 2014

An old and simple model of supervised learning � associate b to a and store: Z φ b,a ( x ) = b ∗ a = b ( ξ ) a ( x − ξ ) d ξ retrieve output b from input a — if a � a ⇡ a Z a � φ b,a ( x ) = a ( τ ) φ b,a ( τ + x ) d τ ⇡ a 9.54, fall semester 2014

An old and simple model of supervised learning � when X φ ( x ) = b i ∗ a i a j � a i ⇡ δ i,j retrieve output b from input a — if a j � φ ⇡ b j It is a special case… � 9.54, fall semester 2014

Linear 9.54, fall semester 2014

“Linear” learning � Suppose x i ∈ R n and y i ∈ R m , i = 1 , · · · , N Define ( x 1 , · · · , x N ) = X and ( y 1 , · · · , y N ) = Y Find linear operator (eg a matrix) such that MX = Y 9.54, fall semester 2014

“Linear” learning � If X − 1 exists, then M = Y X − 1 MX = Y ⇒ = If X − 1 does not exists, then M = Y X † MX = Y ⇒ = where the pseudo inverse is the solution of min || MX − Y || F X p with | a i,j | 2 ) || A || F = ( i,j X † = ( X T X ) − 1 X T and if X is full column rank 9.54, fall semester 2014

“Linear” learning is linear regression � If e.g. the output y is scalar, then m = 1 ⇒ y = m T x = X Mx = y m i x i = i with M = XY − 1 9.54, fall semester 2014

Nonlinear 9.54, fall semester 2014

Nonlinear learning � Suppose x i ∈ R n and y i ∈ R m , i = 1 , · · · , N Define ( x 1 , · · · , x N ) = X and ( y 1 , · · · , y N ) = Y Find operator N such that N � X = Y In general impossible but…assume N is in the class of polynomial mappings of degree k in the vector space V (over the real field)…eg N has a convergent Taylor series expansion Weierstrass theorem ensures approximation of any continuous function 10

Nonlinear learning � Y = L o + L 1 ( X ) + L 2 ( X, X ) + ... + L k ( X, ..., X ) f(x) is a polynomial with all monomials as in this 2D example y = a 1 x 1 + a 2 x 2 + b 1 x 2 1 + b 12 x 1 x 2 + · · · 11

Classification and Regression 9.54, fall semester 2014

y = sign ( Mx )

In our language: is L 1 enough?

XOR function y = sign ( L 1 x + L 2 ( x, x )) = sign ( a 1 u 1 + a 2 u 2 + bu 1 u 2 ) = sign ( u 1 u 2 ) is in fact enough. This corresponds to a universal, one-hidden layer network output layer all monomials input variables

A few non-standard remarks • Regression is king, Gauss knew everything… � • Perhaps no need of multiple layers…are 2 layers universal? � • An interesting junction here RBFs MLPs

Radial Basis Functions 9.54, fall semester 2014

Nonlinear learning � Later we will see that RBF expansions are a good approximation of functions in high dimensions: N c k e − || x k − x || 2 X k =1 • RBF can be written as a 1-hidden layer network � � � • RBF is a rewriting of our polynomial (infinite ∞ x k − x || 2 n x k − x || 2 = || ˆ radius of convergence) X e || ˆ n ! n =0

Memory-based computation c i e − || x − xi || 2 X X f ( x ) = c i G ( x, x i ) = 2 σ 2 i i The training set is ( x 1 , · · · , x N ) = X and ( y 1 , · · · , y N ) = Y � e − || x − xi || 2 Suppose now that : then it is a → δ ( x − x i ) 2 σ 2 memory, a lookup table ( if x = x i y, f ( x ) = 0 , if x 6 = x i 9.54, fall semester 2014

Memory-based computation Of course learning is much more than memory but in this model the difference is between a Gaussian and a delta function 9.54, fall semester 2014

c i e − || x − xi || 2 X X f ( x ) = c i G ( x, x i ) = 2 σ 2 i i From Learning-from-Examples to   View-based Networks for Object Recognition Σ VIEW ANGLE Poggio, Edelman Nature , 1990.

Recording Sites in Anterior IT Logothetis, Pauls, and Poggio, 1995

Garfield 9.54, fall semester 2014

Image Analysis ⇒ Bear (0° view) ⇒ Bear (45° view)

Image Synthesis UNCONVENTIONAL GRAPHICS Θ = 0° view ⇒ Θ = 45° view ⇒

Hyperbf 9.54, fall semester 2014

Cartooon male 9.54, fall semester 2014

A toy problem: Gender Classification

Brunelli, Poggio ’91 (IRST, MIT)

An example: HyperBF and gender classification Some of the geometrical feature (white) used in the gender classification experiments

HyperBF and gender classification Typical stimuli used in the (informal!) psychophysical experiments of gender classification (about 90% correct)

Figure 3: Feature weights for gender classification as computed by the HyperBF networks

Radial Basis Functions and MLPs 9.54, fall semester 2014

Sigmoidal units are radial basis functions (for normalized inputs) � || x − w || 2 = || x || 2 + || w || 2 − 2( x · w ) Since If || x || = 1 ( x · w ) = 1 + || w || 2 − || x − w || 2 2 σ ( w · x + b ) is a radial function and thus Consider the MLP units 1 σ ( x · w − θ ) = 1 + e − ( x · w − θ )

Sigmoidal units are radial basis functions (for normalized inputs) � The corresponding radial function is

Sigmoidal units are radial basis functions (for normalized inputs) �

9.54 class 4 Supervised learning Shimon Ullman + Tomaso Poggio - PowerPoint PPT Presentation

9.54 class 4 Supervised learning Shimon Ullman + Tomaso Poggio Danny Harari + Daneil Zysman + Darren Seibert 9.54, fall semester 2014 Intro 9.54, fall semester 2014 An old and simple model of supervised learning associate b to a

Programming Abstraction in C++ Eric S. Roberts and Julie Zelenski Stanford University 2010

BIBLICAL SURVEY Introductory Class Introductory Class BIBLICAL SURVEY Introductory Class

Curriculum on The Cadet Corps Uniform Class A Uniform Class A Uniform Agenda C1. Class A

Electing Your Membership Class Class TG, Class TH, or Class DC As a school employee who

TwissOptics Class Joschua Dilly TwissOptics Class 2 The TwissOptics Class Resonance Driving

3/14/16 Review Class/Object Type Class Keyword class class Point

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Curriculum on The Cadet Corps Uniform Wear It WIth honor Class C Uniform Class C Uniform

Classroom Assessment Scoring System (CLASS) 104 B New Report Format Interpreting your CLASS

Essential Oils Class with Jami Borlik 1 Essential Oils Class with Jami Borlik 2 Essential Oils

Inheritance II Is-a versus has-a When an object of class A has a n object of class B, use

Inheritance A class can be a sub-type of another class The inheriting class contains all

UML Class Diagrams Steven Zeil February 25, 2013 UML Class Diagrams Outline Class

Multiple inheritance Multiple inheritance Can derive a class from more than one base class

CS 135: File Systems Class Overview 1 / 11 Class Overview Todays Topics Purpose of class

Chapter 10 Object-Oriented Thinking 1 Class Abstraction and Encapsulation Class abstraction is

Overview Last time we discussed orthogonal projection. Well review this today before

Follow-up Report: Updates on the implementation of recommendations from recent reports April

In the electricity market timeline 2/5 Learning objectives Through this module, it is aimed for

Advisory Council Meeting March 8, 2013 Agenda Welcome / Introductions Governing Board Updates

TUITION REMISSION BENEFIT INFO SESSION January 30, 2014 Tuition Remission Program Contents:

Communication, Semiotics and Design Fundamentals Lecture 3 IML 499 Reading Review One

CS449/649: Human-Computer Interaction Spring 2017 Lecture VII Anastasia Kuzminykh Create

Language evolution in the lab language evolution synthesized agent-based simulations iterated