Non-parametric regression model Distribution over functions - PowerPoint PPT Presentation

¡ Non-‑parametric ¡regression ¡model ¡ ¡ Distribution ¡over ¡functions ¡ ¡ Fully ¡specified ¡by ¡training ¡data ¡and ¡kernel ¡ function ¡ ¡ ¡ Output ¡variables ¡are ¡jointly ¡Gaussian ¡ ¡ Covariance ¡given ¡by ¡distance ¡of ¡inputs ¡in ¡ kernel ¡space ¡ 2/21/12 CSE-571: Probabilistic Robotics 2

Picture ¡from ¡ ¡[Bishop: ¡Pattern ¡Recognition ¡and ¡Machine ¡Learning, ¡2006] ¡ p ( x ) = Ν ( µ , Σ ) − 1 ( x − µ ) T Σ − 1 ( x − µ ) 1 p ( x ) = 1/2 e 2 (2 π ) d /2 Σ 2/21/12 CSE-571: Probabilistic Robotics 3

¡ Outputs ¡are ¡noisy ¡function ¡of ¡inputs: ¡ = + ε y f ( x ) ¡ i i ¡ GP ¡prior: ¡Outputs ¡jointly ¡zero-‑mean ¡Gaussian: ¡ p ( y | X ) = Ν ( 0 , K + σ n 2 I ) ¡ Covariance ¡given ¡by ¡kernel ¡matrix ¡over ¡inputs: ¡ ⎛ ⎞ k ( x 1 , x 1 ) … k ( x 1 , x n ) ⎜ ⎟ ⎜ ⎟ k ( x 2 , x 1 ) − 1 ⎜ ⎟ x ) T 2 e ( x − ′ x ) W ( x − ′ K = k ( x , ′ x ) = σ f  k ( x i , x i )  2 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ k ( x n , x 1 ) … k ( x n , x n ) ⎝ ⎠ 2/21/12 CSE-571: Probabilistic Robotics 5

Pictures ¡from ¡ ¡[Bishop: ¡PRML, ¡2006] ¡ 2/21/12 CSE-571: Probabilistic Robotics 6

¡ Training ¡data: ¡ = … = D {( x , y ),( x , y ), ,( x , y )} ( , ) X y ¡ 1 1 2 2 n n ¡ Prediction ¡given ¡training ¡samples: ¡ ( ) p ( y * | x * , y , X ) = N µ , σ 2 ( ) p ( x a | x b ) = Ν µ a | b , Σ a | b ( ) − 1 y µ = k * T K + σ n µ a | b = µ a + Σ ab Σ bb − 1 ( x b − µ b ) 2 I Σ a | b = Σ aa − Σ ab Σ bb − 1 Σ ba k * [ i ] = k ( x * , x i ) Recall conditional ( ) σ 2 = k ( x * , x * ) − k * T K + σ n − 1 k * 2 I 2/21/12 CSE-571: Probabilistic Robotics 7

2/21/12 CSE-571: Probabilistic Robotics 8

Pictures ¡from ¡ ¡[Bishop: ¡PRML, ¡2006] ¡ − 1 x ) T 2 e 2( x − ′ x ) W ( x − ′ k ( x , ′ x ) = σ f 2/21/12 CSE-571: Probabilistic Robotics 9

¡ Maximize ¡data ¡log ¡likelihood: ¡ θ * = argmax p ( y | X , θ ) θ ( ) − 1 y − 1 ( ) − n log p ( y | X , θ ) = − 1 2 y T K + σ n 2 log K + σ n 2 log2 π 2 I 2 I ¡ Compute ¡derivatives ¡wrt. ¡params ¡ θ = 〈 σ σ 〉 2 2 , , l n f ¡ Optimize ¡using ¡conjugate ¡gradient ¡descent ¡ 2/21/12 CSE-571: Probabilistic Robotics 10

• Learn ¡hyperparameters ¡via ¡numerical ¡methods ¡ • Learn ¡noise ¡model ¡at ¡the ¡same ¡time ¡ 12 ¡ 2/21/12 CSE-571: Probabilistic Robotics

• System ¡modeling ¡ – Learn ¡behavior ¡of ¡the ¡system ¡given ¡ground ¡truth ¡states ¡ • Dynamics ¡model ¡ – Discrete-‑time ¡model ¡of ¡change ¡over ¡time ¡ • Observation ¡model ¡ – Mapping ¡between ¡states ¡and ¡observations ¡ • Traditionally ¡use ¡parametric ¡models ¡ – Derive ¡system ¡equations ¡then ¡learn ¡parameters ¡ 14 ¡ 2/21/12 CSE-571: Probabilistic Robotics

[Ferris-‑Haehnel-‑Fox: ¡RSS-‑06] ¡ 2/21/12 CSE-571: Probabilistic Robotics 15

Mean ¡ Variance ¡ 2/21/12 CSE-571: Probabilistic Robotics 16

observa-on ¡ System: ¡ ¡ • Commercial ¡blimp ¡envelope ¡with ¡custom ¡gondola ¡ • XScale ¡based ¡computer ¡with ¡Bluetooth ¡connectivity ¡ • Two ¡main ¡motors ¡with ¡tail ¡motor ¡(3D ¡control) ¡ • Observations: ¡ • Two ¡cameras ¡each ¡operating ¡at ¡1Hz ¡ • Extract ¡ellipse ¡using ¡computer ¡vision ¡(5D ¡observations) ¡ • Ground ¡truth ¡obtained ¡via ¡VICON ¡motion ¡capture ¡system ¡ • 19 ¡ 2/21/12 CSE-571: Probabilistic Robotics

c 2 Δ s o 2 2 Δ s … c 1 o s 1 s 3 2 3 s 1 • Use ¡ground ¡truth ¡state ¡to ¡extract: ¡ – Dynamics ¡data ¡ = D Δ Δ [ s , c ], s , [ s , c ], s … S 1 1 1 2 2 2 – Observation ¡data ¡ = D s 2 , o , s 3 , o … O 2 3 • Learn ¡models ¡using ¡Gaussian ¡process ¡regression ¡ – Learn ¡process ¡noise ¡inherent ¡in ¡system ¡ 20 ¡ 2/21/12 CSE-571: Probabilistic Robotics

c 1 Δ s 1 f ([ s 1 c , ]) 1 s s 2 1 • Combine ¡GP ¡model ¡with ¡parametric ¡model ¡ = D Δ − [ s , c ], s f ([ s , c ]) X 1 1 1 1 1 • Advantages ¡ – Captures ¡aspects ¡of ¡system ¡not ¡considered ¡by ¡parametric ¡ model ¡ – Learns ¡noise ¡model ¡in ¡same ¡way ¡as ¡GP-‑only ¡models ¡ – Higher ¡accuracy ¡for ¡same ¡amount ¡of ¡training ¡data ¡ 21 ¡ 2/21/12 CSE-571: Probabilistic Robotics

Dynamic ¡model ¡error ¡ Propagation method pos(mm) rot(deg) vel(mm/s) rotvel(deg/s) Param 3.3 0.5 14.6 1.5 GPonly 1.8 0.2 9.8 1.1 EGP 1.6 0.2 9.6 1.3 Observa2on ¡model ¡error ¡ Modeling Major Minor method pos(pix) axis(pix) axis(pix) Theta(deg) Param 7.1 2.9 5.7 9.2 GPonly 4.7 3.2 1.9 9.1 EGP 3.9 2.4 1.9 9.4 • 1800 ¡training ¡points, ¡mean ¡error ¡over ¡900 ¡test ¡points ¡ • For ¡dynamic ¡model, ¡0.25 ¡sec ¡predictions ¡ 22 ¡ 2/21/12 CSE-571: Probabilistic Robotics

σ u(k-1) u(k) u(k+1) Q(k) µ Dynamics GP dynamics s(k-1) s(k) s(k+1) model model Observation GP observation z(k-1) z(k) z(k+1) model model µ σ R(k) • Traditional ¡Bayesian ¡filtering ¡ – Parametric ¡dynamics ¡and ¡observation ¡models ¡ • GP-‑BayesFilters ¡ – GP ¡dynamics ¡and ¡observation ¡models ¡ – Noise ¡derived ¡from ¡GP ¡prediction ¡uncertainty ¡ – Can ¡be ¡integrated ¡into ¡to ¡Bayes ¡filters: ¡EKF, ¡UKF, ¡PF, ¡ADF ¡ 2/21/12 CSE-571: Probabilistic Robotics 23

¡ Learn ¡GP: ¡ ¡ § Input: ¡Sequence ¡of ¡ground ¡truth ¡states ¡along ¡with ¡ controls ¡and ¡observations: ¡<s, ¡u, ¡z> ¡ § Learn ¡GPs ¡for ¡dynamics ¡and ¡observation ¡models ¡ ¡ Filters ¡ § Particle ¡filter: ¡sample ¡from ¡dynamics ¡GP, ¡weigh ¡by ¡ Gaussian ¡GP ¡observation ¡function ¡ § EKF: ¡GP ¡for ¡mean ¡state, ¡GP ¡derivative ¡for ¡ linearization ¡ § UKF: ¡GP ¡for ¡sigma ¡points ¡ 2/21/12 CSE-571: Probabilistic Robotics 24

⎡ ⎤ e R v ⎡ ⎤ p b ⎢ ⎥ ⎢ ⎥ ξ H ( ) ξ ⎢ ⎥ d ⎢ ⎥ = =  s ⎢ ⎥ ∑ − ⎢ ⎥ − ω 1 M ( Forces * Mv ) v dt ⎢ ⎥ ⎢ ⎥ ∑ ω ⎢ ⎥ − − ω ω ⎣ ⎦ 1 J ( Torques * J ) ⎣ ⎦ ¡ 12-‑D ¡state=[pos,rot,transvel,rotvel] ¡ ¡ Describes ¡evolution ¡of ¡state ¡as ¡ODE ¡ ¡ ¡ Forces ¡/ ¡torques ¡considered: ¡buoyancy, ¡gravity, ¡drag, ¡thrust ¡ ¡ 16 ¡parameters ¡are ¡learned ¡by ¡optimization ¡on ¡ground ¡truth ¡ motion ¡capture ¡data ¡

Tracking algorithm pos(mm) rot(deg) vel(mm/s) rotvel(deg/s) MLL time(sec) GP-PF 91 +/- 7 6.4 +/- 1.6 52 +/- 3.7 5.0 +/- .2 9.4 +/- 1.9 449.4 +/- 21 GP-EKF 93 +/- 1 5.2 +/- .1 52 +/- .5 4.6 +/- .1 13.0 +/- .2 .29 +/- .1 GP-UKF 89 +/- 1 4.7 +/- .2 50 +/- .4 4.5 +/- .1 14.9 +/- .5 1.28 +/- .3 ParaPF 115 +/- 5 7.9 +/- .1 64 +/- 1.2 7.6 +/-. 1 -4.5 +/- 4.2 30.7 +/- 5.8 ParaEKF 112 +/- 4 8.0 +/- .2 65 +/- 2 7.5 +/- .2 8.4 +/- 1 .21 +/- .1 ParaUKF 111 +/- 4 7.9 +/- .1 64 +/- 1 7.6 +/- .1 10.1 +/- 1 .33 +/- .1 • Blimp tracking using multiple cameras • Ground truth obtained via Vicon motion tracking system • Average tracking error • Trajectory ~12 min long • 0.5 sec timesteps 2/21/12 CSE-571: Probabilistic Robotics 27

Full ¡process ¡model ¡tracking ¡ No ¡right ¡turn ¡process ¡model ¡tracking ¡ • Training ¡data ¡for ¡right ¡turns ¡removed ¡ 28 ¡ 2/21/12 CSE-571: Probabilistic Robotics

Non-parametric regression model Distribution over functions - PowerPoint PPT Presentation

Non-parametric regression model Distribution over functions Fully specified by training data and kernel function Output variables are jointly

MLSE Channel Estimation MLSE Channel Estimation MLSE Channel Estimation Parametric or Non-

Semi-parametric and response setup non-parametric approaches to Parametric models

Non-parametric Bayesian Statistics Graham Neubig 2011-12-22 1 Graham Neubig Non-parametric

Introduction to non-parametric Bayes Introduction to non-parametric Bayes methods 1 Overview

Parametric Methods Steven J Zeil Old Dominion Univ. Fall 2010 1 Distributions Estimating

TCTL model checking lower/upper-bound Introduction parametric timed automata without Parametric

Variational Bayesian Inference for Parametric and Non-Parametric Regression with Missing Predictor

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Parametric Bootstrapping 18.05 Spring 2017 Parametric bootstrapping Use the estimated parameter

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Towards a non-parametric Towards a non-parametric stochastic framework: a consistent approach of

Non parametric prediction and mapping of standing Non-parametric prediction and mapping of

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

CMSC427 Notes on piecewise parametric curves: Hermite, Catmull-Rom, and Bezier I. Parametric

Notes on the Non-linear Regression The model Non-linear regression models, like ordinary linear

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

3 D Programming D Programming What about solid shapes? glutSolidSphere

CS 4204 Computer Graphics Final Exam Preview Virginia Tech Yong Cao Final Exam (90%)

RenderMan Primitives RenderMan Primitives CSCD 472? Slide 1 4/5/10 Primitive Attributes

Computer Graphics 8 - Lighting & Shading Yoonsang Lee Spring 2019 Topics Covered

PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E.

Processor Design Single Cycle Processor Hung-Wei Tseng Recap: the stored-program computer

the GP Language Philip Cavanagh Graph Isomorphisms Isomorphism: Two graphs which are the

Mezurit 2: Virtual instrumentation for electronics experiments Dr. Brian Standley FOSDEM 2 Feb

Non-parametric regression model Distribution over functions - PowerPoint PPT Presentation

Non-parametric regression model Distribution over functions Fully specified by training data and kernel function Output variables are jointly

MLSE Channel Estimation MLSE Channel Estimation MLSE Channel Estimation Parametric or Non-

Semi-parametric and response setup non-parametric approaches to Parametric models

Non-parametric Bayesian Statistics Graham Neubig 2011-12-22 1 Graham Neubig Non-parametric

Introduction to non-parametric Bayes Introduction to non-parametric Bayes methods 1 Overview

Parametric Methods Steven J Zeil Old Dominion Univ. Fall 2010 1 Distributions Estimating

TCTL model checking lower/upper-bound Introduction parametric timed automata without Parametric

Variational Bayesian Inference for Parametric and Non-Parametric Regression with Missing Predictor

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Parametric Bootstrapping 18.05 Spring 2017 Parametric bootstrapping Use the estimated parameter

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Towards a non-parametric Towards a non-parametric stochastic framework: a consistent approach of

Non parametric prediction and mapping of standing Non-parametric prediction and mapping of

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

CMSC427 Notes on piecewise parametric curves: Hermite, Catmull-Rom, and Bezier I. Parametric

Notes on the Non-linear Regression The model Non-linear regression models, like ordinary linear

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

3 D Programming D Programming What about solid shapes? glutSolidSphere

CS 4204 Computer Graphics Final Exam Preview Virginia Tech Yong Cao Final Exam (90%)

RenderMan Primitives RenderMan Primitives CSCD 472? Slide 1 4/5/10 Primitive Attributes

Computer Graphics 8 - Lighting &amp; Shading Yoonsang Lee Spring 2019 Topics Covered

PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E.

Processor Design Single Cycle Processor Hung-Wei Tseng Recap: the stored-program computer

the GP Language Philip Cavanagh Graph Isomorphisms Isomorphism: Two graphs which are the

Mezurit 2: Virtual instrumentation for electronics experiments Dr. Brian Standley FOSDEM 2 Feb

Computer Graphics 8 - Lighting & Shading Yoonsang Lee Spring 2019 Topics Covered