Bellmans curse of dimensionality n n-dimensional state space n Number - PDF document

Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 – 11 [optional] Betts, Practical Methods for Optimal Control Using Nonlinear Programming Bellman’s curse of dimensionality n n-dimensional state space n Number of states grows exponentially in n (assuming some fixed number of discretization levels per coordinate) n In practice n Discretization is considered only computationally feasible up to 5 or 6 dimensional state spaces even when using n Variable resolution discretization n Highly optimized implementations Page 1 �

This Lecture: Nonlinear Optimization for Optimal Control Goal: find a sequence of control inputs (and corresponding sequence n of states) that solves: Generally hard to do. We will cover methods that allow to find a n local minimum of this optimization problem. Note: iteratively applying LQR is one way to solve this problem if n there were no constraints on the control inputs and state Outline n Unconstrained minimization n Gradient Descent n Newton’s Method n Equality constrained minimization n Inequality and equality constrained minimization Page 2 �

Unconstrained Minimization If x* satisfies: n then x* is a local minimum of f. In simple cases we can directly solve the system of n equations given by (2) to find n candidate local minima, and then verify (3) for these candidates. In general however, solving (2) is a difficult problem. Going forward we will n consider this more general setting and cover numerical solution methods for (1). Steepest Descent n Idea: n Start somewhere n Repeat: Take a small step in the steepest descent direction Local Figure source: Mathworks Page 3 �

Steep Descent n Another example, visualized with contours: Figure source: yihui.name Steepest Descent Algorithm 1. Initialize x 2. Repeat 1. Determine the steepest descent direction ¢ x 2. Line search. Choose a step size t > 0. 3. Update. x := x + t ¢ x. 3. Until stopping criterion is satisfied Page 4 �

What is the Steepest Descent Direction? Stepsize Selection: Exact Line Search n Used when the cost of solving the minimization problem with one variable is low compared to the cost of computing the search direction itself. Page 5 �

Stepsize Selection: Backtracking Line Search n Inexact: step length is chose to approximately minimize f along the ray {x + t ¢ x | t ¸ 0} Stepsize Selection: Backtracking Line Search Figure source: Boyd and Vandenberghe Page 6 �

Gradient Descent Method Figure source: Boyd and Vandenberghe Gradient Descent: Example 1 Figure source: Boyd and Vandenberghe Page 7 �

Gradient Descent: Example 2 Figure source: Boyd and Vandenberghe Gradient Descent: Example 3 Figure source: Boyd and Vandenberghe Page 8 �

Gradient Descent Convergence Condition number = 10 Condition number = 1 For quadratic function, convergence speed depends on ratio of highest n second derivative over lowest second derivative (“condition number”) In high dimensions, almost guaranteed to have a high (=bad) condition n number Rescaling coordinates (as could happen by simply expressing quantities in n different measurement units) results in a different condition number Outline n Unconstrained minimization n Gradient Descent n Newton’s Method n Equality constrained minimization n Inequality and equality constrained minimization Page 9 �

Newton’s Method n 2 nd order Taylor Approximation rather than 1 st order: assuming , the minimum of the 2 nd order approximation is achieved at: Figure source: Boyd and Vandenberghe Newton’s Method Figure source: Boyd and Vandenberghe Page 10 �

Affine Invariance n Consider the coordinate transformation y = A x n If running Newton’s method starting from x (0) on f(x) results in x (0) , x (1) , x (2) , … n Then running Newton’s method starting from y (0) = A x (0) on g (y) = f(A -1 y), will result in the sequence y (0) = A x (0) , y (1) = A x (1) , y (2) = A x (2) , … n Exercise: try to prove this. Newton’s method when we don’t have n Issue: now ¢ x nt does not lead to the local minimum of the quadratic approximation --- it simply leads to the point where the gradient of the quadratic approximation is zero, this could be a maximum or a saddle point n Three possible fixes, let be the eigenvalue decomposition. n Fix 1: n Fix 2: n Fix 3: In my experience Fix 2 works best. Page 11 �

Example 1 gradient descent with Newton’s method with backtracking line search Figure source: Boyd and Vandenberghe Example 2 gradient descent Newton’s method Figure source: Boyd and Vandenberghe Page 12 �

Larger Version of Example 2 Gradient Descent: Example 3 Figure source: Boyd and Vandenberghe Page 13 �

Example 3 Gradient descent n Newton’s method (converges in one step if f convex quadratic) n Quasi-Newton Methods n Quasi-Newton methods use an approximation of the Hessian n Example 1: Only compute diagonal entries of Hessian, set others equal to zero. Note this also simplfies computations done with the Hessian. n Example 2: natural gradient --- see next slide Page 14 �

Natural Gradient n Consider a standard maximum likelihood problem: n Gradient: n Hessian: n Natural gradient only keeps the 2 nd term 1: faster to compute (only gradients needed); 2: guaranteed to be negative definite; 3: found to be superior in some experiments Outline n Unconstrained minimization n Gradient Descent n Newton’s Method n Equality constrained minimization n Inequality and equality constrained minimization Page 15 �

Outline n Unconstrained minimization n Equality constrained minimization n Inequality and equality constrained minimization Page 16 �

Bellmans curse of dimensionality n n-dimensional state space n Number - PDF document

Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional] Betts, Practical Methods for Optimal Control Using Nonlinear Programming Bellmans

How to Cope with the Curse of Dimensionality ? Henryk Wo zniakowski University of Warsaw and

. . . 1 / 5 The curse of dimensionality . many applications require high dimensional data .

Dimensionality Reduction Alexandros Tantos Assistant Professor Aristotle University of

Investigating Dimensionality Dimensionality Dimensionality with with Investigating

STAT 209 Dimensionality Reduction November 26, 2019 Colin Reimer Dawson 1 / 24 Dimensionality

Lifting the curse of dimensionality in nonlinear system identification with tensor networks. Kim

Bellman Group Company presentation Introduction to Bellman Group Key facts Sales split by

Bellman GAN: Distributional Multivariate Policy Evaluation and Exploration Dror Freirich, Tzahi

Can Tim or Leste Avoid Can Tim or Leste Avoid the Resource Curse? the Resource Curse? By

High dimensional computing - the upside of the curse of dimensionality Peer Neubert Stefan

Curse of Dimensionality in Pivot-based Indexes Ilya Volnyansky, Vladimir Pestov Department of

Overcoming the curse of dimensionality: from nonlinear Monte Carlo to deep artificial neural

Concepts for Breaking the Curse of Dimensionality for the Optimal Control HJB Equation Karl

When can Deep Networks avoid the curse of dimensionality and other theoretical puzzles Tomaso

The curse of dimensionality Julie Delon Laboratoire MAP5, UMR CNRS 8145 Universit Paris

Lecture 3: Kernel Regression Distance Metrics Curse of Dimensionality Linear

Assembly Language Techniques Programming the MCS51 Microcontroller Background Data Transfer

Control Structures 1 / 34 Control Flow Issues Multiple vs. single entry ("How did we get

Programming in Assembly Language Minimal Program Move CS Basics Flags Increment

Everybody be cool, this is a roppery! Vincenzo Iozzo (vincenzo.iozzo@zynamics.com) zynamics GmbH (

Pseudospectral Fourier reconstruction with IPRM Karlheinz Grchenig Tomasz Hrycak European

Engineering Analysis ENG 3420 Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th

Directed Model Checking (not only) for Timed Automata Sebastian Kupferschmid March, 2010 Model

Radiative energy loss in absorptive media Marcus Bluhm Laboratoire SUBATECH, Nantes with P . B.

Bellmans curse of dimensionality n n-dimensional state space n Number - PDF document

Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional] Betts, Practical Methods for Optimal Control Using Nonlinear Programming Bellmans

How to Cope with the Curse of Dimensionality ? Henryk Wo zniakowski University of Warsaw and

. . . 1 / 5 The curse of dimensionality . many applications require high dimensional data .

Dimensionality Reduction Alexandros Tantos Assistant Professor Aristotle University of

Investigating Dimensionality Dimensionality Dimensionality with with Investigating

STAT 209 Dimensionality Reduction November 26, 2019 Colin Reimer Dawson 1 / 24 Dimensionality

Lifting the curse of dimensionality in nonlinear system identification with tensor networks. Kim

Bellman Group Company presentation Introduction to Bellman Group Key facts Sales split by

Bellman GAN: Distributional Multivariate Policy Evaluation and Exploration Dror Freirich, Tzahi

Can Tim or Leste Avoid Can Tim or Leste Avoid the Resource Curse? the Resource Curse? By

High dimensional computing - the upside of the curse of dimensionality Peer Neubert Stefan

Curse of Dimensionality in Pivot-based Indexes Ilya Volnyansky, Vladimir Pestov Department of

Overcoming the curse of dimensionality: from nonlinear Monte Carlo to deep artificial neural

Concepts for Breaking the Curse of Dimensionality for the Optimal Control HJB Equation Karl

When can Deep Networks avoid the curse of dimensionality and other theoretical puzzles Tomaso

The curse of dimensionality Julie Delon Laboratoire MAP5, UMR CNRS 8145 Universit Paris

Lecture 3: Kernel Regression Distance Metrics Curse of Dimensionality Linear

Assembly Language Techniques Programming the MCS51 Microcontroller Background Data Transfer

Control Structures 1 / 34 Control Flow Issues Multiple vs. single entry (&quot;How did we get

Programming in Assembly Language Minimal Program Move CS Basics Flags Increment

Everybody be cool, this is a roppery! Vincenzo Iozzo (vincenzo.iozzo@zynamics.com) zynamics GmbH (

Pseudospectral Fourier reconstruction with IPRM Karlheinz Grchenig Tomasz Hrycak European

Engineering Analysis ENG 3420 Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th

Directed Model Checking (not only) for Timed Automata Sebastian Kupferschmid March, 2010 Model

Radiative energy loss in absorptive media Marcus Bluhm Laboratoire SUBATECH, Nantes with P . B.

Control Structures 1 / 34 Control Flow Issues Multiple vs. single entry ("How did we get