optimization based control direct collocation methods for
play

Optimization-Based Control: Direct Collocation Methods for - PowerPoint PPT Presentation

Optimization-Based Control: Direct Collocation Methods for Trajectory and Policy Optimization CS 287: Advanced Robotics, Fall 2019 Guest Lecture Igor Mordatch Overview Previously: Locally optimal control (shooting vs. collocation)


  1. Optimization-Based Control: Direct Collocation Methods for Trajectory and Policy Optimization CS 287: Advanced Robotics, Fall 2019 Guest Lecture Igor Mordatch

  2. Overview • Previously: • Locally optimal control (shooting vs. collocation) • Forward dynamics models and shooting (LQR, DDP) • Today: • Direct collocation in detail (open-loop and policies) • inverse dynamics models • Solution methods for collocation problems • Optimization with contacts

  3. Outline • Trajectory optimization and direct collocation • Inverse dynamics model • Numerical optimization for collocation • Optimizing dynamics with contact • Collocation methods for policy learning

  4. shooting collocation

  5. shooting collocation

  6. shooting collocation

  7. Outline • Trajectory optimization and direct collocation • Inverse dynamics model • Numerical optimization for collocation • Optimizing dynamics with contact • Collocation methods for policy learning

  8. Outline • Trajectory optimization and direct collocation • Inverse dynamics model • Numerical optimization for collocation • Optimizing dynamics with contact • Collocation methods for policy learning

  9. (recall Natural Gradient from lec. 6)

  10. Recall Natural Gradient (Lec. 6). Can you see the commonalities? Natural Gradient Consider a standard maximum likelihood problem: n Gradient: n Hessian: n r 2 p ( x ( i ) ; θ ) ⌘ > ⇣ ⌘ ⇣ X r 2 f ( θ ) = r log p ( x ( i ) ; θ ) r log p ( x ( i ) ; θ ) � p ( x ( i ) ; θ ) i Natural gradient: n only keeps the 2 nd term in the Hessian. Benefits: (1) faster to compute (only gradients needed); (2) guaranteed to be negative definite; (3) found to be superior in some experiments; (4) invariant to re-parameterization

  11. Outline • Trajectory optimization and direct collocation • Inverse dynamics model • Numerical optimization for collocation • Optimizing dynamics with contact • Collocation methods for policy learning

  12. Direct Trajectory OpWmizaWon of Rigid Body Dynamical Systems Through Contact Posa and Tedrake, 2012

  13. Outline • Trajectory optimization and direct collocation • Inverse dynamics model • Numerical optimization for collocation • Optimizing dynamics with contact • Collocation methods for policy learning

  14. Recall from Last Lecture: Optimal Control -- Approaches Return feedback policy Return open-loop (e.g. linear or neural net) controls u 0 , u 1 , …, u H shooting collocation

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend