SOLAR: Deep Structured Representations for Model-Based Reinforcement - PowerPoint PPT Presentation

SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning Marvin Zhang*, Sharad Vikram*, Laura Smith, Pieter Abbeel, Matthew J Johnson, Sergey Levine UC Berkeley, UC San Diego, Google

Efficient reinforcement learning from images

Efficient reinforcement learning from images Model-free RL: 4 hours for image-based robotic task, 2 hours for block stacking from states https://sites.google.com/view/sac-and-applications

Efficient reinforcement learning from images Model-free RL: 20 hours for image-based robotic task, 2 hours for block stacking from states Model-based RL from images: relies on accurate forward prediction , which is difficult

Efficient reinforcement learning from images Model-free RL: 20 hours for image-based robotic task, 2 hours for block stacking from states Model-based RL from images: relies on accurate forward prediction , which is difficult Key idea: structured representation learning to enable accurate modeling with simple models; model-based method that does n o t use forward prediction

Preliminary: LQR-FLM Levine and Abbeel, “Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics”. NIPS 2014. Levine*, Finn*, Darrell, and Abbeel, “End-to-End Training of Deep Visuomotor Policies”. JMLR 2016. Chebotar*, Hausman*, Zhang*, Sukhatme, Schaal, and Levine, “Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning”. ICML 2017.

Preliminary: LQR-FLM LQR-FLM fits local models for policy improvement, not forward prediction Levine and Abbeel, “Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics”. NIPS 2014. Levine*, Finn*, Darrell, and Abbeel, “End-to-End Training of Deep Visuomotor Policies”. JMLR 2016. Chebotar*, Hausman*, Zhang*, Sukhatme, Schaal, and Levine, “Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning”. ICML 2017.

Preliminary: LQR-FLM LQR-FLM fits local models for policy improvement, not forward prediction LQR-FLM has worked on complex robotic systems from states Levine and Abbeel, “Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics”. NIPS 2014. Levine*, Finn*, Darrell, and Abbeel, “End-to-End Training of Deep Visuomotor Policies”. JMLR 2016. Chebotar*, Hausman*, Zhang*, Sukhatme, Schaal, and Levine, “Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning”. ICML 2017.

Preliminary: LQR-FLM LQR-FLM fits linear dynamics and quadratic cost models for policy improvement:

Preliminary: LQR-FLM LQR-FLM fits linear dynamics and quadratic cost models for policy improvement: This works well, even for complex systems, if the state is relatively simple, but this doesn’t work if the state is complex, such as images

Our method: SOLAR In this work, we enable LQR-FLM for images using structured representation learning

Real robot results Our method is more efficient than both prior model-free and model-based methods

Real robot results Our method is more efficient than both prior model-free and model-based methods Block stacking: we can transfer a representation and model to multiple initial arm positions

Real robot results Our method is more efficient than both prior model-free and model-based methods Mug pushing: We can solve this task from sparse reward using human key presses

Thank you Poster #34 Paper: https://arxiv.org/abs/1808.09105 Website: https://sites.google.com/view/icml19solar Blog post: https://bair.berkeley.edu/blog/2019/05/20/solar Code: https://github.com/sharadmv/parasol

SOLAR: Deep Structured Representations for Model-Based Reinforcement - PowerPoint PPT Presentation

SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning Marvin Zhang, Sharad Vikram, Laura Smith, Pieter Abbeel, Matthew J Johnson, Sergey Levine UC Berkeley, UC San Diego, Google Efficient reinforcement learning from

Solar 101 3 rd Annual WV Solar Congress May 5, 2018 Presentation in three parts: 1. Solar

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Solar Power Project DBW Vietnam With strong support of: Solar Power DBW Vietnam Solar

www.capsolar.org CAP SOLAR ROOFTOP SOLAR THE VALUE CAP SOLAR COMMUNITY SOLAR THE

Get a Solar Home Now: Get a Solar Home Now: How Installing Solar Can How Installing Solar Can

ON-GRID VS OFF-GRID SOLAR On-Grid Solar is solar generation that is connected to the utility grid

Solar Energy 101 What You Need to Know to Go Solar Cathy Redson SolPowerPeople, Inc.

Smart Sector Integration: Industrial solar heating & cooling Christian Holter SOLID Solar

China Future Energy Water Top Solar Power I Water Top Solar II Solar and Fishery III

Expanding Solar Access with Community Solar Solar Energy Technologies Office U.S. Department of

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

Introduction to Solar Business Agenda Solar Electricity Generation Fundamental Solar PV Power

Solar Hot Water Heating Systems Solar Hot Water Heating Systems Installer Training Installer

Solar PV Solar PV Solar PV Solar PV: The Opportunity Presentation by: Lee Summers

1 Mach Overview Mach Overview Mach Mach Mach is more general than NT in that objects named by

Lecture 22 Knowledge Recovery and Software Reflexion Model EE 382V Spring 2009 Software Evolution

NFV glavado@whitestack.com Whitestack Jose Miguel Guzmn jmguzman@whitestack.com Whitestack

A compact Arnoldi algorithm for polynomial eigenvalue problems Yangfeng Su, Junyi Zhang, Zhaojun

EE 109 Unit 9 LCD LCD BOARD 9.3 9.4 How Do We Use It? The EE 109 LCD Shield The LCD

Inverse Compton Scattering at FAST Alex Murokh (substituting for Philippe Piot, NIU) RadiaBeam

Integrating DFS into ath9k/mac80211 Introducing Neratecs Design Approach Zefir Kurtisi

EECS 373 Design of Microprocessor-Based Systems Prabal Dutta University of Michigan Lecture 6:

Sambuz

Useful Links

Newsletter

Mail Us

SOLAR: Deep Structured Representations for Model-Based Reinforcement - PowerPoint PPT Presentation

SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning Marvin Zhang*, Sharad Vikram*, Laura Smith, Pieter Abbeel, Matthew J Johnson, Sergey Levine UC Berkeley, UC San Diego, Google Efficient reinforcement learning from

Solar 101 3 rd Annual WV Solar Congress May 5, 2018 Presentation in three parts: 1. Solar

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Solar Power Project DBW Vietnam With strong support of: Solar Power DBW Vietnam Solar

www.capsolar.org CAP SOLAR ROOFTOP SOLAR THE VALUE CAP SOLAR COMMUNITY SOLAR THE

Get a Solar Home Now: Get a Solar Home Now: How Installing Solar Can How Installing Solar Can

ON-GRID VS OFF-GRID SOLAR On-Grid Solar is solar generation that is connected to the utility grid

Solar Energy 101 What You Need to Know to Go Solar Cathy Redson SolPowerPeople, Inc.

Smart Sector Integration: Industrial solar heating &amp; cooling Christian Holter SOLID Solar

China Future Energy Water Top Solar Power I Water Top Solar II Solar and Fishery III

Expanding Solar Access with Community Solar Solar Energy Technologies Office U.S. Department of

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

Introduction to Solar Business Agenda Solar Electricity Generation Fundamental Solar PV Power

Solar Hot Water Heating Systems Solar Hot Water Heating Systems Installer Training Installer

Solar PV Solar PV Solar PV Solar PV: The Opportunity Presentation by: Lee Summers

1 Mach Overview Mach Overview Mach Mach Mach is more general than NT in that objects named by

Lecture 22 Knowledge Recovery and Software Reflexion Model EE 382V Spring 2009 Software Evolution

NFV glavado@whitestack.com Whitestack Jose Miguel Guzmn jmguzman@whitestack.com Whitestack

A compact Arnoldi algorithm for polynomial eigenvalue problems Yangfeng Su, Junyi Zhang, Zhaojun

EE 109 Unit 9 LCD LCD BOARD 9.3 9.4 How Do We Use It? The EE 109 LCD Shield The LCD

Inverse Compton Scattering at FAST Alex Murokh (substituting for Philippe Piot, NIU) RadiaBeam

Integrating DFS into ath9k/mac80211 Introducing Neratecs Design Approach Zefir Kurtisi

EECS 373 Design of Microprocessor-Based Systems Prabal Dutta University of Michigan Lecture 6:

Sambuz

Useful Links

Newsletter

Mail Us

SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning Marvin Zhang, Sharad Vikram, Laura Smith, Pieter Abbeel, Matthew J Johnson, Sergey Levine UC Berkeley, UC San Diego, Google Efficient reinforcement learning from

Smart Sector Integration: Industrial solar heating & cooling Christian Holter SOLID Solar