Learning Symmetric and Low-energy Locomotion Wenhao Yu, Greg Turk, - - PowerPoint PPT Presentation

learning symmetric and low energy locomotion
SMART_READER_LITE
LIVE PREVIEW

Learning Symmetric and Low-energy Locomotion Wenhao Yu, Greg Turk, - - PowerPoint PPT Presentation

Learning Symmetric and Low-energy Locomotion Wenhao Yu, Greg Turk, C. Karen Liu Georgia Institute of Technology 2 /45 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3591461/ https://www.youtube.com/watch?v=Wz3beWec7D4 2 /45 [Robotis OP2] 3


slide-1
SLIDE 1

Learning Symmetric and Low-energy Locomotion

Wenhao Yu, Greg Turk, C. Karen Liu Georgia Institute of Technology

slide-2
SLIDE 2

/45 2

slide-3
SLIDE 3

/45 2

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3591461/ https://www.youtube.com/watch?v=Wz3beWec7D4

slide-4
SLIDE 4

/45 3

[Robotis OP2]

slide-5
SLIDE 5

/45 4

[Yin et al. 2009]

slide-6
SLIDE 6

/45 5

[OptiTrack] [Liu et al. 2015]

slide-7
SLIDE 7

/45 6

No FSM No Mocap

slide-8
SLIDE 8

/45 7

[Tan et al. 2011] [Tan et al. 2014]

slide-9
SLIDE 9

/45 8

[Heess et al. 2017] [Peng et al. 2018]

slide-10
SLIDE 10

/45 9

Deep Reinforcement Learning

slide-11
SLIDE 11

/45 9

Deep Reinforcement Learning

1.0 m/s 3.0 m/s

slide-12
SLIDE 12

/45 9

Deep Reinforcement Learning Energy Minimization Gait Symmetry

slide-13
SLIDE 13

/45 10

Deep Reinforcement Learning Energy Minimization Gait Symmetry

slide-14
SLIDE 14

/45 11

Deep Reinforcement Learning

State:

slide-15
SLIDE 15

/45 11

Deep Reinforcement Learning

State:

slide-16
SLIDE 16

/45 12

State:

slide-17
SLIDE 17

/45 12

State:

slide-18
SLIDE 18

/45 13

Action: State:

slide-19
SLIDE 19

/45 14

Action: State:

slide-20
SLIDE 20

/45 14

Transition: Action: State:

slide-21
SLIDE 21

/45 15

State: Transition: Action: Reward:

slide-22
SLIDE 22

/45 17

Markov Decision Process: { } Control Policy: Reinforcement Learning: Rollout:

[s0, a0, s1, a1, . . . , sT ]

DeepReinforcement Learning: :

π

slide-23
SLIDE 23

/45 18

∂LRL(π) ∂π

Policy Gradient

πnew

slide-24
SLIDE 24

/45 19

T(·) = R(s, a) = −||v − ¯ v|| −||a||1

−LaternalDeviation

+AliveBonus

−TorsoRotation

∗wv

∗wa

∗wL

∗wT

slide-25
SLIDE 25

/45

  • 20

Deep Reinforcement Learning

slide-26
SLIDE 26

/45

  • 20

Deep Reinforcement Learning Energy Minimization

  • Curriculum Learning
slide-27
SLIDE 27

/45 21

Keep Balance Go Forward Energy Efficient all possible gaits

what we want

slide-28
SLIDE 28

/45 21

Keep Balance Go Forward Energy Efficient all possible gaits

what we want

Somewhat Energy Efficient

typical RL result

slide-29
SLIDE 29

/45 22

Keep Balance Go Forward Energy Efficient

what we want

search space

slide-30
SLIDE 30

/45 23

Keep Balance Go Forward Energy Efficient

https://www.youtube.com/watch?v=5BiesFPqYWE https://www.youtube.com/watch?v=5a21xnaqCzk https://www.youtube.com/watch?v=LjnZHbFpDxc

slide-31
SLIDE 31

/45

lateral push walls waist support propel forward treadmill

24

slide-32
SLIDE 32

/45

lateral push walls waist support propel forward treadmill

24

slide-33
SLIDE 33

/45 25

With Assistance

slide-34
SLIDE 34

/45 26

Iteration i

slide-35
SLIDE 35

/45 26

time propel strength

(0,0)

Iteration i

slide-36
SLIDE 36

/45 26

time propel strength

(0,0)

Iteration i

slide-37
SLIDE 37

/45 26

time propel strength

(0,0)

Iteration i+1

slide-38
SLIDE 38

/45 27

slide-39
SLIDE 39

/45

  • 28

Deep Reinforcement Learning Energy Minimization Gait Symmetry

  • Curriculum Learning
slide-40
SLIDE 40

/45

Flip

29

slide-41
SLIDE 41

/45

Flip

29

slide-42
SLIDE 42

/45

Flip

29

time left hand height right hand height

slide-43
SLIDE 43

/45 30

time left hand height right hand height

slide-44
SLIDE 44

/45

Asymmetry = ( )

30

time left hand height right hand height

slide-45
SLIDE 45

/45

Reward -= ( )

31

slide-46
SLIDE 46

/45

Reward -= ( )

31

slide-47
SLIDE 47

/45 32

slide-48
SLIDE 48

/45 32

slide-49
SLIDE 49

/45 32

slide-50
SLIDE 50

/45 33

symmetric symmetric

slide-51
SLIDE 51

/45

Mirror( )

=

34

state Mirror(state) action actionmirror Mirror

slide-52
SLIDE 52

/45 36

action = Mirror( ) actionmirror subject to:

LRL

minimize:

*weights omitted for clarity

slide-53
SLIDE 53

/45 36

  • 2

2

action Mirror( ) actionmirror

LRL

minimize:

+

*weights omitted for clarity

slide-54
SLIDE 54

/45 37

slide-55
SLIDE 55

/45 38

Deep Reinforcement Learning Energy Minimization Gait Symmetry Curriculum Learning

slide-56
SLIDE 56

/45

Running Stylized

39

slide-57
SLIDE 57

/45 40

Walking back Running Walking

slide-58
SLIDE 58

/45 41

Trotting Galloping

slide-59
SLIDE 59

/45 42

Slow Fast

slide-60
SLIDE 60

/45 43

Ball Walking

slide-61
SLIDE 61

/45 44

What’s Next?

  • Biomechanics-based models
  • Extend to more agile motions such as gymnastics
  • Running on real-hardware
slide-62
SLIDE 62

/45 45

Thank you!

  • paper link: https://arxiv.org/abs/1801.08093
  • code link: https://github.com/VincentYu68/SymmetryCurriculumLocomotion
  • my website: wenhaoyu.weebly.com
  • email: wenhaoyu@gatech.edu