Generalization via Modularity Deepak Chris Trevor Phillip - - PowerPoint PPT Presentation

generalization via modularity
SMART_READER_LITE
LIVE PREVIEW

Generalization via Modularity Deepak Chris Trevor Phillip - - PowerPoint PPT Presentation

Learning to Control Self-Assembling Morphologies Generalization via Modularity Deepak Chris Trevor Phillip Alyosha Pathak* Lu* Darrell Isola Efros * equal contribution How do we train a robot? Multiple tasks Expert


slide-1
SLIDE 1

Learning to Control Self-Assembling Morphologies

Generalization via Modularity

Deepak Pathak* Chris Lu* Trevor Darrell Phillip Isola Alyosha Efros

* equal contribution

slide-2
SLIDE 2

How do we train a robot?

slide-3
SLIDE 3
slide-4
SLIDE 4
  • Multiple tasks
  • Expert demonstrations
  • Rewards, labels
slide-5
SLIDE 5
  • Multiple tasks
  • Expert demonstrations
  • Rewards, labels
  • Self-supervision
  • Curious exploration
  • Learning “common sense”
slide-6
SLIDE 6

. . .

slide-7
SLIDE 7

. . .

… even earlier?

slide-8
SLIDE 8

Single to Multicellular

slide-9
SLIDE 9

Single to Multicellular competition  collaboration

slide-10
SLIDE 10

Single to Multicellular shared objective competition  collaboration

slide-11
SLIDE 11

Compositionality has been useful in language …

[Andreas et. al. 2016]

slide-12
SLIDE 12

How to implement compositionality in hardware?

slide-13
SLIDE 13

Modular Co-evolution of Control and Morphology

slide-14
SLIDE 14

Modular Co-evolution of Control and Morphology

Cylindrical Limb

slide-15
SLIDE 15

Modular Co-evolution of Control and Morphology

Cylindrical Limb Configurable Motor Joint

slide-16
SLIDE 16

Modular Co-evolution of Control and Morphology

slide-17
SLIDE 17

Modular Co-evolution of Control and Morphology

slide-18
SLIDE 18

Modular Co-evolution of Control and Morphology

Potential Magnetic Joint

slide-19
SLIDE 19

Modular Co-evolution of Control and Morphology

Potential Magnetic Joint

slide-20
SLIDE 20

Modular Co-evolution of Control and Morphology

Potential Magnetic Joint

Acts as single agent upon joining Rewards are shared!

slide-21
SLIDE 21

Modular Co-evolution of Control and Morphology

Potential Magnetic Joint

  • Input = Local Sensory State
  • Output = Torques, Link, Unlink

Acts as single agent upon joining Rewards are shared!

slide-22
SLIDE 22

Modular Co-evolution of Control and Morphology

Potential Magnetic Joint

  • Input = Local Sensory State
  • Output = Torques, Link, Unlink

Acts as single agent upon joining Rewards are shared!

slide-23
SLIDE 23

Consider the task of “standing up” …

slide-24
SLIDE 24
slide-25
SLIDE 25

How to learn compositional controllers?

slide-26
SLIDE 26

Idea: Shared policy network across limbs

Node Node Node Node Node Node Node Node Node Node Node Node

Nod in

slide-27
SLIDE 27

Idea: Shared policy network across limbs

Node Node Node Node Node Node Node Node Node Node Node Node

Nod in

𝜌𝜄

shared policy

  • utput

input

slide-28
SLIDE 28

How to adapt when morphology changes?

slide-29
SLIDE 29

How to adapt when morphology changes?

slide-30
SLIDE 30
slide-31
SLIDE 31

Network as reusable LEGO Blocks

slide-32
SLIDE 32

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

slide-33
SLIDE 33

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

slide-34
SLIDE 34

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

same dimension

slide-35
SLIDE 35

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

slide-36
SLIDE 36

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

slide-37
SLIDE 37

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

𝜌𝜄 𝜌𝜄 𝜌𝜄

slide-38
SLIDE 38

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

𝜌𝜄 𝜌𝜄 𝜌𝜄

slide-39
SLIDE 39

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

𝜌𝜄 𝜌𝜄 𝜌𝜄

slide-40
SLIDE 40

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

𝜌𝜄 𝜌𝜄 𝜌𝜄

cut

slide-41
SLIDE 41

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

𝜌𝜄 𝜌𝜄 𝜌𝜄 𝜌𝜄 𝜌𝜄 𝜌𝜄

and paste

cut

slide-42
SLIDE 42

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

𝜌𝜄 𝜌𝜄 𝜌𝜄 𝜌𝜄 𝜌𝜄 𝜌𝜄

and paste

adaptation by conditioning cut

slide-43
SLIDE 43

Dynamic Graph Networks

slide-44
SLIDE 44
slide-45
SLIDE 45

BTW, basically curriculum learning but in hardware

slide-46
SLIDE 46

How well does it generalize?

slide-47
SLIDE 47
slide-48
SLIDE 48

. . .

slide-49
SLIDE 49

a bit crazy… and totally useless!

. . .

slide-50
SLIDE 50

Self-Assembling Robots in the Real World

[Mark Yim’s Lab at UPenn] [Daniela Rus's Lab at MIT]

Also: [Modular Snake Robot – Howie Choset’s Lab at CMU]

slide-51
SLIDE 51

https://people.eecs.berkeley.edu/~pathak/

Thank You!

Poster # 197 …today!! (Multi-agent RL)

code & data at