Generalization via Modularity Deepak Chris Trevor Phillip - - PowerPoint PPT Presentation

generalization via modularity
SMART_READER_LITE
LIVE PREVIEW

Generalization via Modularity Deepak Chris Trevor Phillip - - PowerPoint PPT Presentation

Learning to Control Self-Assembling Morphologies Generalization via Modularity Deepak Chris Trevor Phillip Alyosha Pathak* Lu* Darrell Isola Efros * equal contribution NeurIPS 2019 How do we train a robot? Multiple tasks


slide-1
SLIDE 1

Learning to Control Self-Assembling Morphologies

Generalization via Modularity

Deepak Pathak* Chris Lu* Trevor Darrell Phillip Isola Alyosha Efros

* equal contribution

NeurIPS 2019

slide-2
SLIDE 2

How do we train a robot?

slide-3
SLIDE 3
slide-4
SLIDE 4
  • Multiple tasks
  • Expert demonstrations
  • Rewards, labels
slide-5
SLIDE 5
  • Multiple tasks
  • Expert demonstrations
  • Rewards, labels
  • Self-supervision
  • Curious exploration
  • Learning “common sense”
slide-6
SLIDE 6

. . .

slide-7
SLIDE 7

. . .

… even earlier?

slide-8
SLIDE 8

Single to Multicellular

slide-9
SLIDE 9

Single to Multicellular competition collaboration

slide-10
SLIDE 10

Single to Multicellular shared objective competition collaboration

slide-11
SLIDE 11

Compositionality has been useful in language …

[Andreas et. al. 2016]

slide-12
SLIDE 12

How to implement compositionality in hardware?

slide-13
SLIDE 13

Modular Co-evolution of Control and Morphology

slide-14
SLIDE 14

Modular Co-evolution of Control and Morphology

Cylindrical Limb

slide-15
SLIDE 15

Modular Co-evolution of Control and Morphology

Cylindrical Limb Configurable Motor Joint

slide-16
SLIDE 16

Modular Co-evolution of Control and Morphology

slide-17
SLIDE 17

Modular Co-evolution of Control and Morphology

slide-18
SLIDE 18

Modular Co-evolution of Control and Morphology

Potential Magnetic Joint

slide-19
SLIDE 19

Modular Co-evolution of Control and Morphology

Potential Magnetic Joint

slide-20
SLIDE 20

Modular Co-evolution of Control and Morphology

Potential Magnetic Joint

Acts as single agent upon joining Rewards are shared!

slide-21
SLIDE 21

Modular Co-evolution of Control and Morphology

Potential Magnetic Joint

Input = Local Sensory State Output = Torques, Link, Unlink Acts as single agent upon joining Rewards are shared!

slide-22
SLIDE 22

Modular Co-evolution of Control and Morphology

Potential Magnetic Joint

Input = Local Sensory State Output = Torques, Link, Unlink Acts as single agent upon joining Rewards are shared!

slide-23
SLIDE 23

Consider the task of “standing up” …

slide-24
SLIDE 24
slide-25
SLIDE 25

How to learn compositional controllers?

slide-26
SLIDE 26

Idea: Shared policy network across limbs

Node Node Node Node Node Node Node Node Node Node Node Node

Nod in

slide-27
SLIDE 27

Idea: Shared policy network across limbs

Node Node Node Node Node Node Node Node Node Node Node Node

Nod in

𝜌𝜄

shared policy

  • utput

input

slide-28
SLIDE 28

How to adapt when morphology changes?

slide-29
SLIDE 29

How to adapt when morphology changes?

slide-30
SLIDE 30

Network as reusable LEGO Blocks

slide-31
SLIDE 31

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

slide-32
SLIDE 32

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

slide-33
SLIDE 33

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

same dimension

slide-34
SLIDE 34

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

slide-35
SLIDE 35

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

slide-36
SLIDE 36

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

𝜌𝜄 𝜌𝜄 𝜌𝜄

slide-37
SLIDE 37

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

𝜌𝜄 𝜌𝜄 𝜌𝜄

slide-38
SLIDE 38

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

𝜌𝜄 𝜌𝜄 𝜌𝜄

slide-39
SLIDE 39

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

𝜌𝜄 𝜌𝜄 𝜌𝜄

cut

slide-40
SLIDE 40

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

𝜌𝜄 𝜌𝜄 𝜌𝜄 𝜌𝜄 𝜌𝜄 𝜌𝜄

and paste

cut

slide-41
SLIDE 41

Network as reusable LEGO Blocks 𝜌𝜄

shared policy

  • utput

input

message

  • utput

message input

𝜌𝜄 𝜌𝜄 𝜌𝜄 𝜌𝜄 𝜌𝜄 𝜌𝜄

and paste

adaptation by conditioning cut

slide-42
SLIDE 42

Dynamic Graph Networks

slide-43
SLIDE 43

BTW, basically curriculum learning but in hardware

slide-44
SLIDE 44

How well does it generalize?

slide-45
SLIDE 45
slide-46
SLIDE 46

. . .

slide-47
SLIDE 47

a bit crazy… is it even possible in real world?

. . .

slide-48
SLIDE 48

Self-Assembling Robots in the Real World

[Mark Yim’s Lab at UPenn] [Daniela Rus's Lab at MIT]

Also: [Modular Snake Robot – Howie Choset’s Lab at CMU]

slide-49
SLIDE 49

https://people.eecs.berkeley.edu/~pathak/

Thank You!

Poster # 197 Today (Tues)!!

code & data at