Patch to the Future: Unsupervised Visual Prediction Jacob Walker, - - PowerPoint PPT Presentation

▶

Feb 15, 2023 271 likes •884 views

Patch to the Future: Unsupervised Visual Prediction Jacob Walker, Abhinav Gupta, Martial Hebert The Robotics Institute Carnegie Mellon University Visual Prediction Goal Both the what and the how Goal Both the what and the how Goal Both the

SLIDE 1

Patch to the Future: Unsupervised Visual Prediction

Jacob Walker, Abhinav Gupta, Martial Hebert The Robotics Institute Carnegie Mellon University

SLIDE 2

Visual Prediction

SLIDE 3

Goal

Both the what and the how

SLIDE 4

Goal

Both the what and the how

SLIDE 5

Goal

Both the what and the how

SLIDE 6

Goal

Both the what and the how

SLIDE 7

Goal

Both the what and the how

SLIDE 8

Goal

Both the what and the how

SLIDE 9

Background

Data-Driven

Yuen et al. 2010

SLIDE 10

Background

Data-Driven

Yuen et al. 2010

SLIDE 11

Background

Data-Driven

Yuen et al. 2010

SLIDE 12

Background

Agent-Centric

Kitani et al. 2012, Koppula et al. 2013, etc.

SLIDE 13

Background

Agent-Centric

Kitani et al. 2012, Koppula et al. 2013, etc.

SLIDE 14

Our Approach

Data-Driven

SLIDE 15

Our Approach

Data-Driven + Agent-Centric

SLIDE 16

Our Approach

Unsupervised

SLIDE 17

Limitations

Domain-Dependent

Train Test

SLIDE 18

Limitations

Goal-Driven

SLIDE 19

Limitations

No Inter-Element Prediction

SLIDE 20

Overview

SLIDE 21

Representation

Singh et al. 2012

SLIDE 22

Action Space

SLIDE 23

Scene Interaction

SLIDE 24

Scene Interaction

SLIDE 25

Scene Interaction

High Low

SLIDE 26

Expected Reward

P(Transition) Reward(X,Y,C) E(Reward) = P(T) * R

SLIDE 27

Planning

SLIDE 28

Planning

SLIDE 29

Planning

SLIDE 30

Planning

SLIDE 31

Planning

SLIDE 32

Planning

SLIDE 33

Planning

SLIDE 34

Planning

SLIDE 35

Planning

SLIDE 36

Planning

SLIDE 37

Planning

SLIDE 38

Planning

SLIDE 39

Planning

SLIDE 40

Planning

SLIDE 41

Training

Transitions Scene Interaction

SLIDE 42

Training

Transitions Scene Interaction

SLIDE 43

Training

Transitions

SLIDE 44

Training

Patch Transitions

SLIDE 45

Training

Transitions Scene Interaction

SLIDE 46

Training

Scene Interaction

SLIDE 47

Training

Scene Interaction

SLIDE 48

Scene Interaction

Training

SLIDE 49

Training

Scene Interaction

SLIDE 50

Training

Scene Interaction

SLIDE 51

Datasets

183 Videos
139 Training
44 Testing
~300 Minutes

SLIDE 52

Qualitative Results

SLIDE 53

Qualitative Results

SLIDE 54

Quantitative Results

SLIDE 55

Data-Driven Active Entity

Quantitative Results

Error (Top 6) NN + Sift-Flow Ours Mean 22.34 14.38 Median 16.68 10.91

SLIDE 56

Human-Chosen Active Entity

Quantitative Results

Error (Top 1) NN+Sift-Flow Kitani et al. Ours

Mean 27.55 37.94 21.55 Median 23.77 30.23 14.98

SLIDE 57

Second Dataset

VIRAT

SLIDE 58

Unsupervised method for prediction
No explicit modeling of semantics
Models appearance changes
Code will be available!

Conclusion

SLIDE 59