Learning Image Representations Tied to Ego motion Jayaraman and - PowerPoint PPT Presentation

University of Texas at Austin Visual Recognition Presentation (paper review) Learning Image Representations Tied to Ego ‐ motion Jayaraman and Grauman. ICCV 2015 Hilgad Montelo 2016 March

Outline • The "Kitten Carousel" Experiment • Problem • Objective • Main Idea • Related Work • Approach • Experiments and Results • Conclusions

The "Kitten Carousel" Experiment (Held & Hein, 1963) active kitten passive kitten Key to perceptual development: self-generated motion + visual feedback [Slide credit: Dinesh Jayaraman] 3

Problem • Today’s visual recognition algorithms learn from “disembodied” bag of labeled snapshots. 4

Objective • Provide visual recognition algorithm that learns in the context of acting and moving in the world. 5

Main Idea • Associate Ego ‐ Motion and vision by teaching computer vision system the connection: • “how I move” “how my visual surroundings change” + 6

Ego ‐ motion vision: view prediction After moving: 7 [Slide credit: Dinesh Jayaraman]

Ego ‐ motion vision for recognition • Learning this connection requires:  Depth, 3D geometry Also key to  Semantics recognition!  Context • Can be learned without manual labels! Approach: unsupervised feature learning using egocentric video + motor signals 8

Related Works Integrating vision and motion Agrawal, Carreira, Malik, “Learning to see by moving”, ICCV 2015 Watter, Springenberg, Boedecker, Riedmiller, “Embed to control...”, NIPS 2015 Levine, Finn, Darrell, Abbeel, “… visuomotor policies”, arXiv 2015 Konda, Memisevic, “Learning visual odometry ...”, VISAPP 2015 Visual prediction Doersch, Gupta, Efros, “… context prediction”, ICCV 2015 Oh, Guo, Lee, Lewis, Singh, “Action-conditional video …”, NIPS 2015 Kulkarni, Whitney, Kohli, Tenenbaum, “… inverse graphics ...”, NIPS 2015 Vondrick, Pirsiavash, Torralba, “Anticipating the future ...”, arXiv 2015 Video for unsupervised image features Wang, Gupta, “Unsupervised learning of visual …”, ICCV 2015 Goroshin, Bruna, Tompson, Eigen, LeCun, “Unsupervised ...”, ICCV 2015 9

Approach Ego ‐ motion equivariance Invariant features: unresponsive to some classes of transformations Equivariant features : predictably responsive to some classes of transformations, through simple mappings (e.g., linear) “equivariance map” � Invariance discards information; equivariance organizes it. 10

Approach Equivariant embedding Training data organized by ego ‐ motions Unlabeled video + motor signals left turn right turn forward motor signal Learn Pairs of frames related by similar ego ‐ motion should be related by same feature time → transformation 11 Source: “Learning image representations equivariant to ego motion ” Jayaraman and Grauman ICCV 2015

Approach 1. Extract training frame pairs from video 2. Learn ego ‐ motion ‐ equivariant image features 3. Train on target recognition task in parallel 12

Training frame pair mining Discovery of ego ‐ motion clusters =left turn yaw change =forward =right turn Right turn forward distance 13 [Slide credit: Dinesh Jayaraman]

Ego ‐ motion equivariant feature learning Given: Desired : for all motions and all images , � � � Unsupervised training � � �� ∥ � � � � �� ∥ � � � �� Supervised training Feature space � � �� softmax loss � � �� , y � � � � � � � � class � , � and jointly trained � � �� 14 [Slide credit: Dinesh Jayaraman]

Experiments • Validation using 3 public datasets: NORB , KITTI , SUN . • Comparison with different methods: CLSNET, TEMPORAL, DRLIM . 15

Results: Recognition Learn from unlabeled car video (KITTI) Geiger et al, IJRR ’13 Exploit features for static scene classification (SUN, 397 classes) Xiao et al, CVPR ’10 16

Results: Recognition Do ego ‐ motion equivariant features improve recognition? 6 labeled training examples KITTI ⟶ SUN per class recognition accuracy (%) 1.58 KITTI ⟶ KITTI 1.21 397 classes 1.02 0.70 0.25 invariance NORB ⟶ NORB Up to 30% accuracy increase over state of the art! * Hadsell et al., Dimensionality Reduction by Learning an Invariant Mapping, CVPR’06 ** Mobahi et al., Deep Learning from Temporal Coherence in Video, ICML’09 17

Results: Active recognition • Leverage proposed equivariant embedding to select next best view for object recognition NORB data 50 40 Accuracy (%) 30 cup/bowl/pan? cup/bowl/pan? 20 10 0 cup frying pan [Slide credit: Dinesh Jayaraman]

Conclusion and Future Work • The paper provided a new embodied visual feature learning paradigm. • The Ego ‐ motion equivariance boosts performance across multiple challenging recognition tasks. 19

Questions • Why KITTI training and not some other domain based training? • Why does incorporating DRLIM improve EQUIV? Still Temporal coherence properties left to be learned? • Is it meaningful to compare EQUIV or EQUIV + DRLIM with the other cases with respect to equivariance error?

Learning Image Representations Tied to Ego motion Jayaraman and - PowerPoint PPT Presentation

University of Texas at Austin Visual Recognition Presentation (paper review) Learning Image Representations Tied to Ego motion Jayaraman and Grauman. ICCV 2015 Hilgad Montelo 2016 March Outline The "Kitten Carousel" Experiment

TIED Paul Garner Becci Malthouse Giles Lewis The TIED model Where has it come from?

Ego State Model Transactional Analysis Ego States P A C VISIONS Inc. Transactional Analysis

Image Processing Todays Class Image Representations: Matrices Image Representations: RGB,

Image Motion COMPSCI 527 Computer Vision COMPSCI 527 Computer Vision Image Motion 1 /

Image Motion COMPSCI 527 Computer Vision COMPSCI 527 Computer Vision Image Motion 1 /

Visual Motion Motion illusions Uses for motion cues Optic flow Motion blindness

Instructions Overview The Ego Id Ticket provides an unprecedented opportunity to track and

Motion Estimation for Video Coding Motion-Compensated Prediction Bit Allocation Motion

Learning to Synthesize Motion Blur CVPR 2019 Tim Brooks and Jon Barron Research Motion During

Forces and Motion Click on the topic to go to that section Motion Motion Graphs of Motion

Forces and Motion Click on the topic to go to that section Motion Motion Graphs of Motion

Poses and Motion: Representations of Motion and Kinematics of Rigid Bodies The Heart of

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

Motion in Photography Freeze Motion / Blur Motion Objective The student will create freeze

Outline Outline Motion & Inverse Motion Motion & Inverse Motion Time

Motion Aftereffects Without Motion: Engaging the Human Motion Perception System With Still

Modelling of LP-problems (2WO09) assignor: Judith Keijsper room: HG 9.31 email:

Improving the Competency of First-Order Ontologies Javier Alvez Paqui Lucio German Rigau

Inferentialism and Logical Knowledge Brian Weatherson University of Michigan, Ann Arbor and

Cyclic Proofs of Program Termination in Separation Logic James Brotherston 1 , Richard Bornat

The Supply Side of the Market The Supply Side of the Market in in Three Parts: Three Parts:

T ask Analysis Ov erview What is task analysis? T ask Analysis Metho ds task

Chapter 1 Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu

CS3102 Theory of Computation Warm up: = { 0,1 | has an odd number of

Learning Image Representations Tied to Ego motion Jayaraman and - PowerPoint PPT Presentation

University of Texas at Austin Visual Recognition Presentation (paper review) Learning Image Representations Tied to Ego motion Jayaraman and Grauman. ICCV 2015 Hilgad Montelo 2016 March Outline The "Kitten Carousel" Experiment

TIED Paul Garner Becci Malthouse Giles Lewis The TIED model Where has it come from?

Ego State Model Transactional Analysis Ego States P A C VISIONS Inc. Transactional Analysis

Image Processing Todays Class Image Representations: Matrices Image Representations: RGB,

Image Motion COMPSCI 527 Computer Vision COMPSCI 527 Computer Vision Image Motion 1 /

Image Motion COMPSCI 527 Computer Vision COMPSCI 527 Computer Vision Image Motion 1 /

Visual Motion Motion illusions Uses for motion cues Optic flow Motion blindness

Instructions Overview The Ego Id Ticket provides an unprecedented opportunity to track and

Motion Estimation for Video Coding Motion-Compensated Prediction Bit Allocation Motion

Learning to Synthesize Motion Blur CVPR 2019 Tim Brooks and Jon Barron Research Motion During

Forces and Motion Click on the topic to go to that section Motion Motion Graphs of Motion

Forces and Motion Click on the topic to go to that section Motion Motion Graphs of Motion

Poses and Motion: Representations of Motion and Kinematics of Rigid Bodies The Heart of

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

Motion in Photography Freeze Motion / Blur Motion Objective The student will create freeze

Outline Outline Motion &amp; Inverse Motion Motion &amp; Inverse Motion Time

Motion Aftereffects Without Motion: Engaging the Human Motion Perception System With Still

Modelling of LP-problems (2WO09) assignor: Judith Keijsper room: HG 9.31 email:

Improving the Competency of First-Order Ontologies Javier Alvez Paqui Lucio German Rigau

Inferentialism and Logical Knowledge Brian Weatherson University of Michigan, Ann Arbor and

Cyclic Proofs of Program Termination in Separation Logic James Brotherston 1 , Richard Bornat

The Supply Side of the Market The Supply Side of the Market in in Three Parts: Three Parts:

T ask Analysis Ov erview What is task analysis? T ask Analysis Metho ds task

Chapter 1 Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu

CS3102 Theory of Computation Warm up: = { 0,1 | has an odd number of

Outline Outline Motion & Inverse Motion Motion & Inverse Motion Time