learning from unlabeled video
play

Learning from Unlabeled Video Carl Vondrick Columbia University - PowerPoint PPT Presentation

Learning from Unlabeled Video Carl Vondrick Columbia University Survivor Bias of Video Data Large-scale Video Classification with Convolutional Neural Networks, CVPR 2014 Survivor Bias of Video Data Large-scale Video Classification with


  1. Learning from Unlabeled Video Carl Vondrick Columbia University

  2. Survivor Bias of Video Data Large-scale Video Classification with Convolutional Neural Networks, CVPR 2014

  3. Survivor Bias of Video Data Large-scale Video Classification with Convolutional Neural Networks, CVPR 2014

  4. Survivor Bias of Video Data Large-scale Video Classification with Convolutional Neural Networks, CVPR 2014

  5. Felix Warneken, Max Plank Institute

  6. The Oops ! dataset

  7. Oops! Predicting Unintentional Action CVPR 2020 oops.cs.columbia.edu Epstein, Chen, Vondrick. CVPR 2020.

  8. Oops! Predicting Unintentional Action CVPR 2020 oops.cs.columbia.edu Epstein, Chen, Vondrick. CVPR 2020.

  9. Learning from unlabeled video

  10. Example Videos

  11. Perceptual Clues 1) Predictability Ranzato 2014, Han 2019, … 2) Temporal Order Misra 2016, Wei 2018, …

  12. 3) Video speed as self-supervised clue Epstein, Chen, Vondrick. CVPR 2020.

  13. Speed of Action Alters Perceptual Judgement

  14. 3) Video speed as self-supervised clue Epstein, Chen, Vondrick. CVPR 2020.

  15. Visualizing Features Epstein, Chen, Vondrick. CVPR 2020.

  16. Fit linear model to classify intentionality + - ++ - -

  17. What’s missing? Environmental Unexpected Multi-agent Limited Skill Planning Error Single-agent Execution Error Limited Visibility Human Ours (self-supervised) Limited Knowledge Kinetics (supervised) 0 5 10 15 20 25 Error (lower is better)

  18. oops.cs.columbia.edu Tuesday 10am PST Poster 93 Epstein, Chen, Vondrick. CVPR 2020.

  19. Natural Synchronization Vision Speech

  20. Ackee seems to be: • edible • white/yellow • washable • sticky • larger than cherry tomato “I’m going to go in with the actual ackee I rinsed off earlier”

  21. Word Learning from Vision VisualBERT, VILBERT, VideoBERT, LXMERT, … “stir” Transformer stack … “I turn on the fire and then I [???] the pasta” Learn what Learn how to learn � “stir” means what “stir” means

  22. Learning to Learn Words Suris, Epstein, Ji, Chang, Vondrick. arXiv.

  23. Transformers as Meta-Learners Suris, Epstein, Ji, Chang, Vondrick. arXiv.

  24. Transformers as Meta-Learners Implement with cross entropy loss Suris, Epstein, Ji, Chang, Vondrick. arXiv.

  25. Meta-Learning Episodes New Words Episode … Composition Episode Suris, Epstein, Ji, Chang, Vondrick. arXiv.

  26. Mode 1: Language Modeling Mode 2: Word Acquisition Suris, Epstein, Ji, Chang, Vondrick. arXiv.

  27. Language Modeling 75 18% drop 60 Seen 19% 45 Accuracy drop Seen New 30 New 15 Seen Composition New Composition 0 BERT pretrained BERT + vision Meta-Learned Suris, Epstein, Ji, Chang, Vondrick. arXiv.

  28. Language Modeling 75 11% 18% drop drop 60 Seen Seen 19% New 45 Accuracy drop Seen New 30 New 15 Seen Composition New Composition 0 BERT pretrained BERT + vision Meta-Learned Suris, Epstein, Ji, Chang, Vondrick. arXiv.

  29. Word Acquisition Training Set Test Example get avocado still taking skin off stir rice into pan avocado fish with a knife new word Suris, Epstein, Ji, Chang, Vondrick. arXiv.

  30. Word Acquisition Training Set Test Example open the wash plates switch off oven on oven close oven cupboard with rag the bottom right new word Suris, Epstein, Ji, Chang, Vondrick. arXiv.

  31. Novel word acquisition Suris, Epstein, Ji, Chang, Vondrick. arXiv.

  32. Visualizing Learned Process Suris, Epstein, Ji, Chang, Vondrick. arXiv.

  33. Visualizing Attention Green boxes impact green prediction the most Training Set cut cherry tomatoes put spoon close food container … … … Test chop sun-dried rinse container spoon container put spoon tomatoes tomatoes Suris, Epstein, Ji, Chang, Vondrick. arXiv.

  34. expert.cs.columbia.edu Suris, Epstein, Ji, Chang, Vondrick. arXiv.

  35. Learning from Unlabeled Video Carl Vondrick Columbia University

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend