cs 4803 7643 deep learning
play

CS 4803 / 7643: Deep Learning Website: - PowerPoint PPT Presentation

CS 4803 / 7643: Deep Learning Website: https://www.cc.gatech.edu/classes/AY2020/cs7643_fall/ Piazza: https://piazza.com/gatech/fall2019/cs48037643 Canvas: https://gatech.instructure.com/courses/60374 (4803)


  1. CS 4803 / 7643: Deep Learning Website: https://www.cc.gatech.edu/classes/AY2020/cs7643_fall/ Piazza: https://piazza.com/gatech/fall2019/cs48037643 Canvas: https://gatech.instructure.com/courses/60374 (4803) https://gatech.instructure.com/courses/60364 (7643) Gradescope: https://www.gradescope.com/courses/56799 (4803) https://www.gradescope.com/courses/53817 (7643) Dhruv Batra School of Interactive Computing Georgia Tech

  2. What are we here to discuss? Some of the most exciting developments in Machine Learning, Vision, NLP, Speech, Robotics & AI in general in the last decade! (C) Dhruv Batra 2

  3. Proxy for public interest (C) Dhruv Batra 3

  4. AlphaGo vs Lee Sedol (C) Dhruv Batra 4

  5. Outline • What is Deep Learning, the field, about? – Highlight of some recent projects from my lab • What is this class about? – What to expect? – Logistics • FAQ (C) Dhruv Batra 5

  6. Outline • What is Deep Learning, the field, about? – Highlight of some recent projects from my lab • What is this class about? – What to expect? – Logistics • FAQ (C) Dhruv Batra 6

  7. Demo time vqa.cloudcv.org. demo.visualdialog.org (C) Dhruv Batra 7

  8. Concepts (C) Dhruv Batra 8 Image Credit: https://www.sumologic.com/blog/machine-learning-deep-learning/

  9. What is (general) intelligence? • Boring textbook answer The ability to acquire and apply knowledge and skills – Dictionary • My favorite The ability to navigate in problem space – Siddhartha Mukherjee, Columbia (C) Dhruv Batra 9

  10. What is artificial intelligence? • Boring textbook answer Intelligence demonstrated by machines – Wikipedia • My favorite The science and engineering of making computers behave in ways that, until recently, we thought required human intelligence. – Andrew Moore, CMU (C) Dhruv Batra 10

  11. What is machine learning? • My favorite Study of algorithms that improve their performance (P) at some task (T) with experience (E) – Tom Mitchell, CMU (C) Dhruv Batra 11

  12. Image Classification ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 1000 object classes 1.4M/50k/100k images Person Dalmatian http://image-net.org/challenges/LSVRC/{2010,…,2015} (C) Dhruv Batra 12

  13. Image Classification (C) Dhruv Batra 13

  14. Tasks are getting bolder A group of young people playing a game of Frisbee Vinyals et al., 2015 Antol et al., 2015 Das et al., 2017 (C) Dhruv Batra 15

  15. (C) Dhruv Batra 23

  16. Embodied Question Answering [CVPR ’18] Georgia Gkioxari Abhishek Das Samyak Datta (FAIR) (Georgia Tech) (Georgia Tech) Devi Parikh Dhruv Batra Stefan Lee (Georgia Tech / FAIR) (Georgia Tech / FAIR) (Georgia Tech)

  17. (C) Dhruv Batra 26

  18. What is to the left of the shower? Cabinet

  19. PACMAN-RL

  20. PACMAN-RL

  21. So what is Deep (Machine) Learning? • Representation Learning • Neural Networks • Deep Unsupervised/Reinforcement/Structured/ <insert-qualifier-here> Learning • Simply: Deep Learning (C) Dhruv Batra 33

  22. So what is Deep (Machine) Learning? • A few different ideas: • (Hierarchical) Compositionality – Cascade of non-linear transformations – Multiple layers of representations • End-to-End Learning – Learning (goal-driven) representations – Learning to feature extraction • Distributed Representations – No single neuron “encodes” everything – Groups of neurons work together (C) Dhruv Batra 34

  23. Traditional Machine Learning VISION hand-crafted your favorite features “car” classifier SIFT/HOG fixed learned SPEECH hand-crafted your favorite features \ˈd ē p\ classifier MFCC fixed learned NLP hand-crafted This burrito place your favorite features “+” classifier is yummy and fun! Bag-of-words fixed learned 35 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

  24. Hierarchical Compositionality VISION pixels edge texton motif part object SPEECH spectral sample formant motif phone word band NLP character word NP/VP/.. clause sentence story 36 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

  25. Building A Complicated Function Given a library of simple functions Compose into a complicate function (C) Dhruv Batra 37 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

  26. Building A Complicated Function Given a library of simple functions Idea 1: Linear Combinations Compose into a • Boosting • Kernels complicate function • … X f ( x ) = α i g i ( x ) i (C) Dhruv Batra 38 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

  27. Building A Complicated Function Given a library of simple functions Idea 2: Compositions Compose into a • Deep Learning • Grammar models complicate function • Scattering transforms… f ( x ) = g 1 ( g 2 ( . . . ( g n ( x ) . . . )) (C) Dhruv Batra 39 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

  28. Building A Complicated Function Given a library of simple functions Idea 2: Compositions Compose into a • Deep Learning • Grammar models complicate function • Scattering transforms… f ( x ) = log(cos(exp(sin 3 ( x )))) (C) Dhruv Batra 40 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

  29. Deep Learning = Hierarchical Compositionality “car” Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

  30. Deep Learning = Hierarchical Compositionality “car” Low-Level Mid-Level High-Level Trainable Feature Feature Feature Classifier Feature visualization of convolutional net trained on ImageNet from [Zeiler & Fergus 2013] Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

  31. So what is Deep (Machine) Learning? • A few different ideas: • (Hierarchical) Compositionality – Cascade of non-linear transformations – Multiple layers of representations • End-to-End Learning – Learning (goal-driven) representations – Learning to feature extraction • Distributed Representations – No single neuron “encodes” everything – Groups of neurons work together (C) Dhruv Batra 44

  32. Traditional Machine Learning VISION hand-crafted your favorite features “car” classifier SIFT/HOG fixed learned SPEECH hand-crafted your favorite features \ˈd ē p\ classifier MFCC fixed learned NLP hand-crafted This burrito place your favorite features “+” classifier is yummy and fun! Bag-of-words fixed learned 45 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

  33. Feature Engineering SIFT Spin Images HoG Textons and many many more…. (C) Dhruv Batra 46

  34. Traditional Machine Learning (more accurately) “Learned” VISION K-Means/ SIFT/HOG classifier “car” pooling fixed unsupervised supervised SPEECH Mixture of MFCC classifier \ˈd ē p\ Gaussians fixed unsupervised supervised NLP Parse Tree This burrito place n-grams classifier “+” Syntactic is yummy and fun! fixed unsupervised supervised (C) Dhruv Batra 47 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

  35. Deep Learning = End-to-End Learning “Learned” VISION K-Means/ SIFT/HOG classifier “car” pooling fixed unsupervised supervised SPEECH Mixture of MFCC classifier \ˈd ē p\ Gaussians fixed unsupervised supervised NLP Parse Tree This burrito place n-grams classifier “+” Syntactic is yummy and fun! fixed unsupervised supervised (C) Dhruv Batra 48 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

  36. “Shallow” vs Deep Learning • “Shallow” models hand-crafted “Simple” Trainable Feature Extractor Classifier fixed learned • Deep models Trainable Trainable Trainable Feature- Feature- Feature- Transform / Transform / Transform / Classifier Classifier Classifier Learned Internal Representations Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

  37. So what is Deep (Machine) Learning? • A few different ideas: • (Hierarchical) Compositionality – Cascade of non-linear transformations – Multiple layers of representations • End-to-End Learning – Learning (goal-driven) representations – Learning to feature extraction • Distributed Representations – No single neuron “encodes” everything – Groups of neurons work together (C) Dhruv Batra 51

  38. Distributed Representations Toy Example • Local vs Distributed (C) Dhruv Batra 52 Slide Credit: Moontae Lee

  39. Distributed Representations Toy Example • Can we interpret each dimension? (C) Dhruv Batra 53 Slide Credit: Moontae Lee

  40. Power of distributed representations! Local Distributed (C) Dhruv Batra 54 Slide Credit: Moontae Lee

  41. Power of distributed representations! • United States:Dollar :: Mexico:? (C) Dhruv Batra 55 Slide Credit: Moontae Lee

  42. ThisPlusThat.me Image Credit: (C) Dhruv Batra 56 http://insightdatascience.com/blog/thisplusthat_a_search_engine_that_lets_you_add_words_as_vectors.html

  43. So what is Deep (Machine) Learning? • A few different ideas: • (Hierarchical) Compositionality – Cascade of non-linear transformations – Multiple layers of representations • End-to-End Learning – Learning (goal-driven) representations – Learning to feature extraction • Distributed Representations – No single neuron “encodes” everything – Groups of neurons work together (C) Dhruv Batra 57

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend