DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2019 // JOY - PowerPoint PPT Presentation

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2019 // JOY ARULRAJ L E C T U R E # 0 2 : I M A G E C L A S S I F I C A T I O N

Python + Numpy TUTORIAL http://cs231n.github.io/python-numpy-tutorial/ GT 8803 // Fall 2019 2

ASSIGNMENT #0 • Hand in one page with following details – Digital picture (ideally 2x2 inches of face) – Name (last name, first name) – Year in School – Major Field – Final Degree Goal (e.g., B.S., M.S., Ph.D.) – Previous Education (degrees, institutions) – Previous Courses – More details on Gradescope GT 8803 // Fall 2019 3

ASSIGNMENT #0 • The purpose is to help us: – know more about your background for tailoring the course, and – recognize you in class • Due on next Aug 26 (Monday) GT 8803 // Fall 2019 4

LAST CLASS • History of Computer Vision • Overview of Visual Recognition problems – Focus on the Image Classification problem GT 8803 // Fall 2019 5

TODAY’S AGENDA • Image Classification • Nearest Neighbor Classifier • Linear Classifier GT 8803 // Fall 2019 6

IMAGE CLASSIFICATION 7 GT 8803 // Fall 2018

Image Classification: A core CV tasK (assume given set of DISCRETE LABELS) {dog, cat, truck, plane, ...} CAT GT 8803 // Fall 2019 8

The Problem: “Semantic” Gap What the computer sees An image is just a big grid of numbers between [0, 255]: e.g. 800 x 600 x 3 (3 channels RGB) GT 8803 // Fall 2019 9

Challenges: Viewpoint variation All pixels change when the camera moves! GT 8803 // Fall 2019 10

Challenges: Illumination GT 8803 // Fall 2019 11

Challenges: Deformation GT 8803 // Fall 2019 12

Challenges: Occlusion GT 8803 // Fall 2019 13

Challenges: Background Clutter GT 8803 // Fall 2019 14

Challenges: Intraclass variation GT 8803 // Fall 2019 15

Challenges: IMAGE CLASSIFICATION • Hard to appreciate the complexity of this task – Because your brains are tuned for dealing with this – But, this is a fantastically challenging problem for computer programs • It is miraculous that a program works at all in practice – But, it actually works very close to human accuracy (with certain constraints) GT 8803 // Fall 2019 16

An image classifier • Unlike e.g. sorting a list of numbers, – No obvious way to hard-code the algorithm for recognizing a cat, or other classes GT 8803 // Fall 2019 17

RULE-BASED APPROACH Find edges Find corners ? GT 8803 // Fall 2019 18

CHALLENGES: RULE-BASED APPROACH • Challenges – Not robust enough to handle different image transformations – Does not generalize to other classes (e.g., dogs) • Need a robust scalable approach GT 8803 // Fall 2019 19

Data-Driven Approach: MACHINE LEARNING 1. Collect a dataset of images and labels 2. Use machine learning to train a classifier 3. Evaluate the classifier on new images Example training set GT 8803 // Fall 2019 20

NEAREST NEIGHBOR CLASSIFIER 21 GT 8803 // Fall 2018

NEAREST NEIGHBOR CLASSIFIER • This class is primarily about neural networks – But, this data driven approach is more general – We will start with a simpler classifier GT 8803 // Fall 2019 22

First classifier: Nearest Neighbor Memorize all data and labels Predict the label of the most similar training image GT 8803 // Fall 2019 23

Example Dataset: CIFAR10 • 10 classes; 50K training and 10K testing images GT 8803 // Fall 2019 24

Example Dataset: CIFAR10 Test images and nearest neighbors GT 8803 // Fall 2019 25

Distance Metric to compare images L1 DISTANCE add GT 8803 // Fall 2019 26

Nearest Neighbor classifier GT 8803 // Fall 2019 27

Nearest Neighbor classifier Memorize training data GT 8803 // Fall 2019 28

Nearest Neighbor classifier For each test image: Find closest train image Predict label of nearest image GT 8803 // Fall 2019 29

Nearest Neighbor classifier Q: With N examples, how fast are training and prediction? GT 8803 // Fall 2019 30

Nearest Neighbor classifier Q: With N examples, how fast are training and prediction? A : Train O(1), predict O(N) GT 8803 // Fall 2019 31

Nearest Neighbor classifier Q: With N examples, how fast are training and prediction? A : Train O(1), predict O(N) This is bad: we want classifiers that are fast at prediction; slow for training is ok GT 8803 // Fall 2019 32

Nearest Neighbor classifier Many methods exist for fast / approximate nearest neighbor (beyond the scope of this course!) A good implementation: https://github.com/facebookresearch/faiss Johnson et al, “Billion-scale similarity search with GPUs”, arXiv 2017 GT 8803 // Fall 2019 33

What do THE DECISION REGIONS look like? GT 8803 // Fall 2019 34

LIMITATIONS • Island – Yellow island within the green cluster • Fingers – Green region pushing into blue region – Noisy or spurious points • Generalization – Instead of copying label from nearest neighbor, take majority vote from K nearest neighbors (i.e., closest points) GT 8803 // Fall 2019 35

K-Nearest Neighbors K = 1 K = 3 K = 5 GT 8803 // Fall 2019 36

COMPUTER VISION VIEWPOINTS • Whenever we think of computer vision, it is useful to flip between different viewpoints: – Geometric viewpoint: Points in a high- dimensional space – Visual viewpoint: Concrete pixels in images – Algebraic viewpoint: In terms of vectors and matrices • Images are high-dimensional vectors GT 8803 // Fall 2019 37

What doES IT look like? (VISUAL VIEWPOINT) GT 8803 // Fall 2019 38

What doES IT look like? (VISUAL VIEWPOINT) GT 8803 // Fall 2019 39

K-Nearest Neighbors: Distance Metric L1 (MANHATTAN) DISTANCE L2 (EUCLIDEAN) DISTANCE GT 8803 // Fall 2019 40

K-Nearest Neighbors: Distance Metric L1 (MANHATTAN) DISTANCE L2 (EUCLIDEAN) DISTANCE K = 1 K = 1 GT 8803 // Fall 2019 41

K-Nearest Neighbors: DEMO • All examples are from an interactive demo – http://vision.stanford.edu/teaching/cs231n-demos/knn/ GT 8803 // Fall 2019 42

Hyperparameters • What is the best value of K to use? • What is the best distance metric to use? • These are hyperparameters – Choices about the algorithm that we set rather than learn directly from the data GT 8803 // Fall 2019 43

Hyperparameters • What is the best value of K to use? • What is the best distance metric to use? • These are hyperparameters – Choices about the algorithm that we set rather than learn directly from the data – Very problem-dependent. – Must try them all out and see what works best. GT 8803 // Fall 2019 44

Setting Hyperparameters Idea #1 : Choose hyperparameters that work best on the data Your Dataset GT 8803 // Fall 2019 45

Setting Hyperparameters Idea #1 : Choose hyperparameters that BAD : K = 1 always works work best on the data perfectly on training data Your Dataset GT 8803 // Fall 2019 46

Setting Hyperparameters Idea #1 : Choose hyperparameters that BAD : K = 1 always works work best on the data perfectly on training data Your Dataset Idea #2 : Split data into train and test , choose hyperparameters that work best on test data train test GT 8803 // Fall 2019 47

Setting Hyperparameters Idea #1 : Choose hyperparameters that BAD : K = 1 always works work best on the data perfectly on training data Your Dataset Idea #2 : Split data into train and test , choose BAD : No idea how algorithm hyperparameters that work best on test data will perform on new data train test GT 8803 // Fall 2019 48

Setting Hyperparameters Idea #1 : Choose hyperparameters that BAD : K = 1 always works work best on the data perfectly on training data Your Dataset Idea #2 : Split data into train and test , choose BAD : No idea how algorithm hyperparameters that work best on test data will perform on new data train test Idea #3 : Split data into train , val , and test ; choose Better! hyperparameters on val and evaluate on test train validation test GT 8803 // Fall 2019 49

Setting Hyperparameters Your Dataset Idea #4 : Cross-Validation : Split data into folds , try each fold as validation and average the results fold 1 fold 2 fold 3 fold 4 fold 5 test fold 1 fold 2 fold 3 fold 4 fold 5 test fold 1 fold 2 fold 3 fold 4 fold 5 test Useful for small datasets, but not used too frequently in deep learning GT 8803 // Fall 2019 50

Setting Hyperparameters Example of 5-fold cross-validation CROSS-VALIDATION for the value of K. ACCURACY Each point: single outcome. The line goes through the mean, bars indicate standard deviation (Seems that K ~= 7 works best for this data) K GT 8803 // Fall 2019 51

K-NEAREST NEIGHBOR: LIMITATIONS • K-Nearest Neighbor never used on images – Very slow at test time – Distance metrics on pixels are not informative Original Boxed Shifted Tinted (all three images have same L2 distance to the one on the left) GT 8803 // Fall 2019 52

K-NEAREST NEIGHBOR: LIMITATIONS Dimensions = 3 • Curse of dimensionality Points = 4 3 Dimensions = 2 Points = 4 2 Dimensions = 1 Points = 4 GT 8803 // Fall 2019 53

K-NEAREST NEIGHBOR: LIMITATIONS Dimensions = 3 • Curse of dimensionality Points = 4 3 Dimensions = 2 Points = 4 2 Dimensions = 1 Points = 4 GT 8803 // Fall 2019 54

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2019 // JOY - PowerPoint PPT Presentation

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2019 // JOY ARULRAJ L E C T U R E # 0 2 : I M A G E C L A S S I F I C A T I O N Python + Numpy TUTORIAL http://cs231n.github.io/python-numpy-tutorial/ GT 8803 // Fall 2019 2 ASSIGNMENT #0

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Undergraduate Business Analytics Minor Spreadsheet Analytics BANA-2081 Business Analytics

Deep Data Analytics for Pricing: Uses, Issues, and Solutions Walter R. Paczkowski, Ph.D. Data

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2018 // VENKATA KISHORE PATCHA Lecture#16 :

Architecture 3.0 Landscape Analytics Jrgen Dllner Hasso-Plattner-Institut Jrgen

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Google Analytics Overview Whats Google Analytics? The Google Analytics

Document Name Solar Analytics - Rooftop PV energy analytics PREPARED BY: Your Name, Your Title

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Data Mining & Analytics Data Mining Reference Model Data Warehouse Legal and Ethical Issues

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

HVAC System Air Leakage 2019 Presented By: Mark Terzigni Director of Engineering Technical

Drawer Drawer Cabinet Cabinet mm (inches) SLIDE TRAVEL SLIDE LENGTH SLIDE TRAVEL TOTAL

twenty two concrete construction: http:// nisee.berkeley.edu/godden flat spanning systems,

Research Designs I. Use analysis of text to shed light on attitudes and values of the source

Introduction to Data Science: Principles ordered categorical data do not have magnitude

arXiv:1404.1100v1 [cs.LG] 3 Apr 2014 hope is that by addressing both aspects, readers of all

Grid Graphics The grid graphics system is provided by

Codes of Corporate Governance: Germany and Britain Paul Sanderson David Seidl Centre for

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2019 // JOY - PowerPoint PPT Presentation

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2019 // JOY ARULRAJ L E C T U R E # 0 2 : I M A G E C L A S S I F I C A T I O N Python + Numpy TUTORIAL http://cs231n.github.io/python-numpy-tutorial/ GT 8803 // Fall 2019 2 ASSIGNMENT #0

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Undergraduate Business Analytics Minor Spreadsheet Analytics BANA-2081 Business Analytics

Deep Data Analytics for Pricing: Uses, Issues, and Solutions Walter R. Paczkowski, Ph.D. Data

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2018 // VENKATA KISHORE PATCHA Lecture#16 :

Architecture 3.0 Landscape Analytics Jrgen Dllner Hasso-Plattner-Institut Jrgen

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Google Analytics Overview Whats Google Analytics? The Google Analytics

Document Name Solar Analytics - Rooftop PV energy analytics PREPARED BY: Your Name, Your Title

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Data Mining &amp; Analytics Data Mining Reference Model Data Warehouse Legal and Ethical Issues

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

HVAC System Air Leakage 2019 Presented By: Mark Terzigni Director of Engineering Technical

Drawer Drawer Cabinet Cabinet mm (inches) SLIDE TRAVEL SLIDE LENGTH SLIDE TRAVEL TOTAL

twenty two concrete construction: http:// nisee.berkeley.edu/godden flat spanning systems,

Research Designs I. Use analysis of text to shed light on attitudes and values of the source

Introduction to Data Science: Principles ordered categorical data do not have magnitude

arXiv:1404.1100v1 [cs.LG] 3 Apr 2014 hope is that by addressing both aspects, readers of all

Grid Graphics The grid graphics system is provided by

Codes of Corporate Governance: Germany and Britain Paul Sanderson David Seidl Centre for

Data Mining & Analytics Data Mining Reference Model Data Warehouse Legal and Ethical Issues