DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2019 // JOY - - PowerPoint PPT Presentation

data analytics using deep learning
SMART_READER_LITE
LIVE PREVIEW

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2019 // JOY - - PowerPoint PPT Presentation

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2019 // JOY ARULRAJ L E C T U R E # 0 1 : C O U R S E I N T R O D U C T I O N WELCOME TO 8803-DDL This is a cross-cutting course! Gain holistic understanding of three areas Data


slide-1
SLIDE 1

DATA ANALYTICS USING DEEP LEARNING

GT 8803 // FALL 2019 // JOY ARULRAJ

L E C T U R E # 0 1 : C O U R S E I N T R O D U C T I O N

slide-2
SLIDE 2

GT 8803 // Fall 2019

WELCOME TO 8803-DDL

  • This is a cross-cutting course!
  • Gain holistic understanding of three areas

– Data Analytics – Machine Learning – Computer Vision

  • Bridge the gap between systems and

machine learning

2

slide-3
SLIDE 3

GT 8803 // Fall 2019

CREDITS

  • This course is derived from two courses
  • Convolutional Neural Networks for Visual

Recognition

– Fei Fei Li, Andrej Karpathy, and Justin Johnson – http://cs231n.stanford.edu/

  • Advanced database systems

– Andy Pavlo – https://15721.courses.cs.cmu.edu/

3

slide-4
SLIDE 4

GT 8803 // Fall 2019

TODAY’S AGENDA

  • Course Overview
  • Course Objectives
  • Course Logistics
  • History of Computer Vision
  • Visual Recognition Overview

4

slide-5
SLIDE 5

GT 8803 // Fall 2018 5

COURSE OVERVIEW

slide-6
SLIDE 6

GT 8803 // Fall 2019

BIG DATA & DATA SCIENCE ERA

  • Visual data is the biggest Big Data out there

6

Millions of images uploaded EVERY day Hours of videoS uploaded every minute

slide-7
SLIDE 7

GT 8803 // Fall 2019

NEXT-GENERATION APPS

  • Apps will focus on visual data

7

SELF-DRIVING CARS SPORTS ANALYTICS

slide-8
SLIDE 8

GT 8803 // Fall 2019

CHALLENGES: TRADITIONAL DATABASE SYSTEMS

  • Traditional database systems only support

structured data

8

EMPLOYEE ID NAME AGE SALARY 101 PETER 25 100K 102 JOHN 20 80K 103 MARK 30 120K

slide-9
SLIDE 9

GT 8803 // Fall 2019

WHY IS THIS IMPORTANT NOW?

  • Modern computer vision techniques have

made great strides

– Near human-levels of accuracy for several visual data analytics tasks

9

slide-10
SLIDE 10

GT 8803 // Fall 2019

22 K categories and 15 M images

www.image-net.org

Deng, Dong, Socher, Li, Li, & Fei-Fei, 2009

Animals

  • Bird
  • Fish
  • Mammal
  • Invertebrate

P lants

  • Tree
  • Flower
  • Food
  • Materials
  • Structures
  • Artifact
  • Tools
  • Appliances
  • Structures
  • Person
  • Scenes
  • Indoor
  • GeologicalFormations
  • SportActivities

EXAMPLE: IMAGE CLASSIFICATION

10

slide-11
SLIDE 11

GT 8803 // Fall 2019

www.image-net.org

OUTPUT: Scale T-shirt Steel drum Drumstick Mud turtle OUTPUT: Scale T-shirt Giant Panda Drumstick Mud turtle

EXAMPLE: IMAGE CLASSIFICATION

11

slide-12
SLIDE 12

GT 8803 // Fall 2019

www.image-net.org

Russakovskyet al., IJCV2015

EXAMPLE: IMAGE CLASSIFICATION

12

slide-13
SLIDE 13

GT 8803 // Fall 2019

CHALLENGES: DEEP LEARNING MODELS

  • Computational Efficiency
  • Usability

13

slide-14
SLIDE 14

GT 8803 // Fall 2019

CHALLENGES: COMPUTER VISION PIPELINES

  • Computational Efficiency

– These pipelines are computationally infeasible at scale – Example: State-of-the-art object detection models run at 3 frames per second (fps) (e.g., Mask R-CNN) – It will take 8 decades of GPU time to process 100 cameras over a month of video.

14

slide-15
SLIDE 15

GT 8803 // Fall 2019

CHALLENGES: COMPUTER VISION PIPELINES

  • Usability

– These techniques require complex, imperative programming across many low-level libraries (e.g., Pytorch and OpenCV) – This is an ad-hoc, tedious process that ignores

  • pportunity for cross-operator optimization

– Traditional database systems were successful due to their ease of use (i.e., SQL is declarative)

15

slide-16
SLIDE 16

GT 8803 // Fall 2019

GOAL: DECLARATIVE VIDEO ANALYTICS SYSTEM

16

slide-17
SLIDE 17

GT 8803 // Fall 2018 17

COURSE OBJECTIVES

slide-18
SLIDE 18

GT 8803 // Fall 2019

WHY SHOULD YOU TAKE THIS COURSE?

  • There are many challenging problems in

database systems & machine learning

  • Systems + ML developers are in demand
  • If you are good enough to write code for a

ML-driven data analytics system, then you can write code on almost anything else

18

slide-19
SLIDE 19

GT 8803 // Fall 2019

COURSE OBJECTIVES

  • Learn about cutting-edge research topics in

data analytics and deep learning

  • Learn about modern practices in systems

programming and deep learning

  • We will cover state-of-the-art topics
  • This is not a course on classical database

systems

19

slide-20
SLIDE 20

GT 8803 // Fall 2019

PRE-REQUISITES

  • Proficiency in Python and some high-level

familiarity with C++

– All assignments will be in Python; but some of the deep learning libraries we may look at later in the class will be written in C++ – A Python tutorial is available on course website

  • Calculus, Linear Algebra
  • Basic Probability and Statistics

20

slide-21
SLIDE 21

GT 8803 // Fall 2019

PRE-REQUISITES

  • Fundamentals of Machine Learning

– We will be formulating cost functions, taking derivatives and performing optimization with gradient descent

  • I am happy to have people from different

backgrounds

– But talk to me if you’re not sure

21

slide-22
SLIDE 22

GT 8803 // Fall 2018 22

COURSE LOGISTICS

slide-23
SLIDE 23

GT 8803 // Fall 2019

COURSE LOGISTICS

  • Office: Klaus 3324
  • On-line Discussion through Piazza:

– https://piazza.com/gatech/fall2019/cs8803ddl/home

  • For all technical questions, please use Piazza

– Don’t email me directly – All non-technical questions should be sent to me

23

slide-24
SLIDE 24

GT 8803 // Fall 2019

COURSE LOGISTICS

  • Course Schedule

– https://www.cc.gatech.edu/~jarulraj/courses/8803

  • f19/pages/schedule.html

– We will post lecture slides and course materials on this page

  • Course Policies

– Students are expected to abide by the Georgia Tech Honor Code – If you are not sure, ask me

24

slide-25
SLIDE 25

GT 8803 // Fall 2019

COURSE LOGISTICS

  • Grading Tool: Gradescope

– https://www.gradescope.com/courses/54455 – You will get immediate feedback on your programming assignments – You can iteratively improve your score over time

25

slide-26
SLIDE 26

GT 8803 // Fall 2019

GRADE BREAKDOWN

  • The final grade for the course will be

tentatively based on the following weights:

– 30% Assignments – 30% Midterm Exam – 40% Group Project

  • Emphasis on learning rather than testing you

– If your project is truly amazing, you get an automatic A!

26

slide-27
SLIDE 27

GT 8803 // Fall 2019

TEACHING ASSISTANTS

  • TA #1: Jaeho Bang

– Ph.D. student in Computer Science – B.S. from Carnegie Mellon

  • TA #2: TBD

27

slide-28
SLIDE 28

GT 8803 // Fall 2019

OFFICE HOURS

  • Immediately before class

– Me: Mon/Wed 3:30 – 4:30 PM – Jaeho: Tue/Thu 3:30 – 4:30 PM – Near my office (Klaus 3324)

  • Things we can talk about

– Questions related to lectures and assignments – Project ideas – Can’t give relationship advice

28

slide-29
SLIDE 29

GT 8803 // Fall 2018 29

HISTORY OF COMPUTER VISION

slide-30
SLIDE 30

GT 8803 // Fall 2019

EVOLUTION’s BIG BANG

  • ~543 million years

– What was life like back then? – Onset of vision triggered evolution’s Big Bang – Now biggest sensory system in most animals

30

slide-31
SLIDE 31

GT 8803 // Fall 2019

CAMERA OBSCURA

31

DA VINCI (~1500) GEMMA FRISIUS (1545) ENCYCLOPEDIE (~1800)

slide-32
SLIDE 32

GT 8803 // Fall 2019

ELECTROPHYSIOLOGY (1959)

  • Visual processing mechanism in mammals

Stimulus Electricalsignal frombrain Stimulus Response

Simple cells: Response to light orientation Complexcells: Response to light orientation and movement Hypercomplex cells: Responseto movement with an endpoint

Noresponse Response (end point)

slide-33
SLIDE 33

GT 8803 // Fall 2019

BLOCK WORLD (1961)

  • Visual world simplified into geometric shapes

33

(a) Original picture (b) Differentiated picture (c) Feature points selected

slide-34
SLIDE 34

GT 8803 // Fall 2019

PROJECT MAC (1966)

34

slide-35
SLIDE 35

GT 8803 // Fall 2019

STAGES OF VISUAL REPRESENTATION (1970s)

35 Perceived Intensities Zero crossings, edges, bars, ends, virtual lines, groups, curves boundaries Local surface

  • rientation and

discontinuities in depth and surface

  • rientation

3-D models hierarchically

  • rganized in

terms of surface and volumetric primitives

INPUT IMAGE EDGE IMAGE 2.5-D MODEL 3-D MODEL

slide-36
SLIDE 36

GT 8803 // Fall 2019

BETTER REPRESENTATIONS (1970s)

36

GENERALIZED CYLINDER (1979) PICTORIAL STRUCTURE (1973)

slide-37
SLIDE 37

GT 8803 // Fall 2019

OBJECT RECOGNITION (1987)

37

slide-38
SLIDE 38

GT 8803 // Fall 2019

IMAGE SEGMENTATION (1987)

38

slide-39
SLIDE 39

GT 8803 // Fall 2019

FACE DETECTION (2001)

39

slide-40
SLIDE 40

GT 8803 // Fall 2019

FEATURE-BASED OBJECT RECOGNITION (1999)

  • Certain features are invariant to perspective

40

slide-41
SLIDE 41

GT 8803 // Fall 2019

FEATURE MATCHING (2006)

41

LEVEL 0 LEVEL 1 SPATIAL PYRAMID

slide-42
SLIDE 42

GT 8803 // Fall 2019

HUMAN POSE DETECTION (2005)

42

frequency

slide-43
SLIDE 43

GT 8803 // Fall 2019

PASCAL VISUAL OBJECT CHALLENGE (2006~12)

43 Airplane Train Person

20 OBJECT CATEGORIES

slide-44
SLIDE 44

GT 8803 // Fall 2019

22K categories and 15M images

www.image-net.org

Deng, Dong, Socher, Li, Li, & Fei-Fei, 2009

Animals

  • Bird
  • Fish
  • Mammal
  • Invertebrate

P lants

  • Tree
  • Flower
  • Food
  • Materials
  • Structures
  • Artifact
  • Tools
  • Appliances
  • Structures
  • Person
  • Scenes
  • Indoor
  • GeologicalFormations
  • SportActivities

44

IMAGENET CHALLENGE (2009~17)

slide-45
SLIDE 45

GT 8803 // Fall 2019

www.image-net.org

OUTPUT: Scale T-shirt Steel drum Drumstick Mud turtle OUTPUT: Scale T-shirt Giant Panda Drumstick Mud turtle

45

IMAGENET CHALLENGE (2009~17)

slide-46
SLIDE 46

GT 8803 // Fall 2019

www.image-net.org

Russakovskyet al., IJCV2015

IMAGENET CHALLENGE (2009~17)

46

slide-47
SLIDE 47

GT 8803 // Fall 2018 47

VISUAL RECOGNITION OVERVIEW

slide-48
SLIDE 48

GT 8803 // Fall 2019

IMAGE CLASSIFICATION

  • This course will focus on one of the most

fundamental problems of visual recognition

– Image classification

  • This technique can be applied in many ways

48

slide-49
SLIDE 49

GT 8803 // Fall 2019

IMAGE CLASSIFICATION

49

slide-50
SLIDE 50

GT 8803 // Fall 2019

OTHER VISUAL RECOGNITION PROBLEMSS

  • There are many visual recognition problems

related to image classification

– Action classification – Image captioning – Object detection

  • Tools developed for image classification can

be reused for these other problems as well

50

slide-51
SLIDE 51

GT 8803 // Fall 2019

OTHER VISUAL RECOGNITION PROBLEMSS

51 Person Hammer Person Bike Person onBike

ACTION CLASSIFICATION IMAGE CAPTIONING OBJECT DETECTION

slide-52
SLIDE 52

GT 8803 // Fall 2019

CONVOLUTIONAL NEURAL NETWORKS

  • Convolutional Neural Networks (CNNs) have

become an important tool for object recognition

52

slide-53
SLIDE 53

GT 8803 // Fall 2019

IMAGENET CHALLENGE

53 VGG

[Krizhevsky NIPS2012]

Year2012 AlexNet Year2014 Google LeNet Year2010 NEC-UIUC

[Lin C V P R2011] [Szegedy arxiv2014] [Simonyan arxiv2014]

Year2015 MSR Asia

Dense descriptor grid: HOG,LBP Coding: local coordinate, super-vector Pooling,SPM LinearSVM

Image conv-64 conv-64 maxpool conv-128 conv-128 maxpool conv-256 conv-256 maxpool conv-512 conv-512 maxpool fc-4096 fc-4096 fc-1000 softmax conv-512 conv-512 maxpool

Pooling Convolution Softmax Other [He ICCV2015]

slide-54
SLIDE 54

GT 8803 // Fall 2019

CONVOLUTIONAL NEURAL NETWORKS

  • They were not invented overnight

54

slide-55
SLIDE 55

GT 8803 // Fall 2018 55

1998

LeCun et al.

2012

Krizhevsky et al.

107 10

14

106 109

GPUs

Input

K

ImageMaps Convolutions Subsam pling Output Fully Connected

# of TRANSISTORS # of TRANSISTORS # of PIXELS USED IN TRAINING # of PIXELS USED IN TRAINING

slide-56
SLIDE 56

GT 8803 // Fall 2019

INGREDIENTS FOR DEEP LEARNING

56

Algorithms Data

Computation

slide-57
SLIDE 57

GT 8803 // Fall 2019

GIGAFLOPS PER DOLLAR

57

5 10 15 20 25 30 35 40 1/2004 10/2006 7/2009 4/2012 12/2014 9/2017

Time

CPU GPU TPU

GeF

  • rce

GTX580 (AlexNet) GTX 1080Ti GeForce 8800GTX TITAN V (Tensor Cores)

Deep Learning Explosion

GigaFlops Per Dollar

slide-58
SLIDE 58

GT 8803 // Fall 2019

QUEST FOR VISUAL INTELLIGENCE

  • The quest for visual intelligence goes far

beyond object recognition

58

slide-59
SLIDE 59

GT 8803 // Fall 2019

QUEST FOR VISUAL INTELLIGENCE

59 Laptop Glass Desk Wall Wire

Image isGFDL

SEMANTIC SEGMENTATION VIRTUAL REALITY

slide-60
SLIDE 60

GT 8803 // Fall 2019

QUEST FOR VISUAL INTELLIGENCE

60

SCENE GRAPHS

slide-61
SLIDE 61

GT 8803 // Fall 2019

QUEST FOR VISUAL INTELLIGENCE

61

Some kind of game or fight. Two groups of two men? The man

  • n the left is throwing something. Outdoors seemed like

because i have an impression of grass and maybe lines on the grass? That would be why I think perhaps a game, rough game though, more like rugby than football because they pairs weren't in pads and helmets, though I did get the impression of similar clothing. maybe some trees? in the

  • background. (Subject:SM)

PT = 500ms

slide-62
SLIDE 62

GT 8803 // Fall 2019

QUEST FOR VISUAL INTELLIGENCE

62

slide-63
SLIDE 63

GT 8803 // Fall 2018 63

Computer Vision Technology can better our lives

slide-64
SLIDE 64

GT 8803 // Fall 2019

OPTIONAL TEXTBOOK

  • Deep Learning

– Ian Goodfellow et. al. – Free online

64

slide-65
SLIDE 65

GT 8803 // Fall 2019

COURSE PHILOSOPHY

  • Thorough and detailed

– Understand how to develop, train, and debug convolutional neural networks from scratch.

  • Practical

– Focus on practical techniques for training these networks at scale, and on GPUs. Cover deep learning frameworks.

  • State of the art

– Most materials are new from research world.

65

slide-66
SLIDE 66

GT 8803 // Fall 2019

COURSE PHILOSOPHY

  • Fun

– We will cover some fun topics – Image Captioning (using RNN), NeuralStyle, etc.

66

slide-67
SLIDE 67

GT 8803 // Fall 2019

NEXT CLASS: IMAGE CLASSIFICATION

67

K-NEAREST NEIGHBOURS LINEAR CLASSIFIER