Neural Networks Many slides attributable to: Prof. Mike Hughes - - PowerPoint PPT Presentation

neural networks
SMART_READER_LITE
LIVE PREVIEW

Neural Networks Many slides attributable to: Prof. Mike Hughes - - PowerPoint PPT Presentation

Tufts COMP 135: Introduction to Machine Learning https://www.cs.tufts.edu/comp/135/2019s/ Neural Networks Many slides attributable to: Prof. Mike Hughes Erik Sudderth (UCI), Emily Fox (UW), Finale Doshi-Velez (Harvard) James, Witten, Hastie,


slide-1
SLIDE 1

Neural Networks

2

Tufts COMP 135: Introduction to Machine Learning https://www.cs.tufts.edu/comp/135/2019s/

Many slides attributable to: Erik Sudderth (UCI), Emily Fox (UW), Finale Doshi-Velez (Harvard) James, Witten, Hastie, Tibshirani (ISL/ESL books)

  • Prof. Mike Hughes
slide-2
SLIDE 2

Objectives Today: Neural Networks day 10

  • How to learn feature representations
  • Feed-forward neural nets
  • Single neuron = linear function + activation
  • Multi-layer perceptrons (MLPs)
  • Universal approximation
  • The Rise of Deep Learning:
  • Success stories on Images and Language

3

Mike Hughes - Tufts COMP 135 - Fall 2020

slide-3
SLIDE 3

What will we learn?

4

Mike Hughes - Tufts COMP 135 - Fall 2020

Supervised Learning Unsupervised Learning Reinforcement Learning

Data, Label Pairs Performance measure Task data x label y

{xn, yn}N

n=1

Training Prediction Evaluation

slide-4
SLIDE 4

5

Mike Hughes - Tufts COMP 135 - Fall 2020

y

x2 x1

is a binary variable (red or blue)

Supervised Learning

binary classification

Unsupervised Learning Reinforcement Learning

Task: Binary Classification

slide-5
SLIDE 5

Example: Hotdog or Not

6

Mike Hughes - Tufts COMP 135 - Fall 2020

https://www.theverge.com/tldr/2017/5/14/15639784/hbo- silicon-valley-not-hotdog-app-download

slide-6
SLIDE 6

Text Sentiment Classification

7

Mike Hughes - Tufts COMP 135 - Fall 2020

slide-7
SLIDE 7

Image Classification

8

Mike Hughes - Tufts COMP 135 - Fall 2020

slide-8
SLIDE 8

Feature, Label Pairs

Feature Transform Pipeline

9

Mike Hughes - Tufts COMP 135 - Fall 2020

data x label y Data, Label Pairs Performance measure

{xn, yn}N

n=1

Task

φ(x)

{φ(xn), yn}N

n=1

slide-9
SLIDE 9

Predicted Probas vs Binary Labels

10

Mike Hughes - Tufts COMP 135 - Fall 2020

slide-10
SLIDE 10

Decision Boundary is Linear

11

Mike Hughes - Tufts COMP 135 - Fall 2020

{x ∈ R2 : σ(wT ˜ x) = 0.5} ←→ {x ∈ R2 : wT ˜ x = 0}

slide-11
SLIDE 11

Logistic Regr. Network Diagram

12

Mike Hughes - Tufts COMP 135 - Fall 2020 Credit: Emily Fox (UW) https://courses.cs.washington.edu/courses/cse41 6/18sp/slides/

0 or 1

slide-12
SLIDE 12

A “Neuron” or “Perceptron” Unit

13

Mike Hughes - Tufts COMP 135 - Fall 2020

Linear function with weights w Non-linear activation function

Credit: Emily Fox (UW)

slide-13
SLIDE 13

“Inspired” by brain biology

14

Mike Hughes - Tufts COMP 135 - Fall 2020 Slide Credit: Bhiksha Raj (CMU)

slide-14
SLIDE 14

Challenge: Find w for these functions

15

Mike Hughes - Tufts COMP 135 - Fall 2020

X_1 X_2 y 1 1 1 1 1 1 1 X_1 X_2 y 1 1 1 1 1

Credit: Emily Fox (UW)

slide-15
SLIDE 15

Challenge: Find w for these functions

16

Mike Hughes - Tufts COMP 135 - Fall 2020

X_1 X_2 y 1 1 1 1 1 1 1 X_1 X_2 y 1 1 1 1 1

Credit: Emily Fox (UW)

slide-16
SLIDE 16

What we can’t do with linear decision boundary classifiers

17

Mike Hughes - Tufts COMP 135 - Fall 2020

X_1 X_2 y 1 1 1 1 1 1

slide-17
SLIDE 17

Idea: Compose Neurons together!

18

Mike Hughes - Tufts COMP 135 - Fall 2020

φ(x)

x

y

transform classify

slide-18
SLIDE 18

Can you find w to solve XOR?

19

Mike Hughes - Tufts COMP 135 - Fall 2020

? ? ? ? ? ? ? ? ? ? ?

AND/ OR

slide-19
SLIDE 19

Can you find w to solve XOR?

20

Mike Hughes - Tufts COMP 135 - Fall 2020

? ? ? ? ? ? ? ? ?

slide-20
SLIDE 20

Can you find w to solve XOR?

21

Mike Hughes - Tufts COMP 135 - Fall 2020

slide-21
SLIDE 21

1D Input + 3 hidden units

22

Mike Hughes - Tufts COMP 135 - Fall 2020

slide-22
SLIDE 22

1D Input + 3 hidden units

23

Mike Hughes - Tufts COMP 135 - Fall 2020

f(x1) f(x1)

Example functions (before final threshold)

Intuition: Piece-wise step function Partitioning input space into regions

slide-23
SLIDE 23

MLPs can approximate any functions with enough hidden units!

24

Mike Hughes - Tufts COMP 135 - Fall 2020

slide-24
SLIDE 24

Neuron Design

25

Mike Hughes - Tufts COMP 135 - Fall 2020

Linear function with weights w Non-linear activation function

Credit: Emily Fox (UW)

What’s wrong with hard step activation function?

slide-25
SLIDE 25

Neuron Design

26

Mike Hughes - Tufts COMP 135 - Fall 2020

Linear function with weights w Non-linear activation function

Credit: Emily Fox (UW)

What’s wrong with hard step activation function?

Not smooth! Gradient is zero almost everywhere, so hard to train weights!

slide-26
SLIDE 26

Which Activation Function?

27

Mike Hughes - Tufts COMP 135 - Fall 2020

Linear function with weights w Non-linear activation function

Credit: Emily Fox (UW)

slide-27
SLIDE 27

Activation Functions Advice

28

Mike Hughes - Tufts COMP 135 - Fall 2020 Credit: Emily Fox (UW)

slide-28
SLIDE 28

Exciting Applications: Computer Vision

29

Mike Hughes - Tufts COMP 135 - Fall 2020

slide-29
SLIDE 29

Object Recognition from Images

30

Mike Hughes - Tufts COMP 135 - Fall 2020

slide-30
SLIDE 30

Deep Neural Networks for Object Recognition

31

Mike Hughes - Tufts COMP 135 - Fall 2020

Scores for each possible object category

Decision:

“leopard”

slide-31
SLIDE 31

Deep Neural Networks for Object Recognition

32

Mike Hughes - Tufts COMP 135 - Fall 2020

Decision:

“mushroom”

Scores for each possible object category

slide-32
SLIDE 32

Each Layer Extracts “Higher Level” Features

33

Mike Hughes - Tufts COMP 135 - Fall 2020

slide-33
SLIDE 33

More layers = less error!

34

Mike Hughes - Tufts COMP 135 - Fall 2020

Credit: KDD Tutorial by Sun, Xiao, & Choi: http://dl4health.org/ Figure idea originally from He et. al., CVPR 2016

top 5 classification error (%)

shallow 2010 2011 2012 2013 2014 2015

ImageNet challenge 1000 categories, 1.2 million images in training set

slide-34
SLIDE 34

2012 ImageNet Challenge Winner

35

Mike Hughes - Tufts COMP 135 - Fall 2020

slide-35
SLIDE 35

State of the art Results

36

Mike Hughes - Tufts COMP 135 - Fall 2020

slide-36
SLIDE 36

Semantic Segmentation

37

Mike Hughes - Tufts COMP 135 - Fall 2020

slide-37
SLIDE 37

Object Detection

38

Mike Hughes - Tufts COMP 135 - Fall 2020

slide-38
SLIDE 38

39

Mike Hughes - Tufts COMP 135 - Fall 2020

Exciting Applications: Natural Language (Spoken and Written)

slide-39
SLIDE 39

Reaching Human Performance in Speech-to-Text

40

Mike Hughes - Tufts COMP 135 - Fall 2020 https://arxiv.org/pdf/1610.05256.pdf

slide-40
SLIDE 40

Gains in Translation Quality

41

Mike Hughes - Tufts COMP 135 - Fall 2020 https://ai.googleblog.com/2016/09/a-neural-network-for-machine.html

slide-41
SLIDE 41

Any Disadvantages?

42

Mike Hughes - Tufts COMP 135 - Fall 2020

slide-42
SLIDE 42

Deep Neural Networks can overfit!

43

Mike Hughes - Tufts COMP 135 - Fall 2020

Many layers Many units / layer Overfitting 1 layer Few units / layer Underfitting

slide-43
SLIDE 43

Ways to avoid overfitting

  • More training data!
  • L2 / L1 penalties on weights
  • More tricks (next week) ….
  • Early stopping
  • Dropout
  • Data augmentation

44

Mike Hughes - Tufts COMP 135 - Fall 2020

slide-44
SLIDE 44

Objectives Today: Neural Networks Unit 1/2

45

Mike Hughes - Tufts COMP 135 - Fall 2020

  • How to learn feature representations
  • Feed-forward neural nets
  • Single neuron = linear function + activation
  • Multi-layer perceptrons (MLPs)
  • Universal approximation
  • The Rise of Deep Learning:
  • Success stories on Images and Language