Basis of CNN and RNN School of Data Science, Fudan - - PowerPoint PPT Presentation

basis of cnn and rnn
SMART_READER_LITE
LIVE PREVIEW

Basis of CNN and RNN School of Data Science, Fudan - - PowerPoint PPT Presentation

DATA130006 Text Management and Analysis Basis of CNN and RNN School of Data Science, Fudan University Dec. 27 th , 2017 Linear score function Neural networks: Architectures Neuron Activation Functions


slide-1
SLIDE 1

复旦大学大数据学院

School of Data Science, Fudan University

DATA130006 Text Management and Analysis

Basis of CNN and RNN

魏忠钰

  • Dec. 27th, 2017
slide-2
SLIDE 2

Linear score function

slide-3
SLIDE 3

Neural networks: Architectures

slide-4
SLIDE 4

Neuron

slide-5
SLIDE 5

Activation Functions

slide-6
SLIDE 6

Formal Definition of Neural Network

§ Definition: § L : Number of Layers; § 𝑜": Number of neurons in 𝑚$% layer; size of the hidden state § 𝑔

"(): Activation function in 𝑚$% layer;

§ W(") ∈ R,-.,-/0 weight matrix between 𝑚 − 1$% layer and 𝑚$% layer § 𝑐(") ∈ R,- bias vector between 𝑚 − 1$% layer and 𝑚$% layer § 𝑨(") ∈ R,- state vector of neurons in 𝑚$% layer § a(") ∈ R,- activation vector of neurons in 𝑚$% layer

𝑨(") = 𝑋" ∗ 𝑏(";<) + 𝑐" 𝑏(") = 𝑔

"(𝑨("))

slide-7
SLIDE 7

Example feed-forward computation of a neural network

slide-8
SLIDE 8

Outline

§ Forward Neural Networks § Convolutional Neural Networks

slide-9
SLIDE 9

Fully Connected Layer

slide-10
SLIDE 10

Convolutional Neural Networks

slide-11
SLIDE 11

Convolution Layer

slide-12
SLIDE 12

Convolution Layer

slide-13
SLIDE 13

Convolution Layer

slide-14
SLIDE 14

Convolution Layer

slide-15
SLIDE 15

Convolution Layer

slide-16
SLIDE 16

Convolution Layer

§ Consider a second, green filter

slide-17
SLIDE 17

Convolution Layer

slide-18
SLIDE 18

Convolutional Neural Network

§ ConvNet is a sequence of Convolution Layers, interspersed with activation functions

slide-19
SLIDE 19

Convolutional Neural Network

§ ConvNet is a sequence of Convolution Layers, interspersed with activation functions

slide-20
SLIDE 20

VGG Net Visualization

slide-21
SLIDE 21

Example of Spatial dimensions

slide-22
SLIDE 22

Example - Convolution

slide-23
SLIDE 23

Example - Convolution

slide-24
SLIDE 24

Example - Convolution

slide-25
SLIDE 25

Example - Convolution

slide-26
SLIDE 26

Example - Convolution

slide-27
SLIDE 27

Example - Convolution

slide-28
SLIDE 28

Example - Convolution

slide-29
SLIDE 29

Example - Convolution

slide-30
SLIDE 30

Example - Convolution

slide-31
SLIDE 31

Example - Convolution

slide-32
SLIDE 32

Example - Convolution

slide-33
SLIDE 33

Padding

slide-34
SLIDE 34

Padding

slide-35
SLIDE 35

Convolutional Neural Networks

slide-36
SLIDE 36

Examples :

§Input volume: 32X32X3 §10 filters 5X5 with stride 1, pad 2 §What is the volume size of output? §(32+2*2-5)/1 + 1 = 32 spatially, so 32X32X10 §How about the number of parameters? §Each filter has 5*5*3 + 1 = 76 params (+1 for bias) § à 76*10 = 760

slide-37
SLIDE 37

Fully Connected Layers V.S. Convolutional Layer

slide-38
SLIDE 38

Pooling layer

§ Makes the representations smaller and more manageable § Operates over each activation map independently:

slide-39
SLIDE 39

Pooling Layer

§ It is common to periodically insert a pooling layer in-between successive convolutional layers

§ Progressively reduce the spatial size of the representation § Reduce the amount of parameters and computation in the network § Avoid overfitting

slide-40
SLIDE 40

Max Pooling

slide-41
SLIDE 41

Alpha Go

slide-42
SLIDE 42

Alpha Go

slide-43
SLIDE 43

General Neural Architectures for NLP

  • 1. Represent the

words/features with dense vectors (embeddings) by lookup table’

  • 2. Concatenate the vectors
  • 3. Multi-layer neural

networks

§ Classification § Matching § ranking

  • R. Collobert et al. “Natural language processing (almost) from scratch”
slide-44
SLIDE 44

CNN for Sentence Modeling

§Input: A sentence of Length n, §After lookup layer, 𝑌 = [𝑦<, 𝑦B, … , 𝑦,] ∈ 𝑆F×, §Variable-length input §Convolution §Pooling

slide-45
SLIDE 45

CNN for Sentence Modeling

slide-46
SLIDE 46

Sentiment Analysis using CNN

slide-47
SLIDE 47

Outline

§ Forward Neural Networks § Convolutional Neural Networks § Recurrent Neural Networks

slide-48
SLIDE 48

Recurrent Neural Networks: Process Sequences

Vanilla Image Captioning Classification Machine Translation Sequence Labeling

slide-49
SLIDE 49

Recurrent Neural Network

slide-50
SLIDE 50

Recurrent Neural Network

§ We can process a sequence of vectors x by applying a recurrent formula at every time step § Notice: the same function and the same set of parameters are used at every time step.

slide-51
SLIDE 51

(Vanilla) Recurrent Neural Network

§ The state consists of a single “hidden” vector h:

slide-52
SLIDE 52

Unfolded RNN: Computational Graph

slide-53
SLIDE 53

Unfolded RNN: Computational Graph

§ Re-use the same weight matrix at every time-step

slide-54
SLIDE 54

RNN Computational Graph

slide-55
SLIDE 55

Sequence to Sequence

§ Many-to-one + one-to-many

slide-56
SLIDE 56

Sequence to Sequence

slide-57
SLIDE 57

Attention Mechanism

slide-58
SLIDE 58

Example: Character-level Language Model

slide-59
SLIDE 59

Example: Character-level Language Model

slide-60
SLIDE 60

Example: Character-level Language Model

slide-61
SLIDE 61

Example: Character-level Language Model

slide-62
SLIDE 62

Example Image Captioning

slide-63
SLIDE 63

Example Image Captioning

slide-64
SLIDE 64

Example Image Captioning

slide-65
SLIDE 65

Example Image Captioning

slide-66
SLIDE 66

Example Image Captioning

slide-67
SLIDE 67

Example Image Captioning

slide-68
SLIDE 68

Example Image Captioning

slide-69
SLIDE 69

Example Image Captioning

slide-70
SLIDE 70

Example Image Captioning

slide-71
SLIDE 71

Example Image Captioning

slide-72
SLIDE 72

Example Image Captioning

slide-73
SLIDE 73

Example Image Captioning