SLIDE 1 复旦大学大数据学院
School of Data Science, Fudan University
DATA130006 Text Management and Analysis
Basis of CNN and RNN
魏忠钰
SLIDE 2
Linear score function
SLIDE 3
Neural networks: Architectures
SLIDE 4
Neuron
SLIDE 5
Activation Functions
SLIDE 6 Formal Definition of Neural Network
§ Definition: § L : Number of Layers; § 𝑜": Number of neurons in 𝑚$% layer; size of the hidden state § 𝑔
"(): Activation function in 𝑚$% layer;
§ W(") ∈ R,-.,-/0 weight matrix between 𝑚 − 1$% layer and 𝑚$% layer § 𝑐(") ∈ R,- bias vector between 𝑚 − 1$% layer and 𝑚$% layer § 𝑨(") ∈ R,- state vector of neurons in 𝑚$% layer § a(") ∈ R,- activation vector of neurons in 𝑚$% layer
𝑨(") = 𝑋" ∗ 𝑏(";<) + 𝑐" 𝑏(") = 𝑔
"(𝑨("))
SLIDE 7
Example feed-forward computation of a neural network
SLIDE 8
Outline
§ Forward Neural Networks § Convolutional Neural Networks
SLIDE 9
Fully Connected Layer
SLIDE 10
Convolutional Neural Networks
SLIDE 11
Convolution Layer
SLIDE 12
Convolution Layer
SLIDE 13
Convolution Layer
SLIDE 14
Convolution Layer
SLIDE 15
Convolution Layer
SLIDE 16
Convolution Layer
§ Consider a second, green filter
SLIDE 17
Convolution Layer
SLIDE 18
Convolutional Neural Network
§ ConvNet is a sequence of Convolution Layers, interspersed with activation functions
SLIDE 19
Convolutional Neural Network
§ ConvNet is a sequence of Convolution Layers, interspersed with activation functions
SLIDE 20
VGG Net Visualization
SLIDE 21
Example of Spatial dimensions
SLIDE 22
Example - Convolution
SLIDE 23
Example - Convolution
SLIDE 24
Example - Convolution
SLIDE 25
Example - Convolution
SLIDE 26
Example - Convolution
SLIDE 27
Example - Convolution
SLIDE 28
Example - Convolution
SLIDE 29
Example - Convolution
SLIDE 30
Example - Convolution
SLIDE 31
Example - Convolution
SLIDE 32
Example - Convolution
SLIDE 33
Padding
SLIDE 34
Padding
SLIDE 35
Convolutional Neural Networks
SLIDE 36
Examples :
§Input volume: 32X32X3 §10 filters 5X5 with stride 1, pad 2 §What is the volume size of output? §(32+2*2-5)/1 + 1 = 32 spatially, so 32X32X10 §How about the number of parameters? §Each filter has 5*5*3 + 1 = 76 params (+1 for bias) § à 76*10 = 760
SLIDE 37
Fully Connected Layers V.S. Convolutional Layer
SLIDE 38
Pooling layer
§ Makes the representations smaller and more manageable § Operates over each activation map independently:
SLIDE 39
Pooling Layer
§ It is common to periodically insert a pooling layer in-between successive convolutional layers
§ Progressively reduce the spatial size of the representation § Reduce the amount of parameters and computation in the network § Avoid overfitting
SLIDE 40
Max Pooling
SLIDE 41
Alpha Go
SLIDE 42
Alpha Go
SLIDE 43 General Neural Architectures for NLP
words/features with dense vectors (embeddings) by lookup table’
- 2. Concatenate the vectors
- 3. Multi-layer neural
networks
§ Classification § Matching § ranking
- R. Collobert et al. “Natural language processing (almost) from scratch”
SLIDE 44
CNN for Sentence Modeling
§Input: A sentence of Length n, §After lookup layer, 𝑌 = [𝑦<, 𝑦B, … , 𝑦,] ∈ 𝑆F×, §Variable-length input §Convolution §Pooling
SLIDE 45
CNN for Sentence Modeling
SLIDE 46
Sentiment Analysis using CNN
SLIDE 47
Outline
§ Forward Neural Networks § Convolutional Neural Networks § Recurrent Neural Networks
SLIDE 48
Recurrent Neural Networks: Process Sequences
Vanilla Image Captioning Classification Machine Translation Sequence Labeling
SLIDE 49
Recurrent Neural Network
SLIDE 50
Recurrent Neural Network
§ We can process a sequence of vectors x by applying a recurrent formula at every time step § Notice: the same function and the same set of parameters are used at every time step.
SLIDE 51
(Vanilla) Recurrent Neural Network
§ The state consists of a single “hidden” vector h:
SLIDE 52
Unfolded RNN: Computational Graph
SLIDE 53
Unfolded RNN: Computational Graph
§ Re-use the same weight matrix at every time-step
SLIDE 54
RNN Computational Graph
SLIDE 55
Sequence to Sequence
§ Many-to-one + one-to-many
SLIDE 56
Sequence to Sequence
SLIDE 57
Attention Mechanism
SLIDE 58
Example: Character-level Language Model
SLIDE 59
Example: Character-level Language Model
SLIDE 60
Example: Character-level Language Model
SLIDE 61
Example: Character-level Language Model
SLIDE 62
Example Image Captioning
SLIDE 63
Example Image Captioning
SLIDE 64
Example Image Captioning
SLIDE 65
Example Image Captioning
SLIDE 66
Example Image Captioning
SLIDE 67
Example Image Captioning
SLIDE 68
Example Image Captioning
SLIDE 69
Example Image Captioning
SLIDE 70
Example Image Captioning
SLIDE 71
Example Image Captioning
SLIDE 72
Example Image Captioning
SLIDE 73
Example Image Captioning