neural networks continue
play

Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim - PowerPoint PPT Presentation

Machine Learning for Signal Processing Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 1 So what are neural networks?? Voice Image N.Net N.Net Text caption Transcription signal Game N.Net Next move


  1. ImageNet • 1.2 million high-resolution images from ImageNet LSVRC-2010 contest • 1000 different classes (sofmax layer) • NN configuration • NN contains 60 million parameters and 650,000 neurons, • 5 convolutional layers, some of which are followed by max-pooling layers • 3 fully-connected layers Krizhevsky, A., Sutskever, I. and Hinton, G. E . “ImageNet Classification with Deep Convolutional Neural Networks” NIPS 2012: Neural Information Processing Systems, Lake Tahoe, Nevada

  2. ImageNet Figure 3: 96 convolutional kernels of size 11×11×3 learned by the first convolutional layer on the 224×224×3 input images. The top 48 kernels were learned on GPU 1 while the bottom 48 kernels were learned on GPU 2. See Section 6.1 for details. Krizhevsky, A., Sutskever, I. and Hinton, G. E . “ImageNet Classification with Deep Convolutional Neural Networks” NIPS 2012: Neural Information Processing Systems, Lake Tahoe, Nevada

  3. ImageNet Five ILSVRC-2010 test images in the first Eight ILSVRC-2010 test images and the five column. The remaining columns show the six labels considered most probable by our model. training images that produce feature vectors in The correct label is written under each image, the last hidden layer with the smallest Euclidean and the probability assigned to the correct label distance from the feature vector for the test is also shown with a red bar (if it happens to be image. in the top 5). Krizhevsky, A., Sutskever, I. and Hinton, G. E . “ImageNet Classification with Deep Convolutional Neural Networks” NIPS 2012: Neural Information Processing Systems, Lake Tahoe, Nevada

  4. CNN for Automatic Speech Recognition • Convolution over frequencies • Convolution over time

  5. CNN-Recap Feature maps • Neural network with specialized connectivity structure Pooling • Feed-forward: - Convolve input - Non-linearity (rectified linear) Non-linearity - Pooling (local max) • Supervised training Convolution • Train convolutional filters by back-propagating error (Learned) • Convolution over time • Adding memory to classical MLP network Input image • Recurrent neural network

  6. Recurrent Neural Networks (RNNs) Recurrent Neural Network Recurrent networks introduce (RNN) cycles and a notion of time. 𝑦 𝑢 𝑧 𝑢 ℎ 𝑢−1 ℎ 𝑢 One-step delay • They are designed to process sequences of data 𝑦 1 , … , 𝑦 𝑜 and can produce sequences of outputs 𝑧 1 , … , 𝑧 𝑛 .

  7. Elman Nets (1990) – Simple Recurrent Neural Networks • Elman nets are feed forward networks with partial recurrence • Unlike feed forward nets, Elman nets have a memory or sense of time • Can also be viewed as a “Markovian” NN

  8. (Vanilla) Recurrent Neural Network Simple Recurrent Neural Network The state consists of a single “hidden” vector h : 𝑦 𝑢 𝑧 𝑢 ℎ 𝑢−1 ℎ 𝑢 One-step delay

  9. Unrolling RNNs Recurrent Neural Network RNNs can be unrolled across multiple time steps. 𝑦 𝑢 𝑧 𝑢 ℎ 𝑢−1 ℎ 𝑢 𝑧 0 𝑧 1 𝑧 2 ℎ 0 ℎ 1 ℎ 2 One-step delay This produces a DAG which supports backpropagation. 𝑦 0 𝑦 1 𝑦 2 But its size depends on the input sequence length.

  10. Learning time sequences • Recurrent networks have one more or more feedback loops • There are many tasks that require learning a temporal sequence of events – Speech, video, Text, Market • These problems can be broken into 3 distinct types of tasks 1. Sequence Recognition: Produce a particular output pattern when a specific input sequence is seen. Applications: speech recognition 2. Sequence Reproduction: Generate the rest of a sequence when the network sees only part of the sequence. Applications: Time series prediction (stock market, sun spots, etc) 3. Temporal Association : Produce a particular output sequence in response to a specific input sequence. Applications: speech generation

  11. RNN structure Recurrent Neural Network Often layers are stacked vertically (deep RNNs): 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 Same parameters at this level 𝑦 00 𝑦 02 𝑦 01  Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 Same parameters level at this level features 𝑦 0 𝑦 1 𝑦 2 Time

  12. RNN structure Recurrent Neural Network Backprop still works: (it called Backpropagation Through Time ) 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 𝑦 00 𝑦 02 𝑦 01 Activations Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 level features 𝑦 0 𝑦 1 𝑦 2 Time

  13. RNN structure Recurrent Neural Network Backprop still works: 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 𝑦 00 𝑦 02 𝑦 01 Activations Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 level features 𝑦 0 𝑦 1 𝑦 2 Time

  14. RNN structure Recurrent Neural Network Backprop still works: 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 𝑦 00 𝑦 02 𝑦 01 Activations Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 level features 𝑦 0 𝑦 1 𝑦 2 Time

  15. RNN structure Recurrent Neural Network Backprop still works: 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 𝑦 00 𝑦 02 𝑦 01 Activations Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 level features 𝑦 0 𝑦 1 𝑦 2 Time

  16. RNN structure Recurrent Neural Network Backprop still works: 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 𝑦 00 𝑦 02 𝑦 01 Activations Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 level features 𝑦 0 𝑦 1 𝑦 2 Time

  17. RNN structure Recurrent Neural Network Backprop still works: 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 𝑦 00 𝑦 02 𝑦 01 Activations Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 level features 𝑦 0 𝑦 1 𝑦 2 Time

  18. RNN structure Recurrent Neural Network Backprop still works: 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 𝑦 00 𝑦 02 𝑦 01 Activations Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 level features 𝑦 0 𝑦 1 𝑦 2 Time

  19. RNN structure Recurrent Neural Network Backprop still works: 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 𝑦 00 𝑦 02 𝑦 01 Gradients Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 level features 𝑦 0 𝑦 1 𝑦 2 Time

  20. RNN structure Recurrent Neural Network Backprop still works: 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 𝑦 00 𝑦 02 𝑦 01 Gradients Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 level features 𝑦 0 𝑦 1 𝑦 2 Time

  21. RNN structure Recurrent Neural Network Backprop still works: 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 𝑦 00 𝑦 02 𝑦 01 Gradients Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 level features 𝑦 0 𝑦 1 𝑦 2 Time

  22. RNN structure Recurrent Neural Network Backprop still works: 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 𝑦 00 𝑦 02 𝑦 01 Gradients Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 level features 𝑦 0 𝑦 1 𝑦 2 Time

  23. RNN structure Recurrent Neural Network Backprop still works: 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 𝑦 00 𝑦 02 𝑦 01 Gradients Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 level features 𝑦 0 𝑦 1 𝑦 2 Time

  24. RNN structure Recurrent Neural Network Backprop still works: 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 𝑦 00 𝑦 02 𝑦 01 Gradients Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 level features 𝑦 0 𝑦 1 𝑦 2 Time

  25. RNN structure Recurrent Neural Network Backprop still works: 𝑧 10 𝑧 12 𝑧 11 ℎ 10 ℎ 12 ℎ 11 𝑦 00 𝑦 02 𝑦 01 Gradients Abstraction 𝑧 00 𝑧 02 𝑧 01 ℎ 00 - Higher ℎ 02 ℎ 01 level features 𝑦 0 𝑦 1 𝑦 2 Time

  26. The memory problem with RNN • RNN models signal context • If very long context is used -> RNNs become unable to learn the context information

  27. Standard RNNs to LSTM Standard LSTM

  28. LSTM illustrated: input and forming new memory Cell state LSTM cell takes the following input • the input 𝑦 𝑢 • Forget gate past memory output ℎ 𝑢−1 • past memory 𝐷 𝑢−1 Input gate (all vectors) New memory

  29. LSTM illustrated: Output • Forming the output of the cell by using output gate Overall picture:

  30. LSTM Equations 𝑗 = 𝜏 𝑦 𝑢 𝑉 𝑗 + 𝑡 𝑢−1 𝑋 𝑗 • 𝑔 = 𝜏 𝑦 𝑢 𝑉 𝑔 + 𝑡 𝑢−1 𝑋 𝑔 • 𝑝 = 𝜏 𝑦 𝑢 𝑉 𝑝 + 𝑡 𝑢−1 𝑋 𝑝 • • 𝒋: input gate, how much of the new 𝑕 = tanh 𝑦 𝑢 𝑉 𝑕 + 𝑡 𝑢−1 𝑋 𝑕 • information will be let through the memory • 𝑑 𝑢 = 𝑑 𝑢−1 ∘ 𝑔 + 𝑕 ∘ 𝑗 cell. • 𝑡 𝑢 = tanh 𝑑 𝑢 ∘ 𝑝 • 𝒈 : forget gate, responsible for information • 𝑧 = 𝑡𝑝𝑔𝑢𝑛𝑏𝑦 𝑊𝑡 𝑢 should be thrown away from memory cell. • 𝒑: output gate, how much of the information will be passed to expose to the next time step. • 𝒉: self-recurrent which is equal to standard RNN • 𝒅 𝒖 : internal memory of the memory cell LSTM Memory Cell • 𝒕 𝒖 : hidden state • 𝐳 : final output 92

  31. LSTM output synchronization

  32. (NLP) Applications of RNNs • Section overview – Language Model – Sentiment analysis / text classification – Machine translation and conversation modeling – Sentence skip-thought vectors

  33. RNN for

  34. Sentiment analysis / text classification • A quick example, to see the idea. • Given text collections and their labels. Predict labels for unseen texts.

  35. Translating Videos to Natural Language Using Deep Recurrent Neural Networks Translating Videos to Natural Language Using Deep Recurrent Neural Networks Subhashini Venugopalan, Huijun Xu, Jeff Donahue, Marcus Rohrbach, Raymond Mooney, Kate Saenko North American Chapter of the Association for Computational Linguistics, Denver, Colorado, June 2015.

  36. Composing music with RNN http://www.hexahedria.com/2015/08/03/composing-music-with-recurrent-neural-networks/

  37. CNN-LSTM-DNN for speech recognition • Ensembles of RNN/LSTM, DNN, & Conv Nets (CNN) give huge gains (state of the art): • T. Sainath, O. Vinyals, A. Senior, H. Sak. “ Convolutional, Long Short-Term Memory, Fully Connected Deep Neural Networks,” ICASSP 2015.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend