multi dimensional lstm networks for video prediction
play

Multi-Dimensional LSTM Networks for Video Prediction Wonmin Byeon - PowerPoint PPT Presentation

Multi-Dimensional LSTM Networks for Video Prediction Multi-Dimensional LSTM Networks for Video Prediction Wonmin Byeon NVIDIA Research March 29, 2018 Wonmin Byeon | NVIDIA Research | March 29, 2018 1 / 44 Multi-Dimensional LSTM Networks for


  1. Multi-Dimensional LSTM Networks for Video Prediction Multi-Dimensional LSTM Networks for Video Prediction Wonmin Byeon NVIDIA Research March 29, 2018 Wonmin Byeon | NVIDIA Research | March 29, 2018 1 / 44

  2. Multi-Dimensional LSTM Networks for Video Prediction Convolutional Neural Networks Sementic Segmentation Biomedical Image Segmentation SegNet [Badrinarayanan16] U-Net [Ronneberger15] Page Segmentation Fluid Simulation / Pressure Solve Fully-CNN [Wick18] CNN [Tompson17] Wonmin Byeon | NVIDIA Research | March 29, 2018 2 / 44

  3. Multi-Dimensional LSTM Networks for Video Prediction Convolutional Neural Networks Sementic Segmentation Biomedical Image Segmentation SegNet [Badrinarayanan16] U-Net [Ronneberger15] • Needs a lot of computations Wonmin Byeon | NVIDIA Research | March 29, 2018 3 / 44

  4. Multi-Dimensional LSTM Networks for Video Prediction Convolutional Neural Networks Sementic Segmentation Biomedical Image Segmentation SegNet [Badrinarayanan16] U-Net [Ronneberger15] • Needs a lot of computations Each window can be computed in parallel An efficient GPU implementation is possible Wonmin Byeon | NVIDIA Research | March 29, 2018 3 / 44

  5. Multi-Dimensional LSTM Networks for Video Prediction Convolutional Neural Networks Sementic Segmentation Biomedical Image Segmentation SegNet [Badrinarayanan16] U-Net [Ronneberger15] • Needs a lot of computations Each window can be computed in parallel An efficient GPU implementation is possible • Has a fixed size of receptive field Wonmin Byeon | NVIDIA Research | March 29, 2018 4 / 44

  6. Multi-Dimensional LSTM Networks for Video Prediction Convolutional Neural Networks Sementic Segmentation Biomedical Image Segmentation SegNet [Badrinarayanan16] U-Net [Ronneberger15] • Needs a lot of computations Each window can be computed in parallel An efficient GPU implementation is possible • Has a fixed size of receptive field • Perceives only small local contexts of the pixels Wonmin Byeon | NVIDIA Research | March 29, 2018 4 / 44

  7. Multi-Dimensional LSTM Networks for Video Prediction Convolutional Neural Networks • Needs a lot of computations Each window can be computed in parallel An efficient GPU implementation is possible • Has a fixed size of receptive field • Perceives only small local contexts of the pixels Images from Zheng’s ECCV16 tutorial Wonmin Byeon | NVIDIA Research | March 29, 2018 5 / 44

  8. Multi-Dimensional LSTM Networks for Video Prediction Convolutional Neural Networks • Needs a lot of computations Each window can be computed in parallel An efficient GPU implementation is possible • Has a fixed size of receptive field • Perceives only small local contexts of the pixels Solutions? Images from Zheng’s ECCV16 tutorial Wonmin Byeon | NVIDIA Research | March 29, 2018 5 / 44

  9. Multi-Dimensional LSTM Networks for Video Prediction Convolutional Neural Networks: solutions Up-pooling (deconvolution) Adding Conditional Random Field (CRF) DeconvNet [Noh16] DeepLab [Chen16] Wonmin Byeon | NVIDIA Research | March 29, 2018 6 / 44

  10. Multi-Dimensional LSTM Networks for Video Prediction Convolutional Neural Networks: solutions? Up-pooling (deconvolution) Adding Conditional Random Field (CRF) DeconvNet [Noh16] DeepLab [Chen16] Using Dilated/Atrous Convolutions Dilated Convolutions [Yu15] DeepLab V2 [Chen16] Animation from https://github.com/vdumoulin/conv_arithmetic Wonmin Byeon | NVIDIA Research | March 29, 2018 7 / 44

  11. Multi-Dimensional LSTM Networks for Video Prediction Convolutional Neural Networks: solutions? Using Dilated Convolutions & Going Deeper DeepLab V3 [Chen17] Wonmin Byeon | NVIDIA Research | March 29, 2018 8 / 44

  12. Multi-Dimensional LSTM Networks for Video Prediction Convolutional Neural Networks: solutions? Using Dilated Convolutions & Going Deeper DeepLab V3 [Chen17] Fusing Multi-Resolutions Adopting Large Kernels RefineNet [Lin16] Global-CNN [Peng17] Wonmin Byeon | NVIDIA Research | March 29, 2018 9 / 44

  13. Multi-Dimensional LSTM Networks for Video Prediction How can we efficiently capture global/long range context? Wonmin Byeon | NVIDIA Research | March 29, 2018 10 / 44

  14. Multi-Dimensional LSTM Networks for Video Prediction How can we efficiently capture global/long range context? Wonmin Byeon | NVIDIA Research | March 29, 2018 10 / 44

  15. Multi-Dimensional LSTM Networks for Video Prediction How can we efficiently capture global/long range context? Wonmin Byeon | NVIDIA Research | March 29, 2018 10 / 44

  16. Multi-Dimensional LSTM Networks for Video Prediction How can we efficiently capture global/long range context? Image from http://staffwww.dcs.shef.ac.uk/people/H.Lu/feeler.html Wonmin Byeon | NVIDIA Research | March 29, 2018 10 / 44

  17. Multi-Dimensional LSTM Networks for Video Prediction Long Short Term Memory Recurrent Networks Wonmin Byeon | NVIDIA Research | March 29, 2018 11 / 44

  18. Multi-Dimensional LSTM Networks for Video Prediction LSTM Networks for Sequence Learning speech [Graves05, Graves06] handwriting [Liwicki07, Graves09] Wonmin Byeon | NVIDIA Research | March 29, 2018 12 / 44

  19. Multi-Dimensional LSTM Networks for Video Prediction Sequence Classification Task with Dependencies mapping x 1 x 2 ... to y 1 y 2 .... x 1 y 1 y 2 x 2 y 3 x 3 y 4 x 4 * * y ∈ Y F : x ∈ X Wonmin Byeon | NVIDIA Research | March 29, 2018 13 / 44

  20. Multi-Dimensional LSTM Networks for Video Prediction Sequence Classification Task with Dependencies mapping x 1 x 2 ... to y 1 y 2 .... h 1 x 1 y 1 y 2 x 2 y 3 x 3 y 4 x 4 … ... input sequence output sequence x * y * Wonmin Byeon | NVIDIA Research | March 29, 2018 13 / 44

  21. Multi-Dimensional LSTM Networks for Video Prediction 1-Dimensional LSTM Networks Standard LSTM [Hochreiter97,Gers99] ... h t − 1 x t − 1 x t y t LSTM x t + 1 ... Input Hidden Layer Output Wonmin Byeon | NVIDIA Research | March 29, 2018 14 / 44

  22. Multi-Dimensional LSTM Networks for Video Prediction 1-Dimensional LSTM Networks Standard LSTM Bidirectional LSTM [Hochreiter97,Gers99] [Graves05, Chen05] ... x t − 1 x t h t − 1 ... h t − 1 x t + 1 x t − 1 LSTM ... x t y t LSTM y t x t + 1 LSTM ... x t − 1 ... h t + 1 x t Input Hidden Layer Output x t + 1 ... Input Hidden Layer Output Wonmin Byeon | NVIDIA Research | March 29, 2018 14 / 44

  23. Multi-Dimensional LSTM Networks for Video Prediction Multi-Dimensional LSTM networks Scene Labeling with LSTM Recurrent Neural Networks [Byeon15] Wonmin Byeon | NVIDIA Research | March 29, 2018 15 / 44

  24. Multi-Dimensional LSTM Networks for Video Prediction 2-Dimensional LSTM Networks for images red: the current pixel Wonmin Byeon | NVIDIA Research | March 29, 2018 16 / 44

  25. Multi-Dimensional LSTM Networks for Video Prediction 2-Dimensional LSTM Networks for images red: the current pixel Wonmin Byeon | NVIDIA Research | March 29, 2018 16 / 44

  26. Multi-Dimensional LSTM Networks for Video Prediction 2-Dimensional LSTM Networks for images red: the current pixel Wonmin Byeon | NVIDIA Research | March 29, 2018 16 / 44

  27. Multi-Dimensional LSTM Networks for Video Prediction 2-Dimensional LSTM Networks for images red: the current pixel Wonmin Byeon | NVIDIA Research | March 29, 2018 16 / 44

  28. Multi-Dimensional LSTM Networks for Video Prediction 2-Dimensional LSTM Networks for images red: the current pixel Wonmin Byeon | NVIDIA Research | March 29, 2018 16 / 44

  29. Multi-Dimensional LSTM Networks for Video Prediction 2-Dimensional LSTM Networks for images red: the current pixel Wonmin Byeon | NVIDIA Research | March 29, 2018 16 / 44

  30. Multi-Dimensional LSTM Networks for Video Prediction 2-Dimensional LSTM Networks for images red: the current pixel Wonmin Byeon | NVIDIA Research | March 29, 2018 16 / 44

  31. Multi-Dimensional LSTM Networks for Video Prediction 2-Dimensional LSTM Networks for images red: the current pixel Wonmin Byeon | NVIDIA Research | March 29, 2018 16 / 44

  32. Multi-Dimensional LSTM Networks for Video Prediction 2-Dimensional LSTM Networks for images red: the current pixel Wonmin Byeon | NVIDIA Research | March 29, 2018 16 / 44

  33. Multi-Dimensional LSTM Networks for Video Prediction 2-Dimensional LSTM Networks for images 3x1x1 3x1x1 LSTM LSTM ... s s Input I k Output 1 1 3 LSTM LSTM 3 3x1x1 3x1x1 s s LSTM Layer Hidden Layer Scene Labeling with LSTM Recurrent Neural Networks [Byeon15] Wonmin Byeon | NVIDIA Research | March 29, 2018 17 / 44

  34. Multi-Dimensional LSTM Networks for Video Prediction 2-Dimensional LSTM Networks for images Perceives the entire spatio-temporal context of each pixel in a few sweeps through all pixels Requires fewer number of 3x1x1 3x1x1 parameters to takes both local LSTM LSTM ... s s Input I k Output 1 and global contexts into account 1 3 LSTM LSTM 3 3x1x1 3x1x1 s s End-to-End learning , No pre- and LSTM Layer post- processing Hidden Layer Scene Labeling with LSTM Recurrent Neural Networks [Byeon15] Wonmin Byeon | NVIDIA Research | March 29, 2018 17 / 44

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend