Bidirectional Recurrent Convolutional Networks for Video - PowerPoint PPT Presentation

Bidirectional Recurrent Convolutional Networks for Video Super-Resolution Qi Zhang & Yan Huang Center for Research on Intelligent Perception and Computing (CRIPAC) National Laboratory of Pattern Recognition (NLPR) Institute of Automation, Chinese Academy of Sciences (CASIA) May 10, 2017

CRIPAC CRIPAC mainly focuses on the following research topics related to national public security. • Biometrics • Image and Video Analysis • Big Data and Multi-modal Computing • Content Security and Authentication • Sensing and Information Acquisition CRAPIC receives regular fundings from various Government departments or agencies. It is also supported by funds of R&D projects from many other national and international sources. CRIPAC members publish widely in leading national and international journals and conferences such as IEEE Transactions on PAMI, IEEE Transactions on Image Processing, International Journal of Computer Vision, Pattern Recognition, Pattern Recognition Letters, ICCV, ECCV, CVPR, ACCV, ICPR, ICIP, etc. http://cripac.ia.ac.cn/en/EN/volumn/home.shtml 2

NVAIL Artificial Intelligence Laboratory Researches on artificial intelligence and deep learning 3

Outline 1 Deep Learning 2 Recurrent Convolutional Networks 3 Application to Video Super-Resolution 4 Future Work 4

Deep Neural Networks (DNN) • Originate from : - 1962 – simple/complex cell, Hubel and Wiesel - 1970 – efficient error backpropagation, Linnainmaa - 1979 – deep neocognitron, convolution, Fukushima - 1987 – autoencoder, Ballard - 1989 – backpropagation for CNN, Lecun - 1991 – fundamental deep learning problem, Hochreiter - 1991 – deep recurrent neural network, Schmidhuber - 1997 – supervised LSTM RNN, Schmidhuber • Two drawbacks: Large numbers of parameters → High computational cost Small training set → Over-fitting problem 6

Two Recent Developments Big Data 100000 87360 90000 80000 70000 54600 60000 50000 40000 27300 30000 18200 20000 13000 10000 10000 0 2009 2010 2011 2012 2013 2014 Video surveillance data size (PB) Cheap Computation DNN can thus be fitted efficiently 7

Deep Learning ： The Resurgence of DNN Breakthrough in 2006 ImageNet: 74% vs. 85% RNN for sequence analysis Activity recognition , CVPR2015 Video caption , CVPR2015 Deep Learning promotes the fast development of various visual computing areas 2006 2012 2014 Representation learning CNN for visual tasks DeepFace , CVPR2014 ∙∙∙∙∙∙ RCNN for detection , CVPR2014 8

Deep Neural Networks (DNN) • 𝐲 ∈ ℝ 𝑒 , 𝐢 ∈ ℝ 𝑜 , 𝐗 ∈ ℝ 𝑒×𝑜 𝐳 1 • 𝐢 = 𝜏 𝐲𝐗 , 𝜏 𝑢 = 1+𝑓 −𝑢 𝐢 𝐗 𝐲 Sigmoid function 𝜏 𝑢 10

Recurrent Neural Networks (RNN) 𝐳 𝐳 Temporal dependency modeling 𝐕 𝐕 𝐢 𝟐 𝐢 𝐢 𝟑 𝐢 𝟒 𝐗 𝐗 𝐗 𝐗 𝐲 𝟐 𝐲 𝐲 𝟑 𝐲 𝟒 RNN DNN 𝐲 𝐮 ∈ ℝ 𝑒 , 𝐢 𝐮 ∈ ℝ 𝑜 , 𝐗 ∈ ℝ 𝑒×𝑜 , 𝐕 ∈ ℝ 𝑜×𝑜 • • 𝐢 𝒖 = 𝜏 𝐲 𝒖 𝐗 + 𝐢 𝒖−𝟐 𝐕 11

Recurrent Convolutional Networks (RCN) DNN: Deep Neural Networks RNN: Recurrent Neural Networks CNN: Convolutional Neural Networks DNN CNN Convolutional Sequential Sequential RCN RNN Convolutional 12

Applications of RCN Video SR, NIPS15 & TPAMI17 Scene Labeling, NIPS15 Weather Nowcasting, NIPS15 Action Recognition, ICLR15 Person ReID, CVPR16 Object Recognition, CVPR15 13

Video Super-Resolution Display High-resolution devices High-resolution videos Display Super-resolution: denoising, deblurring, upscaling A great need for super resolving low-resolution videos Low-resolution videos 15

Two Main Approaches (1/2) 1. Single-Image super-resolution [1-6] One-to-One scheme, super resolve each video frame independently Ignore the intrinsic temporal dependency relation of video frames Low computational complexity, fast [1] Dong et al., Learning a deep convolutional network for image super resolution. ECCV, 2014. [2] Timofte et al., Anchored neighborhood regression for fast example-based super resolution. ICCV, 2013. [3] Zeyde et al., On single image scale-up using sparse-representations. Curves and Surfaces, 2012. [4] Yang et al., Image super-resolution via sparse representation. IEEE TIP, 2010. [5] Bevilacqua et al., Low-complexity single-image super resolution. BMVC, 2012. [6] Chang et al., Super-resolution through neighbor embedding. CVPR, 2004. 16

Two Main Approaches (2/2) 2. Multi-Frame super-resolution [7-11] Many-to-One scheme, use ⋯ multiple adjacent frames to super resolve a frame Model the temporal dependency relation by motion estimation High computational complexity, slow [7] Liu and Sun, On bayesian adaptive video super resolution. IEEE PAMI, 2014. [8] Takeda et al., Super-resolution without explicit subpixel motion estimation. IEEE TIP, 2009. [9] Mitzel et al., Video super resolution using duality based tv-l 1 optical flow. PR, 2009. [10] Protter et al. Generalizing the nonlocal-means to super-resolution reconstruction. IEEE TIP, 2009. [11] Fransens et al., Optical flow based super-resolution: A probabilistic approach. CVIU, 2007. 17

Motivation RNN: Recurrent Neural Networks SR: Super-Resolution • RNN can model long-term contextual information of temporal sequences well • Convolutional operation can scale to full videos of any spatial size and temporal step ➢ Propose bidirectional recurrent convolutional networks, different from vanilla RNN: 1. Commonly-used full connections are replaced with weight -sharing convolutions 2. Conditional convolutions are added for learning visual-temporal dependency relation 18

Bidirectional Recurrent Convolutional Networks learn spatial dependency between a low-resolution frame and its high- resolution result model long-term temporal dependency relation across video frames enhance visual-temporal dependency relation modeling 19

Learning • Define an end-to-end mapping 𝑃 ∙ from low-resolution frames 𝒴 to high-resolution frames 𝒵 • Learning proceeds by optimizing the Mean Square Error (MSE) between predicted frames 𝑃(𝒴) and 𝒵 2 𝑀 = 𝑃 𝒴 − 𝒵 – stochastic gradient descent – small learning rate in the output layer: 1e-4 20

Experiments • Train the model on 25 YUV format video sequences – volume-based training – number of volumes: roughly 41,000 – volume size: 32 × 32 × 10 Training videos • Test on a variety of real world videos – severe motion blur – motion aliasing – complex motions Testing videos 21

PSNR Comparison PSNR: peak signal-to-noise ratio Table1: The results of PSNR (dB) and test time (sec) on the test video sequences. [1] Video enhancer. http://www.infognition.com/videoenhancer/, version 1.9.10. 2014. Surpass state-of-the-art methods in PSNR, due to the effective [4] Bevilacqua et al., Low-complexity single-image super resolution. BMVC, 2012. [5] Chang et al., Super-resolution through neighbor embedding. CVPR, 2004. temporal dependency modelling [6] Dong et al., Learning a deep convolutional network for image super resolution. ECCV, 2014. [20] Takeda et al., Super-resolution without explicit subpixel motion estimation. IEEE TIP, 2009. [22] Timofte et al., Anchored neighborhood regression for fast example-based super resolution. ICCV, 2013. [24] Yang et al., Image super-resolution via sparse representation. IEEE TIP, 2010. [25] Zeyde et al., On single image scale-up using sparse-representations. Curves and Surfaces, 2012. 22

Model Architecture • Investigate the impact of our model architecture on the performance • Take a simplified network containing only feedfoward ( 𝑤 ) convolution as a benchmark • Study its variants by successively adding the bidirectional ( 𝑐 ), recurrent ( 𝑠 )and conditional ( 𝑢 ) schemes Table1: The results of PSNR (dB) by variants of BRCN on the testing video sequences. 23

Running Time Figure: Speed vs. PSNR for all the comparison methods. Outperform both single-image and multi-frame SR methods Achieve comparable speed with the fastest single-image SR methods 24

Closeup Comparison Figure: Comparison among original frames (2th, 3th and 4th frames, from the top row to Our method is able to recover more image details than others, the bottom) of the Dancing video and super resolved results by Bicubic, 3DSKR, ANR and under severe motion conditions BRCN, respectively. 25

Example Upscaling factor:4 87 × 157 → 348 × 628 Comparison: Bicubic (top) Ours (bottom) 26

Bidirectional Recurrent Convolutional Networks for Video - PowerPoint PPT Presentation

Bidirectional Recurrent Convolutional Networks for Video Super-Resolution Qi Zhang & Yan Huang Center for Research on Intelligent Perception and Computing (CRIPAC) National Laboratory of Pattern Recognition (NLPR) Institute of Automation,

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

15-780 Graduate Artificial Intelligence: Convolutional and recurrent networks J. Zico Kolter

Recurrent Neural Networks Greg Mori - CMPT 419/726 Goodfellow, Bengio, and Courville: Deep

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

CONVOLUTIONAL AND RECURRENT NEURAL NETWORKS Neural networks Fully connected networks

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

The Power of Linear Recurrent Neural Networks Neural Networks Was knnen lineare rekurrente

Single Layer Recurrent Network Bidirectional Symmetric Connection Binary /

! Single Layer Recurrent Network ! Bidirectional Symmetric Connection ! Binary /

Recurrent Neural Network Xiaogang Wang xgwang@ee.cuhk.edu.hk February 26, 2019 cuhk Xiaogang

Bidirectional Flow Measurement, IPFIX, and Security Analysis Elisa Boschi, Hitachi Europe SAS

Welcome to Celebrating 30 years in Semiconductor and Sensor device manufacture! Sensor and

rarefaction? Joseph Caine 1 , Warren Douglas Stevens 2 & Ivn Jimnez 2 1 Harris-Stowe State

AER Draft Rate of Return Guidelines APGA Early Views Nick Wills-Johnson AER Public Forum 2

SENSY Presentation SENSY Presentation KNOW HOW & FLEXIBILITY THE WAY TO EXCELLENCE Bringing

From Sensors to Context Summer School on Wireless Sensor Networks and Smart Objects Albrecht

Date: 18 June 2020 Venue: Zwartkopjes Pumping Station Time: 10H00 AM 1 Rand Water Team

6 th General Assembly 30.November 2011 Peter Hjuler Jensen RIS DTU Technical University of

Bidirectional Recurrent Convolutional Networks for Video - PowerPoint PPT Presentation

Bidirectional Recurrent Convolutional Networks for Video Super-Resolution Qi Zhang & Yan Huang Center for Research on Intelligent Perception and Computing (CRIPAC) National Laboratory of Pattern Recognition (NLPR) Institute of Automation,

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

15-780 Graduate Artificial Intelligence: Convolutional and recurrent networks J. Zico Kolter

Recurrent Neural Networks Greg Mori - CMPT 419/726 Goodfellow, Bengio, and Courville: Deep

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

CONVOLUTIONAL AND RECURRENT NEURAL NETWORKS Neural networks Fully connected networks

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

The Power of Linear Recurrent Neural Networks Neural Networks Was knnen lineare rekurrente

Single Layer Recurrent Network Bidirectional Symmetric Connection Binary /

! Single Layer Recurrent Network ! Bidirectional Symmetric Connection ! Binary /

Recurrent Neural Network Xiaogang Wang xgwang@ee.cuhk.edu.hk February 26, 2019 cuhk Xiaogang

Bidirectional Flow Measurement, IPFIX, and Security Analysis Elisa Boschi, Hitachi Europe SAS

Welcome to Celebrating 30 years in Semiconductor and Sensor device manufacture! Sensor and

rarefaction? Joseph Caine 1 , Warren Douglas Stevens 2 &amp; Ivn Jimnez 2 1 Harris-Stowe State

AER Draft Rate of Return Guidelines APGA Early Views Nick Wills-Johnson AER Public Forum 2

SENSY Presentation SENSY Presentation KNOW HOW &amp; FLEXIBILITY THE WAY TO EXCELLENCE Bringing

From Sensors to Context Summer School on Wireless Sensor Networks and Smart Objects Albrecht

Date: 18 June 2020 Venue: Zwartkopjes Pumping Station Time: 10H00 AM 1 Rand Water Team

6 th General Assembly 30.November 2011 Peter Hjuler Jensen RIS DTU Technical University of

rarefaction? Joseph Caine 1 , Warren Douglas Stevens 2 & Ivn Jimnez 2 1 Harris-Stowe State

SENSY Presentation SENSY Presentation KNOW HOW & FLEXIBILITY THE WAY TO EXCELLENCE Bringing