Efficient Neural Networks for Image Restoration Yulun Zhang - - PowerPoint PPT Presentation

efficient neural networks for image restoration
SMART_READER_LITE
LIVE PREVIEW

Efficient Neural Networks for Image Restoration Yulun Zhang - - PowerPoint PPT Presentation

Efficient Neural Networks for Image Restoration Yulun Zhang Supervisor: Prof. Yun Fu SMILE lab, Northeastern University, Boston, US Research summary Deep convolutional neural networks for Image Restoration 1.Residual dense network. [ CVPR-2018


slide-1
SLIDE 1

Efficient Neural Networks for Image Restoration

Yulun Zhang Supervisor: Prof. Yun Fu SMILE lab, Northeastern University, Boston, US

slide-2
SLIDE 2

Research summary

Deep convolutional neural networks for Image Restoration 1.Residual dense network. [CVPR-2018] Comparable STOA performance with much less parameters 2.Residual channel attention network. [ECCV-2018] Very deep network with channel attention 3.Residual non-local attention network for image restoration. [ICLR-2019] Channel and spatial mixed attention

slide-3
SLIDE 3

Research status

SRCNN FSRCNN EDSR VDSR MemNet Challenges:  Hard to recover lost details.  Hierarchical features in LR feature space  Local feature extraction  Hard to train very deep and wide network Objects have various: Scales; Angles of view; Aspect ratios; Feature extraction in HR space SRResNet Feature extraction in LR space Limitations: Increase computation complexity; Blur original LR inputs Limitations: Neglect to use hierarchical features; Hard to train very deep and wide networks

Residual Dense Network for Image Super-Resolution (CVPR-2018)

slide-4
SLIDE 4

Research status

Method Conv Block 2 Block D Conv Conv LR HR concat Block 1 Block d Upscale 1x1 Conv

Global Feature Fusion Global Residual Learning Shallow Feature Extraction Upscale Global feature extraction

Conv Conv 1x1Conv concat ReLU Conv ReLU Conv ReLU Conv ReLU

Block d-1 Block d+1 Local Feature Fusion Block d Local Residual Learning Residual Dense Block Contiguous Memory

Residual Dense Network for Image Super-Resolution (CVPR-2018)

slide-5
SLIDE 5

Research status

Study of D, C, and G.

The number of RDB (denote as D for short), the number of Conv layers per RDB (denote as C for short), and the growth rate (denote as G for short). Analyses: Our RDN allows deeper and wider network, from which more hierarchical features are extracted for higher performance. Residual Dense Network for Image Super-Resolution (CVPR-2018)

slide-6
SLIDE 6

Research status

Ablation Investigation.

Ablation investigation on the effects of contiguous memory (CM), local residual learning (LRL), and global feature fusion (GFF). Analyses: These quantitative and visual analyses demonstrate the effectiveness and benefits of our proposed CM, LRL, and GFF. Residual Dense Network for Image Super-Resolution (CVPR-2018)

slide-7
SLIDE 7

Research status

Analyses: These quantitative and visual analyses demonstrate the effectiveness and benefits of our proposed CM, LRL, and GFF. Residual Dense Network for Image Super-Resolution (CVPR-2018)

slide-8
SLIDE 8

Research status

Visual Results with BI Degradation Model.

Residual Dense Network for Image Super-Resolution (CVPR-2018)

slide-9
SLIDE 9

Research status

Visual Results with BD Degradation Model.

Residual Dense Network for Image Super-Resolution (CVPR-2018)

slide-10
SLIDE 10

Research status

Visual Results with DN Degradation Model.

Residual Dense Network for Image Super-Resolution (CVPR-2018)

More results about image restoration

arXiv-2018- Residual dense network for image restoration https://arxiv.org/abs/1812.10477

slide-11
SLIDE 11

Research status

Motivations for our next work (ECCV-2018-RCAN)

 Less GPU memory. Wide network could consume too much GPU memory. (4 GPUs, or 1 GPU with batch split)  Smaller model size. Too further decrease network parameter number. (CVPRW-17-EDSR: 43M, CVPR-18-RDN: 22M)  Better performance. Very deep network should achieve better performance.

slide-12
SLIDE 12

Research status

Image super-resolution using very deep residual channel attention networks (ECCV-2018)

Limitations of previous methods

 Whether deeper networks can further contribute to image SR and how to construct very deep trainable networks remains to be explored. Deepest networks for image SR: ICCV-2017-MemNet_M10R10_212C64, CVPRW-2017-EDSR  Previous networks lack distinguish ability across feature channels, and finally hinder the representational power of deep networks.

slide-13
SLIDE 13

Research status

Image super-resolution using very deep residual channel attention networks (ECCV-2018)

Long skip connection Residual in Residual RG-1 RG-g RG-G LR HR Residual group Residual channel attention block Conv Upscale module Element-wise sum Fg Fg−1 Residual Group RCAB-1 RCAB-b RCAB-B Short skip connection Fg,b−1 Fg,b Fg−1 Fg FDF

Network architecture Contributions

 We propose the very deep residual channel attention networks (RCAN) for highly accurate image SR.  We propose residual in residual (RIR) structure to construct very deep trainable networks. The long and short skip connections in RIR help to bypass abundant low-frequency information and make the main network learn more effective information.  We propose channel attention (CA) mechanism to adaptively rescale features by considering interdependencies among feature channels.

slide-14
SLIDE 14

Research status

Image super-resolution using very deep residual channel attention networks (ECCV-2018)

Convergence analyses with RIR

slide-15
SLIDE 15

Research status

Image super-resolution using very deep residual channel attention networks (ECCV-2018)

Channel attention Conv ReLU Global pooling Sigmoid function Element-wise product Element-wise sum Fg,b−1 Fg,b Xg,b H×W×C 1×1×C 1×1×C 1×1×C 1×1× C r H×W×C WD WU HGP f

𝑦𝑑 𝑗, 𝑘 is the value at position (i, j) of c-th feature 𝑦𝑑.

Channel attention Residual channel attention block

slide-16
SLIDE 16

Research status

Image super-resolution using very deep residual channel attention networks (ECCV-2018)

Low-level CA High-level CA c=48,s=0.0009 c=8,s=0.0016 c=28,s=0.0017 c=12,s=0.9578 c=51,s=0.9732 c=23,s=0.9998 c=29,s=0.2244 c=1,s=0.2397 c=54,s=0.2699 c=56,s=0.5334 c=33,s=0.5457 c=13,s=0.5603

  • Figure. Channel attention visualization. Low-/high-level CAs and feature maps. c and s denote channel index and weight.

In each row, we show 3 feature maps (indicated by index c) with the smallest channel weights (s) and other 3 feature maps with the largest channel weights.

Channel attention visualization

slide-17
SLIDE 17

Research status

Image super-resolution using very deep residual channel attention networks (ECCV-2018)

Ablation study

Investigations of RIR and CA. We observe the best PSNR (dB) values on Set5 (2×) in 5×104 iterations

slide-18
SLIDE 18

Research status

Image super-resolution using very deep residual channel attention networks (ECCV-2018)

Quantitative results

slide-19
SLIDE 19

Research status

Image super-resolution using very deep residual channel attention networks (ECCV-2018)

Quantitative results

slide-20
SLIDE 20

Research status

Image super-resolution using very deep residual channel attention networks (ECCV-2018)

Visual results

slide-21
SLIDE 21

Research status

Image super-resolution using very deep residual channel attention networks (ECCV-2018)

Visual results

slide-22
SLIDE 22

Research status

Image super-resolution using very deep residual channel attention networks (ECCV-2018)

Visual results

slide-23
SLIDE 23

Research status

Image super-resolution using very deep residual channel attention networks (ECCV-2018)

Objective recognition performance and model size

slide-24
SLIDE 24

Research status

Motivations for our next work (ICLR-2019-RNAN)

 Effective attention mechanism. Channel attention to spatial attention, mixed attention, …. Tell noise apart from noisy input better  Model generalization. Generalize our model to different image restoration tasks.  Larger receptive field size. To make use of input in a more global way.

slide-25
SLIDE 25

Research status

Residual Non-local Attention Networks for Image Restoration (ICLR-2019)

Network architecture Limitations of previous methods

 The receptive field size of these networks is relatively small.  Distinctive ability of these networks is also limited.  All channel-wise features are treated equally in those networks.

Contributions

 We propose the very deep residual non-local attention networks for high-quality image restoration.  We propose residual non-local attention learning to train very deep networks by preserving more low-level features, being more suitable for image restoration.  We demonstrate with extensive experiments that our RNAN is powerful for various image restoration tasks. Framework Non-local block Residual (non-)local attention block.

slide-26
SLIDE 26

Research status

Residual Non-local Attention Networks for Image Restoration (ICLR-2019)

Quantitative results: color and gray-scale image denoising

slide-27
SLIDE 27

Research status

Residual Non-local Attention Networks for Image Restoration (ICLR-2019)

Visual results: color image denoising

slide-28
SLIDE 28

Research status

Residual Non-local Attention Networks for Image Restoration (ICLR-2019)

Visual results: gray-scale image denoising

slide-29
SLIDE 29

Research status

Residual Non-local Attention Networks for Image Restoration (ICLR-2019)

Visual results: image demosaicing

slide-30
SLIDE 30

Research status

Residual Non-local Attention Networks for Image Restoration (ICLR-2019)

Visual results: image compression artifact reduction

slide-31
SLIDE 31

Research status

Residual Non-local Attention Networks for Image Restoration (ICLR-2019)

Visual results: image super-resolution

slide-32
SLIDE 32

Thank you

More works are available at: http://yulunzhang.com

slide-33
SLIDE 33

Research status

Potential future works:

0.Efficient networks under common settings. 1.Video enhancement (image/video super-resolution, interpolation, denoising, compression removal et al.). 2.Light-weight networks for mobile devices. 3.Very large scaling factors (e.g., 16, 32, 64, …). 4.Reference based IR. 5.NAS based IR. 6.Very high-resolution style transfer. 7.Image/video ‘compression’ (include downscaling). 8.IR under complex conditions (e.g., low-light, blind IR). 9.Understanding. Bridge low-level and high-level vision tasks. 10.Unsupervised IR.