Multi-Input Cardiac Image Super-Resolution using Convolutional - - PowerPoint PPT Presentation

multi input cardiac image super resolution using
SMART_READER_LITE
LIVE PREVIEW

Multi-Input Cardiac Image Super-Resolution using Convolutional - - PowerPoint PPT Presentation

Multi-Input Cardiac Image Super-Resolution using Convolutional Neural Networks Ozan Oktay, Wenjia Bai, Matthew Lee, Ricardo Guerrero, Konstantinos Kamnitsas, Jose Caballero, Antonio de Marvao, Stuart Cook, Declan ORegan, and Daniel Rueckert


slide-1
SLIDE 1

Multi-Input Cardiac Image Super-Resolution using Convolutional Neural Networks

Ozan Oktay, Wenjia Bai, Matthew Lee, Ricardo Guerrero, Konstantinos Kamnitsas, Jose Caballero, Antonio de Marvao, Stuart Cook, Declan O’Regan, and Daniel Rueckert

19th International Conference on Medical Image Computing and Computer Assisted Interventions (MICCAI 2016) October 2016, Athens

slide-2
SLIDE 2

SAX Cardiac MR Image Acquisition

  • Large slice thickness (8-10 mm)
  • Due to constrains on SNR,

acquisition and breath-hold time

  • It hampers subsequent image

analysis and quantitative measurements.

Clinical Motivation

2

Slice I Slice III Slice II Slice IV

slide-3
SLIDE 3

SAX Cardiac MR Image Acquisition

  • Large slice thickness (8-10 mm)
  • Due to constrains on SNR,

acquisition and breath-hold time

  • It hampers subsequent image

analysis and quantitative measurements.

  • LAX image acquisitions are

performed to complement SAX images

Clinical Motivation

3

4 - Chamber LAX Slice 2 - Chamber LAX Slice

slide-4
SLIDE 4

4

Low and High Resolution Images

PSF kernel and patient motion Down-sample

Sinc Filter Sub-sampling grid 3D HR Image 3D LR Image 2D LAX Images Image Super Resolution Model

Input Output

3D LR Image 2D LAX Images

slide-5
SLIDE 5

Related Work on Super Resolution

5

External Example & Model Based SR

I. Coupled Dictionary Learning and S. Coding [Yang et al TIP’12, Bhatia K. ISBI’14 ] LR Image HR Image LR Dictionary HR Dictionary

slide-6
SLIDE 6

Related Work on Super Resolution

6

External Example & Model Based SR

I. Coupled Dictionary Learning and S. Coding [Yang et al TIP’12, Bhatia K. ISBI’14 ] II. Multi-Atlas Based SR Techniques [Shi et al. MICCAI’13] LR Image HR Image HR atlases

slide-7
SLIDE 7

Related Work on Super Resolution

7

External Example & Model Based SR

I. Coupled Dictionary Learning and S. Coding [Yang et al TIP’12, Bhatia K. ISBI’14 ] II. Multi-Atlas Based SR Techniques [Shi et al. MICCAI’13] III. Decision Forest based Regression [Alexander et al MICCAI’14, Schulter S. CVPR’15] LR Image HR Image Regression Tree

slide-8
SLIDE 8

Related Work on Super Resolution

8

External Example & Model Based SR

I. Coupled Dictionary Learning and S. Coding [Yang et al TIP’12, Bhatia K. ISBI’14 ] II. Multi-Atlas Based SR Techniques [Shi et al. MICCAI’13] III. Decision Forest based Regression [Alexander et al MICCAI’14, Schulter S. CVPR’15] IV. Neural Network based Regression i. CNNs [Dong et al. ECCV’14, Shi et al. CVPR’16] ii. CNNs + GANs [Ledig et al Arxiv Sept’16] LR Image HR Image

Convolution and Non-Linear Units

slide-9
SLIDE 9

Proposed 3D-SR Model (Single-Image)

9

Components of the model

  • 3D Convolution and Deconvolution (inverse convolution) Kernels
  • Rectified Linear Units (ReLUs)
  • Regression Based Cost Function (Smooth L1-Norm)
  • Input (2D Stack-LR) and Output (3D-HR) Images
slide-10
SLIDE 10

Proposed 3D-SR Model (Single-Image)

10

Proposed improvements on SR-CNN model: I. Residual Learning

  • An easier regression problem to solve
  • Robust and faster model convergence
slide-11
SLIDE 11

Proposed 3D-SR Model (Single-Image)

11

Proposed improvements on SR-CNN model: II. Learning Upsampling Layers

  • End-to-end training of convolution and upsampling kernels
slide-12
SLIDE 12

Proposed 3D-SR Model (Single-Image)

12

Proposed improvements on SR-CNN model:

  • III. Multi-Input model extension
  • Constrains the regression task with more input data
  • In cardiac imaging usually multiple image stacks are acquired.
slide-13
SLIDE 13

13

Proposed 3D-SR Model (Multi-Image)

  • Siamese model is used to combine information from multiple stacks
  • The learned kernels can be easily integrated in this multi-model.
slide-14
SLIDE 14

Method Evaluation Strategy

14

I. Image Quality Analysis

§ Peak-to-Signal-Noise Ratio (PSNR) (Images from 300 Subjects) § Structural Similarity Index Measure (SSIM) [Wang et al. IEEE TIP’04]

II. Subsequent Image Analysis (SR is used for pre-processing)

§ Cardiac Image Segmentation (Images from 18 Subjects) § Cardiac Motion Tracking (Images from 10 Subjects)

  • III. Our method is compared against:

§ Linear, C-Spline, MAPM [Shi MICCAI’13], CNN [Dong TPAMI’15]

slide-15
SLIDE 15

Table 1: Quantitative comparison of different image upsampling methods. Method PSNR (dB) SSIM Linear 20.83±1.10 .70±.03 CSpline 22.38±1.13 .73±.03 MAPM 22.75±1.22 .73±.03 sh-CNN 23.67±1.18 .74±.02 CNN 24.12±1.18 .76±.02 de-CNN 24.45±1.20 .77±.02

Image Quality Assessment

15

  • MAPM:

Multi-Atlas Patch Match [Shi et al MICCAI’13]

  • sh-CNN:

4 - Layer Network without Deconvolution Layer [Dong TPAMI’15]

  • CNN:

7 - Layer Network without Deconvolution Layer

  • de-CNN:

7 - Layer Network with Deconvolution Layer

slide-16
SLIDE 16

16

Image Quality Assessment

Upsampling x5 Inference Time: 6-8 Seconds for image size (140x140x10) Low Resolution Input Image Linear Interpolation The Proposed Method High Resolution Ground-truth

slide-17
SLIDE 17

Image Quality Assessment

17

  • nr-CNN:

7 - Layer Network without Residual Learning.

  • de-CNN:

7 - Layer Network with Residual Learning

PSNR (dB) Structural Similarity Index (Dashed Lines) Number of Training Epochs

5 10 15 22.3 22.6 22.9 23.2 23.5 23.8 24.1 24.4 24.7 0.7 0.71 0.72 0.73 0.74 0.75 0.76 0.77 0.78 0.79

deCNN CSpline MAPM nrCNN

slide-18
SLIDE 18

Experiments with Multi-Image Model

18

  • MC (SAX/4CH): Multi-Channel input – SAX and 4 Chamber LAX Images
  • MC (SAX/2/4CH): Multi-Channel input – SAX and 2/4 Chamber LAX Images

Table 2: Image quality results obtained with three different models: single-image de-CNN, Siamese, and multi-channel (MC) CNN. Method PSNR (dB) SSIM de-CNN(SAX) 24.76±0.48 .807±.009 Siamese(SAX/4CH) 25.13±0.48 .814±.013 MC(SAX/4CH) 25.15±0.47 .814±.012 MC(SAX/2/4CH) 25.26±0.37 .818±.012

slide-19
SLIDE 19

19

Motion Tracking Experiments (SR is used as a preprocessing method)

Surface to Surface Distance (Proposed vs HR) 4.73 mm Surface to Surface Distance (Linear vs HR) 5.50 mm

slide-20
SLIDE 20

20

Motion Tracking Experiments (SR is used as a preprocessing method)

slide-21
SLIDE 21

21

Table 3: Segmentation results for different upsampling methods, CSpline (p = .007) and MAPM (p = .009). They are compared in terms of mean and Hausdorff distances (MYO) and LV cavity volume differences (w.r.t. manual annotations). Linear CSpline MAPM de-CNN High Res Exp (c) LV Vol Diff (ml) 11.72±6.96 10.80±6.46 9.55±5.42 9.09±5.36 8.24±5.47 Mean Dist (mm) 1.49±0.30 1.45±0.29 1.40±0.29 1.38±0.29 1.38±0.28 Haus Dist (mm) 7.74±1.73 7.29±1.63 6.83±1.61 6.67±1.77 6.70±1.85

  • Multi-Atlas patch based label fusion [Coupe NeuroImage’11] is used

to segment images (20 Atlases)

LV Segmentation Experiments (SR is used as a preprocessing method)

slide-22
SLIDE 22

22

Difference Between Trained and Fixed Deconvolution Kernels

slide-23
SLIDE 23

Take Home Messages

23

I. SR as a preprocessing step / Could it replace standard interpolation techniques ? II. Importance of learning upsampling filters and residual connections in SR models.

  • III. Models could be trained with combined images and stacks

acquired from different directions.

  • IV. Future work

a. Other imaging modalities or applications (DTI or MR Image Reconstruction) b. Perceptual loss function: Could it be applicable to medical images ?

slide-24
SLIDE 24

Multi-Input Cardiac Image Super-Resolution using Convolutional Neural Networks

Acknowledgments:

Poster Session 1 – Cardiac Image Analysis (CARD) – PS1.40

slide-25
SLIDE 25

Some Additional Slides

25

Additional Details about the SR-CNN Model

slide-26
SLIDE 26

Model Training Strategy

26

I. Batch Normalization [Ioffe and Szegedy ICML’15]

  • Faster Model Convergence.
  • Reduces the dependency of model on filter coefficient initialization.

II. Data Augmentation

  • Training data, LR-HR pairs, are generated from 3D-HR Images based on

the following model [Shi et al. MICCAI’13]:

  • Trained with cine cardiac HR - MR images acquired from 930 healthy adult

subjects.

  • III. Smooth L1-Norm Function
  • Improves the convergence when outliers are observed in training data.

x = DBSMy + η

slide-27
SLIDE 27

Number of Feature Maps / Atlases

27

Table 1: Quantitative comparison of different image upsampling methods. Exp (a) PSNR (dB) SSIM # Filters/Atlases Linear 20.83±1.10 .70±.03 – CSpline 22.38±1.13 .73±.03 – MAPM 22.75±1.22 .73±.03 350 sh-CNN 23.67±1.18 .74±.02 64,64,32,1 CNN 24.12±1.18 .76±.02 64,64,32,16,8,4,1 de-CNN 24.45±1.20 .77±.02 64,64,32,16,8,4,1

  • MAPM:

Multi-Atlas Patch Match [Shi et al MICCAI’13]

  • sh-CNN:

4 - Layer Network without Deconvolution Layer [Dong TPAMI’15]

  • CNN:

7 - Layer Network without Deconvolution Layer

  • de-CNN:

7 - Layer Network with Deconvolution Layer

slide-28
SLIDE 28

28

SR-CNN (9-5-5) - ImageNet

Upsampling x4 A 3-Layer model is trained with ImageNet Dataset Low Resolution Input Image Cubic Spline Interpolation SR-CNN (9-5-5) Output Image

slide-29
SLIDE 29

29

Image Quality Assessment

Upsampling x5 Inference Time: 6-8 Seconds for image size (140x140x10) Low Resolution Input Image Linear Interpolation SR-CNN Output Image High Resolution Ground-truth