[PPT] - Cascaded 3D Fully Convolutional Networks for Medical Image PowerPoint Presentation

SLIDE 1

Cascaded 3D Fully Convolutional Networks for Medical Image Segmentation

2018/03/26 1 www.holgerroth.com

Holger Roth

Assistant Professor (Research) Nagoya University, Japan

Contributors: Hirohisa Odaa, Xiangrong Zhoub, Natsuki Shimizua, Ying Yanga, Chen Shena, Yuichiro Hayashia, Masahiro Odaa, Michitaka Fujiwarac, Kazunari Misawad, Kensaku Moria

aNagoya University, Furo-cho, Chikusa-ku, Nagoya, Japan bGifu University, Yanagido, Gifu, Japan cNagoya University Graduate School of Medicine, Nagoya, Japan dAichi Cancer Center, Kanokoden, Chikusa-ku, Nagoya, Japan

SLIDE 2

Motivation

Multi-organ segmentation is an important prerequisite for many

CADx systems in medical imaging.

Segmentation could provide quantitative analysis, important for

diagnosis & treatment

– Measure organ volumes, discover shape irregularities, 3D printing, and surgical navigation

Challenging because of high anatomical variability of organs’

appearance, especially in the abdomen.

2 2018/03/26 www.holgerroth.com

contrast enhanced CT scans of the lower abdomen

SLIDE 3

Surgical navigation system for gastric cancer treatment

2018/03/26 www.holgerroth.com 3

Hayashi, Yuichiro, et al. "Clinical application of a surgical navigation system based on virtual laparoscopy in laparoscopic gastrectomy for gastric cancer." International Journal of Computer Assisted Radiology and Surgery 11.5 (2016): 827-836.

SLIDE 4

Deep learning for image segmentation

Recent advances in deep learning, like fully convolutional networks

(FCN), have made it feasible to train deep models for dense semantic segmentation tasks. [Long et al., CVPR 2015]

Extensions to 3D have been shown to work well for biomedical

images (3D U-Net) [Cicek et al., MICCAI 2016]

In this work we present a cascaded 3D FCN approach trained on

manually labelled data of several abdominal organs and vessels.

We achieve competitive segmentation results on clinical CT images

used in gastric surgery.

2018/03/26 www.holgerroth.com 4

SLIDE 5

Background: Convolutional Neural Network (CNN)

Image: 3D Filter kernel: Output image:

[Krizhevsky et al., NIPS 2012] [Roth et al. TMI 2015]

Kernel elements are trained from the data! Kernel elements are trained from the data!

2018/03/26 5 www.holgerroth.com

First layer 3D kernels

https://github.com/vdumoulin/conv_arithmetic

SLIDE 6

Classification CNN

“abdomen” e.g. LeNet, AlexNet, VGG-Net, etc…

2018/03/26 www.holgerroth.com 6

Figure from [Roth et al., JAMIT 2018, arxiv:1803.08691]

SLIDE 7

Fully convolutional networks (FCN)

[Long et al., “Fully convolutional networks for semantic segmentation”, CVPR 2015]

2018/03/26 www.holgerroth.com 7

Figure from [Roth et al., JAMIT 2018, arxiv:1803.08691]

SLIDE 8

Figure after 3D U-Net [Çiçek et al., “3D U-Net: learning dense volumetric segmentation from sparse annotation”, MICCAI 2016]

19 million learnable parameters
Fits on one 12GB GPU (NVIDIA TITAN X) for training, ~6GB needed for inference

2018/03/26 www.holgerroth.com 8

Concat Conv + BatchNorm + ReLu Max pool De-conv analysis path (encoder) synthesis path (decoder) concatenation (skip connection) with cropping

CT

Input: Output:

1 32 64 64 128 128 256 256 256 + 512 256 128+256 128 64+128 64 64 [132,132,116] [44,44,28] [64,64,56] [30,30,26] [13,13,11] [9,9,7] [14,14,10] [24,24,16] [48,48,32] [128,128,112] [60,60,52] [26,26,22] [18,18,14] [28,28,20]

Fully convolutional architectures (3D U-Net)

[Cicek et al., MICCAI 2016]*

3D probability maps for each class *Implementation in Caffe

SLIDE 9

2018/03/26 www.holgerroth.com 9

H. Roth et al., “An application of cascaded 3D fully convolutional networks for medical image segmentation”,

Computerized Medical Imaging and Graphics, 2018 (arXiv 1803.05431)

Images,

Labels

Detect patient’s body Train 3D FCN Multi-class prediction Dilate fore- ground Train 3D FCN Final prediction 2nd Stage 1st Stage

A cascaded approach

SLIDE 10

2018/03/26 www.holgerroth.com 10

A cascaded approach

3D FCN sees ~40% of the voxels in the image 3D FCN sees ~10% of the voxels in the image  This approach encourages better segmentation around the boundary of organs.

H. Roth et al., Computerized Medical Imaging

and Graphics, 2018 (arXiv 1803.05431)

SLIDE 11

Subvolume tiling approach in training & testing

What the network sees What the network learns

All CT volumes are downsampled 2x

Down-sampling and sub-volume size

are largely depending on the amount

f available GPU memory

2018/03/26 www.holgerroth.com 11

SLIDE 12

Dataset creation

http://pluto.newves.org

semi-automated segmentation tools
graph-cuts
region growing
data collected from 2009~2017 (331 cases)

2018/03/26 www.holgerroth.com 12

CTs acquired at Aichi Cancer Center, Nagoya, Japan

SLIDE 13

Candidate region

Mask of patient’s body

– Thresholding – Morphological opening – Largest connected component

Removes ~60% of

voxels in the image

2018/03/26 www.holgerroth.com 13

SLIDE 14

Testing Balancing weight

(smallest organ gets highest weight):

weighted voxel-wise cross-entropy loss:

1 : total number in candidate region : number classes : softmax likelihood : number of voxels for class

Dealing with data imbalance

high low

2018/03/26 www.holgerroth.com 14

SLIDE 15

Data augmentation

random cropping
random rotations
elastic B-spline deformations

'CreateDeformation' and 'ApplyDeformation' layers

3D U-Net [Cicek et al. MICCAI 2016]

2018/03/26 www.holgerroth.com 15

SLIDE 16

Feature maps

2018/03/26 www.holgerroth.com 16

Learned feature kernels

… …

CT image Segmentation

3x3x3 kernels throughout the network

…

concatenation (skip) connections at each level

f 3D

probability maps

SLIDE 17

Network trained on abdominal contrast

enhanced CT images:

281/50 training/validation split
8 classes manually labeled
artery, vein, liver, spleen, stomach, gallbladder,

pancreas + background

Training on 281 cases can take 2-3 days for

200-k iterations, inference in 1.4-3.3 minutes

(NVIDIA TITAN X)

Compared approaches:
Single 3D U-Net FCN
Cascade (train one FCN to define candidate region

for second FCN)

Experiments

2018/03/26 www.holgerroth.com 17

SLIDE 18

artery
vein
liver
spleen
stomach
gallbladder
pancreas

Inference (test case) – 1st stage

5-6 minutes using non-overlapping tiles
15-20 minutes using overlapping tiles
NVIDIA GeForce GTX TITAN X with 12 GB memory

2018/03/26 www.holgerroth.com 18

SLIDE 19

Ground truth Stage 2 – Tiling Stage 2 result

Example of the validation set with (a) ground truth and illustrating (b) Tiling approach on 2nd stage candidate region, Resulting segmentation is shown in (c). Note that the grid shows the output tiles of size 44×44×28 (x,y,z-directions). Each predicted tile is based on a larger input of 132×132×116 that the network processes as defined by GPU requirements (12 GB) (a) (b) (c)

H. Roth et al., Computerized Medical Imaging

and Graphics, 2018 (arXiv 1803.05431) 2018/03/26 www.holgerroth.com 19

SLIDE 20

0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 artery vein liver spleen stomach gall pancreas 1st 0.60 0.67 0.90 0.85 0.80 0.72 0.56 2nd 0.80 0.73 0.93 0.91 0.84 0.71 0.63

Validation: Dice scores (1st & 2nd stages)

1st 2nd Note: a marked improvement in performance of the 2nd stage can be observed.

2018/03/26 www.holgerroth.com 20

SLIDE 21

Test on unseen dataset

150 abdominal ceCTs from different hospital and scanner

2018/03/26 www.holgerroth.com 21

H. Roth et al., Computerized Medical Imaging

and Graphics, 2018 (arXiv 1803.05431)

1st stage 2nd stage

SLIDE 22

2018/03/26 www.holgerroth.com 22

H. Roth et al., Computerized Medical Imaging

and Graphics, 2018 (arXiv 1803.05431)

Comparison to other methods

SLIDE 23

Anatomical name display on blood vessel surface in 3D views Anatomical name display in surgical navigation view Segmentation result overlaid with anatomical labeling of blood vessels

Matsuzaki, T., Oda, M., Kitasaka, T., Hayashi, Y., Misawa, K., & Mori, K. (2015). Automated anatomical labeling of abdominal arteries and hepatic portal system extracted from abdominal CT volumes. Medical image analysis

Visualization for Surgical navigation

2018/03/26 www.holgerroth.com 23

SLIDE 24

Conclusions

3D fully convolutional architectures (3D U-net)

can achieve competitive results for multi-organ segmentation.

They can be efficiently deployed on a single GPU.

– larger GPU memory or multi-GPU processing is helpful, see Roth et al. SPIE 2018 (arXiv 1711.06439)

An cascaded approach for training & testing was

able to markedly improve the results.

2018/03/26 www.holgerroth.com 24

Close to 90% Dice

n average for

pancreas! Same input/output size

SLIDE 25

Thank you!

www.newves.org

This work was supported by MEXT KAKENHI (26108006, 26560255, 17H00867, 17K20099) and JPSP International Bilateral Collaboration Grant.

2018/03/26 www.holgerroth.com 25

Code & trained models: https://github.com/holgerroth/3Dunet_abdomen_cascade Reference: H. Roth et al., Computerized Medical Imaging and Graphics, 2018 (arXiv 1803.05431)

Contributors: Hirohisa Odaa, Xiangrong Zhoub, Natsuki Shimizua, Ying Yanga, Chen Shena, Yuichiro Hayashia, Masahiro Odaa, Michitaka Fujiwarac, Kazunari Misawad, Kensaku Moria

aNagoya University, Furo-cho, Chikusa-ku, Nagoya, Japan bGifu University, Yanagido, Gifu, Japan cNagoya University Graduate School of Medicine, Nagoya, Japan dAichi Cancer Center, Kanokoden, Chikusa-ku, Nagoya, Japan