Generative Image Inpainting for Person Pose Generation Anubha - - PowerPoint PPT Presentation

generative image inpainting for person pose generation
SMART_READER_LITE
LIVE PREVIEW

Generative Image Inpainting for Person Pose Generation Anubha - - PowerPoint PPT Presentation

Generative Image Inpainting for Person Pose Generation Anubha Pandey, Vismay Patel Indian Institute of Technology Madras cs16s023@cse.iitm.ac.in 19th September 2018 Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September


slide-1
SLIDE 1

Generative Image Inpainting for Person Pose Generation

Anubha Pandey, Vismay Patel

Indian Institute of Technology Madras cs16s023@cse.iitm.ac.in

19th September 2018

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 1 / 20

slide-2
SLIDE 2

Overview

1

Problem Statement

2

Introduction

3

Related Works

4

Proposed Solution

5

Network Architecture

6

Training

7

Results

8

Conclusion

9

Future Work

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 2 / 20

slide-3
SLIDE 3

Problem Statement

Chalearn LAP Inpainting Competition Track1 - Inpainting of still images of humans Objective To restore the masked parts of the image in a way that resembles the original content and looks plausible to a human.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 3 / 20

slide-4
SLIDE 4

Problem Statement

Chalearn LAP Inpainting Competition Track1 - Inpainting of still images of humans Objective To restore the masked parts of the image in a way that resembles the original content and looks plausible to a human. Dataset

The dataset consists of images with multiple square blocks of black pixels randomly placed, occluding at most 70% of the original image. The dataset is taken from multiple sources- MPII Human Pose Detection, Leeds Sports Pose Dataset, Synchronic Activities Stickmen V, Short BBC Pose and Frames labelled in Cinema. 28755 training samples, 6160 validation samples and 6160 test samples.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 3 / 20

slide-5
SLIDE 5

Introduction

Image Inpainting is the task of filling missing pixels of an image.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 4 / 20

slide-6
SLIDE 6

Introduction

Image Inpainting is the task of filling missing pixels of an image. The main challenge of the task is to generate realistic and semantically plausible pixel for the missing regions that blends properly with the existing image pixels.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 4 / 20

slide-7
SLIDE 7

Related Works

Early works [1] [2] [3] use patch based methods to solve the problem.

They copy matching background patches into the holes. These paper works well in background inpainting tasks. They can’t synthesize novel structures.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 5 / 20

slide-8
SLIDE 8

Related Works

New deep methods use CNN and GAN networks to formulate the solution and have produces promising results for image inpainting.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 6 / 20

slide-9
SLIDE 9

Related Works

New deep methods use CNN and GAN networks to formulate the solution and have produces promising results for image inpainting.

These methods train encoder-decoder network jointly with adversarial networks to produce pixels which are coherent with the existing ones.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 6 / 20

slide-10
SLIDE 10

Related Works

New deep methods use CNN and GAN networks to formulate the solution and have produces promising results for image inpainting.

These methods train encoder-decoder network jointly with adversarial networks to produce pixels which are coherent with the existing ones. They can’t model long term correlations between distant contextual information and hole regions.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 6 / 20

slide-11
SLIDE 11

Related Works

New deep methods use CNN and GAN networks to formulate the solution and have produces promising results for image inpainting.

These methods train encoder-decoder network jointly with adversarial networks to produce pixels which are coherent with the existing ones. They can’t model long term correlations between distant contextual information and hole regions. Produces boundary artifacts, distorted structures, blurry textures inconsistent with surroundings.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 6 / 20

slide-12
SLIDE 12

Related Works

More recently, Globally and locally consistent image completion [4] CVPR 2017 paper, improve the results by introducing local and global

  • discriminators. In addition, it uses dilated convolutions to increase the

receptive fields and replace the fully connected layers adopted in the contextual encoders.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 7 / 20

slide-13
SLIDE 13

Proposed Solution: Image Inpainting generator with skip-connections

We use an encoder-decoder based CNN model with a combination of regular and dilated convolutions followed by batch normalization and ReLU to encode the partial image.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 8 / 20

slide-14
SLIDE 14

Proposed Solution: Image Inpainting generator with skip-connections

We use an encoder-decoder based CNN model with a combination of regular and dilated convolutions followed by batch normalization and ReLU to encode the partial image. The decoder uses skip connections from the encoder and combination

  • f deconvolution and convolutions to generate the full image.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 8 / 20

slide-15
SLIDE 15

Proposed Solution: Image Inpainting generator with skip-connections

We use an encoder-decoder based CNN model with a combination of regular and dilated convolutions followed by batch normalization and ReLU to encode the partial image. The decoder uses skip connections from the encoder and combination

  • f deconvolution and convolutions to generate the full image.

Inputs

The input to the model is 128*128*4 sized tensor which is concatenation of the input image and the mask. We use data available in the ’maskdata.json’ file to generate binary mask images. The masks contain ones in places of holes and zeros everywhere else.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 8 / 20

slide-16
SLIDE 16

Network Architecture

Figure: Architecture of the discriminator module of the inpainting network. Each building block is described in Figure 9. Figure: Building blocks of the network.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 9 / 20

slide-17
SLIDE 17

Network Architecture

Figure: Architecture of the generator module

  • f the inpainting network. Building block is

shown in Fig 9. Figure: Architecture of the discriminator module of the inpainting network.Building block is shown in Fig 9.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 10 / 20

slide-18
SLIDE 18

Loss Functions

Following loss functions have been used to train the network- Reconstruction Loss [5] Lr = 1

K

K

i=1 |I i x − I i imitation| + α ∗ 1 K

K

i=1(I i Mask − I i Mask)2

where, K is the batch size and alpha = 0.000001 and I i

imitation is the

  • utput of the decoder.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 11 / 20

slide-19
SLIDE 19

Loss Functions

Following loss functions have been used to train the network- Reconstruction Loss [5] Lr = 1

K

K

i=1 |I i x − I i imitation| + α ∗ 1 K

K

i=1(I i Mask − I i Mask)2

where, K is the batch size and alpha = 0.000001 and I i

imitation is the

  • utput of the decoder.

Adversarial Loss [5] Lreal = −log(p), Lfake = −log(1 − p) Ld = Lreal + β ∗ Lfake where, p is the output probability of the discriminator module and β = 0.01 (hyper parameter)

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 11 / 20

slide-20
SLIDE 20

Loss Functions

Following loss functions have been used to train the network- Reconstruction Loss [5] Lr = 1

K

K

i=1 |I i x − I i imitation| + α ∗ 1 K

K

i=1(I i Mask − I i Mask)2

where, K is the batch size and alpha = 0.000001 and I i

imitation is the

  • utput of the decoder.

Adversarial Loss [5] Lreal = −log(p), Lfake = −log(1 − p) Ld = Lreal + β ∗ Lfake where, p is the output probability of the discriminator module and β = 0.01 (hyper parameter) Perceptual Loss [6] Lp = 1

K

K

i=1(φ(Iy) − φ(Iimitation))2

where, φ represents features from VGG16 network pretrained on Microsoft COCO dataset.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 11 / 20

slide-21
SLIDE 21

Training

The network is trained using Adam Optimizer with learning rate 0.001 and batch size 12.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 12 / 20

slide-22
SLIDE 22

Training

The network is trained using Adam Optimizer with learning rate 0.001 and batch size 12. For first 5 epochs only the generator module of the network is trained minimizing only the reconstruction loss and perceptual loss

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 12 / 20

slide-23
SLIDE 23

Training

The network is trained using Adam Optimizer with learning rate 0.001 and batch size 12. For first 5 epochs only the generator module of the network is trained minimizing only the reconstruction loss and perceptual loss For the next 15 epochs, the entire GAN network [5] is trained end-to-end minimizing Adversarial and Perceptual loss.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 12 / 20

slide-24
SLIDE 24

Results

With our proposed solution we secured 2nd position in the competition.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 13 / 20

slide-25
SLIDE 25

Results

With our proposed solution we secured 2nd position in the competition. To evaluate the quality of the reconstruction, metrics as mentioned

  • n the competition’s website are used.

Evaluation Metrics Training Phase Testing Phase PSNR 20.4314 21.5118 MSE 0.0176 0.0158 DSSIM 0.2089 0.2048 WNJD 0.1488 0.1495

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 13 / 20

slide-26
SLIDE 26

Results

Figure: Input Image Figure: Generated Image Figure: Ground Truth

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 14 / 20

slide-27
SLIDE 27

Results

Figure: Input Image Figure: Generated Image Figure: Ground Truth

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 15 / 20

slide-28
SLIDE 28

Results

Figure: Input Image Figure: Generated Image Figure: Ground Truth

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 16 / 20

slide-29
SLIDE 29

Conclusion

We propose a generative solution for the image inpainting task.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 17 / 20

slide-30
SLIDE 30

Conclusion

We propose a generative solution for the image inpainting task. We have trained our model to generate patches which has not appear anywhere in the scene.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 17 / 20

slide-31
SLIDE 31

Conclusion

We propose a generative solution for the image inpainting task. We have trained our model to generate patches which has not appear anywhere in the scene. Also, it has learn to inpaint images with randomly placed masks of variable size.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 17 / 20

slide-32
SLIDE 32

Future Work

We aim to improve the resolution of the inpainted image by using multi-stage GANs at different resolution.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 18 / 20

slide-33
SLIDE 33

Future Work

We aim to improve the resolution of the inpainted image by using multi-stage GANs at different resolution. Moreover, techniques to handle the multiple modalities of the image and using loss functions related to pose estimation would help improve the results.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 18 / 20

slide-34
SLIDE 34

Thank You

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 19 / 20

slide-35
SLIDE 35

References

Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S Huang. Generative image inpainting with contextual attention. arXiv preprint, 2018. Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan B Goldman. Patchmatch: A randomized correspondence algorithm for structural image editing. ACM Transactions on Graphics (ToG), 28(3):24, 2009. James Hays and Alexei A Efros. Scene completion using millions of photographs. In ACM Transactions on Graphics (TOG), volume 26, page 4. ACM, 2007. Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. Globally and locally consistent image completion. ACM Transactions on Graphics (TOG), 36(4):107, 2017.

Anubha Pandey, Vismay Patel (IITM) Track1: Image Inpainting 19th September 2018 20 / 20