Semantic Image Analogy with a Conditional Single-Image GAN Ji a - - PowerPoint PPT Presentation

▶

Mar 17, 2024 220 likes •530 views

Semantic Image Analogy with a Conditional Single-Image GAN Ji a cheng Li , Zhiwei Xiong, Dong Liu, Xuejin Chen, Zheng-Jun Zh a ACM MM 2020 P P analogous I I Image Analogy A : A :: B : B : :: : :: A A A A :

SLIDE 1

Semantic Image Analogy with a Conditional Single-Image GAN

Jiacheng Li, Zhiwei Xiong, Dong Liu, Xuejin Chen, Zheng-Jun Zha ACM MM 2020

analogous

I ⇒ I′ P ⇒ P′

SLIDE 2

A. Hertzmann, et al. 2001. Image analogies. ACM Trans. Graph.

A : A′ :: B : B′

: :: :

A B B′ A′

: :: :

A B B′ A′

Image Analogy

SLIDE 3

: :: :

A B B′ A′

Image Analogy

A : A′ :: B : B′

SLIDE 4

P P′ I′ I

P ⇒ P′ :: I ⇒ I′

Semantic Image Analogy

⇒ ⇒

Segmentation Domain Image Domain

SLIDE 5

analogous

I ⇒ I′ P ⇒ P′ P

Semantic Image Analogy

P ⇒ P′ :: I ⇒ I′

P I

SLIDE 6

T. Park, et al. 2019. Semantic Image Synthesis With Spatially-Adaptive Normalization. In CVPR.
T. Shaham, et al. 2019. SinGAN: Learning a Generative Model From a Single Natural Image. In ICCV.

Conditional GANs Single-Image GANs

Semantic Image Synthesis Retargeting Super-Resolution Unconditional Sampling …

In-the-wild Images ADE20k Cityscapes COCO CelebA …

SLIDE 7

T. Park, et al. 2019. Semantic Image Synthesis With Spatially-Adaptive Normalization. In CVPR.
T. Shaham, et al. 2019. SinGAN: Learning a Generative Model From a Single Natural Image. In ICCV.

Conditional GANs Single-Image GANs

Semantic Image Synthesis Retargeting Super-Resolution Unconditional Sampling …

In-the-wild Images ADE20k Cityscapes COCO CelebA …

Can we achieve the best from both worlds?

SLIDE 8

Conditional Single-Image GAN

Can we achieve the best from both worlds?

Self-Supervised Training Semantic Feature Translation (SFT) Loss Terms

SLIDE 9

Self-Supervised Learning: Alternating Optimization

Psource ⇒ Paug :: Isource ⇒ Iaug

⇒ ⇒ ⇒ ⇒

Psource ⇒ Psource :: Isource ⇒ Isource

Sampling Mode Reconstruction Mode

SLIDE 10

share weights

Eseg Faug Fsource

(γseg, βseg)

Self-Supervised Learning: Reconstruction Mode

⇒ ⇒

Psource Isource

G SFT

(γimg, βimg)

Psource ⇒ Psource :: Isource ⇒ Isource

Psource Isource

SLIDE 11

SFT block SFT block

Semantic Feature Translation (SFT) Module

source

aug

Segmentation Features

βl

img

γl

img

Transformation Parameters

⊕

⊙

img

Image Features

γl

seg ≈ Fl scale

βl

seg ≈ Fl shift

scale =

aug

source

shift = Fl aug − Fl source

Transformation Parameters Linear Linear

SLIDE 12

Paug Psource Isource Itarget

share weights

Eseg

G SFT

Faug Fsource

(γimg, βimg) (γseg, βseg)

Loss Terms

homogeneous appearance

SLIDE 13

Paug Psource Isource Itarget

share weights

Eseg

G SFT

Faug Fsource

(γimg, βimg) (γseg, βseg)

Loss Terms

aligned semantic layout homogeneous appearance

SLIDE 14

Paug Psource Isource Itarget

share weights

Eseg

G SFT

Faug Fsource

(γimg, βimg) (γseg, βseg)

Loss Terms

Patch Coherence Loss

Isource Itarget

1 N ∑

V⊂Itarget

min

U⊂Isource

d(V, U)

V U

SLIDE 15

Paug Psource Isource Itarget

share weights

Eseg

G SFT

Semantic Alignment Loss

Iaug

GAN Loss

Feature Matching Loss

Real/Fake Fake Real

Faug Fsource

(γimg, βimg) Segmentation Network

Ppredict

(γseg, βseg)

Loss Terms

Patch Coherence Loss

SLIDE 16

Psource Isource Itarget

share weights

Eseg

G SFT

Faug Fsource

(γimg, βimg) (γseg, βseg)

Loss Terms

Reconstruction Loss

Paug

γimg → 1 βimg → 0

Fixed-Point Loss

SLIDE 17

Psource Isource Itarget

share weights

Eseg

G SFT

Faug Fsource

(γimg, βimg) (γseg, βseg)

Loss Terms

Reconstruction Loss

Paug Isource

GAN Loss

Real/Fake Fake Real γimg → 1 βimg → 0

Fixed-Point Loss

SLIDE 18

Evaluation

SLIDE 19

User Study Interface

Pleas rank A, B and C by appearance similarity with the left side image.

A. Hertzmann, et al. 2001. Image analogies. ACM Trans. Graph.
J. Liao, et al. 2001. Visual attribute transfer through deep image analogy. ACM Trans. Graph.

SLIDE 20

Quantitative Comparisons

15 30 45 60 Mean IOU Pixel-wise Accuracy IA DIA Ours IA DIA Ours 0% 25% 50% 75% 100% Rank #1 Rank #2 Rank #3

A. Hertzmann, et al. 2001. Image analogies. ACM Trans. Graph.
J. Liao, et al. 2001. Visual attribute transfer through deep image analogy. ACM Trans. Graph.

SLIDE 21

Source Target DIA IA Target Layout Ours

Comparisons with Previous Image Analogies

A. Hertzmann, et al. 2001. Image analogies. ACM Trans. Graph.
J. Liao, et al. 2001. Visual attribute transfer through deep image analogy. ACM Trans. Graph.

SLIDE 22

Ours Source SinGAN IA Target Layout Edited Source Ours

A. Hertzmann, et al. 2001. Image analogies. ACM Trans. Graph.
T. Shaham, et al. 2019. SinGAN: Learning a Generative Model From a Single Natural Image. In ICCV.

Comparisons with Single-Image GANs

SLIDE 23

Source Ours SPADE IA Target Layout

A. Hertzmann, et al. 2001. Image analogies. ACM Trans. Graph.
T. Park, et al. 2019. Semantic Image Synthesis With Spatially-Adaptive Normalization. In CVPR.

Comparisons with Conditional GANs

SLIDE 24

Source Target #3 Target #1 Target #2

Semantic Manipulation Results

SLIDE 25

Applications

SLIDE 26

Isource Psource Ptarget Itarget

Object Removal Results

SLIDE 27

Source Target #1 Target #3 Target #2

Face Editing Results

SLIDE 28

Isource Psource Ptarget Itarget

Sketch-to-Image Synthesis Results

SLIDE 29

Isource Psource Ptarget Itarget

Failure Cases

SLIDE 30

Thank you!

analogous

I ⇒ I′ P ⇒ P′