Unsupervised Learning of Probably Symmetric Deformable 3D Objects - - PowerPoint PPT Presentation

unsupervised learning of probably symmetric deformable 3d
SMART_READER_LITE
LIVE PREVIEW

Unsupervised Learning of Probably Symmetric Deformable 3D Objects - - PowerPoint PPT Presentation

Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild Shangzhe Wu Christian Rupprecht Andrea Vedaldi V ISUAL G EOMETRY G ROUP , U NIVERSITY OF O XFORD Agenda v Problem Introduction v Method Overview v Results


slide-1
SLIDE 1

Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild

VISUAL GEOMETRY GROUP, UNIVERSITY OF OXFORD Shangzhe Wu Christian Rupprecht Andrea Vedaldi

slide-2
SLIDE 2

29

v Problem Introduction v Method Overview v Results v Discussions v Conclusions

Agenda

slide-3
SLIDE 3

30

What is 3D Reconstruction?

Vision

Reconstruction

Graphics

Rendering 2D Observations 3D Representation

slide-4
SLIDE 4

Multi-view 3D Reconstruction

33 Building Rome in a Day. Agarwal et al. ICCV’09

static scene but.. the world is dynamic

slide-5
SLIDE 5

Multi-view 3D Reconstruction

34 Building Rome in a Day. Agarwal et al. ICCV’09 The Relightables: Volumetric Performance Capture of Humans with Realistic Relighting. Guo et al. SIGGRAPH Asia’19

static scene 100 cameras too expensive for me :(

slide-6
SLIDE 6

Learning-based Single-view 3D Reconstruction

35

Neural Network

3D prior learned during training Need supervision!

slide-7
SLIDE 7

Supervision during Training

36

3D ground truth or shape models keypoints silhouettes multi-views camera viewpoint depth maps

slide-8
SLIDE 8

37

3D ground truth or shape models keypoints silhouettes multi-views camera viewpoint depth maps

Unsupervised Learning of 3D Objects

slide-9
SLIDE 9

Unsupervised Learning of 3D Objects

38

instance-specific 3D shapes single-view images of a category NO other supervision! Training Data Output Unsup3D

slide-10
SLIDE 10

39

input reconstruction input reconstruction

slide-11
SLIDE 11

Unsupervised Learning of 3D Objects

40

instance-specific 3D shapes single-view images of a category NO other supervision! Training Data Output Unsup3D

slide-12
SLIDE 12

Symmetries in the World

41

slide-13
SLIDE 13

Training Pipeline: Photo-Geometric Autoencoding

42

slide-14
SLIDE 14

Photo-Geometric Autoencoding

43 Renderer view ! texture

encoder decoder encoder

depth "

encoder decoder

input # reconstruction $ # Reconstruction Loss

slide-15
SLIDE 15

Photo-Geometric Autoencoding

44 Renderer view ! texture

encoder decoder encoder

depth "

encoder decoder

input # reconstruction $ # Reconstruction Loss

Q1: How to avoid degenerate solutions?

slide-16
SLIDE 16

Photo-Geometric Autoencoding

45 Renderer view ! texture

encoder decoder encoder

depth "

encoder decoder

input # reconstruction $ # Reconstruction Loss

Q1: How to avoid degenerate solutions? A1: Enforce symmetry

slide-17
SLIDE 17

Photo-Geometric Autoencoding

46 view ! texture

encoder decoder encoder

flipped depth "

encoder decoder

depth "′ : horizontal flip input #

? ?

Q1: How to avoid degenerate solutions? A1: Enforce symmetry by flipping

slide-18
SLIDE 18

Photo-Geometric Autoencoding

47 Renderer view ! texture

encoder decoder encoder

flipped depth "

encoder decoder

depth "′ : horizontal flip input # reconstruction $ # Reconstruction Loss

? ? ?

flip switch

Q1: How to avoid degenerate solutions? A1: Enforce symmetry by flipping

slide-19
SLIDE 19

Photo-Geometric Autoencoding

48 Renderer view ! texture

encoder decoder encoder

flipped depth "

encoder decoder

depth "′ : horizontal flip input # reconstruction $ # Reconstruction Loss

? ?

flip switch

Q1: How to avoid degenerate solutions? A1: Enforce symmetry by flipping

slide-20
SLIDE 20

Photo-Geometric Autoencoding

49 Renderer view ! texture

encoder decoder encoder

flipped depth "

encoder decoder

depth "′ : horizontal flip input # reconstruction $ # Reconstruction Loss

? ? ?

flip switch

Q1: How to avoid degenerate solutions? A1: Enforce symmetry by flipping

slide-21
SLIDE 21

Photo-Geometric Autoencoding

50 Renderer view ! texture

encoder decoder encoder

flipped depth "

encoder decoder

depth "′ : horizontal flip input # reconstruction $ # Reconstruction Loss

flip switch

Q1: How to avoid degenerate solutions? A1: Enforce symmetry by flipping

slide-22
SLIDE 22

Photo-Geometric Autoencoding

51 view ! texture

encoder decoder encoder

flipped depth "

encoder decoder

depth "′ : horizontal flip input #

Q2: What about non-symmetric lighting?

slide-23
SLIDE 23

Photo-Geometric Autoencoding

52 canonical view & Renderer shading input # view ! light ' albedo ( reconstruction $ # Reconstruction Loss

encoder decoder encoder encoder

albedo (′ depth "

encoder decoder

depth "′ : horizontal flip

Q2: What about non-symmetric lighting? A2: Enforce symmetry on albedo

flip switch

slide-24
SLIDE 24

Photo-Geometric Autoencoding

53 canonical view & Renderer shading input # view ! light ' albedo ( reconstruction $ # Reconstruction Loss

encoder decoder encoder encoder

albedo (′ depth "

encoder decoder

depth "′ : horizontal flip

flip switch

Q3: Non-symmetric albedo, deformation, etc?

slide-25
SLIDE 25

Photo-Geometric Autoencoding

54 canonical view & Renderer shading input # view ! light ' albedo ( reconstruction $ #

  • conf. )′
  • conf. )

Reconstruction Loss

encoder decoder encoder decoder encoder encoder

albedo (′ depth "

encoder decoder

depth "′ : horizontal flip

Q3: Non-symmetric albedo, deformation, etc? A3: Predict uncertainty

flip switch

slide-26
SLIDE 26

Photo-Geometric Autoencoding

55 canonical view & Renderer shading input # view ! light ' albedo ( reconstruction $ #

  • conf. )′
  • conf. )

Reconstruction Loss

encoder decoder encoder decoder encoder encoder

albedo (′ depth "

encoder decoder

depth "′ : horizontal flip

Q3: Non-symmetric albedo, deformation, etc? A3: Predict uncertainty

flip switch

slide-27
SLIDE 27

Photo-Geometric Autoencoding

56 canonical view & Renderer shading input # view ! light ' albedo ( reconstruction $ #

  • conf. )′
  • conf. )

Reconstruction Loss

encoder decoder encoder decoder encoder encoder

albedo (′ depth "

encoder decoder

depth "′ : horizontal flip

Q3: Non-symmetric albedo, deformation, etc? A3: Predict uncertainty

flip switch

slide-28
SLIDE 28

Photo-Geometric Autoencoding

57 canonical view & Renderer shading input # view ! light ' albedo ( reconstruction $ #

  • conf. )′
  • conf. )

Reconstruction Loss

encoder decoder encoder decoder encoder encoder

albedo (′ depth "

encoder decoder

depth "′ : horizontal flip

Q3: Non-symmetric albedo, deformation, etc? A3: Predict uncertainty

flip switch

slide-29
SLIDE 29

Photo-Geometric Autoencoding

58 canonical view & Renderer shading input # view ! light ' albedo ( reconstruction $ #

  • conf. )′
  • conf. )

Reconstruction Loss

encoder decoder encoder decoder encoder encoder

albedo (′ depth "

encoder decoder

depth "′ : horizontal flip

Q3: Non-symmetric albedo, deformation, etc? A3: Predict uncertainty

flip switch

slide-30
SLIDE 30

Photo-Geometric Autoencoding

59 canonical view & Renderer shading input # view ! light ' albedo ( reconstruction $ #

  • conf. )′
  • conf. )

Reconstruction Loss

encoder decoder encoder decoder encoder encoder

albedo (′ depth "

encoder decoder

depth "′ : horizontal flip

Q3: Non-symmetric albedo, deformation, etc? A3: Predict uncertainty

flip switch

slide-31
SLIDE 31

Photo-Geometric Autoencoding

60 canonical view & Renderer shading input # view ! light ' albedo ( reconstruction $ #

  • conf. )′
  • conf. )

Reconstruction Loss

encoder decoder encoder decoder encoder encoder

albedo (′ depth "

encoder decoder

depth "′ : horizontal flip

Q3: Non-symmetric albedo, deformation, etc? A3: Predict uncertainty

flip switch

slide-32
SLIDE 32

61

Images taken from CelebA, 3DFAW

Results on human faces

slide-33
SLIDE 33

input reconstruction input reconstruction

62

slide-34
SLIDE 34

63

Images taken from [1]

Results on face paintings

[1] Elliot J. Crowley, Omkar M. Parkhi, and Andrew Zisserman. Face painting: querying art with photos. In Proc. BMVC, 2015.

slide-35
SLIDE 35

input reconstruction input reconstruction

64

slide-36
SLIDE 36

65

Images taken from [1] and the Internet

Results on abstract faces

[1] Elliot J. Crowley, Omkar M. Parkhi, and Andrew Zisserman. Face painting: querying art with photos. In Proc. BMVC, 2015.

slide-37
SLIDE 37

input reconstruction input reconstruction

66

slide-38
SLIDE 38

67

Video clips taken from VoxCeleb2

Results on video frames

We do not use videos for training or fine-tuning. These results are obtained by applying our model trained on CelebA frame by frame.

slide-39
SLIDE 39

68

input input input input recon. new view rotated recon. new view rotated recon. new view rotated recon. new view rotated

slide-40
SLIDE 40

69

Images taken from CelebA

Relighting effects

slide-41
SLIDE 41

input reconstruction input reconstruction

70

slide-42
SLIDE 42

71

Images taken from [2] and [3]

Results on cat faces

[2] Weiwei Zhang, Jian Sun, and Xiaoou Tang. Cat head detection - how to effectively exploit shape and texture features. In Proc. ECCV, 2008. [3] Omkar M. Parkhi, Andrea Vedaldi, Andrew Zisserman, and C. V. Jawahar. Cats and dogs. In Proc. CVPR, 2012.

slide-43
SLIDE 43

input reconstruction input reconstruction

72

slide-44
SLIDE 44

73

Images rendered using ShapeNet

Results on synthetic cars

slide-45
SLIDE 45

74

input reconstruction input reconstruction

slide-46
SLIDE 46

Symmetry Plane Visualization

75

slide-47
SLIDE 47

Asymmetry Visualization

76

slide-48
SLIDE 48

Discussion: Ablation Studies

77

slide-49
SLIDE 49

78

Ablation – Symmetry

Normal Shading Albedo Shaded Recon. full w/o albedo flip w/o depth flip Depth Insight #1: Symmetry avoids degeneracy Input

slide-50
SLIDE 50

79

Ablation – Lighting (Shape from Shading)

Input Normal Shading Albedo Shaded Recon. full w/o lighting Depth Insight #2: Lighting avoids bumpy shapes and provides cues for shape

slide-51
SLIDE 51

80

Ablation – Confidence Maps

Insight #3: Confidence maps allows for asymmetry modelling

Asymmetry perturbation input

  • recon. w/ conf.
  • recon. w/o conf.
  • conf. !
  • conf. !′
slide-52
SLIDE 52

Discussion: Limitations

81

slide-53
SLIDE 53

82

Limitation #1: Poor side reconstruction

Why?

  • Canonical depth map cannot represent the shape of the side

input reconstructions input reconstructions

slide-54
SLIDE 54

83

Limitation #2: Side pose input

Why?

  • There are no/few symmetric correspondences present in the image,

which voids the symmetry regularization

  • No enough side images in the training set

input reconstructions

slide-55
SLIDE 55

84

Limitation #3: Lambertian shading

Why?

  • We assume Lambertian shading with one dominant directional light,

and do not model specularity and shadow input reconstructions

slide-56
SLIDE 56

85

Limitation #4: Dark texture vs. dark shading

Why?

  • Disentangling dark texture and dark shading is hard. The model may

produce dark shading with spiky shape to reconstruct dark texture. input reconstructions

slide-57
SLIDE 57

86

v We present an unsupervised method for learning deformable 3D objects from only raw single-view images v Bilateral symmetry in common objects provides a powerful constraint for learning 3D shapes v Shading provides important geometric cues and helps regularize shape v By modeling symmetric albedo and non-symmetric shading separately, our model automatically learns intrinsic image decomposition without supervision v Confidence maps can be used to model asymmetries

Conclusions

slide-58
SLIDE 58

87

v 3D understanding is possible from only single-view 2D

  • bservations; image recognition should go beyond 2D

v Reducing supervision in training is important for 3D understanding in the wild v Physical cues and visual patterns are useful, such as symmetry, shading, planes, repetitions, etc

Key Takeaways

slide-59
SLIDE 59

88

Thank you!

Demo: bit.ly/2zBNjXx Code: github.com/elliottwu/unsup3d