SLIDE 1 Single Image Portrait Relighting
Tiancheng Sun1, Jonathan T. Barron2, Yun-Ta Tsai2, Zexiang Xu1, Xueming Yu3, Graham Fyffe3, Christoph Rhemann3, Jay Busch3, Paul Debevec3, Ravi Ramamoorthi1
1University of California, San Diego, 2Google Research, 3Google
SLIDE 2
Photography & Recording Allowed
SLIDE 3
Overview
SLIDE 4 Overview
light from the back
SLIDE 5 Overview
light from the back shadow on the face
SLIDE 6 Overview
want to add dramatic lighting light from the back shadow on the face
SLIDE 7 Overview
want to add dramatic lighting light from the back shadow on the face
Change the lighting of any portrait after capture using post-processing algorithm
SLIDE 8
Overview
SLIDE 9
Overview
SLIDE 10
Portrait Relighting System
Overview
SLIDE 11
Portrait Relighting System
Overview
another lighting
SLIDE 12
Portrait Relighting System
Overview
another lighting
SLIDE 13
Overview
Portrait Relighting System
SLIDE 14
Overview
Portrait Relighting System
SLIDE 15
Overview
Portrait Relighting System
I want to rotate the lighting a little bit.
SLIDE 16
Overview
Portrait Relighting System
I want to rotate the lighting a little bit.
SLIDE 17
Previous work
SLIDE 18 Previous work
Debevec, Paul, et al. "Acquiring the reflectance field of a human face." SIGGRAPH 2000.
SLIDE 19 Previous work
Debevec, Paul, et al. "Acquiring the reflectance field of a human face." SIGGRAPH 2000.
SLIDE 20 Previous work
Debevec, Paul, et al. "Acquiring the reflectance field of a human face." SIGGRAPH 2000.
SLIDE 21 Previous work
Debevec, Paul, et al. "Acquiring the reflectance field of a human face." SIGGRAPH 2000.
: capture ~100 images and do image-based relighting
SLIDE 22
Previous work
SLIDE 23
- Deep image-based relighting
Xu, Zexiang, et al. "Deep image-based relighting from optimal sparse samples." SIGGRAPH 2018
Previous work
SLIDE 24
- Deep image-based relighting
Xu, Zexiang, et al. "Deep image-based relighting from optimal sparse samples." SIGGRAPH 2018
Previous work
SLIDE 25
- Deep image-based relighting
Xu, Zexiang, et al. "Deep image-based relighting from optimal sparse samples." SIGGRAPH 2018
Previous work
SLIDE 26
- Deep image-based relighting
Xu, Zexiang, et al. "Deep image-based relighting from optimal sparse samples." SIGGRAPH 2018
capture 5 images and do relighting via neural network
Previous work
SLIDE 27
Previous work
SLIDE 28
- Portrait lighting transfer
Shu, Zhixin, et al. "Portrait lighting transfer using a mass transport approach." SIGGRAPH 2018
Previous work
SLIDE 29
- Portrait lighting transfer
Shu, Zhixin, et al. "Portrait lighting transfer using a mass transport approach." SIGGRAPH 2018
Previous work
SLIDE 30
- Portrait lighting transfer
Shu, Zhixin, et al. "Portrait lighting transfer using a mass transport approach." SIGGRAPH 2018
transfer lighting from one portrait to another
Previous work
SLIDE 31
Previous work
SLIDE 32 Normal Shading Albedo Relit
Sengupta, Soumyadip, et al. "SfSNet: Learning Shape, Reflectance and Illuminance of Faces in the Wild'." CVPR 2018.
Previous work
SLIDE 33 Normal Shading Albedo Relit
Sengupta, Soumyadip, et al. "SfSNet: Learning Shape, Reflectance and Illuminance of Faces in the Wild'." CVPR 2018.
: deep intrinsic decomposition mostly trained on synthetic faces
Previous work
SLIDE 34 Normal Shading Albedo Relit
Sengupta, Soumyadip, et al. "SfSNet: Learning Shape, Reflectance and Illuminance of Faces in the Wild'." CVPR 2018.
: deep intrinsic decomposition mostly trained on synthetic faces
Previous work
SLIDE 35
Overview
SLIDE 36
- Goal: practical relighting on single portrait image
Overview
SLIDE 37
- Goal: practical relighting on single portrait image
- Practical in detail:
- Robust to the pose and camera view
- Work well on natural lightings
- Adapt to high-resolution images
- Run at interactive rate
Overview
SLIDE 38
- Goal: practical relighting on single portrait image
- Practical in detail:
- Robust to the pose and camera view
- Work well on natural lightings
- Adapt to high-resolution images
- Run at interactive rate
- Solution: Deep Neural Network + Real Face Data.
Overview
SLIDE 39
Method
SLIDE 40 Method
portrait under lighting A
SLIDE 41 Method
Portrait Relighting System (Neural Network)
portrait under lighting A
SLIDE 42 Method
Portrait Relighting System (Neural Network)
portrait under lighting A lighting B
SLIDE 43 Method
Portrait Relighting System (Neural Network)
portrait under lighting A portrait under lighting B lighting B
SLIDE 44 Method
Portrait Relighting System (Neural Network)
portrait under lighting A portrait under lighting B lighting A lighting B
SLIDE 45 Method
Portrait Relighting System (Neural Network)
portrait under lighting A portrait under lighting B lighting A lighting B
SLIDE 46 Method
Portrait Relighting System (Neural Network)
portrait under lighting A portrait under lighting B lighting A lighting B
SLIDE 47 Method
Portrait Relighting System (Neural Network)
portrait under lighting A portrait under lighting B lighting A lighting B
SLIDE 48 Method
Portrait Relighting System (Neural Network)
portrait under lighting A portrait under lighting B lighting A lighting B
How can we get the portrait pair for training?
SLIDE 49
Method: Data
SLIDE 50 Method: Data
Light Stage
SLIDE 51 Method: Data
Light Stage One-Light-At-a-Time scans (OLAT)
SLIDE 52
Method: Data
SLIDE 53 Method: Data
Debevec, Paul, et al. "Acquiring the reflectance field of a human face." SIGGRAPH 2000.
SLIDE 54 Method: Data
captured OLAT captured OLAT
Debevec, Paul, et al. "Acquiring the reflectance field of a human face." SIGGRAPH 2000.
SLIDE 55 lighting
Method: Data
captured OLAT captured OLAT
Debevec, Paul, et al. "Acquiring the reflectance field of a human face." SIGGRAPH 2000.
SLIDE 56 lighting
Method: Data
captured OLAT captured OLAT latitude-longitude representation
Debevec, Paul, et al. "Acquiring the reflectance field of a human face." SIGGRAPH 2000.
SLIDE 57 lighting
Method: Data
captured OLAT captured OLAT latitude-longitude representation
Debevec, Paul, et al. "Acquiring the reflectance field of a human face." SIGGRAPH 2000.
SLIDE 58 lighting
Method: Data
captured OLAT captured OLAT latitude-longitude representation
Debevec, Paul, et al. "Acquiring the reflectance field of a human face." SIGGRAPH 2000.
SLIDE 59 lighting
x
x + + …… =
Method: Data
captured OLAT captured OLAT latitude-longitude representation
Debevec, Paul, et al. "Acquiring the reflectance field of a human face." SIGGRAPH 2000.
SLIDE 60 lighting
x
x + + …… =
Method: Data
captured OLAT captured OLAT relit image (background removed) latitude-longitude representation
Debevec, Paul, et al. "Acquiring the reflectance field of a human face." SIGGRAPH 2000.
SLIDE 61 lighting
x
x + + …… =
Method: Data
captured OLAT captured OLAT relit image (background removed) latitude-longitude representation
Wadhwa, Neal, et al. "Synthetic depth-of-field with a single-camera mobile phone." SIGGRAPH 2018 Debevec, Paul, et al. "Acquiring the reflectance field of a human face." SIGGRAPH 2000.
SLIDE 62
Method: Data
SLIDE 63
- OLAT images
- 22 people (18 training, 4 validation), each 3~5 facial expressions
Method: Data
SLIDE 64
- OLAT images
- 22 people (18 training, 4 validation), each 3~5 facial expressions
- Each OLAT captured with 7 cameras in 6 seconds.
Method: Data
SLIDE 65
- OLAT images
- 22 people (18 training, 4 validation), each 3~5 facial expressions
- Each OLAT captured with 7 cameras in 6 seconds.
- HDR lighting environments
Method: Data
SLIDE 66
- OLAT images
- 22 people (18 training, 4 validation), each 3~5 facial expressions
- Each OLAT captured with 7 cameras in 6 seconds.
- HDR lighting environments
- ~2000 indoor HDR lighting
from Laval Dataset
Method: Data
SLIDE 67
- OLAT images
- 22 people (18 training, 4 validation), each 3~5 facial expressions
- Each OLAT captured with 7 cameras in 6 seconds.
- HDR lighting environments
- ~2000 indoor HDR lighting
from Laval Dataset
lighting from the web
Method: Data
SLIDE 68
- OLAT images
- 22 people (18 training, 4 validation), each 3~5 facial expressions
- Each OLAT captured with 7 cameras in 6 seconds.
- HDR lighting environments
- ~2000 indoor HDR lighting
from Laval Dataset
lighting from the web
Method: Data
- Total: 226,800 portrait and lighting pairs for training
SLIDE 69
Method: Training
SLIDE 70 Method: Training
Encoder Decoder Bottleneck
SLIDE 71 Method: Training
Encoder Decoder Bottleneck
- Task 1: Complete relighting
SLIDE 72 Method: Training
Encoder Decoder Bottleneck source image
- Task 1: Complete relighting
SLIDE 73 Method: Training
Encoder Decoder Bottleneck source light source image
- Task 1: Complete relighting
SLIDE 74 Method: Training
Encoder Decoder Bottleneck source light target light source image
- Task 1: Complete relighting
SLIDE 75 Method: Training
Encoder Decoder Bottleneck source light target light source image target image
- Task 1: Complete relighting
SLIDE 76 Method: Training
Encoder Decoder Bottleneck source light target light source image target image
- Task 1: Complete relighting
L1 loss Log L1 loss
SLIDE 77 Method: Training
Encoder Decoder Bottleneck
SLIDE 78 Method: Training
- Task 2: Illumination retargeting
Encoder Decoder Bottleneck
SLIDE 79 Method: Training
- Task 2: Illumination retargeting
Encoder Decoder Bottleneck source image
SLIDE 80 Method: Training
- Task 2: Illumination retargeting
Encoder Decoder Bottleneck source light source image
SLIDE 81 Method: Training
- Task 2: Illumination retargeting
Encoder Decoder Bottleneck source light source image Rotate
SLIDE 82 Method: Training
- Task 2: Illumination retargeting
Encoder Decoder Bottleneck source light target light source image Rotate
SLIDE 83 Method: Training
- Task 2: Illumination retargeting
Encoder Decoder Bottleneck source light target light source image target image Rotate
SLIDE 84 Method: Training
- Task 2: Illumination retargeting
Encoder Decoder Bottleneck source light target light source image target image Rotate
L1 loss Log L1 loss
SLIDE 85
Method: Training
SLIDE 86 Method: Training
k x k conv layer concatenation
k
weighted average tiling
k
k-dimensional input/label
k
k-dimensional activation
256 x 256 128 x 128 64 x 64 32 x 32 16 x 16 Spatial Resolution:
loss
SLIDE 87 Method: Training
k x k conv layer concatenation
k
weighted average tiling
k
k-dimensional input/label
k
k-dimensional activation
256 x 256 128 x 128 64 x 64 32 x 32 16 x 16 Spatial Resolution:
loss
Source Image
64 64 3 128 128 3 256 256 3 512 512 3 512 3 512 3
Output Source Light
16 x 32 x 3 3 29 7 16 x 32 x 1 3 3 3 3 3 3
True Source Light
16 x 32 x 3 16 x 32 x 3
SLIDE 88 Method: Training
- Network structure
- U-Net
- Predict and feed in light at bottleneck
k x k conv layer concatenation
k
weighted average tiling
k
k-dimensional input/label
k
k-dimensional activation
256 x 256 128 x 128 64 x 64 32 x 32 16 x 16 Spatial Resolution:
loss
Source Image
64 64 3 128 128 3 256 256 3 512 512 3 512 3 512 3
Output Source Light
16 x 32 x 3 3 29 7 16 x 32 x 1 3 3 3 3 3 3
True Source Light
16 x 32 x 3 16 x 32 x 3
SLIDE 89 Method: Training
- Network structure
- U-Net
- Predict and feed in light at bottleneck
confidence learning module
k x k conv layer concatenation
k
weighted average tiling
k
k-dimensional input/label
k
k-dimensional activation
256 x 256 128 x 128 64 x 64 32 x 32 16 x 16 Spatial Resolution:
loss
Source Image
64 64 3 128 128 3 256 256 3 512 512 3 512 3 512 3
Output Source Light
16 x 32 x 3 3 29 7 16 x 32 x 1 3 3 3 3 3 3
True Source Light
16 x 32 x 3 16 x 32 x 3
SLIDE 90 Method: Training
- Network structure
- U-Net
- Predict and feed in light at bottleneck
confidence learning module
k x k conv layer concatenation
k
weighted average tiling
k
k-dimensional input/label
k
k-dimensional activation
256 x 256 128 x 128 64 x 64 32 x 32 16 x 16 Spatial Resolution:
loss
Source Image
64 64 3 128 128 3 256 256 3 512 512 3 512 3 512 3
Output Source Light
16 x 32 x 3 3 29 7 16 x 32 x 1 3 3 3 3 3 3
True Source Light
16 x 32 x 3 16 x 32 x 3
3 512
Target Light
3 32 32 3 512 256 512 3 512 512 256 3 256 128 3 128 64 3 64 512 256 256 128 128 64 64 3 3 3 3
16 x 32 x 3
True Target Image Output Target Image
3 3
SLIDE 91
Method: Training
SLIDE 92 Method: Training
SLIDE 93 Method: Training
SLIDE 94 Method: Training
Several conv layers
SLIDE 95 Method: Training
Several conv layers
SLIDE 96 Method: Training
Several conv layers resolution
- f the light
- Confidence learning
SLIDE 97 Method: Training
Several conv layers resolution
Light prediction on each image patch
SLIDE 98 Method: Training
Several conv layers resolution
Light prediction on each image patch Confidence of prediction on each image patch
- Confidence learning
- Predict the confidence of light prediction
SLIDE 99 Method: Training
Several conv layers resolution
Light prediction on each image patch Confidence of prediction on each image patch
* =
- Confidence learning
- Predict the confidence of light prediction
Reshape
SLIDE 100 Method: Training
Several conv layers resolution
Light prediction on each image patch Confidence of prediction on each image patch
* =
- Confidence learning
- Predict the confidence of light prediction
- Allow network to say “I don’t know”
Reshape
SLIDE 101
Results: Validation
SLIDE 102 Results: Validation
➤ Relit images for complete relighting
SLIDE 103 Results: Validation
➤ Relit images for complete relighting
source image
SLIDE 104 Results: Validation
➤ Relit images for complete relighting
source image
SLIDE 105 Results: Validation
➤ Relit images for complete relighting
source image
SLIDE 106 Results: Validation
➤ Relit images for complete relighting
source image target image
SLIDE 107 Results: Validation
➤ Relit images for complete relighting
source image target image
[Barron & Malik 2015][Sengupta et al. 2018] [Li et al. 2018]
SLIDE 108 Results: Validation
➤ Relit images for complete relighting
source image target image
[Barron & Malik 2015][Sengupta et al. 2018] [Li et al. 2018]
SLIDE 109
Results: Validation
SLIDE 110
Results: Validation
SLIDE 111
Results: Validation
SLIDE 112 Results: Validation
➤ Relight images with
predicted light as target light
Encoder Decoder Bottleneck
SLIDE 113 Results: Validation
➤ Relight images with
predicted light as target light
source image
Encoder Decoder Bottleneck
SLIDE 114 Results: Validation
➤ Relight images with
predicted light as target light
source image with self-supervision without self-supervision
Encoder Decoder Bottleneck
SLIDE 115
Results: Validation
SLIDE 116
Results: Validation
SLIDE 117
Results: Validation
SLIDE 118 Results: Validation
➤ Comparison with portrait lighting transfer
SLIDE 119 Results: Validation
➤ Comparison with portrait lighting transfer
reference image
SLIDE 120 Results: Validation
➤ Comparison with portrait lighting transfer
source image reference image
SLIDE 121 Results: Validation
➤ Comparison with portrait lighting transfer
source image reference image
Extract light from reference
SLIDE 122 Results: Validation
➤ Comparison with portrait lighting transfer
source image reference image
Extract light from reference Apply to source image
SLIDE 123 Results: Validation
➤ Comparison with portrait lighting transfer
source image groundtruth
reference image
Extract light from reference Apply to source image
SLIDE 124 Results: Validation
➤ Comparison with portrait lighting transfer
source image groundtruth
[Shih et al. 2014] [Shu et al. 2018] reference image
Extract light from reference Apply to source image
SLIDE 125
Results: Validation
SLIDE 126 Results: Validation
➤ Evaluation on lighting prediction
SLIDE 127 Results: Validation
➤ Evaluation on lighting prediction
source image ground truth
SLIDE 128 Results: Validation
➤ Evaluation on lighting prediction
source image ground truth
confidence learning
SLIDE 129 Results: Validation
➤ Evaluation on lighting prediction
source image ground truth
confidence learning
[Barron & Malik 2015] [Sengupta et al. 2018]
SLIDE 130
Results: Images in the wild
SLIDE 131 Results: Images in the wild
Input Image Relit Image
SLIDE 132 Results: Images in the wild
Input Image Relit Image
SLIDE 133 Results: Images in the wild
Input Image Relit Image Input Image Relit Image
SLIDE 134 Results: Images in the wild
Input Image Relit Image Input Image Relit Image
SLIDE 135
Results: Images in the wild
SLIDE 136
Results: Images in the wild
SLIDE 137
Results: Images in the wild
SLIDE 138
Results: Images in the wild
SLIDE 139
Limitations
SLIDE 140 Limitations
Input Image Relit Image
- Complex shadows
- Specular highlights
- Overexposed pixels
SLIDE 141 Limitations
Input Image Relit Image
- Complex shadows
- Specular highlights
- Overexposed pixels
- Over-smoothing
SLIDE 142 Limitations
Input Image Relit Image
- Complex shadows
- Specular highlights
- Overexposed pixels
- Over-smoothing
- Unseen high-saturation
color
SLIDE 143
Conclusion
SLIDE 144 Conclusion
- Learn the relighting function on portraits using Light Stage data
SLIDE 145 Conclusion
- Learn the relighting function on portraits using Light Stage data
- Take home message:
SLIDE 146 Conclusion
- Learn the relighting function on portraits using Light Stage data
- Take home message:
- For human faces, use real data.
SLIDE 147 Conclusion
- Learn the relighting function on portraits using Light Stage data
- Take home message:
- For human faces, use real data.
- End-to-end training vs assuming models.
SLIDE 148 Acknowledgement
- This work was funded in part by a Jacobs Fellowship, the Ronald L.
Graham Chair, and the UC San Diego Center for Visual Computing.
- Thanks to Zhixin Shu and Yichang Shih for the help on baseline
algorithms
- Thanks to Jean-François Lalonde for providing the indoor lighting
dataset.
- Thanks to Peter Denny for coordinating dataset capture.
- Thanks to all the anonymous volunteers in the dataset from
Google, UCSD and UCLA
SLIDE 149
Also presenting in poster session today at 12:15-1:15.
Single Image Portrait Relighting
SLIDE 150 Single Image Portrait Relighting
input image generated portraits under new illuminations
SLIDE 151 Single Image Portrait Relighting
input image generated portraits under new illuminations