Synthesizing Normalized Faces from Facial Identity Features - PowerPoint PPT Presentation

Synthesizing Normalized Faces from Facial Identity Features Forrester Cole, David Belanger, Dilip Krishnan, Aaron Sarna, Inbar Mosseri, William T. Freeman, Google, Inc. University of Massachusetts Amherst, MIT CSAIL CVPR 2017 Presented by: Kapil Krishnakumar

Problem ● Want method for synthesizing a frontal, neutral expression image of a person’s face given an input face photograph ● One-to-one mapping from identity to image ● Method of pre-processing images to remove irregularities Image Credit: Cole et al.

Related Work Zhmoginov and Sandler et al. Blanz and Vetter et al. Hassner et al. Image Credit: Zhmoginov and Sandler. Inverting face embeddings with convolutional neural works. Blanz and Vetter et al. A Morphable Model For The Synthesis Of 3D Faces Cootes et al. Cootes et al. Active Appearance Models Hassner et al. Effective Face Frontalization in Unconstrained Images

Approach ● Morphing of Images (Data Augmentation) Encoder (Image to Feature Vector) ● ● Decoder (Feature Vector to Normalized Image) ○ Landmarks Texture ○ Image Credit: Cole et al.

Architecture MLP Landmarks Facenet FC/CNN Differentiable Warping Textures Image Credit: Cole et al.

FaceNet (Background) (Schroff et al. 2015) ● Face Images -> 128-D vectors Trained using triplet loss. Embeddings of two pictures ● of A should be more similar than picture of person A and person B. ● Uses GoogLeNet’s NN2 Architecture Image Credit: Cole CVPR 2017 Talk (https://www.youtube.com/watch?v=jVAClXpHgAI) | Szegedy et al. Going deeper with convolutions

Encoder ● Use pretrained FaceNet Extract 1024-D “avgpool” ● layer of “NN2” architecture ● Append and train Fully Connected Layer from 1024 to F dimensions on this layer. Image Credit: Szegedy et al. Going deeper with convolutions

Encoder Feature Vector ● Use pretrained FaceNet Fully Connected Extract 1024-D “avgpool” ● layer of “NN2” architecture ● Append and train Fully Connected Layer from 1024 to F dimensions on this layer. Image Credit: Szegedy et al. Going deeper with convolutions

Decoder ● Separating landmarks and textures more effective than just predicting Shallow MLP image ● Landmarks estimated using shallow Predicted MLP with ReLUs applied on feature Landmarks vector ○ FV -> [(x,y),.....] CNN/FC ● Textures estimated using fully connected or CNN Feature Vector ○ FV -> Image Predicted Texture Image Credit: Cole et al

Decoder ● Use differentiable image warping to combine landmarks and textures Predicted Landmarks Differentiable Image Warping Predicted Texture Image Credit: Cole et al

Decoder Image Credit: Cole et al

Differentiable Image Warping Input Image Final Landmarks Dense Flow Final Output with Landmarks Field with Spline Interpolation Textures with Mean Landmarks Landmarks of Image Credit: Cole et al training data

Differentiable Spline Interpolation Polyharmonic Input Interpolation Landmarks Displacement Flow Field X,Y, Magnitude Final Landmarks Distance Matrix Image Credit: Cole et al

Training MLP Landmarks Facenet FC/CNN Textures Image Credit: Cole et al.

Training Ground Truth Landmarks MLP Landmarks Facenet FC/CNN Textures Image Credit: Cole et al. Ground Truth Textures

Training with FaceNet Loss Ground Truth Landmarks MLP Landmarks Facenet Facenet FC/CNN Textures Image Credit: Cole et al. Ground Truth Textures

Training Loss ● Separately penalize predicted landmarks and textures using mean squared error ● Penalize differences in resulting encodings from input image and rendered image when passed through FaceNet ○ Highly expensive to train Image Credit: Cole et al

Data Augmentation: Random Morphs ● Problem: Don’t have database of normalized face photos to train decoder network on ● Solution: Morphing Data Augmentation Linear Interpolation (Landmarks & Textures) Select one of k=200 Nearest Neighbors using distance defined by Image Credit: Cole et al Landmarks and Textures

Data Augmentation: Gradient Domain Compositing ● Morphing cannot capture hair and background detail Combine morphed image onto an original background using gradient domain ● compositing Image Credit: Cole et al

Data Augmentation Input Augmented Image Credit: Cole et al

Data Augmentation Image Credit: Cole et al

Training Data ● Dataset used to train VGG-Face network. 2.6M photos Processing: ● ○ Average all images for each individual by morphing ○ Each image is then warped to average landmarks of individual ○ Pixel values are averaged to form average image of individual. ● Gives 1K unique identities images Use Kazemi and Sullivan for extracting groundtruth Landmarks ● ● Augmentation produces 1M images Image Credit: Cole et al

Experiments: Labeled Faces in the Wild ● Identities mutually exclusive to VGG face dataset Hassner Image Credit: Cole et al

Experiments: Labeled Faces in the Wild ● Histograms of FaceNet L2 error between input and synthesized images. 1.242 is threshold for clustering identities in FaceNet feature space ● ● Blue : With Facenet Training Loss ● Green : Without Facenet Training Loss Image Credit: Cole et al

Robustness to Occlusions Image Credit: Cole CVPR 2017 Talk (https://www.youtube.com/watch?v=jVAClXpHgAI)

Extensions: 3-D Model Fitting ● Easier to fit normalized face image on 3D morphable model. Image Credit: Cole et al

Extensions: Automatic-Photo Adjustment Image Credit: Cole et al

Advantages ● Splitting of generative tasks (Landmarks and Textures) can be better than directly outputting result ● Fresh use of spline interpolation as differentiable module in NN ● Augmentation technique allows training of decoder with only 1K images to perform extremely well. Tough features like hair and eyes are well defined in normalized images ● ● Robustness to occlusions

Disadvantages ● No “ground truth” to compare Normalized Images Though measure of performance can be defined as FaceNet closeness between image and ○ normalized image Cannot get human annotated ground truth ○ ● Dependent on out of box methods for getting Landmarks and Textures labels ○ Paper doesn’t show experiments on other techniques other than Kazemi ○ Unclear on how Texture labels are generated. ● Backgrounds are unrealistic and blurry

Synthesizing Normalized Faces from Facial Identity Features - PowerPoint PPT Presentation

Synthesizing Normalized Faces from Facial Identity Features Forrester Cole, David Belanger, Dilip Krishnan, Aaron Sarna, Inbar Mosseri, William T. Freeman, Google, Inc. University of Massachusetts Amherst, MIT CSAIL CVPR 2017 Presented by:

SYNTHESIZING 3D SOUND SYNTHESIZING 3D SOUND AND AND SOUND LOCALIZATION SOUND LOCALIZATION

IMAGING OF FACIAL SKELETAL TRAUMA Anesa engi General Hospital Sarajevo FACIAL FRACTURES

Facial Expression Recognition YING SHEN SSE, TONGJI UNIVERSITY Facial expression recognition

Changing Places/Changing Faces 1 Running Head: CHANGING PLACES/CHANGES FACES Changing

Identity Theft Identity Theft Identity theft occurs when your personal information is stolen

Face Recognition: Motivation 1 Overview: 1. Why faces? 2. Applications for Face Analysis

Table of Contents Java Server Faces 3/4 tier architecture MVC AWT Java Server Faces 3)

Adopting the global Marketing Lead Domains.coop co-operative identity www.identity.coop 24

Identity and Access Management Using Identity Management and Identity Governance to increase

Facial Expression Recognition using Deep Learning Omid Nezami IARG meeting Department of

A C N A I B Enhance Skin complexion Enhance Skin complexion Bianca Facial Mask Enhanced

DYNAMIC FACIAL ANALYSIS: FROM BAYESIAN FILTERING TO RNN Jinwei Gu, 2017/4/18 with Xiaodong Yang,

Facial Expression Recognition Facial Expression Recognition using a Dynamic Model using a

License Plate Recognition License Plate Recognition License Plate Recognition License Plate

Based on: 1 Facial expression recognition based on Local Binary Patterns: A comprehensive study

Cold Water Treading, Facial, and Herbal Wrap 26b Hydrotherapy: Cold Water Treading, Facial, and

iOS Animation with Swift Part 9: Shape and Mask Animations CAShapeLayer Animatable Properties

Image Based Rendering Hua Zhong 2004/11 Render from images Image Morphing (has nothing to do

Access Paths Renata Borovica-Gajic Stratos Idreos Anastasia Ailamaki Marcin Zukowski Campbell

API-Compilation for Image Hardware Accelerators Fabien Coelho & Franc ois Irigoin ANR

StreamIt: A Language for Streaming Applications William Thies, Michal Karczmarek, Michael

The Remote Metamorphic Engine Detecting, Evading, Attacking the AI and Reverse Engineering Amro

Object-Oriented GUIs Andrew P . Black 1 Object-Oriented GUIs In the beginning, there was

Global Instruction Selection A Proposal Quentin Colombet Apple What Is Instruction Selection?

Sambuz

Useful Links

Newsletter

Mail Us

Synthesizing Normalized Faces from Facial Identity Features - PowerPoint PPT Presentation

Synthesizing Normalized Faces from Facial Identity Features Forrester Cole, David Belanger, Dilip Krishnan, Aaron Sarna, Inbar Mosseri, William T. Freeman, Google, Inc. University of Massachusetts Amherst, MIT CSAIL CVPR 2017 Presented by:

SYNTHESIZING 3D SOUND SYNTHESIZING 3D SOUND AND AND SOUND LOCALIZATION SOUND LOCALIZATION

IMAGING OF FACIAL SKELETAL TRAUMA Anesa engi General Hospital Sarajevo FACIAL FRACTURES

Facial Expression Recognition YING SHEN SSE, TONGJI UNIVERSITY Facial expression recognition

Changing Places/Changing Faces 1 Running Head: CHANGING PLACES/CHANGES FACES Changing

Identity Theft Identity Theft Identity theft occurs when your personal information is stolen

Face Recognition: Motivation 1 Overview: 1. Why faces? 2. Applications for Face Analysis

Table of Contents Java Server Faces 3/4 tier architecture MVC AWT Java Server Faces 3)

Adopting the global Marketing Lead Domains.coop co-operative identity www.identity.coop 24

Identity and Access Management Using Identity Management and Identity Governance to increase

Facial Expression Recognition using Deep Learning Omid Nezami IARG meeting Department of

A C N A I B Enhance Skin complexion Enhance Skin complexion Bianca Facial Mask Enhanced

DYNAMIC FACIAL ANALYSIS: FROM BAYESIAN FILTERING TO RNN Jinwei Gu, 2017/4/18 with Xiaodong Yang,

Facial Expression Recognition Facial Expression Recognition using a Dynamic Model using a

License Plate Recognition License Plate Recognition License Plate Recognition License Plate

Based on: 1 Facial expression recognition based on Local Binary Patterns: A comprehensive study

Cold Water Treading, Facial, and Herbal Wrap 26b Hydrotherapy: Cold Water Treading, Facial, and

iOS Animation with Swift Part 9: Shape and Mask Animations CAShapeLayer Animatable Properties

Image Based Rendering Hua Zhong 2004/11 Render from images Image Morphing (has nothing to do

Access Paths Renata Borovica-Gajic Stratos Idreos Anastasia Ailamaki Marcin Zukowski Campbell

API-Compilation for Image Hardware Accelerators Fabien Coelho &amp; Franc ois Irigoin ANR

StreamIt: A Language for Streaming Applications William Thies, Michal Karczmarek, Michael

The Remote Metamorphic Engine Detecting, Evading, Attacking the AI and Reverse Engineering Amro

Object-Oriented GUIs Andrew P . Black 1 Object-Oriented GUIs In the beginning, there was

Global Instruction Selection A Proposal Quentin Colombet Apple What Is Instruction Selection?

Sambuz

Useful Links

Newsletter

Mail Us

API-Compilation for Image Hardware Accelerators Fabien Coelho & Franc ois Irigoin ANR