DEEP UNCONSTRAINED GAZE ESTIMATION WITH SYNTHETIC DATA Shalini De - - PowerPoint PPT Presentation
DEEP UNCONSTRAINED GAZE ESTIMATION WITH SYNTHETIC DATA Shalini De - - PowerPoint PPT Presentation
DEEP UNCONSTRAINED GAZE ESTIMATION WITH SYNTHETIC DATA Shalini De Mello, Rajeev Ranjan, Jan Kautz NVIDIA AI CO-PILOT 2 APPLICATIONS INTERFACE DESIGN AR/VR ACCESSIBILITY 3 TRADITIONAL GAZE TRACKERS Fovea Cornea C E Sclera Pupil C c ,
2
NVIDIA AI CO-PILOT
3
APPLICATIONS
INTERFACE DESIGN AR/VR ACCESSIBILITY
4
TRADITIONAL GAZE TRACKERS
Pupil Θ, Φ
Point of regard
Y-axis X-axis Z-axis
Cc CE Sclera Fovea Cornea
Light source
5
TRADITIONAL GAZE TRACKER
6
REMOTE, CHEAP, UNCONSTRAINED GAZE TRACKING
7
RESOLUTION
CHALLENGES
Unconstrained gaze tracking
8
RESOLUTION
CHALLENGES
Unconstrained gaze tracking
LIGHTING
9
RESOLUTION
CHALLENGES
Unconstrained gaze tracking
LIGHTING SUBJECT VARIABILITY
10
RESOLUTION
CHALLENGES
Unconstrained gaze tracking
LIGHTING HEAD ROTATION SUBJECT VARIABILITY
11
APPEARANCE-BASED GAZE ESTIMATION*
Gaze tracking
*Zhang et al., IEEE CVPR 2015. *Krafka et al., IEEE CVPR 2016.
12
LABELED DATA COLLECTION
Gaze tracking
13
LABELED DATA COLLECTION
Gaze tracking
Indoors only Occlusion Cumbersome
14
LABELED DATA COLLECTION
Gaze tracking
15
LABELED DATA COLLECTION
Gaze tracking
Indoors only Limited head rotations and gazes
16
GPU TO THE RESCUE
Gaze tracking Computer Graphics Deep Learning
17
GPU TO THE RESCUE
Gaze tracking Computer Graphics Deep Learning
18
SYNTHETIC DATA
19
SYNTHETIC IMAGES
Head models
20
INSERT BLENDER IMAGE COMPUTER GRAPHICS EYE MODEL*
*Wood et al., IEEE ICCV 2015.
21
1 MILLION SYNTHETIC IMAGES
22
GAZE CNN ARCHITECTURE
Zhang et al., 2015 (5 layers)
Gaze pitch Gaze yaw Head pitch Head yaw
23
SYNTHETIC IMAGES
AUTHOR DATA ERROR (º)
Wood et al., 2015 UT Multiview 1M 9.68 Wood et al., 2016 UnityEyes 1M 9.95 Wood et al., 2015 SynthesEyes 12K 8.94 Ours SynthesEyes 1M 7.74
Results on MPII data
24
EYE POINTS CNN ARCHITECTURE
Trained with 1M synthetic data
x1 y1 xn yn …
25
EYE FIDUCIAL POINTS
Results on MPII gaze data
26
GAZE ESTIMATION NETWORK
27
GAZE CNN ARCHITECTURE
Render for CNN (Su et al., 2015)
28
GAZE CNN ARCHITECTURE
Zhang et al., 2015 (5 layers)
Gaze pitch Gaze yaw Head pitch Head yaw
29
GAZE CNN ARCHITECTURE
Ours (8 layers)
Gaze pitch Gaze yaw Head pitch Head yaw
Render for CNN
30
GAZE CNN ARCHITECTURE
NETWORK INITIALIZATION ERROR (º)
LeNet Random 5.57 AlexNet ImageNet (object recognition) 5.03 ResNet-50 ImageNet (object recognition) 5.07 Ours Render for CNN (viewpoint estimation) 4.4
Results on 1M synthetic data
31
GAZE CNN ARCHITECTURE
Inputs and outputs
Gaze pitch Gaze yaw Head pitch Head yaw
Render for CNN
32
GAZE CNN ARCHITECTURE
INPUT OUTPUT ERROR (º)
Eye Eye-in-head 5.05 Eye Gaze 5.66 Eye, head pose Eye-in-head 4.4 Eye, head pose Gaze 4.4
Results on 1M synthetic data
33
HEAD ROTATION
Positive head yaw
Zero head yaw Negative head yaw
Eye appearance
34
HEAD ROTATION
Gaze distribution (1M synthetic data)
1 6 4 7 3 2 5 Gaze yaw Gaze pitch
35
HEAD ROTATION
- 1
- 0.5
0.5 1
Pose yaw
- 1
- 0.5
0.5 1
Pose pitch
1 2 3 4 5
Gaze yaw Gaze pitch 1 5 3 4 2
Gaze distribution (45k MPII data)
36
HEAD ROTATION
Head pose separation
Gaze pitch Gaze yaw Gaze pitch Gaze yaw … Head pitch Head yaw cluster 1 cluster n
Render for CNN
Head yaw Head pitch
37
HEAD ROTATION
INPUT CNN ERROR (º)
Eye single fc7-8 5.66 Eye branched fc7-8 5.18 Eye, head pose single fc7-8 4.4 Eye, head pose branched fc7-8 4.26
Results on 1M synthetic data
38
GAZE CNN ARCHITECTURE
Error (º) Head pose clusters
3.5 3.75 4 4.25 4.5 4.75 5 1 2 3 4 5 6 7
Single Branched Head yaw Head pitch
Results on 1M synthetic data
39
CNN ARCHITECTURE
Skip connections
Gaze pitch Gaze yaw Gaze pitch Gaze yaw … Pead pitch Pead yaw cluster 1 cluster n +
Render for CNN
40
GAZE CNN ARCHITECTURE
INPUT CNN ERROR (º)
Eye, head pose single fc7-8 4.4 Eye, head pose branched fc7-8 4.26 Eye, head pose branched fc7-8, skip connections 4.15
Results on 1M synthetic data
41
REAL DATA
Columbia
Gaze pitch Gaze yaw Gaze pitch Gaze yaw … Pead pitch Pead yaw cluster 1 cluster n +
Render for CNN
42
6.68 5.65 7.54 6.26 5.58
4 4.5 5 5.5 6 6.5 7 7.5 8 Wood et al., 2015 Our CNN Our CNN with Synthetic data
All No Glasses
GAZE ERROR
Error (º)
Columbia
43
REAL DATA
MPII gaze
Gaze pitch Gaze yaw Gaze pitch Gaze yaw … Pead pitch Pead yaw cluster 1 cluster n +
Render for CNN
44
6.3 5.85 5.58
4 4.5 5 5.5 6 6.5 7 7.5 8 Zhang et al., 2015 Our CNN Our CNN with Synthetic data
GAZE ERROR
Error (º)
MPII gaze
45
CONCLUSION
46
RESOLUTION
CHALLENGES
Unconstrained gaze tracking
LIGHTING HEAD ROTATION SUBJECT VARIABILITY
47