Michael Stengel, Alexander Majercik
NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW-LATENCY, NEAR-EYE GAZE ESTIMATION
LOW-LATENCY, NEAR-EYE GAZE ESTIMATION Michael Stengel, Alexander - - PowerPoint PPT Presentation
NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW-LATENCY, NEAR-EYE GAZE ESTIMATION Michael Stengel, Alexander Majercik Part I (Michael) 25 min Eye tracking for near-eye displays Synthetic dataset generation Network training and results
Michael Stengel, Alexander Majercik
NVGAZE: ANATOMY-AWARE AUGMENTATION FOR LOW-LATENCY, NEAR-EYE GAZE ESTIMATION
2
Part I (Michael) 25 min
Part II (Alexander) 15 min
3
Michael Stengel Alexander Majercik Joohwan Kim Shalini De Mello Morgan McGuire David Luebke Samuli Laine
Perception & Learning New Experiences Group New Experiences Group New Experiences Group New Experiences Group New Experiences Group VP of Graphics Research
4
Michael Stengel
5
Avatars
Foveated Rendering Dynamic Streaming
Attention Studies Computational Displays Perception User State Evaluation
[Eisko.com]
Health Care Gaze Interaction Periphery
[arpost.co] [Vedamurthy et al.] [Sitzmann et al.] [Patney et al.] [Sun et al.] [eyegaze.com] [Padmanaban et al.]
6
SUBTLE GAZE GUIDANCE Enlarging virtual spaces through redirected walking
[Sun et al., Siggraph‘18]
7
FOVEATED RENDERING Accelerating Real-time Computer Graphics
8
ACCOMMODATION SIMULATION Enhancing Depth Perception
9
GAZE-AS-INPUT
10
LABELED REALITY
11
WORKING PRINCIPLE
Pupil localization Domain mapping using calibration parameters 3d gaze vector or 2d point of regard Eye Camera Display Lens Face (x,y) Eye capture
12
Camera view off-axis Camera view on-axis
13
Modded GearVR with integrated gaze tracking
Eye tracking prototype for Virtual Reality headsets
Components for on-axis eye tracking integration Eye tracking cameras, dichroic mirrors, infrared illumination, VR glasses frame
14
Eye tracking prototype for VR headsets
15
16
Eye tracking prototype for VR headsets
Eye Camera Display Lens
17
Eye tracking prototype for VR headsets
18
CHALLENGES FOR MOBILE VIDEO-BASED EYE TRACKERS
→ Reaching low latency AND high robustness is hard !
19
dark pupil tracking only, glint-free tracking)
20
localization task
cars independent of view and lighting condition
21
22
1: Eye Model
We adopted the eye model from Wood et al. 2015 * and modified it to more accurately represent human eyes.
* Wood, E., Baltrušaitis, T., Zhang, X., Sugano, Y., Robinson, P., & Bulling, A. “Rendering of eyes for eye-shape registration and gaze estimation”, ICCV 2015.
Optical Axis 5 deg
23
2: Pupil Center Shift
Pupil center is off from iris center , and it moves as pupil changes in size. Average displacements: 8mm pupil: 0.1 mm nasal and 0.07 mm up 6mm pupil: 0.15 mm nasal and 0.08 mm up 4mm pupil: 0.2 mm nasal and 0.09 mm up This is known to cause gaze tracking error of up to 5 deg in pupil-glint tracking methods.
24
2: Scanned faces
25
2: Combining Eye and Head Models
26
2: Synthetic Model
27
3: Dataset
Blender on Multi-GPU cluster.
28
3: Dataset
29
30
4: Region Labels
Pupil Iris Sclera Skin Sclera occluded by skin Glint
31
Samples of real images for comparison
Original Synthetic Image Augmented Synthetic Image
Region-wise
Global
32
33
IR Camera Gaze Vector Input Image C
34
16
C
24 36 54 81 122 F ully C
Layer
(x, y)
C
C
C
C
C
F C
Layer Resolution
Input 255 x 191 1 Conv1 127 x 95 16 Conv2 63 x 47 24 Conv3 31 x 23 36 Conv4 15 x 11 54 Conv5 7 x 5 81 Conv6 3 x 2 122
Fully convolutional network In reference design, each layer has … Stride of 2 No padding 3x3 Conv. kernel
Camera image 640x480
35
36
Loss function
37
Accuracy / Near Eye Display 2.1 degrees of error in average across real subjects Error is almost evenly distributed across the entire tested visual field 1.7 degrees best-case accuracy when trained for single subject Accuracy / Remote Gaze Tracking 8.4 degrees average accuracy for remote gaze tracking (same accuracy as state of the art by Park et al., 2018) but 100x faster Latency for gaze estimation <1 milliseconds for inference and data transfer between CPU and GPU space cuDNN implementation running on TitanV or Jetson TX2 bottleneck is camera transfer @ 120 Hz
Gaze Estimation
38
39
Pupil Location Estimation
40
41
Our network is more accurate, more robust and requires less memory than others.
Pupil Location Estimation
42
Alexander Majercik
43
dark pupil tracking only, glint-free tracking)
44
dark pupil tracking only, glint-free tracking)
45
Avatars
Foveated Rendering Dynamic Streaming
Attention Studies Computational Displays Perception User State Evaluation
[Eisko.com]
Health Care Gaze Interaction Periphery
[arpost.co] [Vedamurthy et al.] [Sitzmann et al.] [Patney et al.] [Sun et al.] [eyegaze.com]
46
Esports Research at NVIDIA 60 ms To Get it Right
Gaze-Contingent Rendering and Human perception Human Perception Esports
47
Esports Research at NVIDIA 60 ms To Get it Right
Gaze-Contingent Rendering and Human perception Human Perception Esports
48
49
24 52 80 124 256 512 36
50
Key Design Decisions
51
Key Design Decisions
52
Key Design Decisions
53
Key Design Decisions
54
Key Design Decisions
55
Data-directed approach
56
57
58
59
60
Optimizing the pipeline
61
Optimizing the pipeline
62
Optimizing the pipeline
63
Optimizing the pipeline
65
Merging the input images
Convolution kernel
66
Merging the input images
67
Merging the input images
68
Merging the input images
69
Merging the input images
70
72
Results
Method Time (ms) Single Image (Python based DL framework) Single Image (cuDNN) Concatenated input (cuDNN)
73
Results
Method Time (ms) Single Image (Python based DL framework) ~6 Single Image (cuDNN) Concatenated input (cuDNN)
74
Results
Method Time (ms) Single Image (Python based DL framework) ~6 Single Image (cuDNN) 0.748 Concatenated input (cuDNN)
75
Results
Method Time (ms) Single Image (Python based DL framework) ~6 Single Image (cuDNN) 0.748 Concatenated input (cuDNN) 1.022
76
77
VR Theater SJCC Expo Hall 3, Concourse Level Tuesday: 12:00pm - 7:00pm Wednesday: 12:00pm - 7:00pm Thursday: 11:00am - 2:00pm
78
REFERENCES
NVGaze: An Anatomically-Informed Dataset for Low-Latency, Near-Eye Gaze Estimation [Kim’19] Adaptive Image‐Space Sampling for Gaze‐Contingent Real‐time Rendering [Stengel’16] Perception‐driven Accelerated Rendering [Weier’17] Visualization and Analysis of Head Movement and Gaze Data for Immersive Video in Head-mounted Displays [Loewe’15] Subtle gaze guidance for immersive environments [Grogorick ‘17] Towards virtual reality infinite walking: dynamic saccadic redirection [ Sun ’18]
79
Michael Stengel Alexander Majercik
New Experiences Group amajercik@nvidia.com New Experiences Group mstengel@nvidia.com
Try out our demo in the Exhibitor Hall ! sites.google.com/nvidia.com/nvgaze Dataset and model available at
80
Avatars
Foveated Rendering Dynamic Streaming
Attention Studies Computational Displays Perception User State Evaluation
[Eisko.com]
Health Care Gaze Interaction Periphery
[arpost.co] [Vedamurthy et al.] [Sitzmann et al.] [Patney et al.] [Sun et al.] [eyegaze.com] [Padmanaban et al.]
81
Eye tracking prototype for Augmented Reality glasses
Gaze tracking glasses with vertical/horizontal waveguides Vertical beam splitter Horizontal beam splitter Infared illumination units
82
3D Reconstruction Result
83
Calibration Method A – Using calibration network layer
Calibration Method B - Mapping 2d pupil center to 2d screen position
Ring target pattern
84
Retinal Cone Distribution
[Goldstein,2007]
FOVEATED RENDERING Accelerating Real-time Computer Graphics
85
86
APPLICATION EXAMPLE FOVEATED RENDERING
87
88
ATTENTION ANALYSIS Generating 3D Saliency Information
[Loewe and Stengel et al. ETVIS‘15]