DEEP UNCONSTRAINED GAZE ESTIMATION WITH SYNTHETIC DATA Shalini De - - PowerPoint PPT Presentation

deep unconstrained gaze estimation with synthetic data
SMART_READER_LITE
LIVE PREVIEW

DEEP UNCONSTRAINED GAZE ESTIMATION WITH SYNTHETIC DATA Shalini De - - PowerPoint PPT Presentation

DEEP UNCONSTRAINED GAZE ESTIMATION WITH SYNTHETIC DATA Shalini De Mello, Rajeev Ranjan, Jan Kautz NVIDIA AI CO-PILOT 2 APPLICATIONS INTERFACE DESIGN AR/VR ACCESSIBILITY 3 TRADITIONAL GAZE TRACKERS Fovea Cornea C E Sclera Pupil C c ,


slide-1
SLIDE 1

Shalini De Mello, Rajeev Ranjan, Jan Kautz

DEEP UNCONSTRAINED GAZE ESTIMATION WITH SYNTHETIC DATA

slide-2
SLIDE 2

2

NVIDIA AI CO-PILOT

slide-3
SLIDE 3

3

APPLICATIONS

INTERFACE DESIGN AR/VR ACCESSIBILITY

slide-4
SLIDE 4

4

TRADITIONAL GAZE TRACKERS

Pupil Θ, Φ

Point of regard

Y-axis X-axis Z-axis

Cc CE Sclera Fovea Cornea

Light source

slide-5
SLIDE 5

5

TRADITIONAL GAZE TRACKER

slide-6
SLIDE 6

6

REMOTE, CHEAP, UNCONSTRAINED GAZE TRACKING

slide-7
SLIDE 7

7

RESOLUTION

CHALLENGES

Unconstrained gaze tracking

slide-8
SLIDE 8

8

RESOLUTION

CHALLENGES

Unconstrained gaze tracking

LIGHTING

slide-9
SLIDE 9

9

RESOLUTION

CHALLENGES

Unconstrained gaze tracking

LIGHTING SUBJECT VARIABILITY

slide-10
SLIDE 10

10

RESOLUTION

CHALLENGES

Unconstrained gaze tracking

LIGHTING HEAD ROTATION SUBJECT VARIABILITY

slide-11
SLIDE 11

11

APPEARANCE-BASED GAZE ESTIMATION*

Gaze tracking

*Zhang et al., IEEE CVPR 2015. *Krafka et al., IEEE CVPR 2016.

slide-12
SLIDE 12

12

LABELED DATA COLLECTION

Gaze tracking

slide-13
SLIDE 13

13

LABELED DATA COLLECTION

Gaze tracking

Indoors only Occlusion Cumbersome

slide-14
SLIDE 14

14

LABELED DATA COLLECTION

Gaze tracking

slide-15
SLIDE 15

15

LABELED DATA COLLECTION

Gaze tracking

Indoors only Limited head rotations and gazes

slide-16
SLIDE 16

16

GPU TO THE RESCUE

Gaze tracking Computer Graphics Deep Learning

slide-17
SLIDE 17

17

GPU TO THE RESCUE

Gaze tracking Computer Graphics Deep Learning

slide-18
SLIDE 18

18

SYNTHETIC DATA

slide-19
SLIDE 19

19

SYNTHETIC IMAGES

Head models

slide-20
SLIDE 20

20

INSERT BLENDER IMAGE COMPUTER GRAPHICS EYE MODEL*

*Wood et al., IEEE ICCV 2015.

slide-21
SLIDE 21

21

1 MILLION SYNTHETIC IMAGES

slide-22
SLIDE 22

22

GAZE CNN ARCHITECTURE

Zhang et al., 2015 (5 layers)

Gaze pitch Gaze yaw Head pitch Head yaw

slide-23
SLIDE 23

23

SYNTHETIC IMAGES

AUTHOR DATA ERROR (º)

Wood et al., 2015 UT Multiview 1M 9.68 Wood et al., 2016 UnityEyes 1M 9.95 Wood et al., 2015 SynthesEyes 12K 8.94 Ours SynthesEyes 1M 7.74

Results on MPII data

slide-24
SLIDE 24

24

EYE POINTS CNN ARCHITECTURE

Trained with 1M synthetic data

x1 y1 xn yn …

slide-25
SLIDE 25

25

EYE FIDUCIAL POINTS

Results on MPII gaze data

slide-26
SLIDE 26

26

GAZE ESTIMATION NETWORK

slide-27
SLIDE 27

27

GAZE CNN ARCHITECTURE

Render for CNN (Su et al., 2015)

slide-28
SLIDE 28

28

GAZE CNN ARCHITECTURE

Zhang et al., 2015 (5 layers)

Gaze pitch Gaze yaw Head pitch Head yaw

slide-29
SLIDE 29

29

GAZE CNN ARCHITECTURE

Ours (8 layers)

Gaze pitch Gaze yaw Head pitch Head yaw

Render for CNN

slide-30
SLIDE 30

30

GAZE CNN ARCHITECTURE

NETWORK INITIALIZATION ERROR (º)

LeNet Random 5.57 AlexNet ImageNet (object recognition) 5.03 ResNet-50 ImageNet (object recognition) 5.07 Ours Render for CNN (viewpoint estimation) 4.4

Results on 1M synthetic data

slide-31
SLIDE 31

31

GAZE CNN ARCHITECTURE

Inputs and outputs

Gaze pitch Gaze yaw Head pitch Head yaw

Render for CNN

slide-32
SLIDE 32

32

GAZE CNN ARCHITECTURE

INPUT OUTPUT ERROR (º)

Eye Eye-in-head 5.05 Eye Gaze 5.66 Eye, head pose Eye-in-head 4.4 Eye, head pose Gaze 4.4

Results on 1M synthetic data

slide-33
SLIDE 33

33

HEAD ROTATION

Positive head yaw

Zero head yaw Negative head yaw

Eye appearance

slide-34
SLIDE 34

34

HEAD ROTATION

Gaze distribution (1M synthetic data)

1 6 4 7 3 2 5 Gaze yaw Gaze pitch

slide-35
SLIDE 35

35

HEAD ROTATION

  • 1
  • 0.5

0.5 1

Pose yaw

  • 1
  • 0.5

0.5 1

Pose pitch

1 2 3 4 5

Gaze yaw Gaze pitch 1 5 3 4 2

Gaze distribution (45k MPII data)

slide-36
SLIDE 36

36

HEAD ROTATION

Head pose separation

Gaze pitch Gaze yaw Gaze pitch Gaze yaw … Head pitch Head yaw cluster 1 cluster n

Render for CNN

Head yaw Head pitch

slide-37
SLIDE 37

37

HEAD ROTATION

INPUT CNN ERROR (º)

Eye single fc7-8 5.66 Eye branched fc7-8 5.18 Eye, head pose single fc7-8 4.4 Eye, head pose branched fc7-8 4.26

Results on 1M synthetic data

slide-38
SLIDE 38

38

GAZE CNN ARCHITECTURE

Error (º) Head pose clusters

3.5 3.75 4 4.25 4.5 4.75 5 1 2 3 4 5 6 7

Single Branched Head yaw Head pitch

Results on 1M synthetic data

slide-39
SLIDE 39

39

CNN ARCHITECTURE

Skip connections

Gaze pitch Gaze yaw Gaze pitch Gaze yaw … Pead pitch Pead yaw cluster 1 cluster n +

Render for CNN

slide-40
SLIDE 40

40

GAZE CNN ARCHITECTURE

INPUT CNN ERROR (º)

Eye, head pose single fc7-8 4.4 Eye, head pose branched fc7-8 4.26 Eye, head pose branched fc7-8, skip connections 4.15

Results on 1M synthetic data

slide-41
SLIDE 41

41

REAL DATA

Columbia

Gaze pitch Gaze yaw Gaze pitch Gaze yaw … Pead pitch Pead yaw cluster 1 cluster n +

Render for CNN

slide-42
SLIDE 42

42

6.68 5.65 7.54 6.26 5.58

4 4.5 5 5.5 6 6.5 7 7.5 8 Wood et al., 2015 Our CNN Our CNN with Synthetic data

All No Glasses

GAZE ERROR

Error (º)

Columbia

slide-43
SLIDE 43

43

REAL DATA

MPII gaze

Gaze pitch Gaze yaw Gaze pitch Gaze yaw … Pead pitch Pead yaw cluster 1 cluster n +

Render for CNN

slide-44
SLIDE 44

44

6.3 5.85 5.58

4 4.5 5 5.5 6 6.5 7 7.5 8 Zhang et al., 2015 Our CNN Our CNN with Synthetic data

GAZE ERROR

Error (º)

MPII gaze

slide-45
SLIDE 45

45

CONCLUSION

slide-46
SLIDE 46

46

RESOLUTION

CHALLENGES

Unconstrained gaze tracking

LIGHTING HEAD ROTATION SUBJECT VARIABILITY

slide-47
SLIDE 47

47

GPU TO THE RESCUE

Unconstrained gaze tracking Computer Graphics Deep Learning