Eye Gaze Tracking Usin ing an RGBD Camera: A Comparison with an RGB - - PowerPoint PPT Presentation

eye gaze tracking usin ing an rgbd camera
SMART_READER_LITE
LIVE PREVIEW

Eye Gaze Tracking Usin ing an RGBD Camera: A Comparison with an RGB - - PowerPoint PPT Presentation

Eye Gaze Tracking Usin ing an RGBD Camera: A Comparison with an RGB Solu lution Xuehan Xiong (CMU) , Qin Cai, Zicheng Liu, Zhengyou Zhang Microsoft Research, Redmond, WA, USA zhang@microsoft.com http://research.microsoft.com/~zhang/ R The


slide-1
SLIDE 1

Eye Gaze Tracking Usin ing an RGBD Camera: A Comparison with an RGB Solu lution

Xuehan Xiong (CMU), Qin Cai, Zicheng Liu, Zhengyou Zhang

Microsoft Research, Redmond, WA, USA

zhang@microsoft.com http://research.microsoft.com/~zhang/

The 4th International Workshop on Pervasive Eye Tracking and Mobile Eye-Based Interaction (PETMEI 2014)

R

slide-2
SLIDE 2

Outline

  • Goal and motivation
  • Challenges
  • Approach
  • Results
slide-3
SLIDE 3

Goals and motivations

  • 1. Kinect-based eye tracking
  • 2. Comparison between

RGBD and RGB alone

slide-4
SLIDE 4

Goals and motivations

  • Most commercial eye trackers are IR-based
  • Short range
  • Does not work outdoor
  • Non-IR based system
  • Outdoor
  • Cheaper
  • Better capability of being integrated
  • Less accurate
slide-5
SLIDE 5

Outline

  • Motivation
  • Challenges
  • Approach
  • Results
slide-6
SLIDE 6

Challenges

  • Eye images from IR-based approaches
  • Eye images from Kinect
slide-7
SLIDE 7

Outline

  • Motivation
  • Challenges
  • Approach
  • Results
slide-8
SLIDE 8

Approach

  • What is gaze (in our model)?

𝐭𝟏 𝐭𝟐 𝐭𝟑 𝐭𝟒

e p a t v

Notation: p -- pupil v -- visual axis t -- optical axis 𝐒vo -- rotation compensation b/w v and t v = 𝐒vot a -- head center 𝐛𝐟 -- offset 𝐒hp -- head rotation r – eyeball radius Eyeball center: 𝐟 = 𝐛 + 𝐒hp𝐛𝐟

r

slide-9
SLIDE 9

Approach

  • What are fixed (in our model)?

𝐭𝟏 𝐭𝟐 𝐭𝟑 𝐭𝟒

e p a t v

Notation: p -- pupil v -- visual axis t -- optical axis 𝐒vo -- rotation compensation b/w v and t v = 𝐒vot a -- head center 𝐛𝐟 -- offset 𝐒hp -- head rotation r – eyeball radius Eyeball center: 𝐟 = 𝐛 + 𝐒hp𝐛𝐟

r

slide-10
SLIDE 10

Approach

  • What to be measured (in our model)?

𝐭𝟏 𝐭𝟐 𝐭𝟑 𝐭𝟒

e p a t v

Notation: p -- pupil v -- visual axis t -- optical axis 𝐒vo -- rotation compensation b/w v and t v = 𝐒vot a -- head center 𝐛𝐟 -- offset 𝐒hp -- head rotation r – eyeball radius Eyeball center: 𝐟 = 𝐛 + 𝐒hp𝐛𝐟

r

slide-11
SLIDE 11

Approach

  • System calibration
  • Head pose
  • Head center
  • Pupil
  • User calibration
slide-12
SLIDE 12

System calibration

  • World = color camera
  • Intrinsic parameters, centered at [0,0,0]
  • Depth camera
  • Intrinsic and extrinsic parameters
  • Monitor screen
  • Screen-camera calibration
slide-13
SLIDE 13

Screen-camera calibration

  • 4 images capturing screen + pattern
  • 1 image from Kinect camera capturing the pattern
slide-14
SLIDE 14

Calibration results

x z y

slide-15
SLIDE 15

Head pose estimation

Rigid points

  • Build a person-specific 3D face model

Average over 10 frames

slide-16
SLIDE 16

Head pose estimation

  • For each frame t

R,T

Red – Noisy Blue – De-noised Reference model Procrustes

slide-17
SLIDE 17

Head center

  • The average of 13 landmarks
slide-18
SLIDE 18

2D Iris detection

slide-19
SLIDE 19
  • = [0,0,0]

e p r l 𝐯 = 𝑣, 𝑤, 𝑔

T from camera intrinsic parameters

𝐦 = 𝐯 𝐯

3D pupil estimation

Camera center

slide-20
SLIDE 20

User calibration

  • What are fixed (in our model)?

𝐭𝟏 𝐭𝟐 𝐭𝟑 𝐭𝟒

e p a t v

Notation: p -- pupil v -- visual axis t -- optical axis 𝐒vo -- rotation compensation b/w v and t v = 𝐒vot a -- head center 𝐛𝐟 -- offset 𝐒hp -- head rotation r – eyeball radius Eyeball center: 𝐟 = 𝐛 + 𝐒hp𝐛𝐟

r

min σ𝑗 1 − (𝐒vo𝐮𝑗)𝑈𝐰𝑗 2 over 𝐒vo,𝐛𝐟, r

slide-21
SLIDE 21

Outline

  • Motivation
  • Challenges
  • Approach
  • Results
slide-22
SLIDE 22

Results

  • Simulation
slide-23
SLIDE 23

Error modeling

  • Assuming perfect calibration (system and user)
  • 3 sources of errors (assuming normal distribution with zero mean)
  • Head pose
  • Head center
  • Pupil
  • Units
  • Head pose: degree
  • Head center: mm
  • Pupil: pixel
slide-24
SLIDE 24

Simulation Result with low variances

  • Variances – 0.1
slide-25
SLIDE 25

Back to reality

Variance – 0.25 Variance – 0.5

slide-26
SLIDE 26

Real Data: Free head movement

9 calibration points A subject with colored stickers

slide-27
SLIDE 27

Experimental setup

  • The monitor has a dimension of 520mm by 320mm.
  • The distance between a test subject and the Kinect is between

600mm and 800mm.

  • There are 9 subjects participated in the data collection.
  • We collect three training sessions and two test sessions for each

subject.

slide-28
SLIDE 28

Best case scenario

slide-29
SLIDE 29

Training error

Left eye Right eye

slide-30
SLIDE 30

Testing error

Left eye Right eye

slide-31
SLIDE 31

Testing error 2

Left eye Right eye

slide-32
SLIDE 32

Sample Results Without Stickers

slide-33
SLIDE 33

Qin

slide-34
SLIDE 34

Qin – training error

Left eye Right eye

slide-35
SLIDE 35

Qin – testing error

Left eye Right eye

slide-36
SLIDE 36

Qin – testing error 2

Left eye Right eye

slide-37
SLIDE 37

No (little) head movement

slide-38
SLIDE 38

Best case scenario

slide-39
SLIDE 39

Training error

Left eye Right eye

slide-40
SLIDE 40

Sample Results Without Stickers

slide-41
SLIDE 41

Qin

slide-42
SLIDE 42

Qin – training error

Left eye Right eye

slide-43
SLIDE 43

Qin – testing error

Left eye Right eye

slide-44
SLIDE 44

Gaze errors on real-world data

Gaze errors in degrees

Average errors: 4.6 degrees with RGBD, and 5.6 degrees with RGB

slide-45
SLIDE 45

Low-bound of gaze errors

Gaze errors in degrees

Average errors: 2.1 degrees with RGBD, and 3.2 degrees with RGB

With colored stickers

slide-46
SLIDE 46

Conclusions

  • Using depth information directly from Kinect provides more accurate gaze

estimation compared with the one from only RGB images.

  • The lower bound for gaze error is around 2 degrees with RGBD and 4

degrees with RGB

  • Future work
  • Better RGBD sensor -> lower gaze error
  • Leverage two eyes

Zhengyou Zhang, Qin Cai, Improving Cross-Ratio-Based Eye Tracking Techniques by Leveraging the Binocular Fixation Constraint, in ETRA 2014.

slide-47
SLIDE 47

Thank You