Gesture Recognition Adrian Kndig adkuendi@student.ethz.ch Datum - - PDF document

gesture recognition
SMART_READER_LITE
LIVE PREVIEW

Gesture Recognition Adrian Kndig adkuendi@student.ethz.ch Datum - - PDF document

Gesture Recognition Adrian Kndig adkuendi@student.ethz.ch Datum Informatik II Samstag, 27. April 13 1 The beginning of gestures based interfaces Samstag, 27. April 13 2 Gesture Recognition 1970 Myron W. Krueger and VideoPlace


slide-1
SLIDE 1

Datum Informatik II

Gesture Recognition

Adrian Kündig

adkuendi@student.ethz.ch

1 Samstag, 27. April 13

slide-2
SLIDE 2

The beginning of gestures based interfaces

2 Samstag, 27. April 13

slide-3
SLIDE 3

Gesture Recognition

§ 1970 Myron W. Krueger and VideoPlace

http://www.inventinginteractive.com/2010/03/22/myron-krueger/ http://sofa23.net/index.php?m=1&sm=&t=23&sp=18&spic=43&me=show%20all&s=

3 Samstag, 27. April 13

One of the first prototyped VR Using cameras for recognition Simple ideas

slide-4
SLIDE 4

Gesture Recognition

(Baudel and Beaudouin-Lafon, 1993)

§ 1970 Myron W. Krueger and VideoPlace § 1993 Charade

4 Samstag, 27. April 13

First formal definition of gestures Control PowerPoint dataglove 4 line = fingers, 1 line = thumb

slide-5
SLIDE 5

Gesture Recognition

§ 1970 Myron W. Krueger and VideoPlace § 1993 Charade

(Baudel and Beaudouin-Lafon, 1993)

5 Samstag, 27. April 13

Selection of gestures

slide-6
SLIDE 6

Gesture Recognition

§ 1970 Myron W. Krueger and VideoPlace § 1993 Charade § 2002 Minority Report

http://7thperbmmrblog.blogspot.ch/2011/01/william-bermudez.html http://thomaspmbarnett.com/globlogization/2013/2/5/times-battleland-terrorism-minority-report-has-finally-arriv.html

6 Samstag, 27. April 13

Hollywood movie from Steven Spielberg Rooted in Research from John Underkoffmer “like conducting an orchestra” tom cruise

slide-7
SLIDE 7

Gesture Recognition

§ 1970 Myron W. Krueger and VideoPlace § 1993 Charade § 2002 Minority Report § 2009 Oblong Industries

7 Samstag, 27. April 13

Last step in our history of gesture based interfaces Commercial company founded by John Underkoffmer developed g-speak Intended for big data analysis Requires specialized applications

slide-8
SLIDE 8

Oblong Industries - Demo

http://oblong.com/g-speak/

8 Samstag, 27. April 13

Orientation in 3D Selection Segmentation

slide-9
SLIDE 9

Oblong Industries - Demo

http://oblong.com/g-speak/

8 Samstag, 27. April 13

Orientation in 3D Selection Segmentation

slide-10
SLIDE 10

Common Factor

http://www.5dt.com/DataGloveImages.html

9 Samstag, 27. April 13

most shown systems have in common: data glove Hand tracking Hand reconstruction Feedback

slide-11
SLIDE 11

How can we get rid of the Data Glove?

10 Samstag, 27. April 13

Free up hands Remove instrumentation

slide-12
SLIDE 12

Muscle Computer Interface

§ Hands free gestures while holding an object § Arm band like design § Sensing muscle activity

(Saponas et al, 2009)

11 Samstag, 27. April 13

Hands free Muscle sending

slide-13
SLIDE 13

Muscle Computer Interface - Technology

http://painmd.tv/wp-content/uploads/2011/04/emg-muscle-configuration.gif

(Saponas et al, 2009)

12 Samstag, 27. April 13

EMG or Electromyography primarily in Medical therapy (muscle function assessment, controlling prosthetics) Action Potential generated by muscle when signal arrives from Motor Neuron Invasively by inserting a needle into the muscle Non invasively by sensing on the skin

slide-14
SLIDE 14

Muscle Computer Interface - Technology

http://www.emgsrus.com/graphics/emg_trial_rect_page.png

(Saponas et al, 2009)

13 Samstag, 27. April 13

here measured activity 6 Difgerent muscles Peaks of action potentials

slide-15
SLIDE 15

Muscle Computer Interface - Technology

http://www.nature.com/gimo/contents/pt1/fig_tab/gimo32_F2.html

Support Vector Machine

(Saponas et al, 2009)

§ Root mean square § Frequency energy § Phase Coherence

14 Samstag, 27. April 13

6 Sensors and 2 ground electrodes Features extracted from 31ms sample

  • Root Mean Square of amplitude per channel and ratio of pair of

channels sqrt(1/n * (x1^2 + x2 ^ 2 + ...))

  • Frequency energy via FFT
  • Relationship between channels

Classified from SVM into gestures

slide-16
SLIDE 16

Support Vector Machines

§ Binary Linear Classifier § Extended to multiple classes

https://en.wikipedia.org/wiki/File:Kernel_Machine.png

15 Samstag, 27. April 13

Function phi transforms feature space, such that it is possible to lay a hyper plain between two classes Try to lay separator such that separation is most clear Multiple classes by (one vs rest) or pairwise (one vs one)

slide-17
SLIDE 17

Muscle Computer Interface - Demo

(Saponas et al, 2009)

16 Samstag, 27. April 13

Guitar hero input is sent as soon as user touches both fingers

slide-18
SLIDE 18

Muscle Computer Interface - Demo

(Saponas et al, 2009)

16 Samstag, 27. April 13

Guitar hero input is sent as soon as user touches both fingers

slide-19
SLIDE 19

Muscle Computer Interface

§ Pro

§ No instrumentation of hand § Hidden near elbow

§ Contra

§ Inaccurate compared to some following papers § Muscle activity required

(Saponas et al, 2009)

17 Samstag, 27. April 13

79 % accuracy

slide-20
SLIDE 20

Gesture Wrist

§ Hands free gestures § Embed sensing device in wrist watch § Feedback on gesture

(Rekimoto, 2001)

18 Samstag, 27. April 13

slide-21
SLIDE 21

Gesture Wrist - Technology

Original wristwatch dial Receiver electrodes Transmitter electrode Tilt sensor (ADXL202)

Wrist

Piezo-actuator

LPF

AD Converter Analog switch Transmitter Receiver Wave Signal Transmitter Receiver

§ Wave signal is transmitted § The receivers are synchronized § The received strength is proportional to the distance

(Rekimoto, 2001)

19 Samstag, 27. April 13

Actuator vibrates measure the capacitance of the wrist and the receiver electrodes measuring the distance between wristband an wrist

slide-22
SLIDE 22

Gesture Wrist

§ Distinguish ‘Point’ and ‘Fist’ pose

(Rekimoto, 2001)

Gesture Wrist - Technology

20 Samstag, 27. April 13

Clear difgerence between point and fist Only two gestures used to difgerentiate gestures

slide-23
SLIDE 23

Gesture Wrist - Examples

§ Distinguish ‘Point’ and ‘Fist’ pose § Combined with an accelerometer § Rotation also recognizable

(Rekimoto, 2001)

21 Samstag, 27. April 13

Only two gestures used to difgerentiate gestures Use rotation to control slider or knob

slide-24
SLIDE 24

Gesture Wrist

§ Pro

§ Small, watch like design § Sensor embedded inside accessory § Simple recognition method

§ Contra

§ Only a small set of gestures can be recognized

(Rekimoto, 2001)

22 Samstag, 27. April 13

slide-25
SLIDE 25

Hand Shape with Wrist Contour

§ Hands free gestures § Wrist watch like design

(Fukui et al, 2011)

23 Samstag, 27. April 13

slide-26
SLIDE 26

Hand Shape with Wrist Contour - Technology

§ Static wrist band § Photo reflectors § Senses distance between band and skin

Infrared signal 2.5mm

(Fukui et al, 2011)

24 Samstag, 27. April 13

150 sensors

slide-27
SLIDE 27

Hand Shape with Wrist Contour - Demo

(Fukui et al, 2011)

25 Samstag, 27. April 13

static image representing gesture

slide-28
SLIDE 28

Hand Shape with Wrist Contour - Demo

(Fukui et al, 2011)

25 Samstag, 27. April 13

static image representing gesture

slide-29
SLIDE 29

Hand Shape with Wrist Contour - Examples

(Fukui et al, 2011)

26 Samstag, 27. April 13

The recognized gesture set some gestures quiet similar

slide-30
SLIDE 30

Hand Shape with Wrist Contour - Accuracy

(Fukui et al, 2011)

27 Samstag, 27. April 13

Confusion matrix wide spread boosting method and k-NN method rather simple diagonal is correctly recognized

slide-31
SLIDE 31

Hand Shape with Wrist Contour

§ Pro

§ Small, watch like design § Can be hidden inside accessory § New approach to gesture recognition

§ Contra

§ Bad recognition rate § Limited set of gestures

(Fukui et al, 2011)

28 Samstag, 27. April 13

slide-32
SLIDE 32

Digits

§ Recover full 3D hand model § Cheap hardware § Low power

(Kim et al, 2012)

29 Samstag, 27. April 13

Already partly presented by Professor Hilliges in the introduction

  • f the seminar

more sophisticated imitates data glove

slide-33
SLIDE 33

Digits - Technology

3D Laser Triangulation Background Subtraction CCL & Tracking Hand Pose Recovery

(Kim et al, 2012)

30 Samstag, 27. April 13

We ¡use ¡a ¡number ¡of ¡image ¡processing ¡techniques ¡to ¡segment ¡and ¡track ¡five ¡ discrete ¡points ¡on ¡the ¡fingers Knowing ¡the ¡camera ¡and ¡laser ¡posi;on ¡we ¡can ¡triangulate ¡3D ¡posi;ons ¡from ¡ this ¡informa;on ¡ And ¡finally ¡use ¡a ¡kinema;cs ¡model ¡to ¡recover ¡the ¡full ¡hand ¡configura;on

slide-34
SLIDE 34

Digits - Technology

3D Laser Triangulation Background Subtraction CCL & Tracking Hand Pose Recovery

(Kim et al, 2012)

30 Samstag, 27. April 13

We ¡use ¡a ¡number ¡of ¡image ¡processing ¡techniques ¡to ¡segment ¡and ¡track ¡five ¡ discrete ¡points ¡on ¡the ¡fingers Knowing ¡the ¡camera ¡and ¡laser ¡posi;on ¡we ¡can ¡triangulate ¡3D ¡posi;ons ¡from ¡ this ¡informa;on ¡ And ¡finally ¡use ¡a ¡kinema;cs ¡model ¡to ¡recover ¡the ¡full ¡hand ¡configura;on

slide-35
SLIDE 35

Digits - Technology

3D Laser Triangulation Background Subtraction CCL & Tracking Hand Pose Recovery

(Kim et al, 2012)

30 Samstag, 27. April 13

We ¡use ¡a ¡number ¡of ¡image ¡processing ¡techniques ¡to ¡segment ¡and ¡track ¡five ¡ discrete ¡points ¡on ¡the ¡fingers Knowing ¡the ¡camera ¡and ¡laser ¡posi;on ¡we ¡can ¡triangulate ¡3D ¡posi;ons ¡from ¡ this ¡informa;on ¡ And ¡finally ¡use ¡a ¡kinema;cs ¡model ¡to ¡recover ¡the ¡full ¡hand ¡configura;on

slide-36
SLIDE 36

Digits - Technology

3D Laser Triangulation Background Subtraction CCL & Tracking Hand Pose Recovery

(Kim et al, 2012)

30 Samstag, 27. April 13

We ¡use ¡a ¡number ¡of ¡image ¡processing ¡techniques ¡to ¡segment ¡and ¡track ¡five ¡ discrete ¡points ¡on ¡the ¡fingers Knowing ¡the ¡camera ¡and ¡laser ¡posi;on ¡we ¡can ¡triangulate ¡3D ¡posi;ons ¡from ¡ this ¡informa;on ¡ And ¡finally ¡use ¡a ¡kinema;cs ¡model ¡to ¡recover ¡the ¡full ¡hand ¡configura;on

slide-37
SLIDE 37

Digits - Technology

3D Laser Triangulation Background Subtraction CCL & Tracking Hand Pose Recovery

(Kim et al, 2012)

30 Samstag, 27. April 13

We ¡use ¡a ¡number ¡of ¡image ¡processing ¡techniques ¡to ¡segment ¡and ¡track ¡five ¡ discrete ¡points ¡on ¡the ¡fingers Knowing ¡the ¡camera ¡and ¡laser ¡posi;on ¡we ¡can ¡triangulate ¡3D ¡posi;ons ¡from ¡ this ¡informa;on ¡ And ¡finally ¡use ¡a ¡kinema;cs ¡model ¡to ¡recover ¡the ¡full ¡hand ¡configura;on

slide-38
SLIDE 38

Digits - Examples

(Kim et al, 2012)

31 Samstag, 27. April 13

accurate

slide-39
SLIDE 39

Digits - Demo

(Kim et al, 2012)

32 Samstag, 27. April 13

shooting grabbing pulling

slide-40
SLIDE 40

Digits - Demo

(Kim et al, 2012)

32 Samstag, 27. April 13

shooting grabbing pulling

slide-41
SLIDE 41

Digits

§ Pro

§ Portable § Intern processing § Accurate replacement for data glove

§ Contra

§ As obtrusive as a data glove § Occlusion is major problem

(Kim et al, 2012)

33 Samstag, 27. April 13

slide-42
SLIDE 42

Towards bimanual gestures

34 Samstag, 27. April 13

previous papers all tried to reconstruct a model of the hand in a more or less accurate fashion In the next paper we will see a move away from reconstruction towards using the second hand for input and the first hand as a trigger

slide-43
SLIDE 43

Gesture Watch

§ Contact free interface § Unobtrusive

(Kim et al, 2007)

35 Samstag, 27. April 13

device recognizing other hand wearing arm used to initiate gesture

slide-44
SLIDE 44

Gesture Watch - Technology

Sensor signal Recognized gesture

(Kim et al, 2007)

36 Samstag, 27. April 13

4 proximity Sensors arranged in a cross + 1 for initiating towards the hand binary 0/1 sensors

slide-45
SLIDE 45

Gesture Watch - Examples

(Kim et al, 2007)

37 Samstag, 27. April 13

proposed gestures

slide-46
SLIDE 46

Gesture Watch

§ Pro

§ Unobtrusive desgin § Sensors embedded § Contact free § Private

§ Contra

§ Requires action from second hand to start gesture

(Kim et al, 2007)

38 Samstag, 27. April 13

private by hiding the gesture from other people

slide-47
SLIDE 47

What if we could eliminate all instrumentation?

39 Samstag, 27. April 13

But still, instrumentation of the user is required To get hands free To be cheaper

slide-48
SLIDE 48

Sound Wave

§ No instrumentation of user § Reusing existing hardware

(Gupta et al, 2012)

40 Samstag, 27. April 13

Reuses speakers and microphone from an existing laptop

slide-49
SLIDE 49

Sound Wave - Technology

  • Figure 2 shows the frequency of the signal (a)

(Gupta et al, 2012)

41 Samstag, 27. April 13

Doppler efgect Emitted sound 18 - 22 kHz Input sampled -> FFT 22.05kHz spectrum divided into 33 bins scanned until amplitude drops below 10% second scan until 30% away from pilot tone

slide-50
SLIDE 50

Sound Wave - Technology

  • Figure 2 shows the frequency of the signal (a)

(Gupta et al, 2012)

41 Samstag, 27. April 13

Doppler efgect Emitted sound 18 - 22 kHz Input sampled -> FFT 22.05kHz spectrum divided into 33 bins scanned until amplitude drops below 10% second scan until 30% away from pilot tone

slide-51
SLIDE 51

Sound Wave - Demo

(Gupta et al, 2012)

42 Samstag, 27. April 13

Wake up and sleep automatically control media player

slide-52
SLIDE 52

Sound Wave - Demo

(Gupta et al, 2012)

42 Samstag, 27. April 13

Wake up and sleep automatically control media player

slide-53
SLIDE 53

Sound Wave

§ Pro

§ No instrumentation of user § Accurate results § Even in noisy environments

§ Contra

§ Base tone may be hearable

(Gupta et al, 2012)

43 Samstag, 27. April 13

slide-54
SLIDE 54

All sensors need a network

44 Samstag, 27. April 13

To conclude we have a look at a completely difgerent paper that discusses how the body itself can be used as a network for communication

slide-55
SLIDE 55

Gesture Pad

§ The body as touch interface § The body as network § The body as transceiver

(Rekimoto, 2001)

45 Samstag, 27. April 13

Taken from the paper of Gesture Wrist, the capacitance sensing wrist sensor Communicate between themselfes Send data to (touched) outside world Humantenna inverted

slide-56
SLIDE 56

Gesture Pad

transmitter receiver body shield layer fabric

A

transmitter body receiver fabric shield layer

B

body transmitter fabric shield layer

B’

receiver

body fabric rec shield layer transmitter

(Rekimoto, 2001)

46 Samstag, 27. April 13

A: Transmitter/receiver multiplexed B: Shield layer separates transmitter from receiver

slide-57
SLIDE 57

Gesture Pad

§ Further Ideas

§ Use NFC transceivers inside pads § Identify person touching by there signal

(Rekimoto, 2001)

47 Samstag, 27. April 13

slide-58
SLIDE 58

Comparison

Mobility Accuracy Instrumentation Main Application Muscle Computer Interface

Designed for mobile use, data sent via wifi/BT 65% busy hand, no feedback, 4 fingers 91% busy hand, feedback, 3 fingers An arm band at the upper forearm Gesture recognition with busy hands

Gesture Wrist

(Capacity sensing)

Designed for mobile use, data sent via body network N/A Wrist watch like utility Hand shape recognition, authentication

Wrist Shape

(Photosensors)

Designed for mobile use,

  • ffline processing atm.

45-48% Wrist watch like utility Hand shape recognition

Digits

(3D reconstruction)

Designed for mobile use, data sent via wifi/BT 91%, varying from finger to finger Small camera worn at a wrist band Reconstructing 3D model

  • f hand

Gesture Watch

(in air over hand)

Designed for mobile use, data sent via wifi/BT 95 % Wrist watch like utility Simple gesture recognition using one hand

Sound Wave

(in air over laptop)

Bound to Laptop 90-95% None, using existing hardware Add simple gesture recognition to laptop

48 Samstag, 27. April 13

Difgerent aspect that would maybe required from a gesture based interface

slide-59
SLIDE 59

Summary and Future Technology

§ Today

§ Gesture recognition is feasible § Ranging accuracy § Integration is still complicated

§ In the future we ...

§ need to control unobtrusively § can authenticate with an accessory § wear touchable cloth § use the body as a network

49 Samstag, 27. April 13

slide-60
SLIDE 60

50 Samstag, 27. April 13

Vaporware!? commercial from myo foresight of how gesture interaction could look like

slide-61
SLIDE 61

50 Samstag, 27. April 13

Vaporware!? commercial from myo foresight of how gesture interaction could look like

slide-62
SLIDE 62

“Any sufficiently advanced technology is indistinguishable from magic.”

Arthur C. Clarke 51 Samstag, 27. April 13