Kinect Device How the Kinect Works T2 Subhransu Maji Slides - PowerPoint PPT Presentation

4/27/16 Kinect Device How the Kinect Works T2 Subhransu Maji Slides credit: Derek Hoiem, University of Illinois Photo frame-grabbed from: http://www.blisteredthumbs.net/2010/11/dance-central-angry-review Kinect Device What the Kinect does Get Depth Image Application (e.g., game) Estimate Body Pose illustration source: primesense.com 1

4/27/16 How Kinect Works: Overview Part 1: Stereo from projected dots IR Projector IR Projector IR Sensor IR Sensor Projected Light Pattern Projected Light Pattern Stereo Stereo Algorithm Algorithm Segmentation, Segmentation, Part Prediction Part Prediction Depth Image Body Pose Depth Image Body Pose Part 1: Stereo from projected dots Depth from Stereo Images image 1 image 2 1. Overview of depth from stereo 2. How it works for a projector/sensor pair Dense depth map 3. Stereo algorithm used by Primesense (Kinect) Some of following slides adapted from Steve Seitz and Lana Lazebnik 2

4/27/16 Depth from Stereo Images Stereo and the Epipolar constraint X • Goal: recover depth by finding image coordinate x’ X that corresponds to x X x x’ X x’ X x’ z x x x’ Potential matches for x have to lie on the corresponding line l’ . f f x' C Baseline C’ Potential matches for x’ have to lie on the corresponding line l . B Simplest Case: Parallel images Basic stereo matching algorithm • Image planes of cameras are parallel to each other and to the baseline • Camera centers are at same height • Focal lengths are the same • Then, epipolar lines fall along the horizontal scan lines of the images • For each pixel in the first image – Find corresponding epipolar line in the right image – Examine all pixels on the epipolar line and pick the best match – Triangulate the matches to get depth informaXon 3

4/27/16 Depth from disparity Basic stereo matching algorithm X − ′ x x f = z ′ − O O z x x’ f f • If necessary, recXfy the two stereo images to transform O Baseline O’ epipolar lines into scanlines B • For each pixel x in the first image ⋅ B f ′ = − = – Find corresponding epipolar scanline in the right image disparity x x z – Examine all pixels on the scanline and pick the best match x’ – Compute disparity x-x’ and set depth(x) = fB/(x-x’) Disparity is inversely proportional to depth. Correspondence search Correspondence search Left Right Left Right scanline scanline Matching cost disparity • Slide a window along the right scanline and compare contents of that window with the reference window in the le[ image • Matching cost: SSD or normalized correlaXon SSD 4

4/27/16 Results with window search Correspondence search Data Left Right scanline Window-based matching Ground truth Norm. corr Add constraints and solve with graph cuts Failures of correspondence search Before Occlusions, repetition Textureless surfaces Graph cuts Ground truth Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 2001 Non-Lambertian surfaces, specularities For the latest and greatest: http://www.middlebury.edu/stereo/ 5

4/27/16 Dot ProjecXons Depth from Projector-Sensor Only one image: How is it possible to get depth? Scene http://www.youtube.com/ Surface watch?v=28JwgxbQx8w Projector Sensor Source: http://www.futurepicture.org/?p=97 Same stereo algorithms apply Example: Book vs. No Book Projector Sensor 6

4/27/16 Source: http://www.futurepicture.org/?p=97 Example: Book vs. No Book Region-growing Random Dot Matching 1. Detect dots (“speckles”) and label them unknown 2. Randomly select a region anchor, a dot with unknown depth a. Windowed search via normalized cross correlaXon along scanline – Check that best match score is greater than threshold; if not, mark as “invalid” and go to 2 b. Region growing 1. Neighboring pixels are added to a queue 2. For each pixel in queue, iniXalize by anchor’s shi[; then search small local neighborhood; if matched, add neighbors to queue 3. Stop when no pixels are le[ in the queue 3. Stop when all dots have known depth or are marked “invalid” http://www.wipo.int/patentscope/search/en/WO2007043036 Projected IR vs. Natural Light Stereo Part 2: Pose from depth IR Projector • What are the advantages of IR? – Works in low light condiXons – Does not rely on having textured objects – Not confused by repeated scene textures IR Sensor – Can tailor algorithm to produced paeern Projected Light Pattern • What are advantages of natural light? Stereo – Works outside, anywhere with sufficient light Algorithm – Uses less energy Segmentation, – ResoluXon limited only by sensors, not projector Part Prediction • DifficulXes with both – Very dark surfaces may not reflect enough light – Specular reflecXon in mirrors or metal causes trouble Depth Image Body Pose 7

4/27/16 Goal: esXmate pose from depth image Goal: esXmate pose from depth image Part Label Map Joint Positions RGB Depth Real-Time Human Pose Recognition in Parts from a Single Depth Image http://research.microsoft.com/apps/video/ Jamie Shotton, Andrew Fitzgibbon, Mat Cook, Toby Sharp, Mark Finocchio, Richard default.aspx?id=144455 Moore, Alex Kipman, and Andrew Blake CVPR 2011 Challenges Extract body pixels by thresholding depth • Lots of variaXon in bodies, orientaXon, poses • Needs to be very fast (their algorithm runs at 200 FPS on the Xbox 360 GPU) Pose Examples Examples of one part 8

4/27/16 Basic learning approach Get lots of training data • Capture and sample 500K mocap frames of • Very simple features people kicking, driving, dancing, etc. • Get 3D models for 15 bodies with a variety of weight, height, etc. • Synthesize mocap data for all 15 body types • Lots of data • Flexible classifier Body models Features • Difference of depth at two offsets – Offset is scaled by depth at center 9

4/27/16 Part predicXon with random forests Joint esXmaXon • Randomized decision forests: collecXon of independently trained trees • Joints are esXmated using mean-shi[ (a fast • Each tree is a classifier that predicts the likelihood of a pixel belonging to each part mode-finding algorithm) – Node corresponds to a thresholded feature – The leaf node that an example falls into corresponds to a conjuncXon of several features • Observed part center is offset by pre- – In training, at each node, a subset of features is chosen randomly, and the most discriminaXve is selected esXmated value Results More results Ground Truth 10

4/27/16 Accuracy vs. Number of Training Examples Uses of Kinect • Mario: hep://www.youtube.com/watch?v=8CTJL5lUjHg • Robot Control: hep://www.youtube.com/watch?v=w8BmgtMKFbY • Capture for holography: hep://www.youtube.com/watch?v=4LW8wgmfpTE • Virtual dressing room: hep://www.youtube.com/watch?v=1jbvnk1T4vQ • Fly wall: hep://vimeo.com/user3445108/kiwibankinteracXvewall • 3D Scanner: hep://www.youtube.com/watch?v=V7LthXRoESw To learn more Next week • Tues • Warning: lots of wrong info on web – ICES forms (important!) – Wrap-up, proj 5 results • Great site by Daniel Reetz: • Normal office hours + feel free to stop by other Xmes on Tues, Thurs hep://www.futurepicture.org/?p=97 – Try to stop by instead of e-mail except for one-line answer kind of things • Kinect patents: • Final project reports due Thursday at midnight hep://www.faqs.org/patents/app/20100118123 hep://www.faqs.org/patents/app/20100020078 • Friday hep://www.faqs.org/patents/app/20100007717 – Final project presentaXons at 1:30pm – If you’re in a jam for final project, let me know early 11

Kinect Device How the Kinect Works T2 Subhransu Maji Slides - PowerPoint PPT Presentation

4/27/16 Kinect Device How the Kinect Works T2 Subhransu Maji Slides credit: Derek Hoiem, University of Illinois Photo frame-grabbed from: http://www.blisteredthumbs.net/2010/11/dance-central-angry-review Kinect Device What the Kinect does

Nquire ask anything Anis Abboud, Chris Snyder, Mario Finelli Device 1 Device 2 Device 1

Human Body Recogni6on and Tracking: Kinect RGB-D Camera How the Kinect RGB-D Camera Works

GESTURE SENSORS Microsoft Kinect V1 24M - 2013 Microsoft Kinect V2 20M - 2016 + VR + GESTURE

Acquisition of a three- dimensional model through Microsoft Kinect The Microsoft Kinect RGB

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

Using sparse optical flow for multiple Kinect applications 27.6.2013 Stefan Guthe 1 Microsoft

Recognises your face and voice Kinect Adventures What the Kinect Sees top view side view

Point Cloud based Gesture Recognition with Kinect 2 Anton Klarn, Jonathan Karlsson Kinect v2

Attachments 4: Staff Presentation Kinect @ Lynnwood Development Agreement Prehearing City

Building an Autonomous Indoor Visual SLAM System using an iRobot and Kinect Chase Lewis Mentor:

Next Generation Natural User Interface with Kinect Ben Lower Developer Community Manager

Sinetris Tetris played with the Kinect Daniel Fritsch, Sarah Weinmann Projectmotivation

Garment retexturing using Kinect V2.0 Egils Avots Supervisors: Assoc. Prof. Gholamreza

Kinect@Home: Crowdsourced RGB-D data Rasmus Gransson, Alper Aydemir and Patric Jensfelt

Public Works Department Public Works Department Public Works Department 2012-2017 Capital Works

I want my MVP UX in the City - 20th April 2017 PILOT WORKS 1 Hello, I am Alastair from PILOT

Adversarial camera stickers: A physical camera-based attack on deep learning systems Juncheng B.

Abstract: This paper will analyse the different ways by which law has been mapped, modelled and

Association of Metropolitan Planning Officials Baltimore, Maryland October 25, 2019 Performance

Getting the Most out of Transportation Resilience 2019 AASHTO Committee on Environment &

CAT Coalition Technical Resources Working Group Quarterly Meeting February 12, 2020 11:00-12:30

October 11th, 2019 Welcome! Introductions Announcements SVTCs Website Launch

Chapter 5: Integer Compositions and Partitions and Set Partitions Prof. Tesler Math 184A

Anderson Localization from Classical Trajectories Piet Brouwer Laboratory of Atomic and Solid

Kinect Device How the Kinect Works T2 Subhransu Maji Slides - PowerPoint PPT Presentation

4/27/16 Kinect Device How the Kinect Works T2 Subhransu Maji Slides credit: Derek Hoiem, University of Illinois Photo frame-grabbed from: http://www.blisteredthumbs.net/2010/11/dance-central-angry-review Kinect Device What the Kinect does

Nquire ask anything Anis Abboud, Chris Snyder, Mario Finelli Device 1 Device 2 Device 1

Human Body Recogni6on and Tracking: Kinect RGB-D Camera How the Kinect RGB-D Camera Works

GESTURE SENSORS Microsoft Kinect V1 24M - 2013 Microsoft Kinect V2 20M - 2016 + VR + GESTURE

Acquisition of a three- dimensional model through Microsoft Kinect The Microsoft Kinect RGB

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

Using sparse optical flow for multiple Kinect applications 27.6.2013 Stefan Guthe 1 Microsoft

Recognises your face and voice Kinect Adventures What the Kinect Sees top view side view

Point Cloud based Gesture Recognition with Kinect 2 Anton Klarn, Jonathan Karlsson Kinect v2

Attachments 4: Staff Presentation Kinect @ Lynnwood Development Agreement Prehearing City

Building an Autonomous Indoor Visual SLAM System using an iRobot and Kinect Chase Lewis Mentor:

Next Generation Natural User Interface with Kinect Ben Lower Developer Community Manager

Sinetris Tetris played with the Kinect Daniel Fritsch, Sarah Weinmann Projectmotivation

Garment retexturing using Kinect V2.0 Egils Avots Supervisors: Assoc. Prof. Gholamreza

Kinect@Home: Crowdsourced RGB-D data Rasmus Gransson, Alper Aydemir and Patric Jensfelt

Public Works Department Public Works Department Public Works Department 2012-2017 Capital Works

I want my MVP UX in the City - 20th April 2017 PILOT WORKS 1 Hello, I am Alastair from PILOT

Adversarial camera stickers: A physical camera-based attack on deep learning systems Juncheng B.

Abstract: This paper will analyse the different ways by which law has been mapped, modelled and

Association of Metropolitan Planning Officials Baltimore, Maryland October 25, 2019 Performance

Getting the Most out of Transportation Resilience 2019 AASHTO Committee on Environment &amp;

CAT Coalition Technical Resources Working Group Quarterly Meeting February 12, 2020 11:00-12:30

October 11th, 2019 Welcome! Introductions Announcements SVTCs Website Launch

Chapter 5: Integer Compositions and Partitions and Set Partitions Prof. Tesler Math 184A

Anderson Localization from Classical Trajectories Piet Brouwer Laboratory of Atomic and Solid

Getting the Most out of Transportation Resilience 2019 AASHTO Committee on Environment &