Human-Robot Interaction Elective in Artificial Intelligence - - PDF document

human robot interaction
SMART_READER_LITE
LIVE PREVIEW

Human-Robot Interaction Elective in Artificial Intelligence - - PDF document

Human-Robot Interaction Elective in Artificial Intelligence Lecture 7 RGBD Perception Luca Iocchi DIAG, Sapienza University of Rome, Italy With contributions from A. Youssef and M.T. Lazaro Outline RGBD sensors and applications


slide-1
SLIDE 1

Human-Robot Interaction

Elective in Artificial Intelligence Lecture 7 – RGBD Perception

Luca Iocchi DIAG, Sapienza University of Rome, Italy

With contributions from A. Youssef and M.T. Lazaro

2

Outline

  • RGBD sensors and applications
  • Detectors for people detection
  • RGB processing / OpenCV
  • Example 1: RBG face/body detection
  • ROS
  • Depth processing
  • Example 2: RGBD face detections / depth

segmentation / virtual buttons

  • Conclusions
slide-2
SLIDE 2

Depth cameras

Color (RGB) + Depth (D) information Improve efficiency and robustness of image

processing

Mostly used in video-games,

but useful also in HCI and HRI 3

Stereo vision

slide-3
SLIDE 3

Stereo Triangulation

5

Z = b f / (uL – uR)

Active RGBD cameras

Capture color and depth Active infra-red light pattern Work with poor/no texture Depth computation

  • Stereo triangulation
  • Time of flight

Indoor (dark) environments

6

slide-4
SLIDE 4

7

Active infra-red pattern

http://wiki.ros.org/kinect_calibration/technical

8

RGBD Vision

Kinect and RGBD Images: Challenges and Applications. Luiz Velho. IMPA

slide-5
SLIDE 5

9

RGBD Vision

http://robotics.ait.kyushu-u.ac.jp/~kurazume/

10

Point Clouds

http://pointclouds.org/documentation/tutorials/ground_based_rgbd_people_detection.php

slide-6
SLIDE 6

11

Depth resolution

Krystof Litomisky Consumer RGB-D Cameras and their Applications

12

Application: 3D mapping

Krystof Litomisky Consumer RGB-D Cameras and their Applications

slide-7
SLIDE 7

13

Application: skeleton tracking

SkeletalViewer, Microsft Kinect SDK

14

Application: augmented reality

SkeletalViewer, Microsft Kinect SDK

slide-8
SLIDE 8

15

RGBD Image processing

16

Efficiency and robustness

Depth segmentation Image Processing Depth filtering

OUTPUT Select region

  • f interest

Remove false positives

slide-9
SLIDE 9

17

Software Libraries

ROS + Drivers + Libraries (OpenCV) Image acquisition Your app

Linux C++

18

Installation

ROS (includes OpenCV) www.ros.org OpenNI2 https://github.com/occipital/openni2 thin_drivers https://bitbucket.org/ggrisetti/thin_drivers Note: Complete image for Raspberry PI 3 available!

slide-10
SLIDE 10
  • OpenCV (Open Source Computer Vision) is a library of

programming functions for realtime computer vision.

  • BSD Licensed
  • free for commercial use
  • C++, C, Python and Java (Android) interfaces
  • Supports Windows, Linux, Android, iOS and Mac OS
  • More than 2500 optimized algorithms
  • L. Iocchi - Human-Robot Interaction

19

Introduction to OpenCV

http://opencv.org/

Modules for Image Processing

  • core - a compact module defining basic data structures, including

the dense multi-dimensional array Mat and basic functions used by all other modules.

  • imgproc - an image processing module that includes linear and

non-linear image filtering, geometrical image transformations (resize, affine and perspective warping, generic table-based remapping), color space conversion, histograms, and so on.

  • features2d - salient feature detectors, descriptors, and descriptor

matchers.

  • highgui - an easy-to-use interface to video capturing, image and

video codecs, as well as simple UI capabilities.

  • L. Iocchi - Human-Robot Interaction

20

Introduction to OpenCV

slide-11
SLIDE 11

#include <opencv2/core/core.hpp> #include <opencv2/highgui/highgui.hpp> #include <opencv2/imgproc/imgproc.hpp> #include <opencv2/features2d/features2d.hpp>

Data types

Set of primitive data types the library can operate on

  • uchar: 8-bit unsigned integer schar: 8-bit signed integer
  • ushort: 16-bit unsigned integer short: 16-bit signed integer
  • int: 32-bit signed integer float: 32-bit floating-point number
  • double: 64-bit floating-point number
  • L. Iocchi - Human-Robot Interaction

21

Introduction to OpenCV

How to include modules Image representation in OpenCV:

cv::Mat: is a n-dimensional array http://docs.opencv.org/modules/core/doc/basic_structures.html#mat

  • L. Iocchi - Human-Robot Interaction

22

Introduction to OpenCV

gray scale VS color image

slide-12
SLIDE 12

Mat (Header vs Data)

Mat A, C; // creates just the header parts A = imread(argv[1], CV_LOAD_IMAGE_COLOR); // here we'll know the method used (allocate matrix) Mat B(A); // Use the copy constructor (copy by reference) C = A; // Assignment operator (copy by reference) Mat D = A.clone(); // creates a new matrix D with data copied from A Mat E; // creates the header for E with no data A.copyTo(E); //sets the data for E (copied from A)

  • L. Iocchi - Human-Robot Interaction

23

Introduction to OpenCV

  • L. Iocchi - Human-Robot Interaction

24

Introduction to OpenCV

MATLAB style initializer

Mat E = Mat::eye(4, 4, CV_64F); cout << "E = " << endl << " " << E << endl << endl; Mat Z = Mat::zeros(3,3, CV_8UC1); cout << "Z = " << endl << " " << Z << endl << endl; Mat O = Mat::ones(2, 2, CV_32F); cout << "O = " << endl << " " << O << endl << endl;

slide-13
SLIDE 13
  • L. Iocchi - Human-Robot Interaction

25

Introduction to OpenCV

How to scan gray scale images

cv::Mat I = ... ... for( int i = 0; i < I.rows; ++i) { for( int j = 0; j < I.cols; ++j) { uchar g = I.at<uchar>(i,j); ... } }

How to scan RGB images

cv::Mat I = ... for( int i = 0; i < I.rows; ++i) { for( int j = 0; j < I.cols; ++j) { uchar blue = I.at<cv::Vec3b>(i,j)[0]; uchar green = I.at<cv::Vec3b>(i,j)[1]; uchar red = I.at<cv::Vec3b>(i,j)[2]; } }

  • L. Iocchi - Human-Robot Interaction

2 6

Full body detection in images

Histogram of Oriented Gradient (HOG)

  • It was introduced by Navneed Dalal and Bill Triggs in 2005 [1]
  • Sliding window technique for people detection in image.
  • Shape and appearance presence.
  • HOG is a features descriptor:

➢ Dense feature extraction. ➢ Local overlapping. ➢ Trained classifier (support Vector Machine SVM)

slide-14
SLIDE 14
  • L. Iocchi - Human-Robot Interaction

Full body detection in images

Histogram of Oriented Gradient (HOG)

  • Gradient computation
  • Orientation binning
  • Descriptor block
  • Normalization

C++ void HOGDescriptor::detectMultiScale(const Mat& img, vector<Rect>& found_locations, double hit_threshold=0, Size win_stride=Size(), Size padding=Size(), double scale 0=1.05, int group_threshold=2)

  • L. Iocchi - Human-Robot Interaction

28

Full body detection in images

HOG in OpenCV

#include <opencv2/objdetect/objdetect.hpp> HOGDescriptor hog; // standard descriptor hog.setSVMDetector(HOGDescriptor::getDefaultPeopleDetector()); vector<Rect> found; // where to save the detected persons hog.detectMultiScale(img, found, 0, Size(8,8), Size(32,32), 1.05, 2); http://mccormickml.com/2013/05/09/hog-person-detector-tutorial/

slide-15
SLIDE 15
  • L. Iocchi - Human-Robot Interaction

29

Face detection in images

Viola-Jones implementation of the OpenCV library (by using Haar-like cascades) OpenCV comes with a trainer as well as a detector OpenCV already contains many pre-trained classifiers for face, eyes, smile etc. Cascade Classifier in OpenCV C++ void CascadeClassifier::detectMultiScale (const Mat& image, vector<Rect>& objects, double scaleFactor=1.1, int minNeighbors=3, int flags=0, Size minSize=Size(), Size maxSize=Size() )

  • L. Iocchi - Human-Robot Interaction

30

Face detection in images

Cascade Classifier in OpenCV #include <opencv2/objdetect/objdetect.hpp> String face_cascade_trained ="haarcascade_frontalface_alt.xml"; CascadeClassifier face_cascade; face_cascade.load( face_cascade_trained ); vector<Rect> faces; face_cascade.detectMultiScale( frame_gray, faces, 1.1, 2, 0|CV_HAAR_SCALE_IMAGE, Size(30, 30) );

slide-16
SLIDE 16
  • L. Iocchi - Human-Robot Interaction

31

Image-to-world Conversion

u = , : homogeneous vector of pixel in image coordinates. P : perspective projection matrix. M = , , : homogeneous vector of real world coordinates. Camera model:

:

, 0, , , , 0,0,1 /camera/depth/camera_info /camera/rgb/camera_info

  • L. Iocchi - Human-Robot Interaction

32

Image-to-world Conversion

Using the depth camera intrinsics, each pixel (x_d,y_d) of the depth camera can be projected to metric 3D space using the following formula: P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d P3D.y = (y_d - cy_d) * depth(x_d,y_d) / fy_d P3D.z = depth(x_d,y_d) with fx_d, fy_d, cx_d and cy_d the intrinsics of the depth camera. We can then re-project each 3D point on the color image and get its color: P3D' = R.P3D + T P2D_rgb.x = (P3D'.x * fx_rgb / P3D'.z) + cx_rgb P2D_rgb.y = (P3D'.y * fy_rgb / P3D'.z) + cy_rgb with R and T the rotation and translation parameters estimated during the stereo calibration.

slide-17
SLIDE 17

33

Robot Operating System - ROS

  • ROS is a middleware for efficient data exchange
  • Based on publish/subscribe and event-based

paradigms

  • Topics identified by name and type of message
  • Each node can publish topics and subscribe to

topics

  • All nodes subscribed to a topic are notified when a

node publishes data on such topic

www.ros.org

34

ROS publish/subscribe

In our examples, thin_drivers nodes publish data (i.e., images) on topics whenever they are captured from the devices. Application nodes subscribe to these topics and are notified when images are ready. Application nodes are implemented as callback functions activated upon arrival of data in a topic.

slide-18
SLIDE 18

35

ROS subscribers

int main(int argc, char **argv) { ros::init(argc, argv, "rgbd_viewer"); ros::NodeHandle nh; ros::Subscriber rgb_sub = nh.subscribe("…/rgb/image_raw", 1, &rgbCB); ros::Subscriber depth_sub = nh.subscribe("…/depth/image_raw", 1, &depthCB); ros::spin(); return 0; } Buffer size Callback

36

ROS callbacks

void rgbCB(const sensor_msgs::ImageConstPtr& msg) { … } void depthCB(const sensor_msgs::ImageConstPtr& msg) { … }

slide-19
SLIDE 19

37

Depth Processing

void depthCB(const sensor_msgs::ImageConstPtr& msg) { cv_bridge::CvImagePtr cv_ptr = cv_bridge::toCvCopy(msg, msg->encoding); cv::Mat depthImage = cv_ptr->image; for(int i = 0; i < depthImage.rows; ++i) { for(int j = 0; j < depthImage.cols; ++j) { short z = depthImage.at<short>(i, j); // depth in mm … } } }

38

ROS Synchronization

ros::spin(); // runs until ROS node shutdown Callback functions are called asynchronously whenever data (images) are published by the publisher node. For example, if images are acquired at 30 frames per second, callbacks are called at 30 Hz. If time-consuming computation is needed, the callback should just copy the data to memory and the time- consuming method must be invoked by another thread (e.g., the main thread).

slide-20
SLIDE 20

39

ROS Synchronization

ros::spinOnce(); // runs only one ROS step Callback functions for data available at this time are called only once. Example: ROS callbacks and processing functions called at a frame rate < 10 Hz

ros::Rate r(10.0); while (ros::ok()) { ros::spinOnce(); // runs the callbacks proc.runOnce(); // runs the processing r.sleep(); }

40

ROS Synchronization

int main(int argc, char **argv) { // Set subscribers ros::Rate r(10.0); while (ros::ok()) { // true until node shutdown ros::spinOnce(); proc.runOnce(); r.sleep(); } … }

slide-21
SLIDE 21

41

ROS Synchronization

Processor proc; void rgbCB(const sensor_msgs::ImageConstPtr& msg) { proc.cv_rgb_ptr = cv_bridge::toCvCopy(msg, sensor_msgs::image_encodings::BGR8); } void depthCB(const sensor_msgs::ImageConstPtr& msg) { proc.cv_depth_ptr = cv_bridge::toCvCopy(msg, msg->encoding); }

42

ROS Synchronization

class Processor { cv_bridge::CvImagePtr cv_rgb_ptr, cv_depth_ptr; void runOnce() { cv_rgb_ptr->image … cv_depth_ptr->image … } }

slide-22
SLIDE 22

43

Example 1

RGBD Viewer https://bitbucket.org/iocchi/rgbd_viewer

  • Image display
  • Normalization for depth image

44

Example 2

RGBD person detection, depth segmentation, virtual buttons https://bitbucket.org/iocchi/rgbd_person_detection

  • Processor
  • display
  • faceDetection
  • depthSegmentation
  • virtualButtons
slide-23
SLIDE 23

45

Conclusions

  • RGBD computation improves image processing

tasks, specially for person detection

  • Many applications relevant for HRI and HCI
  • Many ideas for projects
  • Advanced user input through person feature

detection and recognition

  • Augmented reality