Detecting Object Manipulations in an Assembly Task Jacob Rosenskld - - PowerPoint PPT Presentation

detecting object manipulations in an assembly task
SMART_READER_LITE
LIVE PREVIEW

Detecting Object Manipulations in an Assembly Task Jacob Rosenskld - - PowerPoint PPT Presentation

Detecting Object Manipulations in an Assembly Task Jacob Rosenskld mas15jro@student.lu.se Agenda Introduction Methodology YOLO Object detection OpenPose Pose estimation Hand activity recognition Determine which


slide-1
SLIDE 1

Detecting Object Manipulations in an Assembly Task

Jacob Rosensköld mas15jro@student.lu.se

slide-2
SLIDE 2

Agenda

  • Introduction
  • Methodology
  • YOLO – Object detection
  • OpenPose – Pose estimation
  • Hand activity recognition
  • Determine which object is being manipulated
  • Discussion
  • Future work
slide-3
SLIDE 3

Introduction

  • Detecting object manipulations
  • Extract information from an assembly video
slide-4
SLIDE 4

Methodology

  • 1. Object detection
  • 2. Pose estimation

Video frame

Sequence/sliding window of hand key points from 32 latest frames

  • 3. Hand activity detection
  • 4. Using depth camera to

determine which object is being manipulated

slide-5
SLIDE 5
  • 1. YOLO – Object detection
  • ”You only look once”
slide-6
SLIDE 6
  • 1. YOLO - training
  • About 1000 annotated images.
  • Image augmentation, increased to about 6000 images.
  • Mean average precision of 99%.
slide-7
SLIDE 7
  • 2. OpenPose – Pose estimation
  • Estimate body, foot, face and hand key points.
  • 21 key points per hand.
slide-8
SLIDE 8
  • 3. Hand activity recognition
  • Classify sequences of the 32 frames.
  • Each frame containing hand key points from

OpenPose.

  • Only used the 5 fingertips.
  • Two classes: grip and drop
slide-9
SLIDE 9
  • 3. Hand activity recognition - architecture

LSTM Input dropout: 0.3 Units: 10 Dropout 0.5 LSTM Units: 10 Dense softmax

slide-10
SLIDE 10
  • 3. Hand activity recognition - training
  • About 1300 sequences, 70% training, 15% validation, 15% testing
  • 2000 epochs, batch size of 256
slide-11
SLIDE 11
  • 3. Hand activity recognition - results

Class Precision Recall F1-score Grip 0.89 0.95 0.92 Drop 0.94 0.87 0.91 Average 0.92 0.91 0.91

slide-12
SLIDE 12

Demo 1

slide-13
SLIDE 13
  • 4. Determine which object is being

manipulated

  • Depth camera.
  • Look for sufficient close objects.
  • Could add more, e.g. looking if object moves in same direction as

hand.

slide-14
SLIDE 14

Demo 2

slide-15
SLIDE 15

Discussion

  • Does not work that good when tested on video containing a lot of

activities and movements.

  • Detects activities with high confidence even though the person is just moving

his hands around.

  • Grip and drop are very subtle activities.
slide-16
SLIDE 16

Discussion

  • Incorrect classifications.
  • Grip and drop are similar.
  • Many of the frames in the sequences for both grip and drop are just the

hands moving.

slide-17
SLIDE 17

Discussion

  • The system is made of a number of chained sub systems.
  • Every sub system adds some uncertainty to the final result.
slide-18
SLIDE 18

Discussion

  • It’s possible to detect subtle movements in 2D, even though

improvements are required.

slide-19
SLIDE 19

Future work

  • Train the neural network not using fixed-sized frame sequences.
  • Stateful LSTM.
  • Include depth of the hand key points in the data.
  • Would make it easier to detect subtle movements.
slide-20
SLIDE 20

Future work

  • Add more classes.
  • E.g. a screwing motion.
  • Would give a hint of how much the similarity between grip and drop affects

the results.

slide-21
SLIDE 21

Thank you!

Questions?