Embedded Multi-Person Pedestrian Tracking and Detection MSCV19 - - PowerPoint PPT Presentation

embedded multi person pedestrian tracking and detection
SMART_READER_LITE
LIVE PREVIEW

Embedded Multi-Person Pedestrian Tracking and Detection MSCV19 - - PowerPoint PPT Presentation

Embedded Multi-Person Pedestrian Tracking and Detection MSCV19 Capstone Project, Internal(CMU) Team Member: Yongxin Wang, Chunhui Liu Advisor: Dr. Kris Kitani 05/03/2019 Introduction Motivation Multi-person pedestrain tracking


slide-1
SLIDE 1

Embedded Multi-Person Pedestrian Tracking and Detection

MSCV19 Capstone Project, Internal(CMU)

Team Member: Yongxin Wang, Chunhui Liu Advisor: Dr. Kris Kitani 05/03/2019

slide-2
SLIDE 2

Introduction

  • Motivation

○ Multi-person pedestrain tracking ○ Real-time performance on embedded system ○ Visual analysis, automatic driving, robotics

  • Problem

○ Detect and track multiple people ○ Deal with new object, out-of-view objects,

  • cclusion, large appearance changes
  • Solution

○ Track by detection - SiameseRPN (Single Object) ○ Multiple object extension

2

slide-3
SLIDE 3

Past, Present, Future

3

Past: January:

  • Start From Single Obejct

SiamRPN March:

  • Train/Finetune Single Obejct

SiamRPN on VOT dataset April:

  • Single Obejct SiamRPN with ROI

Align

  • Multi Object SiamRPN Baseline

Future:

September 15:

  • Finish RoI Align verification
  • Merge Multi-Object

SiamRPN with RoI Align October 15:

  • Data Association & NMS

October 31:

  • Integrate object detection to

handle new objects December 15:

  • Optimize and deploy

algorithm on NVIDIA Jetson Machine

Present:

  • Single-Obejct SiameseRPN

with Region of Interest (RoI) Align

  • Multi Object SiamRPN

Distractor

slide-4
SLIDE 4
  • Single Object SiamRPN

○ Implement Train Code & Verify ○ Fintune on VOT

  • ROI Align For Single Object SiamRPN

○ Implement Code ○ Train and Verify on VOT

  • Multi Object SiamRPN

○ Baseline Model ○ Multi Object Evaluatoin Code

4

Past Present Future

slide-5
SLIDE 5

5 Template Features (4, 4, 2k ⨉ 256) Image Features (20, 20, 256)

CLS Score (FG/BG) (17, 17, 2k) Bounding Box (x, y, w, h) (17, 17, 4k)

Image Feature (22, 22, 256)

AlexNet AlexNet

Template Feature (6, 6, 256)

Conv Conv Conv Conv

Template Features (4, 4, 4k ⨉ 256) Image Features (20, 20, 256) Li, Bo et al. “High Performance Visual Tracking with Siamese Region Proposal Network.” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 2018

Past: Single Object SiamRPN

slide-6
SLIDE 6
  • Re-implementating trainign code Siamese RPN (training & testing)

○ Official repository only has testing code ○ Sanity check of training process ■ Finetuned from pretrained model (trained with VID) on VOT dataset

  • RoI Align for Single Object SiamRPN - Need for SPEED

6

Past: Single Object SiamRPN

Image Features (20, 20, 256) Image Features (20, 20, 256)

slide-7
SLIDE 7

Past: Single Object SiamRPN

7 Model Pretrained Finetune Test Data EAO ↑ DaSiamRPN (Official, SOTA) YoutubeBB + ImageNet VID

  • VOT 2015

0.446 SiamRPN ImageNet VID VOT 2015 (First 40 sequences) VOT 2015 (First 40 sequences) 0.5240 SiamRPN RoI ImageNet VID VOT 2015 (First 40 sequences) VOT 2015 (First 40 sequences) 0.6045 SiamRPN (with location & size penalty) ImageNet VID

  • VOT 2015

0.3426 SiamRPN ImageNet VID

  • VOT 2015

0.2647 SiamRPN

  • VOT 2015

IP SiamRPN RoI

  • VOT 2015

IP

slide-8
SLIDE 8

Past: Single Object SiamRPN

8

Red - SiamRPN (finetuned) Blue - SiamRPN RoI (finetuned) Black - DaSiameseRPN Green - Ground Truth

slide-9
SLIDE 9

9

Past: Multi Object Tracking

Template Adapter (Decide how to update the templates for the next frame)

Template Features (4, 4, 2k×256) Frame Features (20, 20, 256) Template Features (4, 4, 4k×256) Cls Score (FG/BG) (17, 17, 2k) Bounding Box (x, y, w, h) (17, 17, 4k)

Templates

NMS + Data Association

Frame T (255, 255, 3)

Frame Feature (22, 22, 256)

CNN CNN

Template Feature (6, 6, 256)

Conv Conv Conv Conv

Frame Features (20, 20, 256)

  • From Single Object Tracking to Multiple Object Tracking:

○ A network that can handle several templates . ○ NMS & Data Association for matching labels . ○ Decide when to add and delete tempaltes . .

slide-10
SLIDE 10

10

Past: Multi Object SiamRPN

Template Adapter (Decide how to update the templates for the next frame)

Template Features (4, 4, 2k×256) Frame Features (20, 20, 256) Template Features (4, 4, 4k×256) Cls Score (FG/BG) (17, 17, 2k) Bounding Box (x, y, w, h) (17, 17, 4k)

Templates

NMS + Data Association

Frame T (255, 255, 3)

Frame Feature (22, 22, 256)

CNN CNN

Template Feature (6, 6, 256)

Conv Conv Conv Conv

Frame Features (20, 20, 256)

  • From Single Object Tracking to Multiple Object Tracking:

○ A network that can handle several templates . ○ NMS & Data Association for matching labels . ○ Decide when to add and delete tempaltes . .

slide-11
SLIDE 11

11

  • Baseline Idea:

○ Pre-compute correlation filters for each template ○ All templates share the RPN network to do tracking independently

  • Introduce Communication among templates (1)

○ Concatenate all correlation filters as a bigger filter ○ Re-train RPN network to perform multi-object classification

  • Introduce Communication among templates (2)

○ Add Distractor-aware loss and fine-tune RPN

Past: Multi Object Extension

slide-12
SLIDE 12

12

Network 0: Baseline (Pretrained Weight)

Templates Frame T (255, 255, 3)

Frame Feature (22, 22, 256)

CNN CNN

Template Feature (n, 6, 6, 256) Template Feature (6, 6, 256) Template Features (4, 4, 2k×256) Frame Features (20, 20, 256) Template Features (4, 4, 4k×256) Cls Score (FG/BG) (17, 17, 2k) Bounding Box (x, y, w, h) (17, 17, 4k)

Conv Conv Conv Conv

Frame Features (20, 20, 256) n: number of templates k: number of anchors for each spatial pixel

slide-13
SLIDE 13

13

Visualization Results (MOT Dataset)

slide-14
SLIDE 14

14

Visualization Response

Template: Template:

slide-15
SLIDE 15

15

  • Baseline Idea:

○ Pre-compute correlation filters for each template ○ All templates share the RPN network to do tracking independtly

  • Introduce Communication among templates (1)

○ Concatenate all correlation filters as a bigger filter ○ Re-train RPN network to perform multi-object classification

  • Introduce Communication among templates (2)

○ Add Distractor-aware loss and fine-tune RPN

Past: Multi Object SiamRPN

slide-16
SLIDE 16

16

Network 1: Abandoned

Templates Frame T (255, 255, 3)

Frame Feature (22, 22, 256)

CNN CNN

Template Feature (n, 6, 6, 256) Template Feature (6, 6, 256n) Template Features (4, 4, nk×256) Frame Features (20, 20, 256) Template Features (4, 4, 4nk×256) Cls Score (FG/BG) (17, 17, (n+1)k) Bounding Box (x, y, w, h) (17, 17, 4k)

Conv Conv Conv Conv

Frame Features (20, 20, 256) n: number of templates k: number of anchors for each spatial pixel

slide-17
SLIDE 17
  • Single Object SiamRPN

○ Training from scratch ○ Verifying Effect of RoI

  • Multi Object SiamRPN

○ Try to fix Distractor Issue

17

Past Present Future

slide-18
SLIDE 18

18

  • Baseline Idea:

○ Pre-compute correlation filters for each template ○ All templates share the RPN network to do tracking independtly

  • Introduce Communication among templates (1)

○ Concatenate all correlation filters as a bigger filter ○ Re-train RPN network to perform multi-object classification

  • Introduce Communication among templates (2)

○ Add Distractor-aware loss and fine-tune RPN

Present: Multi Object SiamRPN

slide-19
SLIDE 19

19

Network 2: Softmax (Pretrained Weight)

Templates Frame T (255, 255, 3)

Frame Feature (22, 22, 256)

CNN CNN

Template Feature (n, 6, 6, 256) Cls Score (FG/BG) (17, 17, 2k) Bounding Box (x, y, w, h) (17, 17, 4k)

RPN

Cls Score (FG/BG) (17, 17, 2k) Cls Score (FG/BG) (17, 17, 2k)

SoftMax

Cls Score (FG/BG) (17, 17, nk)

slide-20
SLIDE 20

20

  • Add a Layer to handle distractor-aware labelling

○ Freeze the SiamRPN, only train the Association Network ○ E.g. A fully connect network

Present: Deal with Distractor

Templates Frame T (255, 255, 3)

Frame Feature (22, 22, 256)

CNN CNN

Template Feature (n, 6, 6, 256) Cls Score (FG/BG) (17, 17, 2k) Bounding Box (x, y, w, h) (17, 17, 4k)

RPN

Cls Score (FG/BG) (17, 17, 2k) Cls Score (FG/BG) (17, 17, 2k)

Neural Network

Cls Score (FG/BG) (17, 17, nk)

slide-21
SLIDE 21

21

  • ROI Align: Quantitative and Qualitative Verification

Present: Single Object SiamRPN

Whole Image as Input Cropped Feature Cropped Image as Input Whole Feature

slide-22
SLIDE 22

22

Past Present Future

  • Finish RoI Align Verification for Single

Object SiamRPN (September 15) ○ Achieve similar EAO as in SiamRPN paper

  • Merge Multi Object SiamRPN with RoI

Align (September 15) ○ Achieve similar performance as without RoI Align

  • Data Association and NMS Network

(October 15) ○ Assign correct ID to correct person

  • Integrate Object Detection (October 31)

○ Learn a universal template that has high response on all pedestrians

  • Test Speed and Deploy (December 15)
slide-23
SLIDE 23

23

  • Finish RoI Align Verification for Single Object SiamRPN (September 15)

○ Achieve similar EAO as in SiamRPN paper

  • Merge Multi Object SiamRPN with RoI Align (September 15)

○ Achieve similar performance as without RoI Align

Future: Detect New Objects

Sep 15 Oct 31 Nov 15 Dec 15 Oct 15

slide-24
SLIDE 24

24

  • Finish RoI Align Verification for Single Object SiamRPN (September 15)

○ Achieve similar EAO as in SiamRPN paper

  • Merge Multi Object SiamRPN with RoI Align (September 15)

○ Achieve similar performance as without RoI Align

  • Data Association and NMS Network (October 15)

○ Assign correct ID to correct person

Future: Detect New Objects

Sep 15 Oct 31 Nov 15 Dec 15 Oct 15

slide-25
SLIDE 25

25

  • Finish RoI Align Verification for Single Object SiamRPN (September 15)

○ Achieve similar EAO as in SiamRPN paper

  • Merge Multi Object SiamRPN with RoI Align (September 15)

○ Achieve similar performance as without RoI Align

  • Data Association and NMS Network (October 15)

○ Assign correct ID to correct person

  • Integrate Object Detection (October 31)

○ Learn a universal template that has high response on all pedestrians

Future: Detect New Objects

Sep 15 Oct 31 Nov 15 Dec 15 Oct 15

slide-26
SLIDE 26

26

  • Finish RoI Align Verification for Single Object SiamRPN (September 15)

○ Achieve similar EAO as in SiamRPN paper

  • Merge Multi Object SiamRPN with RoI Align (September 15)

○ Achieve similar performance as without RoI Align

  • Data Association and NMS Network (October 15)

○ Assign correct ID to correct person

  • Integrate Object Detection (October 31)

○ Learn a universal template that has high response on all pedestrians

  • Test Speed and Deploy (December 15)

○ Real-time performance on Nvidia Jeston tx2.

Future: Detect New Objects

Sep 15 Oct 31 Nov 15 Dec 15 Oct 15

slide-27
SLIDE 27

Thanks

27