embedded multi person pedestrian tracking and detection
play

Embedded Multi-Person Pedestrian Tracking and Detection MSCV19 - PowerPoint PPT Presentation

Embedded Multi-Person Pedestrian Tracking and Detection MSCV19 Capstone Project, Internal(CMU) Team Member: Yongxin Wang, Chunhui Liu Advisor: Dr. Kris Kitani 05/03/2019 Introduction Motivation Multi-person pedestrain tracking


  1. Embedded Multi-Person Pedestrian Tracking and Detection MSCV19 Capstone Project, Internal(CMU) Team Member: Yongxin Wang, Chunhui Liu Advisor: Dr. Kris Kitani 05/03/2019

  2. Introduction Motivation ● Multi-person pedestrain tracking ○ Real-time performance on embedded system ○ Visual analysis, automatic driving, robotics ○ Problem ● Detect and track multiple people ○ Deal with new object, out-of-view objects, ○ occlusion, large appearance changes Solution ● Track by detection - SiameseRPN (Single ○ Object) Multiple object extension ○ 2

  3. Past, Present, Future Past: Present: Future: January: Single-Obejct SiameseRPN September 15: ● with Region of Interest (RoI) Finish RoI Align verification Start From Single Obejct ● ● Align Merge Multi-Object SiamRPN ● March: Multi Object SiamRPN SiamRPN with RoI Align ● Distractor October 15: Train/Finetune Single Obejct ● Data Association & NMS SiamRPN on VOT dataset ● October 31: April: Single Obejct SiamRPN with ROI Integrate object detection to ● ● handle new objects Align December 15: Multi Object SiamRPN Baseline ● Optimize and deploy ● algorithm on NVIDIA Jetson Machine 3

  4. Past Single Object SiamRPN ● Implement Train Code & Verify ○ Present Fintune on VOT ○ ROI Align For Single Object SiamRPN ● Future Implement Code ○ Train and Verify on VOT ○ Multi Object SiamRPN ● Baseline Model ○ Multi Object Evaluatoin Code ○ 4

  5. Past: Single Object SiamRPN Conv Template Features (4, 4, 2k ⨉ 256) CLS Score (FG/BG) (17, 17, 2k) AlexNet Image Features Conv Template (20, 20, 256) Feature (6, 6, 256) Conv Template Features (4, 4, 4k ⨉ 256) Bounding Box (x, y, w, h) AlexNet ( 17, 17, 4k) Image Feature (22, 22, 256) Image Features Conv (20, 20, 256) Li, Bo et al. “High Performance Visual Tracking with Siamese Region Proposal Network.” 2018 IEEE/CVF Conference on Computer Vision and 5 Pattern Recognition 2018

  6. Past: Single Object SiamRPN Re-implementating trainign code Siamese RPN (training & testing) ● Official repository only has testing code ○ Sanity check of training process ○ Finetuned from pretrained model (trained with VID) on VOT dataset ■ RoI Align for Single Object SiamRPN - Need for SPEED ● Image Features (20, 20, 256) Image Features (20, 20, 256) 6

  7. Past: Single Object SiamRPN Model Pretrained Finetune Test Data EAO ↑ DaSiamRPN (Official, YoutubeBB + - VOT 2015 0.446 SOTA) ImageNet VID SiamRPN ImageNet VID VOT 2015 (First 40 sequences) VOT 2015 (First 40 sequences) 0.5240 SiamRPN RoI ImageNet VID VOT 2015 (First 40 sequences) VOT 2015 (First 40 sequences) 0.6045 SiamRPN (with location & ImageNet VID - VOT 2015 0.3426 size penalty) SiamRPN ImageNet VID - VOT 2015 0.2647 SiamRPN - - VOT 2015 IP SiamRPN RoI - - VOT 2015 IP 7

  8. Past: Single Object SiamRPN Red - SiamRPN (finetuned) Black - DaSiameseRPN Blue - SiamRPN RoI (finetuned) Green - Ground Truth 8

  9. Past: Multi Object Tracking From Single Object Tracking to Multiple Object Tracking: ● A network that can handle several templates . ○ NMS & Data Association for matching labels . ○ Decide when to add and delete tempaltes . ○ Template Adapter . (Decide how to update the templates for the next frame) Conv Template Features (4, 4, 2k × 256) Cls Score (FG/BG) (17, 17, 2k) CNN Frame Features NMS + Data Conv Template Feature (20, 20, 256) Association (6, 6, 256) Templates Conv Template Features Bounding Box (x, y, w, h) (4, 4, 4k × 256) CNN (17, 17, 4k) Frame Feature (22, 22, 256) Frame Features Frame T Conv 9 (20, 20, 256) (255, 255, 3)

  10. Past: Multi Object SiamRPN From Single Object Tracking to Multiple Object Tracking: ● A network that can handle several templates . ○ NMS & Data Association for matching labels . ○ Decide when to add and delete tempaltes . ○ Template Adapter . (Decide how to update the templates for the next frame) Conv Template Features (4, 4, 2k × 256) Cls Score (FG/BG) (17, 17, 2k) CNN Frame Features NMS + Data Conv Template Feature (20, 20, 256) Association (6, 6, 256) Templates Conv Template Features Bounding Box (x, y, w, h) (4, 4, 4k × 256) CNN (17, 17, 4k) Frame Feature (22, 22, 256) Frame Features Frame T Conv 10 (20, 20, 256) (255, 255, 3)

  11. Past: Multi Object Extension Baseline Idea: ● Pre-compute correlation filters for each template ○ All templates share the RPN network to do tracking independently ○ Introduce Communication among templates (1) ● Concatenate all correlation filters as a bigger filter ○ Re-train RPN network to perform multi-object classification ○ Introduce Communication among templates (2) ● Add Distractor-aware loss and fine-tune RPN ○ 11

  12. Network 0: Baseline (Pretrained Weight) n: number of templates k: number of anchors for each spatial pixel Conv CNN Template Features (4, 4, 2k × 256) Cls Score (FG/BG) Template Feature (17, 17, 2k) (6, 6, 256) Frame Features Template Feature Conv (20, 20, 256) (n, 6, 6, 256) Templates Conv Template Features (4, 4, 4k × 256) Bounding Box (x, y, w, h) CNN (17, 17, 4k) Frame Feature (22, 22, 256) Frame Features Conv (20, 20, 256) Frame T (255, 255, 3) 12

  13. Visualization Results (MOT Dataset) 13

  14. Visualization Response Template: Template: 14

  15. Past: Multi Object SiamRPN Baseline Idea: ● Pre-compute correlation filters for each template ○ All templates share the RPN network to do tracking independtly ○ Introduce Communication among templates (1) ● Concatenate all correlation filters as a bigger filter ○ Re-train RPN network to perform multi-object classification ○ Introduce Communication among templates (2) ● Add Distractor-aware loss and fine-tune RPN ○ 15

  16. Network 1: Abandoned n: number of templates k: number of anchors for each spatial pixel Conv CNN Template Features (4, 4, nk × 256) Cls Score (FG/BG) Template Feature (17, 17, (n+1)k) (6, 6, 256n) Frame Features Template Feature Conv (20, 20, 256) (n, 6, 6, 256) Templates Conv Template Features (4, 4, 4nk × 256) Bounding Box (x, y, w, h) CNN (17, 17, 4k) Frame Feature (22, 22, 256) Frame Features Conv (20, 20, 256) Frame T (255, 255, 3) 16

  17. Past Single Object SiamRPN ● Training from scratch ○ Present Verifying Effect of RoI ○ Multi Object SiamRPN ● Future Try to fix Distractor Issue ○ 17

  18. Present: Multi Object SiamRPN Baseline Idea: ● Pre-compute correlation filters for each template ○ All templates share the RPN network to do tracking independtly ○ Introduce Communication among templates (1) ● Concatenate all correlation filters as a bigger filter ○ Re-train RPN network to perform multi-object classification ○ Introduce Communication among templates (2) ● Add Distractor-aware loss and fine-tune RPN ○ 18

  19. Network 2: Softmax (Pretrained Weight) Cls Score (FG/BG) (17, 17, 2k) RPN CNN SoftMax Cls Score (FG/BG) Cls Score (FG/BG) (17, 17, nk) (17, 17, 2k) Template Feature (n, 6, 6, 256) Templates Cls Score (FG/BG) (17, 17, 2k) CNN Frame Feature (22, 22, 256) Bounding Box (x, y, w, h) (17, 17, 4k) Frame T (255, 255, 3) 19

  20. Present: Deal with Distractor Add a Layer to handle distractor-aware labelling ● Freeze the SiamRPN, only train the Association Network ○ E.g. A fully connect network ○ Cls Score (FG/BG) (17, 17, 2k) Neural RPN CNN Cls Score (FG/BG) Network Cls Score (FG/BG) (17, 17, nk) (17, 17, 2k) Template Feature Templates Cls Score (FG/BG) (n, 6, 6, 256) (17, 17, 2k) CNN Frame Feature (22, 22, 256) Bounding Box (x, y, w, h) Frame T 20 (17, 17, 4k) (255, 255, 3)

  21. Present: Single Object SiamRPN ROI Align: Quantitative and Qualitative Verification ● Whole Image as Input Cropped Feature Cropped Image as Input Whole Feature 21

  22. Past Finish RoI Align Verification for Single ● Object SiamRPN (September 15) Present Achieve similar EAO as in SiamRPN ○ paper Future Merge Multi Object SiamRPN with RoI ● Align (September 15) Achieve similar performance as ○ without RoI Align Data Association and NMS Network ● (October 15) Assign correct ID to correct person ○ Integrate Object Detection (October 31) ● Learn a universal template that has ○ high response on all pedestrians Test Speed and Deploy (December 15) ● 22

  23. Future: Detect New Objects Sep 15 Oct 15 Oct 31 Nov 15 Dec 15 Finish RoI Align Verification for Single Object SiamRPN (September 15) ● Achieve similar EAO as in SiamRPN paper ○ Merge Multi Object SiamRPN with RoI Align (September 15) ● Achieve similar performance as without RoI Align ○ 23

  24. Future: Detect New Objects Sep 15 Oct 15 Oct 31 Nov 15 Dec 15 Finish RoI Align Verification for Single Object SiamRPN (September 15) ● Achieve similar EAO as in SiamRPN paper ○ Merge Multi Object SiamRPN with RoI Align (September 15) ● Achieve similar performance as without RoI Align ○ Data Association and NMS Network (October 15) ● Assign correct ID to correct person ○ 24

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend