detection and segmentation
play

Detection and Segmentation CS60010: Deep Learning Abir Das IIT - PowerPoint PPT Presentation

Detection and Segmentation CS60010: Deep Learning Abir Das IIT Kharagpur Feb 28, 2020 Introduction Datasets Localization Agenda To get introduced to two important tasks of computer vision - detection and segmentation along with deep neural


  1. Detection and Segmentation CS60010: Deep Learning Abir Das IIT Kharagpur Feb 28, 2020

  2. Introduction Datasets Localization Agenda To get introduced to two important tasks of computer vision - detection and segmentation along with deep neural network’s application in these areas in recent years. Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 2 / 38

  3. Introduction Datasets Localization From Classification to Detection Classification Detection Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 3 / 38

  4. Introduction Datasets Localization Challenges of Object Detection § Simultaneous recognition and localization § Images may contain objects from more than one class and multiple instances of the same class § Evaluation Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 4 / 38

  5. Introduction Datasets Localization Localization and Detection Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 5 / 38

  6. Introduction Datasets Localization Evaluation § At test time 3 things are predicted:- Bounding box coordinates, Bounding box class label, Confidence score § Performance is measured in terms of IoU (Intersection over Union) § According to PASCAL criterion, ◮ a detection is correct if IoU > 0.5 ◮ For multiple detections only one is considered true positive Image Source Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 6 / 38

  7. Introduction Datasets Localization Evaluation: Precision-Recall tp § precision = tp + fp tp § recall = tp + fn Image Source Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 7 / 38

  8. Introduction Datasets Localization Evaluation: Average Precision Lets consider an image with 5 apples where our detector provides 10 detections. Source: This medium post Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 8 / 38

  9. Introduction Datasets Localization Evaluation: Average Precision Area under curve is a measure of performance. This gives the average precision of the detector. Source: This medium post Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 9 / 38

  10. Introduction Datasets Localization Evaluation: mean Average Precision A little more detail: § The curve is made smooth from the zigzag pattern by finding the highest precision value at or to the right side of the recall values. § Then the average is taken for 11 recall values (0, 0.1, 0.2, ... 1.0) - Average Precison (AP) § The mean average precision (mAP) is the mean of the average precisions (AP) for all classes of objects. Source: This medium post Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 10 / 38

  11. Introduction Datasets Localization Non-max Suppression What to do if there are multiple detections of the same object? Can you think its effect on precision-recall? 0.6 0.8 0.9 0.7 0.7 Source: deeplearning.ai Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 11 / 38

  12. Introduction Datasets Localization Non-max Suppression § Sort the predictions by the confidence scores § Starting with the top score prediction, ignore any other prediction of the same class and high overlap ( e.g. , IoU > 0.5) with the top ranked prediction § Repeat the above step until all predictions are checked 0.6 0.8 0.9 0.7 0.7 Source: deeplearning.ai Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 12 / 38

  13. Introduction Datasets Localization Segmentation Other Computer Vision Tasks Semantic Instance Semantic Classification Object Segmentation Segmentation Segmentation + Localization Detection GRASS , CAT , GRASS , CAT , CAT DOG , DOG , CAT DOG , DOG , CAT TREE , SKY TREE , SKY Source: cs231n course, Stanford University No objects, just pixels Multiple Object No objects, just pixels Single Object Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 13 / 38 This image is CC0 public domain Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 11 - 8 May 10, 2018

  14. Introduction Datasets Localization PASCAL VOC § Dataset size (by 2012): 11.5K training/val images, 27K bounding boxes, 7K segmentations Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 14 / 38

  15. Introduction Datasets Localization PASCAL VOC Object%detection%renaissance% (2013'present) 80% PASCAL$VOC 70% mean0Average0Precision0(mAP) 60% Before$deep$convnets RHCNNv1 50% 40% Using$deep$convnets 30% 20% 10% 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 year Source: ICCV ’15, Fast R-CNN Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 15 / 38

  16. Introduction Datasets Localization COCO Dataset Source: http://cocodataset.org Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 16 / 38

  17. Introduction Datasets Localization COCO Tasks Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 17 / 38

  18. Introduction Datasets Localization Classification + Localization Classification + Localization: Task Classification : C classes Input: Image CAT Output: Class label Evaluation metric: Accuracy Localization : Input: Image (x, y, w, h) Output : Box in the image (x, y, w, h) Evaluation metric: Intersection over Union Classification + Localization : Do both Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 8 - Lecture 8 - 10 1 Feb 2016 1 Feb 2016 Source: cs231n course, Stanford University Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 18 / 38

  19. Introduction Datasets Localization Classification + Localization Idea #1: Localization as Regression Input : image Neural Net Output : Box coordinates (4 numbers) Loss : L2 distance Correct output : box coordinates Only one object, (4 numbers) simpler than detection Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 8 - Lecture 8 - 12 1 Feb 2016 1 Feb 2016 Source: cs231n course, Stanford University Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 19 / 38

  20. Introduction Datasets Localization Classification + Localization Simple Recipe for Classification + Localization Step 1 : Train (or download) a classification model (AlexNet, VGG, GoogLeNet) Convolution Fully-connected and Pooling layers Softmax loss Final conv Class scores feature map Image Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 8 - Lecture 8 - 13 1 Feb 2016 1 Feb 2016 Source: cs231n course, Stanford University Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 20 / 38

  21. Introduction Datasets Localization Classification + Localization Simple Recipe for Classification + Localization Step 2 : Attach new fully-connected “regression head” to the network Fully-connected layers “Classification head” Convolution Class scores and Pooling Fully-connected layers “Regression head” Final conv feature map Box coordinates Image Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 8 - Lecture 8 - 1 Feb 2016 1 Feb 2016 14 Source: cs231n course, Stanford University Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 21 / 38

  22. Introduction Datasets Localization Classification + Localization Simple Recipe for Classification + Localization Step 3 : Train the regression head only with SGD and L2 loss Fully-connected layers Convolution Class scores and Pooling Fully-connected layers L2 loss Final conv Box coordinates feature map Image Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 8 - Lecture 8 - 15 1 Feb 2016 1 Feb 2016 Source: cs231n course, Stanford University Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 22 / 38

  23. Introduction Datasets Localization Classification + Localization Simple Recipe for Classification + Localization Step 4 : At test time use both heads Fully-connected layers Convolution Class scores and Pooling Fully-connected layers Final conv feature map Box coordinates Image Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 8 - Lecture 8 - 1 Feb 2016 1 Feb 2016 16 Source: cs231n course, Stanford University Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 23 / 38

  24. Introduction Datasets Localization Classification + Localization Aside: Localizing multiple objects Want to localize exactly K objects in each image Fully-connected layers (e.g. whole cat, cat head, cat left ear, cat right ear for K=4) Convolution Class scores and Pooling Fully-connected layers K x 4 numbers (one box per object) Final conv feature map Box coordinates Image Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 8 - Lecture 8 - 1 Feb 2016 1 Feb 2016 19 Source: cs231n course, Stanford University Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 24 / 38

  25. Introduction Datasets Localization Classification + Localization Aside: Human Pose Estimation Represent a person by K joints Regress (x, y) for each joint from last fully-connected layer of AlexNet (Details: Normalized coordinates, iterative refinement) Toshev and Szegedy, “DeepPose: Human Pose Estimation via Deep Neural Networks”, CVPR 2014 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 8 - Lecture 8 - 20 1 Feb 2016 1 Feb 2016 Source: cs231n course, Stanford University Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 25 / 38

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend