Detection and Segmentation CS60010: Deep Learning Abir Das IIT - PowerPoint PPT Presentation

Detection and Segmentation CS60010: Deep Learning Abir Das IIT Kharagpur Feb 28, 2020

Introduction Datasets Localization Agenda To get introduced to two important tasks of computer vision - detection and segmentation along with deep neural network’s application in these areas in recent years. Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 2 / 38

Introduction Datasets Localization From Classification to Detection Classification Detection Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 3 / 38

Introduction Datasets Localization Challenges of Object Detection § Simultaneous recognition and localization § Images may contain objects from more than one class and multiple instances of the same class § Evaluation Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 4 / 38

Introduction Datasets Localization Localization and Detection Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 5 / 38

Introduction Datasets Localization Evaluation § At test time 3 things are predicted:- Bounding box coordinates, Bounding box class label, Confidence score § Performance is measured in terms of IoU (Intersection over Union) § According to PASCAL criterion, ◮ a detection is correct if IoU > 0.5 ◮ For multiple detections only one is considered true positive Image Source Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 6 / 38

Introduction Datasets Localization Evaluation: Precision-Recall tp § precision = tp + fp tp § recall = tp + fn Image Source Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 7 / 38

Introduction Datasets Localization Evaluation: Average Precision Lets consider an image with 5 apples where our detector provides 10 detections. Source: This medium post Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 8 / 38

Introduction Datasets Localization Evaluation: Average Precision Area under curve is a measure of performance. This gives the average precision of the detector. Source: This medium post Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 9 / 38

Introduction Datasets Localization Evaluation: mean Average Precision A little more detail: § The curve is made smooth from the zigzag pattern by finding the highest precision value at or to the right side of the recall values. § Then the average is taken for 11 recall values (0, 0.1, 0.2, ... 1.0) - Average Precison (AP) § The mean average precision (mAP) is the mean of the average precisions (AP) for all classes of objects. Source: This medium post Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 10 / 38

Introduction Datasets Localization Non-max Suppression What to do if there are multiple detections of the same object? Can you think its effect on precision-recall? 0.6 0.8 0.9 0.7 0.7 Source: deeplearning.ai Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 11 / 38

Introduction Datasets Localization Non-max Suppression § Sort the predictions by the confidence scores § Starting with the top score prediction, ignore any other prediction of the same class and high overlap ( e.g. , IoU > 0.5) with the top ranked prediction § Repeat the above step until all predictions are checked 0.6 0.8 0.9 0.7 0.7 Source: deeplearning.ai Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 12 / 38

Introduction Datasets Localization Segmentation Other Computer Vision Tasks Semantic Instance Semantic Classification Object Segmentation Segmentation Segmentation + Localization Detection GRASS , CAT , GRASS , CAT , CAT DOG , DOG , CAT DOG , DOG , CAT TREE , SKY TREE , SKY Source: cs231n course, Stanford University No objects, just pixels Multiple Object No objects, just pixels Single Object Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 13 / 38 This image is CC0 public domain Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 11 - 8 May 10, 2018

Introduction Datasets Localization PASCAL VOC § Dataset size (by 2012): 11.5K training/val images, 27K bounding boxes, 7K segmentations Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 14 / 38

Introduction Datasets Localization PASCAL VOC Object%detection%renaissance% (2013'present) 80% PASCAL$VOC 70% mean0Average0Precision0(mAP) 60% Before$deep$convnets RHCNNv1 50% 40% Using$deep$convnets 30% 20% 10% 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 year Source: ICCV ’15, Fast R-CNN Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 15 / 38

Introduction Datasets Localization COCO Dataset Source: http://cocodataset.org Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 16 / 38

Introduction Datasets Localization COCO Tasks Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 17 / 38

Introduction Datasets Localization Classification + Localization Classification + Localization: Task Classification : C classes Input: Image CAT Output: Class label Evaluation metric: Accuracy Localization : Input: Image (x, y, w, h) Output : Box in the image (x, y, w, h) Evaluation metric: Intersection over Union Classification + Localization : Do both Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 8 - Lecture 8 - 10 1 Feb 2016 1 Feb 2016 Source: cs231n course, Stanford University Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 18 / 38

Introduction Datasets Localization Classification + Localization Idea #1: Localization as Regression Input : image Neural Net Output : Box coordinates (4 numbers) Loss : L2 distance Correct output : box coordinates Only one object, (4 numbers) simpler than detection Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 8 - Lecture 8 - 12 1 Feb 2016 1 Feb 2016 Source: cs231n course, Stanford University Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 19 / 38

Introduction Datasets Localization Classification + Localization Simple Recipe for Classification + Localization Step 1 : Train (or download) a classification model (AlexNet, VGG, GoogLeNet) Convolution Fully-connected and Pooling layers Softmax loss Final conv Class scores feature map Image Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 8 - Lecture 8 - 13 1 Feb 2016 1 Feb 2016 Source: cs231n course, Stanford University Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 20 / 38

Introduction Datasets Localization Classification + Localization Simple Recipe for Classification + Localization Step 2 : Attach new fully-connected “regression head” to the network Fully-connected layers “Classification head” Convolution Class scores and Pooling Fully-connected layers “Regression head” Final conv feature map Box coordinates Image Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 8 - Lecture 8 - 1 Feb 2016 1 Feb 2016 14 Source: cs231n course, Stanford University Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 21 / 38

Introduction Datasets Localization Classification + Localization Simple Recipe for Classification + Localization Step 3 : Train the regression head only with SGD and L2 loss Fully-connected layers Convolution Class scores and Pooling Fully-connected layers L2 loss Final conv Box coordinates feature map Image Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 8 - Lecture 8 - 15 1 Feb 2016 1 Feb 2016 Source: cs231n course, Stanford University Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 22 / 38

Introduction Datasets Localization Classification + Localization Simple Recipe for Classification + Localization Step 4 : At test time use both heads Fully-connected layers Convolution Class scores and Pooling Fully-connected layers Final conv feature map Box coordinates Image Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 8 - Lecture 8 - 1 Feb 2016 1 Feb 2016 16 Source: cs231n course, Stanford University Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 23 / 38

Introduction Datasets Localization Classification + Localization Aside: Localizing multiple objects Want to localize exactly K objects in each image Fully-connected layers (e.g. whole cat, cat head, cat left ear, cat right ear for K=4) Convolution Class scores and Pooling Fully-connected layers K x 4 numbers (one box per object) Final conv feature map Box coordinates Image Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 8 - Lecture 8 - 1 Feb 2016 1 Feb 2016 19 Source: cs231n course, Stanford University Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 24 / 38

Introduction Datasets Localization Classification + Localization Aside: Human Pose Estimation Represent a person by K joints Regress (x, y) for each joint from last fully-connected layer of AlexNet (Details: Normalized coordinates, iterative refinement) Toshev and Szegedy, “DeepPose: Human Pose Estimation via Deep Neural Networks”, CVPR 2014 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 8 - Lecture 8 - 20 1 Feb 2016 1 Feb 2016 Source: cs231n course, Stanford University Abir Das (IIT Kharagpur) CS60010 Feb 28, 2020 25 / 38

Detection and Segmentation CS60010: Deep Learning Abir Das IIT - PowerPoint PPT Presentation

Detection and Segmentation CS60010: Deep Learning Abir Das IIT Kharagpur Feb 28, 2020 Introduction Datasets Localization Agenda To get introduced to two important tasks of computer vision - detection and segmentation along with deep neural

Segmentation Bottom-up Segmentation Semantic / instance segmentation Many Slides from L.

VIDEO SIGNALS Segmentation WHAT IS SEGMENTATION WHAT IS SEGMENTATION Segmentation is a

Semantic Segmentation / Instance Segmentation Based on Deep learning Yiding Liu 2018.12.08

Segmentation using Segmentation using Bayesian Decision Theory Bayesian Decision Theory

Segmentation Segmentation Segmentation Define the accurate boundaries of all objects in an image

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Detection and Segmentation CS60010: Deep Learning Abir Das IIT Kharagpur March 04 and 05, 2020

Detection and Segmentation of Detection and Segmentation of Touching Characters in Touching

Semantic segmentation Image classification Object detection Semantic segmentation Evolution

Detection, Segmentation Overview Object Detection deer cat Object Detection as Classification

Pixel-Level Im Image Understanding wit ith Semantic Segmentation and Panoptic Segmentation

Lecture 8: Image Segmentation Peng Chao Face++ Researcher pengchao@megvii.com Nov. 2017

Co-Segmentation of 3D Shapes via Subspace Clustering Ruizhen Hu Lubin Fan

Introduction to RFM segmentation Karolis Urbonas Head of Data Science, Amazon DataCamp

Image Segmentation Machine Learning Study Group Presented by Yaochen Xie Jan 25, 2018 Outline

Segmentation and Contour Detection Image segmentation is the process of assigning a

IntelliSAR March 5, 2020 Department of Electrical and Computer Engineering Department of

Segmentation of nuclei in Microscopy Imaging USING THE U-NET ARCHITECTURE Sonja Aits Queen

and Background for Semantic Segmentation Yu Liu and Michael S. Lew Leiden Institute of Advanced

how to win the Amazon Robotics Challenge T eam ACRV roboticvision.org #cartman Hardware 1.2m

Satellite Imagery Semantic Segmentation Razieh Kaviani Baghbaderani, Hairong Qi University of

Arxiv, 8 dec 2018 Main question How does a GAN represent our visual world internally? How do

Image Segmentation with Gated Shape CNN for Autonomous Driving Jeanine Liebold Intelligent

FREEIPA INSTALLATION USING ANSIBLE-FREEIPA FOSDEM - 2018-02-03 Thomas Wrner Senior Software