Lecture 13: Segmentation and Attention Fei-Fei Li & Andrej - PowerPoint PPT Presentation

Semantic Segmentation: Upsampling Long, Shelhamer, and Darrell, “Fully Convolutional Networks for Semantic Segmentation”, CVPR 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 41

Semantic Segmentation: Upsampling Learnable upsampling! Long, Shelhamer, and Darrell, “Fully Convolutional Networks for Semantic Segmentation”, CVPR 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 42

Semantic Segmentation: Upsampling Long, Shelhamer, and Darrell, “Fully Convolutional Networks for Semantic Segmentation”, CVPR 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 43

Semantic Segmentation: Upsampling “skip connections” Long, Shelhamer, and Darrell, “Fully Convolutional Networks for Semantic Segmentation”, CVPR 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 44

Semantic Segmentation: Upsampling “skip connections” Skip connections = Better results Long, Shelhamer, and Darrell, “Fully Convolutional Networks for Semantic Segmentation”, CVPR 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 45

Learnable Upsampling: “Deconvolution” Typical 3 x 3 convolution, stride 1 pad 1 Input: 4 x 4 Output: 4 x 4 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 46

Learnable Upsampling: “Deconvolution” Typical 3 x 3 convolution, stride 1 pad 1 Dot product between filter and input Input: 4 x 4 Output: 4 x 4 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 47

Learnable Upsampling: “Deconvolution” Typical 3 x 3 convolution, stride 2 pad 1 Input: 4 x 4 Output: 2 x 2 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 49

Learnable Upsampling: “Deconvolution” 3 x 3 “deconvolution”, stride 2 pad 1 Input: 2 x 2 Output: 4 x 4 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 52

Learnable Upsampling: “Deconvolution” 3 x 3 “deconvolution”, stride 2 pad 1 Input gives weight for filter Input: 2 x 2 Output: 4 x 4 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 53

Learnable Upsampling: “Deconvolution” 3 x 3 “deconvolution”, stride 2 pad 1 Input gives weight for filter Input: 2 x 2 Output: 4 x 4 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 54

Learnable Upsampling: “Deconvolution” Sum where 3 x 3 “deconvolution”, stride 2 pad 1 output overlaps Input gives weight for filter Input: 2 x 2 Output: 4 x 4 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 55

Learnable Upsampling: “Deconvolution” Sum where 3 x 3 “deconvolution”, stride 2 pad 1 output overlaps Same as backward pass for normal convolution! Input gives weight for filter Input: 2 x 2 Output: 4 x 4 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 56

Learnable Upsampling: “Deconvolution” Sum where 3 x 3 “deconvolution”, stride 2 pad 1 output overlaps Same as backward pass for normal convolution! “Deconvolution” is a bad Input gives name, already defined as weight for “inverse of convolution” filter Better names: convolution transpose, backward strided convolution, Input: 2 x 2 Output: 4 x 4 1/2 strided convolution, upconvolution Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 57

Learnable Upsampling: “Deconvolution” Im et al, “Generating images with recurrent adversarial networks”, arXiv 2016 “Deconvolution” is a bad name, already defined as “inverse of convolution” Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016 Better names: convolution transpose, backward strided convolution, 1/2 strided convolution, upconvolution Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 58

Learnable Upsampling: “Deconvolution” Great explanation in appendix Im et al, “Generating images with recurrent adversarial networks”, arXiv 2016 “Deconvolution” is a bad name, already defined as “inverse of convolution” Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016 Better names: convolution transpose, backward strided convolution, 1/2 strided convolution, upconvolution Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 59

Semantic Segmentation: Upsampling Noh et al, “Learning Deconvolution Network for Semantic Segmentation”, ICCV 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 60

Semantic Segmentation: Upsampling Normal VGG “Upside down” VGG 6 days of training on Titan X… Noh et al, “Learning Deconvolution Network for Semantic Segmentation”, ICCV 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 61

Instance Segmentation Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 62

Instance Segmentation Detect instances, give category, label pixels “simultaneous detection and segmentation” (SDS) Lots of recent work (MS-COCO) Figure credit: Dai et al, “Instance-aware Semantic Segmentation via Multi-task Network Cascades”, arXiv 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 63

Instance Segmentation Similar to R-CNN, but with segments Hariharan et al, “Simultaneous Detection and Segmentation”, ECCV 2014 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 64

Instance Segmentation Similar to R-CNN, but with segments External Segment proposals Hariharan et al, “Simultaneous Detection and Segmentation”, ECCV 2014 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 65

Instance Segmentation Similar to R-CNN, but with segments External Segment proposals Hariharan et al, “Simultaneous Detection and Segmentation”, ECCV 2014 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 66

Instance Segmentation Similar to R-CNN, but with segments External Segment proposals Mask out background with mean image Hariharan et al, “Simultaneous Detection and Segmentation”, ECCV 2014 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 67

Instance Segmentation: Hypercolumns Hariharan et al, “Hypercolumns for Object Segmentation and Fine-grained Localization”, CVPR 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 70

Instance Segmentation: Hypercolumns Hariharan et al, “Hypercolumns for Object Segmentation and Fine-grained Localization”, CVPR 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 71

Instance Segmentation: Cascades Similar to Faster R-CNN Won COCO 2015 challenge (with ResNet) Dai et al, “Instance-aware Semantic Segmentation via Multi-task Network Cascades”, arXiv 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 72

Instance Segmentation: Cascades Similar to Faster R-CNN Won COCO 2015 challenge (with ResNet) Dai et al, “Instance-aware Semantic Segmentation via Multi-task Network Cascades”, arXiv 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 73

Instance Segmentation: Cascades Region proposal network (RPN) Similar to Faster R-CNN Won COCO 2015 challenge (with ResNet) Dai et al, “Instance-aware Semantic Segmentation via Multi-task Network Cascades”, arXiv 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 74

Instance Segmentation: Cascades Region proposal network (RPN) Similar to Faster R-CNN Reshape boxes to fixed size, figure / ground logistic regression Won COCO 2015 challenge (with ResNet) Dai et al, “Instance-aware Semantic Segmentation via Multi-task Network Cascades”, arXiv 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 75

Instance Segmentation: Cascades Region proposal network (RPN) Similar to Faster R-CNN Reshape boxes to fixed size, figure / ground logistic regression Mask out background, predict object class Won COCO 2015 challenge (with ResNet) Dai et al, “Instance-aware Semantic Segmentation via Multi-task Network Cascades”, arXiv 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 76

Instance Segmentation: Cascades Region proposal network (RPN) Similar to Faster R-CNN Reshape boxes to Learn entire model fixed size, end-to-end! figure / ground logistic regression Mask out background, predict object class Won COCO 2015 challenge (with ResNet) Dai et al, “Instance-aware Semantic Segmentation via Multi-task Network Cascades”, arXiv 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 77

Instance Segmentation: Cascades Dai et al, “Instance-aware Semantic Segmentation via Multi-task Network Predictions Ground truth Cascades”, arXiv 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 78

Segmentation Overview ● Semantic segmentation ○ Classify all pixels ○ Fully convolutional models, downsample then upsample ○ Learnable upsampling: fractionally strided convolution ○ Skip connections can help ● Instance Segmentation ○ Detect instance, generate mask ○ Similar pipelines to object detection Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 79

Attention Models Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 80

Recall: RNN for Captioning Image: H x W x 3 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 81

Recall: RNN for Captioning CNN Image: Features: H x W x 3 D Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 82

Recall: RNN for Captioning CNN h0 Image: Features: Hidden state: H x W x 3 D H Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 83

Recall: RNN for Captioning Distribution over vocab d1 CNN h0 h1 Image: Features: Hidden state: H x W x 3 D H y1 First word Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 84

Recall: RNN for Captioning Distribution over vocab d1 d2 CNN h0 h1 h2 Image: Features: Hidden state: H x W x 3 D H y1 y2 First Second word word Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 85

Recall: RNN for Captioning RNN only looks at Distribution whole image, once over vocab d1 d2 CNN h0 h1 h2 Image: Features: Hidden state: H x W x 3 D H y1 y2 First Second word word Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 86

Recall: RNN for Captioning RNN only looks at Distribution whole image, once over vocab d1 d2 CNN h0 h1 h2 Image: Features: Hidden state: What if the RNN H x W x 3 D H looks at different y1 y2 parts of the image at each timestep? First Second word word Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 87

Soft Attention for Captioning CNN Features: Image: L x D H x W x 3 Xu et al, “Show, Attend and Tell: Neural Image Caption Generation with Visual Attention”, ICML 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 88

Soft Attention for Captioning CNN h0 Features: Image: L x D H x W x 3 Xu et al, “Show, Attend and Tell: Neural Image Caption Generation with Visual Attention”, ICML 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 89

Soft Attention for Captioning Distribution over L locations a1 CNN h0 Features: Image: L x D H x W x 3 Xu et al, “Show, Attend and Tell: Neural Image Caption Generation with Visual Attention”, ICML 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 90

Soft Attention for Captioning Distribution over L locations a1 CNN h0 Features: Image: L x D Weighted H x W x 3 z1 features: D Weighted combination Xu et al, “Show, Attend and Tell: Neural Image Caption Generation with Visual of features Attention”, ICML 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 91

Soft Attention for Captioning Distribution over L locations a1 CNN h0 h1 Features: Image: L x D Weighted H x W x 3 z1 y1 features: D Weighted combination First word Xu et al, “Show, Attend and Tell: Neural Image Caption Generation with Visual of features Attention”, ICML 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 92

Soft Attention for Captioning Distribution over Distribution L locations over vocab a1 a2 d1 CNN h0 h1 Features: Image: L x D Weighted H x W x 3 z1 y1 features: D Weighted combination First word Xu et al, “Show, Attend and Tell: Neural Image Caption Generation with Visual of features Attention”, ICML 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 93

Soft Attention for Captioning Distribution over Distribution L locations over vocab a1 a2 d1 CNN h0 h1 Features: Image: L x D Weighted H x W x 3 z1 y1 z2 features: D Weighted combination First word Xu et al, “Show, Attend and Tell: Neural Image Caption Generation with Visual of features Attention”, ICML 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 94

Soft Attention for Captioning Distribution over Distribution L locations over vocab a1 a2 d1 CNN h0 h1 h2 Features: Image: L x D Weighted H x W x 3 z1 y1 z2 y2 features: D Weighted combination First word Xu et al, “Show, Attend and Tell: Neural Image Caption Generation with Visual of features Attention”, ICML 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 95

Soft Attention for Captioning Distribution over Distribution L locations over vocab a1 a2 d1 a3 d2 CNN h0 h1 h2 Features: Image: L x D Weighted H x W x 3 z1 y1 z2 y2 features: D Weighted combination First word Xu et al, “Show, Attend and Tell: Neural Image Caption Generation with Visual of features Attention”, ICML 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 96

Soft Attention for Captioning Distribution over Distribution Guess which framework L locations over vocab was used to implement? a1 a2 d1 a3 d2 CNN h0 h1 h2 Features: Image: L x D Weighted H x W x 3 z1 y1 z2 y2 features: D Weighted combination First word Xu et al, “Show, Attend and Tell: Neural Image Caption Generation with Visual of features Attention”, ICML 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 97

Soft Attention for Captioning Distribution over Distribution Guess which framework L locations over vocab was used to implement? a1 a2 d1 a3 d2 Crazy RNN = Theano CNN h0 h1 h2 Features: Image: L x D Weighted H x W x 3 z1 y1 z2 y2 features: D Weighted combination First word Xu et al, “Show, Attend and Tell: Neural Image Caption Generation with Visual of features Attention”, ICML 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 98

Soft vs Hard Attention a b CNN c d Grid of features Image: (Each D- H x W x 3 dimensional) p a p b From RNN: p c p d Distribution over grid locations Xu et al, “Show, Attend and Tell: Neural p a + p b + p c + p c = 1 Image Caption Generation with Visual Attention”, ICML 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 99

Soft vs Hard Attention a b CNN c d Grid of features Image: (Each D- H x W x 3 dimensional) Context vector z (D-dimensional) p a p b From RNN: p c p d Distribution over grid locations Xu et al, “Show, Attend and Tell: Neural p a + p b + p c + p c = 1 Image Caption Generation with Visual Attention”, ICML 2015 Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - 24 Feb 2016 100

Lecture 13: Segmentation and Attention Fei-Fei Li & Andrej - PowerPoint PPT Presentation

Lecture 13: Segmentation and Attention Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 13 - Lecture 13 - 24 Feb 2016 24 Feb 2016 1 Administrative Assignment 3 due

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

CEE 680 Lecture #2 1/22/2020 1 CEE 680 Lecture #2 1/22/2020 2 CEE 680 Lecture #2

Pocket Lecture Pocket Lecture Pocket Lecture Pocket Lecture Listen Audio Notes Progress

Multiphase Modelling in Cancer Helen Byrne Wolfson Centre for Mathematical Biology Mathematical

Previous Lecture Todays Lecture Slides for Lecture 5 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 30 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 28 Completion of divide-by-3 counter

Previous Lecture Todays Lecture Slides for Lecture 12 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 3 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 2 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 35 ENEL 353: Digital Circuits Fall

Lecture Capture Introduction to Lecture Capture Learning Outcomes What will lecture capture

Previous Lecture Todays Lecture Slides for Lecture 32 Completion of a timing analysis

Repetition Automatic Control, Basic Course, Lecture 11 Fredrik Bagge Carlson December 17, 2016

Previous Lecture Todays Lecture Slides for Lecture 26 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 33 ENEL 353: Digital Circuits Fall

Decomposition Announcements Modular Design Separation of Concerns 4 Separation of Concerns A

Cycle decompositions of complete multigraphs Barbara Maenhaut, The University of Queensland

CS 251 Fall 2019 CS 251 Fall 2019 Two world views Principles of Programming Languages

Decomposition of Permutations in a Finite Field SVETLA NIKOVA 1 , VENTZISLAV NIKOV 2 , AND VINCENT

Gaussian Deconvolution Filter Michael Mooney Brookhaven National Laboratory ProtoDUNE-SP DRA

Checking Hit Reconstruction Vyacheslav Galymov (IP2I) ProtoDUNE DRA 26/02/2020 Introduction

Old and New Algorithms for Blind Deconvolution Yair Weiss Hebrew University of Jerusalem joint

Visualization Visualization Understand what ConvNets learn 2 Visualization The development of