Visual Parsing with Weak Supervision Jia Xu Department of Computer - PowerPoint PPT Presentation

Visual Parsing with Weak Supervision Jia Xu Department of Computer Sciences University of Wisconsin-Madison 2015-07-30

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Research Goal Teach Computer to See at/beyond Human Level Interpret/summarize/organize visual data on the Internet Help the disabled population (e.g., the blind)

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Visual Parsing Fundamental Task Semantically parse every pixel in images and videos

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Visual Parsing Fundamental Task Semantically parse every pixel in images and videos First step towards high level applications Self-driving Car Unmanned Aerial Vehicle Wearable Glasses

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Visual Parsing Fundamental Task Turning Visual Data Into Knowledge Everyday > 3 . 5 million > 300 million > 150 , 000 hours Never Ending Language Learning (Mitchell et al., 2009) Never Ending Image Learner (Chen et al., 2013)

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Challenges Modern Image Dataset > 6 Billion > 14 Million Log(Size) ∼ 1 Million Noisy Label Image-Level ∼ 5000 Bounding Box Noisy Label Image-Level Bounding Box Segmentation Segmentation Information

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Challenges Modern Image Dataset > 6 Billion > 14 Million Log(Size) ∼ 1 Million Noisy Label Image-Level ∼ 5000 Bounding Box Noisy Label Image-Level Bounding Box Segmentation Segmentation Information Much fewer segmentations are annotated for videos!

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Motivation Bottleneck of Fully Supervised Methods Full annotation is expensive to collect and limited at size

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Motivation Bottleneck of Fully Supervised Methods Full annotation is expensive to collect and limited at size Why Weakly Supervised Learning Weak supervision is easier to obtain: e.g., gaze

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Motivation Bottleneck of Fully Supervised Methods Full annotation is expensive to collect and limited at size Why Weakly Supervised Learning Weak supervision is easier to obtain: e.g., gaze Large datasets with side/weak annotations are readily available: metadata, tags, text

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Motivation Bottleneck of Fully Supervised Methods Full annotation is expensive to collect and limited at size Why Weakly Supervised Learning Weak supervision is easier to obtain: e.g., gaze Large datasets with side/weak annotations are readily available: metadata, tags, text Visual data presents the physical world: shape, geometry, context

Introduction Object Segmentation Scene Parsing Video Parsing Discussion My Thesis Research How can we utilize weakly labeled data effectively for the visual parsing task? When human comes into the visual parsing loop, how can we minimize user effort while still achieving satisfactory parsing results?

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Roadmap Chapter Parsing Task Weak Supervision Publication Ch. 2 Object Segmentation User Indication CVPR 2013 Ch. 3 Scene Parsing Image-level Tags CVPR 2014 Image-level Tags Ch. 4 Scene Parsing Bounding Boxes CVPR 2015a Partial Labels Ch. 5 Video Segmentation Side Knowledge ICCV 2013 Ch. 6 Video Summarization Human Gaze CVPR 2015b

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Object Segmentation

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Object Segmentation Main Challenges Semantic gap: what is an object? 1

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Object Segmentation Main Challenges Semantic gap: what is an object? 1 Ambiguity of user intention: which object do you want? 2

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Interactive Object Segmentation Main Challenges Semantic gap: what is an object? 1 Ambiguity of user intention: which object do you want? 2 A few user scribbles can make segmentation much easier!

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Related work Region-based: Graphcut (Boykov and Jolly, 2001), Grabcut (Rother et al., 2004), Random Walks (Grady, 2006), Geodesic Shortest Path (Bai and Sapiro, 2009), Geodesic Star Convexity (Gulshan et al., 2010) Edge-based: Intelligent Scissors (Mortensen and Barrett, 1998), LabelMe (Russell et al., 2008) GraphCut GrabCut Intelligent Scissors LabelMe

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Our Ideas (EulerSeg) Objective Modeling topological constraint while concurrently finding one or more minimum energy closed contours which satisfy: Foreground seeds must be “inside” Background seeds must be “outside” [ X. , Collins, Singh, CVPR 2013]

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Our Ideas (EulerSeg) Main Advantages Basic primitives are edgelets 1 (Little dependence on # of pixels)

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Our Ideas (EulerSeg) Main Advantages Basic primitives are edgelets 1 (Little dependence on # of pixels) Dense strokes not needed to learn appearance model. 2 Results do NOT vary with seed location (Interaction constraints are completely geometric in form)

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Our Ideas (EulerSeg) Main Advantages Basic primitives are edgelets 1 (Little dependence on # of pixels) Dense strokes not needed to learn appearance model. 2 Results do NOT vary with seed location (Interaction constraints are completely geometric in form) Incorporating connectedness priors and specifying # of 3 closures are easy (Euler characteristic)

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Graph Representation

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Graph Representation x : face indicator vector y : edge indicator vector z : vertex indicator vector w : indicator vector for foreground boundary edges. Internal edges y i � = w i = 0 are black, while boundary edges y i = w i = 1 are red

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Discrete Calculus Coherent Anti-coherent Vertex Edge Face Cell Orientation

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Discrete Calculus Coherent Anti-coherent Vertex Edge Face Cell Orientation Vertex-edge Incidence Matrix: A 1 = A , A 2 = A 1 ./ D � 1 k = i , j A v k , e ij = otherwise 0 [Grady and Polimeni, 2010]

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Discrete Calculus Coherent Anti-coherent Vertex Edge Face Cell Orientation Edge-face Incidence Matrix: C 1 = C , C 2 = | C |  + 1 e is incident to f and coherently oriented   C e , f = − 1 e is incident to f and anti-coherently oriented  otherwise  0 [Grady and Polimeni, 2010]

Introduction Object Segmentation Scene Parsing Video Parsing Discussion An Example v 2 v 2 e 4 e 4 v 4 v 4 e 1 e 1 f 2 f 2 e 3 e 7 e 3 e 7 v 1 f 1 v 1 f 1 f 3 f 3 e 5 e 5 e 2 e 2 v 5 v 5 e 6 e 6 v 3 v 3  1 0 0   1  − 1 − 1 0 0           − 1 1 0 1 0         C = x = b = C x = 0 1 0 1 1           − 1 − 1 0 1 0         − 1 0 0 0     0 0 1 0

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Euler Characteristic v 2 e 4 v 4 e 1 f 2 e 3 e 7 v 1 f 1 f 3 e 5 e 2 v 5 e 6 v 3 Number of faces ( 1 T x ):

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Euler Characteristic v 2 e 4 v 4 e 1 f 2 e 3 e 7 v 1 f 1 f 3 e 5 e 2 v 5 e 6 v 3 Number of faces ( 1 T x ): 2 Number of nodes ( 1 T z ):

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Euler Characteristic v 2 e 4 v 4 e 1 f 2 e 3 e 7 v 1 f 1 f 3 e 5 e 2 v 5 e 6 v 3 Number of faces ( 1 T x ): 2 Number of nodes ( 1 T z ): 4 Number of edges ( 1 T y ):

Introduction Object Segmentation Scene Parsing Video Parsing Discussion Euler Characteristic v 2 e 4 v 4 e 1 f 2 e 3 e 7 v 1 f 1 f 3 e 5 e 2 v 5 e 6 v 3 Number of faces ( 1 T x ): 2 Number of nodes ( 1 T z ): 4 Number of edges ( 1 T y ): 5 Number of connected components ( 1 T x + 1 T z − 1 T y ):

Visual Parsing with Weak Supervision Jia Xu Department of Computer - PowerPoint PPT Presentation

Visual Parsing with Weak Supervision Jia Xu Department of Computer Sciences University of Wisconsin-Madison 2015-07-30 Introduction Object Segmentation Scene Parsing Video Parsing Discussion Research Goal Teach Computer to See at/beyond

Noise2Self: Blind Denoising by Self-Supervision Joshua Batson Loc Royer Noisy Data

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

Few-shot learning of weak supervision sources in Snorkel (or, learning weakly supervised weak

Supervision Strengthening Our Practice The plan Supervision what is it? Benefits

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Basic Parsing Algorithms Chart Parsing Seminar Recent Advances in Parsing Technology WS

Neural Symbolic Machines Semantic Parsing on Freebase with Weak Supervision Chen Liang, Jonathan

Supervision Mandatory Webinar 4 Webinar overview I. Background II. Why supervision? III.

Weak Supervision Vincent Chen and Nish Khandwala Outline Motivation We want more

Learning Dependency Structures for Weak Supervision Models Fred Sala , Paroma Varma, Ann He, Alex

Weak Supervision, noisy labels, and error propagation Marat Freytsis hep-ai journal club

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Models of Human Parsing Experimental Data 2 Informatics 2A: Lecture 22 Eye-tracking Reading

Outline LR Parsing Review of bottom-up parsing LALR Parser Generators Computing the

Graph-Based Parsing Joakim Nivre Uppsala University Department of Linguistics and Philology

LT 4254 PSYCHOLINGUISTICS OF READING To what extent does the language proficiency of the L2

Learning Flexible Goal-Directed Behavior Christian Balkenius Lund University Cognitive Science

80% 90% Impact 45 90 0 45 0 90 System A System B 3D position of

CS 103: Representation Learning, Information Theory and Control Lecture 3, Jan 25, 2019 Seen

How Uniqueness guides Definite Description Processing Christopher Ahern and Jon Stevens

Localisation using Active Mirror Vision System Luke Cole (u4014181) Supervised by Dr. David Austin

Motion Perception II Chapter 8 Lecture 14 Jonathan Pillow Sensation & Perception (PSY 345 /

Responsiveness Human perception Responsiveness in Java Responsiveness in Web Apps

Visual Parsing with Weak Supervision Jia Xu Department of Computer - PowerPoint PPT Presentation

Visual Parsing with Weak Supervision Jia Xu Department of Computer Sciences University of Wisconsin-Madison 2015-07-30 Introduction Object Segmentation Scene Parsing Video Parsing Discussion Research Goal Teach Computer to See at/beyond

Noise2Self: Blind Denoising by Self-Supervision Joshua Batson Loc Royer Noisy Data

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

Few-shot learning of weak supervision sources in Snorkel (or, learning weakly supervised weak

Supervision Strengthening Our Practice The plan Supervision what is it? Benefits

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

Basic Parsing Algorithms Chart Parsing Seminar Recent Advances in Parsing Technology WS

Neural Symbolic Machines Semantic Parsing on Freebase with Weak Supervision Chen Liang, Jonathan

Supervision Mandatory Webinar 4 Webinar overview I. Background II. Why supervision? III.

Weak Supervision Vincent Chen and Nish Khandwala Outline Motivation We want more

Learning Dependency Structures for Weak Supervision Models Fred Sala , Paroma Varma, Ann He, Alex

Weak Supervision, noisy labels, and error propagation Marat Freytsis hep-ai journal club

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Models of Human Parsing Experimental Data 2 Informatics 2A: Lecture 22 Eye-tracking Reading

Outline LR Parsing Review of bottom-up parsing LALR Parser Generators Computing the

Graph-Based Parsing Joakim Nivre Uppsala University Department of Linguistics and Philology

LT 4254 PSYCHOLINGUISTICS OF READING To what extent does the language proficiency of the L2

Learning Flexible Goal-Directed Behavior Christian Balkenius Lund University Cognitive Science

80% 90% Impact 45 90 0 45 0 90 System A System B 3D position of

CS 103: Representation Learning, Information Theory and Control Lecture 3, Jan 25, 2019 Seen

How Uniqueness guides Definite Description Processing Christopher Ahern and Jon Stevens

Localisation using Active Mirror Vision System Luke Cole (u4014181) Supervised by Dr. David Austin

Motion Perception II Chapter 8 Lecture 14 Jonathan Pillow Sensation &amp; Perception (PSY 345 /

Responsiveness Human perception Responsiveness in Java Responsiveness in Web Apps

Motion Perception II Chapter 8 Lecture 14 Jonathan Pillow Sensation & Perception (PSY 345 /