CS6501: Deep Learning for Visual Recognition Recognizing People in - PowerPoint PPT Presentation

Mar 12, 2023 •565 likes •850 views

CS6501: Deep Learning for Visual Recognition Recognizing People in Images Todays Class Face Detection Face Matching - and any type of matching Pose estimation Face Detection Face Detection: Viola-Jones Face Detector circa 2001

CS6501: Deep Learning for Visual Recognition Recognizing People in Images
Today’s Class • Face Detection • Face Matching - and any type of matching • Pose estimation
Face Detection
Face Detection: Viola-Jones Face Detector circa 2001 1. Compute these types of features across the image 2. Use a shallow classifier – e.g. ADA Boost 3. Non-Max Supression
Face Detection: Any Object Detector https://towardsdatascience.com/faced-cpu-real-time-face-detection-using-deep-learning-1488681c1602
Face Detection can be Hard WIDER FACE dataset.
Person Identification: Simplest Case Classify Among k-people in your database
Face Matching and just Matching Things Are these pairs of images, instances of the same?
Matching Things: Siamese Networks Find a neural network such that if two instances of the same thing are fed into the network, the outputs are similar under some simple distance metric. Also called the embedding problem Learning a Similarity Metric Discriminatively, with Application to FaceVerification Chopra, Hadsell, and LeCun.
Matching Things: Siamese Networks ! " $(! " ) ! # $(! # ) FaceNet: A Unified Embedding for Face Recognition and Clustering https://arxiv.org/pdf/1503.03832v1.pdf
Matching Things: Siamese Networks if x1 and x2 are the same ! " person then $(! " ) minimize: |$ ! " − $ ! # | ! # $(! # ) FaceNet: A Unified Embedding for Face Recognition and Clustering https://arxiv.org/pdf/1503.03832v1.pdf
Matching Things: Siamese Networks if x1 and x2 are the same ! " person then $(! " ) minimize: |$ ! " − $ ! # | ! # $(! # ) Beware of Trivial Solutions! FaceNet: A Unified Embedding for Face Recognition and Clustering https://arxiv.org/pdf/1503.03832v1.pdf
Matching Things: Siamese Networks if x1 and x3 are not the ! " same person $(! " ) then minimize: −|$ ! " − $ ! # | ! # $(! # ) FaceNet: A Unified Embedding for Face Recognition and Clustering https://arxiv.org/pdf/1503.03832v1.pdf
Better Idea: Triplet Loss. e.g. FaceNet !(# $ ) Minimize the following loss for every possible triplets ∑( ! # $ − ! # & − ! # $ − ! # ' + +) !(# & ) !(# ' ) FaceNet: A Unified Embedding for Face Recognition and Clustering https://arxiv.org/pdf/1503.03832v1.pdf
Better Idea: Select Triplets that are Hard !(# $ ) Minimize the following loss for every possible triplets ∑( ! # $ − ! # & − ! # $ − ! # ' + +) !(# & ) !(# ' ) FaceNet: A Unified Embedding for Face Recognition and Clustering https://arxiv.org/pdf/1503.03832v1.pdf
Pose Estimation http://www.stat.ucla.edu/~xianjie.chen/projects/pose_estimation/pose_estimation.html
Deep Pose https://arxiv.org/pdf/1312.4659.pdf
Deep Pose https://arxiv.org/pdf/1312.4659.pdf
Results
Pose Model II: HourGlass Network Hourglass Module
Pose Model II: HourGlass Network Hourglass Network
Pose Model II: HourGlass Network Hourglass Network
Pose Model II: HourGlass Network
Dense Pose http://densepose.org/
Dense Pose http://densepose.org/
Dense Pose http://densepose.org/
Dense Pose http://densepose.org/
Questions? 28

Recommend

CS6501: Deep Learning for Visual Recognition Seq2Seq Model & Text-to-Image Synthesis

CS6501: Deep Learning for Visual Recognition Seq2Seq Model & Text-to-Image Synthesis Presenter: Fuwen Tan Todays Class Mini-batch training of the RNN model Special End-of-Sequence token: <end> Padding

525 views • 40 slides

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class Recurrent Neural Network Cell Recurrent Neural Networks (RNNs) Bi-Directional Recurrent Neural Networks (Bi-RNNs) Multiple-layer /

583 views • 47 slides

CS6501: Deep Learning for Visual Recognition Object Detection: RCNN, Fast-RCNN, Faster-RCNN

CS6501: Deep Learning for Visual Recognition Object Detection: RCNN, Fast-RCNN, Faster-RCNN Todays Class Object Detection The RCNN Object Detector (2014) The Fast RCNN Object Detector (2015) The Faster RCNN Object Detector

747 views • 29 slides

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

793 views • 21 slides

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

Deep 3D Representation Learning for Visual Computing Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms Conclusion 2 Outline Overview of 3D deep learning Background 3D deep learning tasks 3D deep

1.66k views • 122 slides

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition: Detection Alignment Recognition Face detection & alignment Face recognition Face detection & alignment Detection

1.2k views • 50 slides

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep

1.15k views • 79 slides

Introduction to Visual Recognition General visual recognition importance for intelligence?

1/26/17 Introduction to Visual Recognition General visual recognition importance for intelligence? challenges? Face recognition importance for intelligence? challenges? Rapid object categorization How good are we at recognizing faces? How

440 views • 9 slides

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3 285 millions visually impaired people Retina Visual cortex 3 285 millions visually impaired people Retina Visual cortex

744 views • 63 slides

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Deep Neural Networks and Deep Reinforcement Learning Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and Courville [chapt. 6,7,8]; AIMA [sect. 21.1-21.3]; Sutton and Barto, Reinforcement Learning: an

528 views • 35 slides

Rich representations for Rich representations for learning visual recognition learning visual

Rich representations for Rich representations for learning visual recognition learning visual recognition g g g g Jitendra Malik Jitendra Malik Jitendra Malik Jitendra Malik University of California at Berkeley University of California

1.36k views • 94 slides

Softmax Classifier + SGD Todays Class Intro to Machine Learning What is Machine Learning?

CS6501: Deep Learning for Visual Recognition Softmax Classifier + SGD Todays Class Intro to Machine Learning What is Machine Learning? Supervised Learning: Classification with K-nearest neighbors Unsupervised Learning: Clustering with

1.36k views • 91 slides

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types 1 7-Speech Recognition (Cont d) HMM Calculating Approaches

1.08k views • 74 slides

Audio- -Visual Automatic Speech Recognition: Visual Automatic Speech Recognition: Audio Theory,

Audio- -Visual Automatic Speech Recognition: Visual Automatic Speech Recognition: Audio Theory, Applications, and Challenges Theory, Applications, and Challenges Gerasimos Potamianos I BM T. J. Watson Research Center Yorktown Heights, NY

1.05k views • 60 slides

Image Retrieval with CNN Giorgos Tolias Visual Recognition Group, CTU in Prague CVPR 2017

Visual Place Recognition as Image Retrieval with CNN Giorgos Tolias Visual Recognition Group, CTU in Prague CVPR 2017 tutorial on Large-Scale Visual Place Recognition and Image-Based Localization Alex Kendall, Torsten Sattler, Giorgos Tolias,

1.42k views • 96 slides

Machine visual perception Cordelia Schmid INRIA Grenoble Machine visual perception

Machine visual perception Cordelia Schmid INRIA Grenoble Machine visual perception Artificial capacity to see , understand the visual world Object recognition Image or sequence of images Action recognition Machine visual perception -

777 views • 54 slides

An AI for a Advisor Dr. Chris Pollett Modification of Dr. Mark Stamp Dou Di Zhu Dr. Fabio Di

By Xuesong Luo An AI for a Advisor Dr. Chris Pollett Modification of Dr. Mark Stamp Dou Di Zhu Dr. Fabio Di Troia San Jose State University May 13 th , 2020 Outline 1. Introduction 2. Dou Di Zhu 3. Design 4. Experiments 5.

278 views • 24 slides

Tulczyjews Triple in Classical Field Theories: Lagrangian submanifolds of premultisymplectic

Tulczyjews Triple in Classical Field Theories: Lagrangian submanifolds of premultisymplectic manifolds. E. Guzm an ICMAT- University of La Laguna e-mail: eguzman@ull.es Workshop on Rough Paths and Combinatorics in Control Theory

474 views • 26 slides

Componentwise accurate numerical methods for Markov-modulated Brownian motion Giang T. Nguyen 1

Componentwise accurate numerical methods for Markov-modulated Brownian motion Giang T. Nguyen 1 Federico Poloni 2 1 U of Adelaide, School of Mathematical Sciences 2 U Pisa, Italy, Dept of Computer Science 9 th Matrix Analytic Methods Conference

734 views • 20 slides

Siamese Neural l Netw Networks a and Simila larity Learning Wh What at can an ML ML do

Siamese Neural l Netw Networks a and Simila larity Learning Wh What at can an ML ML do do for or us? Classification problem Neural CAT Network Prof. Leal-Taix and Prof. Niessner 2 Wh What at can an ML ML do do for or

1.31k views • 83 slides

Structured Query-Based Image Retrieval using Scene Graphs Brigit Schroeder , UCSC Subarna

Structured Query-Based Image Retrieval using Scene Graphs Brigit Schroeder , UCSC Subarna Tripathi, Intel Labs Complexity of Object Interactions for Retrieval woman rides vs woman motorcycle motorcycle Structured queries capture

506 views • 11 slides

Why is the Probability Space a Triple? Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of

Why is the Probability Space a Triple? Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay January 11, 2013 1 / 15 Probability Space Definition A probability space is a

732 views • 15 slides

GoBack Enhancing the PRIMME Eigensolver for Computing Accurately Singular Triplets of Large

GoBack Enhancing the PRIMME Eigensolver for Computing Accurately Singular Triplets of Large Matrices Lingfei Wu and Andreas Stathopoulos Department of Computer Science College of William and Mary April 10th, 2014 CopperMountain2014 1 / 24

1.08k views • 87 slides

CPSC 121: Models of Computation Unit 3: Representing Values in a Computer CPSC 121 2011W T2

CPSC 121: Models of Computation Unit 3: Representing Values in a Computer CPSC 121 2011W T2 Unit 3: Representing Values th online quiz is due Sunday, January 22 nd The 4 at 19:00. Assigned reading for the quiz: Epp, 4 th edition:

588 views • 31 slides

CS6501: Deep Learning for Visual Recognition Recognizing People in - PowerPoint PPT Presentation

CS6501: Deep Learning for Visual Recognition Recognizing People in Images Todays Class Face Detection Face Matching - and any type of matching Pose estimation Face Detection Face Detection: Viola-Jones Face Detector circa 2001

CS6501: Deep Learning for Visual Recognition Seq2Seq Model &amp; Text-to-Image Synthesis

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

CS6501: Deep Learning for Visual Recognition Object Detection: RCNN, Fast-RCNN, Faster-RCNN

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Introduction to Visual Recognition General visual recognition importance for intelligence?

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Rich representations for Rich representations for learning visual recognition learning visual

Softmax Classifier + SGD Todays Class Intro to Machine Learning What is Machine Learning?

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

Audio- -Visual Automatic Speech Recognition: Visual Automatic Speech Recognition: Audio Theory,

Image Retrieval with CNN Giorgos Tolias Visual Recognition Group, CTU in Prague CVPR 2017

Machine visual perception Cordelia Schmid INRIA Grenoble Machine visual perception

An AI for a Advisor Dr. Chris Pollett Modification of Dr. Mark Stamp Dou Di Zhu Dr. Fabio Di

Tulczyjews Triple in Classical Field Theories: Lagrangian submanifolds of premultisymplectic

Componentwise accurate numerical methods for Markov-modulated Brownian motion Giang T. Nguyen 1

Siamese Neural l Netw Networks a and Simila larity Learning Wh What at can an ML ML do

Structured Query-Based Image Retrieval using Scene Graphs Brigit Schroeder , UCSC Subarna

Why is the Probability Space a Triple? Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of

GoBack Enhancing the PRIMME Eigensolver for Computing Accurately Singular Triplets of Large

CPSC 121: Models of Computation Unit 3: Representing Values in a Computer CPSC 121 2011W T2

CS6501: Deep Learning for Visual Recognition Seq2Seq Model & Text-to-Image Synthesis