Multi-task Learning for Precise Object Search from Massive - PowerPoint PPT Presentation

Multi-task Learning for Precise Object Search from Massive Images/Videos Fan Yang National Engineering Laboratory for Video Technology School of EE & CS, Peking University

Outline  Introduction  Motivation  Challenge  Multi-task learning for precise object search 1. Multi-task based person re-identification 2. Multi-task based vehicle search  Summary

Laboratory Organization National Engineering Laboratory for Video Technology Video Coding Lab System Lab New Media Lab Dr. Tiejun Huang Dr. Wen Gao SoC Lab IEEE Fellow ACM Fellow Testing Lab

Research Fields and Groups  Video coding algorithm ： Wen Gao ， Siwei Ma ， Ruiqin Xiong  Video coding standard  Cooperation: CCTV 、 Huawei 、 AVS Industry Alliance  Intelligent video analysis: Tiejun Huang ， Yonghong Tian ， Wei Zeng ， Yaowei Wang  Analysis and mine surveillance videos, recognition friendly video coding  Cooperation ： China Security & Protection , Hisense  Mobile Visual Search: Linyu Duan, Shiliang Zhang  CDVS international standard  Cooperation ： Baidu ， Singapore media bureau  Media content analysis: Yizhou Wang, Tingting Jiang  Computer vision  Cooperation ： Machine intelligence Lab, Computing Technology, Chinese Academy of sciences  Image/Video Chip : Xiaodong Xie, Huizhu Jia  Industrial production  Application ： National defense, Camera, Consumer Electronics

Cooperation with NVIDIA(NVAIL)  Accelerating Video Encoding  investigate the acceleration methods of video encoding on Graphics Processing Unit (GPU).  Video Classification/Recognition for CDN Surveillance  Extend the current state-of-the-art methods and further improve their performance especially for the CDN surveillance purpose  Accelerating Compact Descriptors for Visual Search  Use GPU to accelerate the CDVS extracting process.  Image Super-Resolution via Convolutional Neural Networks  Extend the current state-of-the-art CNNs based super-resolution approaches and accelerate the time inference of CNNs.

The e Bi Big Da g Data ta Era Era  Big Data collected/collecting by societies  More data has been created in the past two years than in the entire previous history of the human race.  Data is growing faster than ever before and by the year 2020, about 1.7 megabytes of new information will be created every second for every human being on the planet. 78.5 (90%) Images and Videos Others Data Size (EB) 34 (81%) 21.5 (77%) 13 (72%) 1 1.5 2 4.5 8 (48%) (43%) (40%) (53%) (66%) The growth trend of Internet Data, estimated by IDC

Sur urveill veillanc ance e Vi Video: o: Th The Bi Bigges ggest t Bi Big Data Data Center City Operation Traffic Surveillance Healthcare Video Network Public Social Lifes Security Surveillance Video Network: Surveillance Videos: The Key infrastructure of intelligent city More than half of all big data >100K cameras for a middle-size city China T. Huang, "Surveillance Video: The Biggest Big Data," Computing Now, vol. 7, no. 2, Feb. 2014, IEEE Computer Society [online]; http://www.computer.org/web/computingnow/archive/february2014.

BUT, data is far from being analyzed and used  “Target rich” data, i.e., the data with especial value, take about 1.5% of the digital universe  To obtain such “target rich” data, we need to analyze and mine all the data.  At the moment, less than 0.5% of all data is ever analyzed and used

The Stat The Status of Curr us of Current ent Syst Systems: ems: Le Less ss Sma Smart rt Have eyes (i.e., camera)  Ca Cannot See See (i.e., Recog. and Search) London Moscow Paris Boston 4

Sur urveil eillan lance ce Vide ideo o An Anal alys ysis is  To develop intelligent algorithms, technologies and systems that can detect/recognize/search specific objects (e.g., pedestrian, vehicle), behavior, or events.  Enabling Technologies  Background modeling  Object detection/tracking (e.g., pedestrian, vehicle)  Object recognition (e.g., face)  Object re-identification and search  Action/Behavior detection/recognition  (Abnormal) Event detection  Crowd analysis  Cross-camera tracking  … 12

A Challenging Problem  How can we search a specific object from massive image or video data?  NOT for visually similar object  BUT for exactly the same object ID=1 ID=2 ID=3 … … Gallery Query Detection and classification Precise object search

Precise Object Search  Task: to search a specific object from a large-scale dataset which contains a set of visually similar objects captured from different camera networks.  Search as Similarity Ranking (SaS)  Search as Recognition (SaR) Precise person search Precise vehicle search

Example: Det etect ect Fa Fake e Li Lice cense se Pl Plate Tollgate Car Monitoring 2 Car Registry Database Search Engine Car Monitoring 3 Car Monitoring N Honda accord Peugeot 206 Fake Plate

Example: Tr Traci cing ng Sus uspic icious ious Ve Vehicle icle 2014.10.9 10:42:15 2014.10.19 10:36:33 2014.10.19 10:22:32 2014.10.19 10:12:11 2014.10.19 12:42:11 2014.10.19 13:02:18 Search Engine

From Search to Recognition  Precise object recognition: The ultimate goal  Till to now, none of any recognition technology (including vehicle plate number recognition, face recognition) can achieve sufficiently high precision under an unconstrained environment  The success story of Google and Baidu tell us: Search can help, even substitute for in some cases, recognition.

vs. Visual Search  The task is aiming to find visually similar objects from a large database through visual similarity measurement and ranking  In most cases, the returned objects that are visually similar (e.g., within the same (sub-)category, having the same attributes such as color) are treated as correct Query Returned List ...

Recent Work: Deep Learning for Visual Search  Three Schemes  Direct Representation Refine with class labels (classification loss)  Refining by Similarity Learning Refine with side information (similarity rank loss)  Refining by Model Retraining Wan J, Wang D, Hoi S C H, et al. Deep learning for content-based image retrieval: A comprehensive study[C] ACM MM2014

Recent Work: Large-scale Clothes Image Retrieval  Cross-domain Image Retrieval Given a user photo depicting a clothing image, the goal is to retrieve the same or attribute-similar clothing items from online shopping stores  Dual Attribute-aware Ranking Network 1. Two sub-networks, one for each domain. 2. Feature representations are driven by semantic attribute learning. 3. Learning to rank by triplet visual similarity constraint. Huang, Junshi, et al. "Cross-domain image retrieval with a dual attribute- aware ranking network." ICCV 2015.

Challenge Challenge 1: 1: Hard Hard to retriev to retrieval al The he e exp xpon onen entiall tially i y inc ncrea easing sing siz size of of ima image ges s an and video d videos s prese pr esent nts s a a gran and d cha hall llen enge ge to to pa patte ttern n rec ecog ognition nition! ! 2.2B images ~15M classes 150M 2500 12000 2.2B … … Datasize-Recognition Gap 10000 2000 8000 Class Number Image Size 1500 6000 14M images 220K classes 1000 1.2M images 4000 1000 classes 500 60K images, 30K images 2000 100 classes 256 classes 0 0 ImageNet Vehicle Images Caltech-256 ImageNet- CIFAR-100 ILSVRC ’12 in a Province 7

Challeng Challenge e 2: 2: Hard Har d to i to ide dentify ntify  Using a unified framework to analysis, recognition and search from images/videos that are captured in an unconstrained environment 1) Huge amount of videos; 2) Different imaging views, illuminations, environmental conditions and image quality; 3) Visual appearance changes of the suspicious person/vehicle; 4) Other factors (e.g., lack of training data) Changchun Car Theft Case London Underground bombings Zhou Kehua Case 23

Challeng Challenge e 2: 2: Har Hard d to i to ide dentify ntify  Difficult to distinguish different objects with similar appearance (i.e. vehicles of the same color and model)  Camera view, distance, illumination variations Different Same

It is challenging also because It is challenging also because …  NOT depend on the strong identification information such as face or vehicle license plate number  Face is unavailable in most real-world surveillance cameras  Vehicle license plate may be faked Face Image Retrieval Scenario [Li, ICCV2015] How to search given these pictures? ✓ No front face image is available ✓ With some facial makeups ✓ Don’t know he is who ID Face Surveillance Database Face Database

Multi-task Learning for Precise Object Search from Massive - PowerPoint PPT Presentation

Multi-task Learning for Precise Object Search from Massive Images/Videos Fan Yang National Engineering Laboratory for Video Technology School of EE & CS, Peking University Outline Introduction Motivation Challenge Multi-task

Multi-Task Active Learning Yi Zhang Outline Active Learning Multi-Task Active Learning

Multi-Task Learning and Matrix Regularization Andreas Argyriou TTI Chicago Outline

GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking with 2D-3D Multi-Feature Learning

Improving historical spelling normalization with bi-directional LSTMs and multi-task learning

Multi-Object Tracking Challenge CV3DST Lecture Exercises Multi-Object Tracking Multi-Object

Identifying beneficial task relations for multi-task learning in deep neural networks Author:

Multi-Object Representation Learning with Iterative Variational Inference Klaus Greff, Raphal

(Even More) Language Modeling: Multi-Task Learning, and Building Blocks of Transformers CMSC

Structured Losses Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning

Multi-Task & Meta-Learning Basics CS 330 Logistics Homework 1 posted today, due Wednesday,

Multi-Object Synchronization Chapter 6 OSPP Part I Multi-Object Programs What happens when

Adaptive Adversarial Multi-task Representation Learning Yuren Mao 1 Weiwei Liu 2 Xuemin Lin 1 1.

Transfer and Multi-Task Learning CS 294-112: Deep Reinforcement Learning Sergey Levine Class

Multi-Task Learning & Transfer Learning Basics CS 330 1 Logistics Homework 1 posted Monday

Deep multi-task learning with evolving weights Machine learning - computer vision published in

3D Multi-Object Tracking for Autonomous Driving Xinshuo Weng, Kris Kitani June 15, 2020 1 3D

Multi-task, Multi-lingual Learning Graham Neubig Site https://phontron.com/class/nn4nlp2018/

Vision and Language Representation Learning Self Supervised Pretraining and Multi-Task Learning

The Peculiar Optimization and Regularization Challenges in Multi-Task Learning and Meta-Learning

MOCHA: Federated Multi-Task Learning NIPS 17 Virginia Smith Stanford / CMU Chao-Kai

Electricity Demand Forecasting by Multi-Task Learning Jean-Baptiste Fiot Francesco Dinuzzo IBM

Task adjustment options: You may select one of the poems provided by your teacher, however, please

Learning Task-Agnostic Embedding of Multiple Black-Box Experts for Multi-Task Model Fusion Nghia

Why We Are Here Group Juggle Task: The object for the group juggle is for the team to juggle as

Multi-task Learning for Precise Object Search from Massive - PowerPoint PPT Presentation

Multi-task Learning for Precise Object Search from Massive Images/Videos Fan Yang National Engineering Laboratory for Video Technology School of EE & CS, Peking University Outline Introduction Motivation Challenge Multi-task

Multi-Task Active Learning Yi Zhang Outline Active Learning Multi-Task Active Learning

Multi-Task Learning and Matrix Regularization Andreas Argyriou TTI Chicago Outline

GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking with 2D-3D Multi-Feature Learning

Improving historical spelling normalization with bi-directional LSTMs and multi-task learning

Multi-Object Tracking Challenge CV3DST Lecture Exercises Multi-Object Tracking Multi-Object

Identifying beneficial task relations for multi-task learning in deep neural networks Author:

Multi-Object Representation Learning with Iterative Variational Inference Klaus Greff, Raphal

(Even More) Language Modeling: Multi-Task Learning, and Building Blocks of Transformers CMSC

Structured Losses Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning

Multi-Task &amp; Meta-Learning Basics CS 330 Logistics Homework 1 posted today, due Wednesday,

Multi-Object Synchronization Chapter 6 OSPP Part I Multi-Object Programs What happens when

Adaptive Adversarial Multi-task Representation Learning Yuren Mao 1 Weiwei Liu 2 Xuemin Lin 1 1.

Transfer and Multi-Task Learning CS 294-112: Deep Reinforcement Learning Sergey Levine Class

Multi-Task Learning &amp; Transfer Learning Basics CS 330 1 Logistics Homework 1 posted Monday

Deep multi-task learning with evolving weights Machine learning - computer vision published in

3D Multi-Object Tracking for Autonomous Driving Xinshuo Weng, Kris Kitani June 15, 2020 1 3D

Multi-task, Multi-lingual Learning Graham Neubig Site https://phontron.com/class/nn4nlp2018/

Vision and Language Representation Learning Self Supervised Pretraining and Multi-Task Learning

The Peculiar Optimization and Regularization Challenges in Multi-Task Learning and Meta-Learning

MOCHA: Federated Multi-Task Learning NIPS 17 Virginia Smith Stanford / CMU Chao-Kai

Electricity Demand Forecasting by Multi-Task Learning Jean-Baptiste Fiot Francesco Dinuzzo IBM

Task adjustment options: You may select one of the poems provided by your teacher, however, please

Learning Task-Agnostic Embedding of Multiple Black-Box Experts for Multi-Task Model Fusion Nghia

Why We Are Here Group Juggle Task: The object for the group juggle is for the team to juggle as

Multi-Task & Meta-Learning Basics CS 330 Logistics Homework 1 posted today, due Wednesday,

Multi-Task Learning & Transfer Learning Basics CS 330 1 Logistics Homework 1 posted Monday