Advances in Visual Tracking Machine Learning Study Group Presented - - PowerPoint PPT Presentation

advances in visual tracking
SMART_READER_LITE
LIVE PREVIEW

Advances in Visual Tracking Machine Learning Study Group Presented - - PowerPoint PPT Presentation

Advances in Visual Tracking Machine Learning Study Group Presented by Yaochen Xie Dec 7, 2017 Contents Visual Tracking Overview Dataset & Evaluation Methodology Traditional Approach (before 2010) Mean-Shift, Particle Filter,


slide-1
SLIDE 1

Advances in Visual Tracking

Machine Learning Study Group

Presented by Yaochen Xie Dec 7, 2017

slide-2
SLIDE 2

Contents

❖ Visual Tracking Overview ❖ Dataset & Evaluation Methodology ❖ Traditional Approach (before 2010)

➢ Mean-Shift, Particle Filter, Optical Flow

❖ The State-of-the-Art (after 2010)

➢ Correlation Filter, Deep Learning

❖ A Summary: Generative models and Discriminative models

slide-3
SLIDE 3

What is tracking in Computer Vision?

✴ Understanding geometric correspondences over time ✴ A fundamental problem in computer vision ✴ A challenging and difficult task ✴ Numerous applications

slide-4
SLIDE 4

Applications

Motion Analysis Surveillance Autonomous Robots Image Guided Surgery Biomedical Image Analysis Human Computer Interaction

slide-5
SLIDE 5

Challenges

Deformation Illumination variation Blur & Fast Motion Background Clutter

slide-6
SLIDE 6

Challenges

Out-of-plane rotation In-plane rotation Scale Variation Occlusion Out-of-view

slide-7
SLIDE 7

Dataset

OTB (Object Tracking Benchmark)

http://cvlab.hanyang.ac.kr/tracker_benchmark/index.html

The full benchmark contains 100 sequences from recent literatures.

  • The sequence names are in CamelCase without

any blanks or underscores.

  • When there exist multiple targets each target is

identified as dot+id_number (e.g. Jogging.1 and Jogging.2).

  • Each row in the ground-truth files represents the

bounding box of the target in that frame, (x, y, box- width, box-height).

slide-8
SLIDE 8

Dataset

OTB (Object Tracking Benchmark)

http://cvlab.hanyang.ac.kr/tracker_benchmark/index.html

slide-9
SLIDE 9

Dataset

http://www.votchallenge.net/

VOT 2015

  • 60 short sequences
  • Chosen from a large pool of

sequences including the ALOV dataset, OTB2 dataset, non- tracking datasets, etc.

  • Rotated bounding boxes in order to

provide highly accurate ground truth values for comparing results

VOT Challenge (Visual Object Tracking)

slide-10
SLIDE 10

Evaluation

✴ Precision plot : center location error (average Euclidean distance between the center locations) / percentage within a threshold ✴ Success plot :

slide-11
SLIDE 11

Evaluation

✴ Temporal Robustness Evaluation (TRE) ✴ Spatial Robustness Evaluation (SRE)

slide-12
SLIDE 12

Traditional Approaches - Mean-shift

Intuitive Description:

slide-13
SLIDE 13

Traditional Approaches - Mean-shift

Intuitive Description:

slide-14
SLIDE 14

Traditional Approaches - Mean-shift

Intuitive Description:

slide-15
SLIDE 15

Traditional Approaches - Mean-shift

Intuitive Description:

slide-16
SLIDE 16

Traditional Approaches - Mean-shift

Intuitive Description:

slide-17
SLIDE 17

Traditional Approaches - Mean-shift

Intuitive Description:

slide-18
SLIDE 18

Traditional Approaches - Mean-shift

Intuitive Description:

slide-19
SLIDE 19

Traditional Approaches - Mean-shift

Assumption: The data points are sampled from an underlying PDF

Assumed Underlying PDF Real Data Samples

slide-20
SLIDE 20

Traditional Approaches - Mean-shift

Histogram and Back Projection

Raw Image Histogram of ROI (or other representations) Back Projection

slide-21
SLIDE 21

Traditional Approaches - Mean-shift

Steps of tracking an object

  • 1. select your Region of Interest in Frame t 0
  • 2. calculate the Histogram of ROI
  • 3. generate Back Projection of ROI in Frame t 1
  • 4. iterate with Mean-Shift

Or, introducing similarity function to select target candidate…

slide-22
SLIDE 22

Traditional Approaches - Mean-shift

Advantages:

  • 1. Low computational complexity
  • 2. Robust to partial occlusion, deformation, rotation and background

movement

Shortcomings:

  • 1. Unable to deal with scale-variation
  • 2. Low performance when object moves fast
  • 3. Histogram is deficient in describing color features
slide-23
SLIDE 23

Traditional Approaches - Particle Filtering

Particle

slide-24
SLIDE 24

Traditional Approaches - Particle Filtering

Filter

slide-25
SLIDE 25

Traditional Approaches - Particle Filtering

Particle Filtering

slide-26
SLIDE 26

Traditional Approaches - Particle Filtering

  • Initialization
  • Sampling
  • Decision
  • Resampling
slide-27
SLIDE 27

Traditional Approaches - Particle Filtering

Tracking

slide-28
SLIDE 28

Traditional Approaches - Particle Filtering

Tracking

slide-29
SLIDE 29

Traditional Approaches - Particle Filtering

Tracking

slide-30
SLIDE 30

Traditional Approaches - Particle Filtering

Tracking

slide-31
SLIDE 31

Traditional Approaches - Particle Filtering

Strengths:

  • 1. Markov model reduces complexity of calculations
  • 2. Good description of methods
  • 3. Able to deal with scale-variation

Weakness:

  • 1. Low performance with occlusion
  • 2. Histogram is deficient in describing color features
slide-32
SLIDE 32

Traditional Approaches - Optical Flow

Optical Flow: The pattern of apparent motion of objects, surfaces, and edges in a visual scene caused by the relative motion between an observer and a scene.

slide-33
SLIDE 33

Traditional Approaches - Optical Flow

Assumptions:

  • Constant luminance among frames
  • Minor movement
  • Each frame is sampled consecutively on temporal domain
  • Spatial consistency
slide-34
SLIDE 34

The State-of-the-Art

  • MOSSE (Minimum Output Sum of Squared Error)
  • CSK (Circulant Structure of Tracking-by-detection)
  • CN (Adaptive Color Attributes)
  • GOTURN (Generic Object Tracking Using Regression Networks)
  • MDNet (Multi-Domain Convolutional Neural Networks)
  • TCNN (Modeling and Propagating CNNs in a Tree Structure)

Correlation Filter based Deep ConvNet based

slide-35
SLIDE 35

Correlation Filter based

slide-36
SLIDE 36

GOTURN

Strengths

  • Offline Training
  • Generic Object Tracking
  • Avoid Online Fine-turning
  • Regression-based Approach

Generic Object Tracking Using Regression Networks

slide-37
SLIDE 37

GOTURN

Generic Object Tracking Using Regression Networks

slide-38
SLIDE 38

TCNN

  • The width of a black arrow

indicates the weight of a CNN for target state estimation while the width of a red edge denotes the affinity between two CNNs.

  • The width of box outline means the

reliability of the CNN associated with the box. Modeling and Propagating CNNs in a Tree Structure for Visual Tracking

slide-39
SLIDE 39

TCNN

  • CNN Architecture

Modeling and Propagating CNNs in a Tree Structure for Visual Tracking

slide-40
SLIDE 40

TCNN

  • Tree Construction
  • Tree structure: , where a vertex corresponds to a

CNN, and a directed edge defines the relationship between CNNs.

  • The score of an edge is the affinity between two end vertices, which

is given by Modeling and Propagating CNNs in a Tree Structure for Visual Tracking

slide-41
SLIDE 41

TCNN

  • Target state estimation
  • Candidates generate: sample from normalize distribution in (x, y, s) space,

centered at target location in last frame

  • Target score:
  • Select target:

Modeling and Propagating CNNs in a Tree Structure for Visual Tracking

To define :

slide-42
SLIDE 42

TCNN

  • Bounding Box regression

Modeling and Propagating CNNs in a Tree Structure for Visual Tracking

slide-43
SLIDE 43

TCNN

  • Update Model
  • Create node for new CNN associated with parent node:

per 10 consecutive frames

  • The CNN in vertex z is fine-tuned from the CNN in using the training

samples collected from two sets of frames, and . Modeling and Propagating CNNs in a Tree Structure for Visual Tracking

slide-44
SLIDE 44

TCNN

Modeling and Propagating CNNs in a Tree Structure for Visual Tracking

slide-45
SLIDE 45

Generative models and Discriminative models

✴ Generative models ✴ Discriminative models - Tracking-by-detection