Computer Vision
Tracking Tracking
Many thanks to: H. Bischof, B. Leibe, V. Ferrari, K. Graumann, Y. Ukrainitz, D. Wagner, V Lepetit, M. Breitenstein, P. Sabzmeydani, Z. Kalal from whom I borrowed many slides and videos.
Tracking Tracking Many thanks to: H. Bischof, B. Leibe, V. Ferrari, - - PowerPoint PPT Presentation
Computer Vision Tracking Tracking Many thanks to: H. Bischof, B. Leibe, V. Ferrari, K. Graumann, Y. Ukrainitz, D. Wagner, V Lepetit, M. Breitenstein, P. Sabzmeydani, Z. Kalal from whom I borrowed many slides and videos. We all know what
Computer Vision
Many thanks to: H. Bischof, B. Leibe, V. Ferrari, K. Graumann, Y. Ukrainitz, D. Wagner, V Lepetit, M. Breitenstein, P. Sabzmeydani, Z. Kalal from whom I borrowed many slides and videos.
Computer Vision
’06]
[Grabner et al., VideoProc CVPR’
Computer Vision
actual object position
Time t+1 Time t
Computer Vision
Computer Vision
Computer Vision
Computer Vision
Computer Vision
Computer Vision
Computer Vision
Computer Vision
(i) Model-based tracking application-specific
human body, faces, space shuttle,…
Computer Vision
(i) Model-based tracking application-specific
human body, faces, space shuttle,…
(ii) Feature tracking more generic
corner tracking blob/contour tracking intensity profile tracking region tracking
Computer Vision Saliency Object
Model/ Tracking History Scene
Computer Vision
Computer Vision
Computer Vision
Computer Vision
004] [Pollefeyes et al. IJCV 200
Computer Vision
Single-Agent Level Multi-Agent Level Scene Level Detail Level
Computer Vision
[Ess et al. CVPR’08]
Computer Vision
– Object classes – specific object
Computer Vision
x y
x y z
Computer Vision
Computer Vision
Computer Vision
Computer Vision
x y
x y z
Computer Vision
predict predict correct correct
– Tracking can be seen as the process of propagating the posterior distribution of state given measurements across time.
Computer Vision
) | , (
1 1 1 − − − t t t
z p p p & ) | , (
1 − t t t
z p p p &
prediction C O N D E N Particle Filter
) | (
t t
p z p
weighing with
) | , (
t t t
z p p p &
update N S A T I O N
Computer Vision
predict to t+1 time t measure at t+1 update location update model
Computer Vision
Computer Vision
Computer Vision
I0(x) I (x+h) I1(x+h)
Computer Vision
I0(x) I (x+h) h I1(x+h)
Computer Vision
Computer Vision
h I0(x) I (x+h) I0(x) – I1(x) I1’(x) I1(x+h)
I1’(x)
Computer Vision
(a) (b)
Computer Vision
Computer Vision
?
Computer Vision
No gradient along
Computer Vision
No gradient along
Computer Vision
pixel?
neighbors have the same movement neighbors have the same movement I0(x) I1(x+h)
Computer Vision
, I I x ∂ ∂ =
, ∂ ∂ = y I I y
t I It ∂ ∂ =
t y x
1 equation in 2 unknowns
, x I x ∂ =
, ∂ = y I y
t It ∂ =
Computer Vision
Pseudo Inverse Over determined System of Equations
Computer Vision
before?
Computer Vision
λ2
“Corner” λ λ λ λ1 ~ λ λ λ λ2 and large “Edge” λ λ λ λ2 >> λ λ λ λ1
λ1
“Edge” λ λ λ λ1 >> λ λ λ λ2 “Flat” region
Computer Vision
Computer Vision
Computer Vision
Computer Vision
Computer Vision
4, Lucas-Kanade ramework] [Baker & Matthews, IJCV’04 20 Years On: A Unifying Fr
Computer Vision
Computer Vision
Computer Vision
Reference image(s) of the object to detect Test image
Computer Vision
– invariant to scale, rotation, or perspective
Computer Vision
Computer Vision
Computer Vision
Query Database
Computer Vision
Search in the Database Search in the Database
Keypoint Detection Keypoint Recognition
Database Database Pre-processing Make the actual classification easier Robust 3D Pose Calculation (RANSAC) Robust 3D Pose Calculation (RANSAC)
Geometric verification
Computer Vision
[Wagner et al. ISMAR’08]
Computer Vision
[Wagner et al. ‘09]
Computer Vision
Computer Vision
Input Background Model Moving Foreground Blobs (Objects)
Computer Vision
with a prescribed (color) distribution
region and the target region is
9]
region and the target region is maximized, through evolution towards higher density in a parameter space
iterations
[Comaniciu and Meer, ICCV’99
Computer Vision
Region of interest (Kernel) Center of mass Mean Shift vector Measurements
Computer Vision
Computer Vision
Computer Vision
Computer Vision
Computer Vision
Computer Vision
Computer Vision
Computer Vision
Computer Vision
Computer Vision
Computer Vision
Computer Vision
– Recover a person’s body articulation – Detailed parameterization in terms of joint locations or joint angles
– Articulated tracking as high- dimensional inference – Part-based models
Computer Vision
[Ramanan et al. CVPR’05]
Computer Vision
Computer Vision
background.
current background background
[Grabner et al. CVPR’06]
current
appearance
Computer Vision
background vs.
Computer Vision
background vs.
Computer Vision
search Region actual object position from time t to t+1 evaluate classifier on sub-patches
analyze map and set new
update classifier (tracker)
Computer Vision
Computer Vision
Computer Vision
Computer Vision
Computer Vision
search Region actual object position from time t to t+1 evaluate classifier on sub-patches
analyze map and set new object position update classifier (tracker)
Computer Vision
Tracked Patches Confidence
Computer Vision
Computer Vision
Computer Vision
t=1 initialization t=2 position in prev. frame candidate new positions (e.g., dynamics) best new position (e.g., max color similarity)
Computer Vision
…
detect object(s) independently in each frame associate detections over time into tracks
Computer Vision
Frame 5 Frame 1 Frame 9
Computer Vision
Computer Vision
Persons Background
Supervised Learning
Computer Vision
Computer Vision
(a) collect detections
Detections Space Time Volume [Leibe et al. CVPR’07]
Computer Vision
(a) collect detections (b) trajectory growing and selection
t x t z
Space Time Volume
Computer Vision
(a) collect detections (b) trajectory growing and selection
t x t z
H1 H2
Space Time Volume
Computer Vision
Input (Object Detections) “Tracking” Result
Computer Vision
Computer Vision
– “walking” people
Ground Plane Depth verification
Computer Vision
[Gammeter et al. ECCV’08]
Computer Vision
Computer Vision
Computer Vision
Current Model Fix (initial) Model
[Grabner et al. ECCV’08]
Computer Vision
Computer Vision
CLICK HERE TO START
Computer Vision
to explore
detector on the fly.
[Kalal et al. CVPR’10]
Computer Vision
Computer Vision
Computer Vision
Computer Vision
Computer Vision
[Grabner et al. CVPR’10]
Computer Vision
Computer Vision
Computer Vision
Computer Vision
change there appearance very quickly.
low textured
“virtual points”.
Computer Vision
Computer Vision
Computer Vision
Computer Vision
Computer Vision
Supporters
Computer Vision
Supporters
Computer Vision
Supporters
Computer Vision
…. and magician knows that.
Supporters
Computer Vision
Computer Vision
Robust, Accurate, Fast,…
Information about the object, dynamics, environment,…
Computer Vision
Time t = 0
Computer Vision
– If the dynamics model is too strong, will end up ignoring the data – If the observation model is too strong, tracking is reduced to repeated detection is reduced to repeated detection
http://www.ethlife.ethz.ch/archive_articles/091008_kalman_per/index
Computer Vision
– Generative: “render” the state on top of the image and compare – Discriminative: classifier or detector score score
– specify using domain knowledge – learn (very difficult)
Computer Vision
dynamics
– Sometimes needed to keep multiple trackers in parallel trackers in parallel – E.g., for abrupt direction changes („Persons“)
Wrong prediction Correct prediction
Computer Vision
Tracking
– What if we don’t know which measurements to associate with which tracks? tracks?
Computer Vision
Computer Vision
Appearance Change
– Cluttered Background – Changes in shape, orientation, color,… – Changes in shape, orientation, color,…
Computer Vision
Occlusions
Computer Vision
Computer Vision
– Errors caused by dynamical model,
tend to accumulate over time
Computer Vision