GPU-Accelerated Object Tracking Using Particle Filtering and - - PowerPoint PPT Presentation

gpu accelerated object tracking using
SMART_READER_LITE
LIVE PREVIEW

GPU-Accelerated Object Tracking Using Particle Filtering and - - PowerPoint PPT Presentation

GPU-Accelerated Object Tracking Using Particle Filtering and Appearance-adaptive Models Bogusaw Rymut, Bogdan Kwolek Rzeszw University of Technology In this work we present an object tracking algorithm running on GPU. The tracking is


slide-1
SLIDE 1

GPU-Accelerated Object Tracking Using Particle Filtering and Appearance-adaptive Models

International Conference on Image Processing & Communications 2010

Bogusław Rymut, Bogdan Kwolek

In this work we present an object tracking algorithm running on GPU. The tracking is achieved by a particle filter using appearance-adaptive models. The main focus of our work is parallel computation of the particle weights. The tracker yields promising GPU/CPU speed-up. We demonstrate that the GPU implementation of the algorithm that runs with 256 particles is about 30 times faster than the CPU implementation. Practical implementation issues in the CUDA framework are discussed. The algorithm has been tested on freely available test sequences.

Rzeszów University of Technology

slide-2
SLIDE 2

2

Agenda

 The problem  CUDA programming model  Particle Filtering  Problem decomposition  Experiments

slide-3
SLIDE 3

3

The problem

 Appearance based object tracking is

time-consuming

 The tracking algorithm must run in

real-time

 GPU implementation of PF algorithm  Real-time tracking using PF and GPU  How to decompose algorithm on GPU

slide-4
SLIDE 4

4

Object appearance

 

3 , , , 1 1

, 1 2 3

K k i k i k k i i k

f m M I initial intensity i previous intensity slow changes

 

         

 

t-1 t Fitness function 1 K 1 K

I

slide-5
SLIDE 5

5

CPU vs. GPU

1. www.nvidia.com

SIMD Architecture

slide-6
SLIDE 6

6

CUDA programming model

 Highly Multithreaded Coprocessor  Small set of extensions to C language  Low level programming  Focus on parallel algorithms

slide-7
SLIDE 7

7

CUDA programming model

 High scalable heterogeneous system

CPU & GPU are separate devices with separate DRAMs

GPU uses and executes thousand of extremely light threads to achieve high performance

GPU DEVICE CPU DEVICE

slide-8
SLIDE 8

8

The problem of object tracking

 The goal is to find the same object in the

sequence of images

 In simplest approach this can be achieved

via brute-force based searching

slide-9
SLIDE 9

9

Tracknig - Probabilistic Approach

One of the goals of visual tracking is to estimate the states of the

  • bjects of interest from image sequences.

The problem of tracking can be formulated as the Bayesian filtering where , and denote the hidden state of the object of interest and

  • bservation vector at discrete time , respectively, whereas ,

denotes all the observations ut to current time step

     

1 1 1: 1 1 t 1:t-1 t t t t t

p z x p z z p z x dx

   



k

x

k

Z k

t

z

  • bservation

hidden state

slide-10
SLIDE 10

10

Starting with a weighted particle set approximately distributed according to the particle filter operates through predicting new samples from a proposal distribution. To give a new particle representation of the posterior density the samples are set to :

Each sample represents the hypothetical state of the object

Particle Filtering

) | ( ) | ( ) | (

t t

  • bserwacja

t t ruch t t

Z p Z p Z p x x x        

   1 1 1

 

 

( ) ( ) 1 1 1

,

M i i t t i

S x w

  

     

1 1 1 1, i i i t t t t i i t t i i t t t

p x p x x w w q x x

   

 z z

1 1

( | )

t t

p x Z

 

 

 

( ) ( ) 1

,

M i i t t i

S x w

( | )

t t

p x Z

2 2

( ) 1 ( | ) exp 2 2

t t t

f p            x z x

slide-11
SLIDE 11

11

1.

For i = 1, 2, . . . , M sample or propose particles using

2.

For i = 1, 2, . . . , M calculate the weights

3.

Normalize the weights using

4.

Calculate the state estimates

5.

Resample to get new set of particles

Particle Filtering

1

ˆ

M i i t t t i

x w x



 

,

i i t t

x w

 

, 1/

i i t t

x w M 

i k

w

i k

w

 

i i i t t t t

w w p z x 

 

1 t t

p x x 

i t

w

slide-12
SLIDE 12

12

Particle Filtering

  • bservation

prediction time time

slide-13
SLIDE 13

13

Approach to algorithm decomposition

 Each part of the algorithm has

been implemented as kernel function.

 Every particle has been

implemented as thread block.

slide-14
SLIDE 14

14

Approach to algorithm decomposition

slide-15
SLIDE 15

15

Data decomposition

slide-16
SLIDE 16

16

Optimization of data access

 Access to on GPU global memory is

bottleneck

 Correctly data alignments essential to

  • verall performance
slide-17
SLIDE 17

17

Experiments

 PC with Intel Core 2 Quad 2.66 GHz,

1GB RAM

 PC with nVidia GeForce 9800 GT

14 multiprocessors 1.5 GHz, 1024MB RAM

slide-18
SLIDE 18

18

Face tracking

Real time Slow motion

slide-19
SLIDE 19

19

Experimental results Computation time [ms]

CPU [ms] 9800 GT [ms] Speedup

#32 16.53 1.30 x12.8 #64 32.27 1.80 x18.3 #128 62.65 2.70 x24.4 #256 123.73 4.17 x29.5 #512 243.19 7.51 X32.4

slide-20
SLIDE 20

20

Conclusions

 GPU implementation of PF algorithm

has been prepared

 Our GPU based implementation is 30

times faster than CPU implementation