April 4-7, 2016 | Silicon Valley
Brian Cheung bcheung@berkeley.edu
Redwood Center for Theoretical Neuroscience, UC Berkeley Visual Computing Research, NVIDIA
Neural Attention for Object Tracking Brian Cheung - - PowerPoint PPT Presentation
April 4-7, 2016 | Silicon Valley Neural Attention for Object Tracking Brian Cheung bcheung@berkeley.edu Redwood Center for Theoretical Neuroscience, UC Berkeley Visual Computing Research, NVIDIA Source: Wikipedia School Bus Motivation
April 4-7, 2016 | Silicon Valley
Brian Cheung bcheung@berkeley.edu
Redwood Center for Theoretical Neuroscience, UC Berkeley Visual Computing Research, NVIDIA
Source: Wikipedia “School Bus”
3
Xu et. al. 2015
4
5
h(t) x(t)
6
Parameters in the kernel control the layout of the attention window over the
Translation Scale
Jaderberg et. al. 2015
7
Jaderberg et. al. 2015
8
9
Cheung et. al. 2015
Recurrent Network Glimpse Network Image Location Network Classification Network
10
Cheung et. al. 2015
Recurrent Network Glimpse Network Image Location Network Classification Network
11
Cheung et. al. 2015
Recurrent Network Glimpse Network Image Location Network Classification Network
12
Cheung et. al. 2015
Recurrent Network Glimpse Network Image Location Network Classification Network
13
Cheung et. al. 2015
Recurrent Network Glimpse Network Image Location Network Classification Network
14
‘5’
Cheung et. al. 2015
Recurrent Network Glimpse Network Image Location Network Classification Network
15
16
Geiger et. al. 2012
Recurrent Network Grid Generator Localization Network Convolutional Network
Tracking Network Generate Image Glimpse = Tθ(Image(t), θloc(t-1))
17
Recurrent Network Grid Generator Localization Network Convolutional Network
Tracking Network Generate features from ConvNet hcnet(t) = fcnet( )
18
Recurrent Network Grid Generator Localization Network Convolutional Network
Tracking Network Generate features from Recurrent Network hrnn(t) = frnn(hcnet(t), θloc(t-1), hrnn(t-1))
19
Recurrent Network Grid Generator Localization Network Convolutional Network
Tracking Network Generate parameters for next glimpse from Localization Network θloc(t) = floc(hrnn(t-1))
20
Recurrent Network Grid Generator Localization Network Convolutional Network
Tracking Network Generate tracking prediction from Tracking Network θpred(t), ypres(t) = ftracking(hrnn(t-1))
21
Grid Generator Convolutional Network {‘Car’, ‘Pedestrian’, ‘Truck’, ‘Tram’, ‘Cyclist’, ‘Misc’, ‘Van’, ‘Person Sitting’} ~3% Classification Error
22
23
Grid Generator Convolutional Network
24
Input Glimpse Predicted Correction Actual Correction
With ConvNet Pretraining Without pretraining (Random Initialization)
25
MNIST Position Attention Position Tracking Network Localization Network
x x y y Prediction Ground Truth
29
30
31
32