Wrap Up Lecture
Instructor - Simon Lucey
16-423 - Designing Computer Vision Apps
Wrap Up Lecture Instructor - Simon Lucey 16-423 - Designing - - PowerPoint PPT Presentation
Wrap Up Lecture Instructor - Simon Lucey 16-423 - Designing Computer Vision Apps Today Review - Project Presentations Emerging Trends in Mobile Vision Project Discussions Reminder - Project Presentation Each team will be
Instructor - Simon Lucey
16-423 - Designing Computer Vision Apps
member to present (for example a 2 member team will have 5 minutes allotted).
(must be shorter than your allotted time) YouTube clip describing your App in action.
http://goo.gl/forms/YoeQt0c1Hf.
receiving the the best project prize.
5
5
6
6
6
7
Correlation Filters with Limited Boundaries
Hamed Kiani Galoogahi Istituto Italiano di Tecnologia Genova, Italy
hamed.kiani@iit.itTerence Sim National University of Singapore Singapore
tsim@comp.nus.edu.sgSimon Lucey Carnegie Mellon University Pittsburgh, USA
slucey@cs.cmu.eduAbstract
Correlation filters take advantage of specific proper- ties in the Fourier domain allowing them to be estimated efficiently: O(ND log D) in the frequency domain, ver- sus O(D3 + ND2) spatially where D is signal length, and N is the number of signals. Recent extensions to cor- relation filters, such as MOSSE, have reignited interest of their use in the vision community due to their robustness and attractive computational properties. In this paper we demonstrate, however, that this computational efficiency comes at a cost. Specifically, we demonstrate that only 1 D proportion of shifted examples are unaffected by boundary effects which has a dramatic effect on detection/trackingSIMD (Single Instruction, Multiple Data)
Names: MMX, SSE, SSE2, …
x
4-way
SIMD (Single Instruction, Multiple Data)
Optimize
Optimize
Optimize
OpenCV MATLAB
OpenCV MATLAB
OpenCV MATLAB
something quickly that works.
algorithmically.
Some insights taken from Markus Püschel’s lectures on “How to Write fast Numerical Code”.
Source: http://www.slashgear.com/iphone-7-potential-wanes-as-android-n-starts-to-tango-20440932/
Source: http://www.slashgear.com/iphone-7-potential-wanes-as-android-n-starts-to-tango-20440932/
Ohad Fried, Eli Shechtman, Dan B Goldman, and Adam Finkelstein. Perspective-aware Manipulation of Portrait Photos. ACM Transactions on Graphics (Proc. SIGGRAPH), July 2016.
Ohad Fried, Eli Shechtman, Dan B Goldman, and Adam Finkelstein. Perspective-aware Manipulation of Portrait Photos. ACM Transactions on Graphics (Proc. SIGGRAPH), July 2016.
18
Taken from: http://lpirc.net/
19
iPhone 6 Samsung Galaxy S5
19
iPhone 6 Samsung Galaxy S5
23
24
Kinect 2 Sensor Standard Basketball Hoop
26
26
27
28
500 1000 1500 2000 2500 3000 3500 4000 0.5 1 1.5 2 2.5 Wavelength (in nm) Spectral Irradiance (in Wm−2nm−1) Extraterrestrial Radiation Direct + Circumsolar Irradiance
30
33
33
33
F - frames
F
X
r=1
F - frames
F
X
r=1
“reference frame”
F - frames
F
X
r=1
X
x∈Ir
x ∈ Ir
“reference frame”
F - frames
F
X
r=1
X
x∈Ir
x ∈ Ir X
f∈obs(x)
“reference frame”
arg min
λ,θ
||Ir(x) − If(W{x; θf, λr(x)})||2
2
ORB-SLAM:
s_01 s_10 s_20 s_30 s_40 s_50 Fwd Bwd
2 4 6 8 10
DSO:
s_01 s_10 s_20 s_30 s_40 s_50 Fwd Bwd
2 4 6 8 10
Full evaluation result.
All error values for the TUM- monoVO dataset.
36
36
ImageNet Challenge Year
BC
(before ConvNets)
AD
(after deep learning)
6.8%
38
9×9 C 1×1 C 1×1 C 1×1 C 1×1 C
11×11
C
11×11
C
Loss Loss
f1 f2
(c) Stage 1
Input Image h×w×3
Input Image h×w×3
9×9 C 9×9 C 9×9 C 2× P 2× P 5×5 C 2× P 9×9 C 9×9 C 9×9 C 2× P 2× P 5×5 C 2× P
11×11
C
(e) Effective Receptive Field
x x0
g1
g2
gT
b1 b2 bT ψ2 ψT
(a) Stage 1
Pooling
P
Convolution
C
x0
Convolutional Pose Machines (T–stage) x x0
h0×w0 ×(P + 1) h0×w0 ×(P + 1)
(b) Stage ≥ 2 (d) Stage ≥ 2
9 × 9 26 × 26 60 × 60 96 × 96 160 × 160 240 × 240 320 × 320 400 × 400
CAD Selection (VGG-NET) Extrinsics Selection (VGG-NET)
image patch 3@ (224x224)
Images Using CNNs Trained with Rendered 3D Model Views”. In ICCV 2015.
via Surface Normal Prediction”. In CVPR 2016.
CAD Selection (VGG-NET) Extrinsics Selection (VGG-NET)
image patch 3@ (224x224)
Images Using CNNs Trained with Rendered 3D Model Views”. In ICCV 2015.
via Surface Normal Prediction”. In CVPR 2016.
[Christopher Olah] Understanding LSTM Networks, http:// colah.github.io/posts/2015-08-Understanding-LSTMs/
3D Convolutional LSTM
MVS Ours 20 30 40
Training
Testing
Testing
32Xbit'
1Xbit'
Reducing'Precision'
{X1,+1}' {0,1}' MUL' XNOR' ADD,'SUB' BitXCount'(popcount)' [Han'et'al.'2016]'
−0.05 0.05 1600 3200 4800 6400
8Xbit'
Rastegari, Mohammad, et al. "XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks." ECCV 2016
'
Rastegari, Mohammad, et al. "XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks." ECCV 2016
+''−''×' 1x' 1x'
OperaIons' Memory' ComputaIon'
+''−''' ~32x' ~2x' XNOR' BitXcount' ~32x' ~58x'
I ⇤ I ⇤ I ⇤ I ⇤
R B R B R B
XNORXNetworks'
specific application domain (e.g. graphics cards, ethernet cards, DSPs, etc.).
57
58
(Taken from recent talk by Yann Lecunn at Hot Chips conference in May 2015).
59
59
Deep Learning
Check out this article.
60
(Taken from recent talk by Yann Lecunn at Hot Chips conference in May 2015).