Fast Convolution Algorithms for deep learning and computer vision
Sample slides only
Presenter: Prof. Ioannis Pitas Aristotle University of Thessaloniki pitas@csd.auth.gr
learning and computer vision Sample slides only Presenter: Prof. - - PowerPoint PPT Presentation
Fast Convolution Algorithms for deep learning and computer vision Sample slides only Presenter: Prof. Ioannis Pitas Aristotle University of Thessaloniki pitas@csd.auth.gr Outline 1D convolutions Linear & Cyclic 1D convolutions
Presenter: Prof. Ioannis Pitas Aristotle University of Thessaloniki pitas@csd.auth.gr
Linear & Cyclic 1D convolutions Discrete Fourier Transform, Fast Fourier Transform Winograd algorithm
Convolutional neural networks
Image filtering Image feature calculation
Template matching Correlation tracking
Convolutional Neural Networks
π§ π = β π β π¦ π = ΰ·
π=0 πβ1
β π π¦ π β π
1, it takes the form:
π§ π = β π β π¦ π = ΰ·
π=βπ€ π€
β π π¦ π β π
Image source: http://electricalacademia.com/signals-and-systems/example-of-discrete-time-graphical-convolution/
Image source: http://electricalacademia.com/signals-and-systems/example-of-discrete-time-graphical-convolution/
π π = ΰ·
π=0 πβ1
β π π¦ π + π
video.
π§ π = π¦ π β β π = ΰ·
π=0 πβ1
β π π¦(( (π β π)π))
length π β₯ π + π β 1 and then performing a cyclic convolution of length N: π§ π = π¦ π β β π = Οπ=0
πβ1 π¦π π βπ(( (π β π)π))
Cyclic convolution can also be calculated using 1D DFT:
π = π½πΈπΊπ(πΈπΊπ π πΈπΊπ π )
DFT.
(DIT) Fast Fourier Transform (FFT) (Cooley-Tuckey).
π(π) = ΰ·
π=0 πβ1
π¦(π) πβ2ππ
π ππ
where π is an integer ranging from 0 to π β 1.
"butterfly" operations.
π(π¨) = ΰ·
π=0 πβ1
π¦(π)π¨βπ
The Z-transform of a signal (function) x(n) having domain [0,β¦,N] is given by: The domain of Z-transform is the complex plane, since z is a complex number. The following relation holds for the Z-transform:
π§(π) = π¦(π) β β(π) β π(π¨) = π(π¨)πΌ(π¨)
Where : (π)π = π mod π
) 1 mod( β
N
z
fashion
filtering algorithms: π = π ππ²β¨ππ’
their middle vector product, thus having minimal complexity.
factors of polynomial π¨π β 1 over the rational numbers π .
π1 Γ π2 is given by:
π§ π1, π2 = β π1, π2 ββ π¦ π1, π2 = ΰ·
π1 π1
ΰ·
π2 π2
β π1, π2 π¦(π1 β π1, π2 β π2)
is defined as:
π§ π1, π2 = β π1, π2 ββ π¦ π1, π2 = ΰ·
π1 π1
ΰ·
π2 π2
β π1, π2 π¦( π1 β π1 π1, π2 β π2 π2)
Signal filtering Signal restoration Signal deconvolution
Time delay estimation Distance calculation (e.g., sonar) 1D template matching
Convergence of machine learning and signal processing processing
π(β) ,
multiple incoming features πππ and one single output feature π. For RGB images
Multiple input features to single feature π transformation π§ π (π, π, π) = π
π
π(π) + ΰ·
π =1 πππ
ΰ·
π1=βπ1 π1
(π)
ΰ·
π2=βπ2 π2
(π)
π₯(π) π1, π2, π , π π¦(π) π β π1, π β π2, π Convolutional Layer Activation Volume (3D tensor) πππ
π (π) = π π
π π (π) + ΰ·
π =1 πππ
πΏ π (π , π) β πππ
π (π )
π© π = πππ
π π : π = 1, . . , π π , π = 1, . . , π π , π = 1, β¦ , πππ£π’
whereπ© π is the activation volume for the convolutional layer π, πΏ π (π , π) is a 2D slice of the convolutional kernel πΏ(π) β ββ1Γβ2ΓπππΓπππ£π’ for input feature π and
feature π , π π (π) a scalar bias and πππ
π (π )a region of input feature π centered at π, π π, e.g. π 1 (1) the R channel of an image πππ = π· = 3.
Image Source: Heehoon Kim, Hyoungwook Nam, Wookeun Jung, and Jaejin Le - Performance Analysis of CNN Frameworks for GPUs
Image Source: Heehoon Kim, Hyoungwook Nam, Wookeun Jung, and Jaejin Le - Performance Analysis of CNN Frameworks for GPUs