SLIDE 19 4/27/20 | Department of Computer Science | Laboratory for Parallel Programming | Arya Mazaheri | 19
Winograd transformation optimization results
F ( 2 , 3 ) F ( 3 , 3 ) F ( 4 , 3 ) F ( 5 , 3 ) F ( 6 , 3 ) F ( 7 , 3 ) F ( 8 , 3 ) F ( 9 , 3 ) F ( 1 , 3 ) 0.0 0.2 0.4 0.6
Arithmetic reduction ratio α = 8
F ( 2 , 5 ) F ( 3 , 5 ) F ( 4 , 5 ) F ( 5 , 5 ) F ( 6 , 5 ) F ( 7 , 5 ) F ( 8 , 5 ) F ( 9 , 5 ) F ( 1 , 5 ) 0.0 0.2 0.4 0.6
α = 8
F ( 2 , 7 ) F ( 3 , 7 ) F ( 4 , 7 ) F ( 5 , 7 ) F ( 6 , 7 ) F ( 7 , 7 ) F ( 8 , 7 ) F ( 9 , 7 ) F ( 1 , 7 ) 0.0 0.2 0.4 0.6
α = 8
Transformations Whole Winograd
F(2,3) F(3,3) F(4,3) F(5,3) F(6,3) F(7,3) F(8,3) F(9,3) 0.0 0.2 0.4 0.7
Runtime (ms) 3x3 conv
F(2,5) F(3,5) F(4,5) F(5,5) F(6,5) F(7,5) F(8,5) F(9,5) 0.0 0.1 0.3 0.4
5x5 conv
F(2,7) F(3,7) F(4,7) F(5,7) F(6,7) F(7,7) F(8,7) F(9,7) 0.0 4.7 9.3 14.0
7x7 conv
1.00 1.25 1.2 1.4 1.0 1.2
Speedup ratio
Non-optimized Optimized
- Overall arithmetic reduction ratios related to transformation steps and the whole
Winograd algorithm for a single tile
- Runtime comparison on Nvidia 1080 Ti