and inference for convolutional
play

and Inference for Convolutional Neural Networks 1 2 FFT IFFT 3 - PowerPoint PPT Presentation

Band-limited Training and Inference for Convolutional Neural Networks 1 2 FFT IFFT 3 4 Mathieu et al.: Fast Training of Convolutional Networks through FFTs Fast Convolutional Nets With fbfft: A GPU Performance Evaluation Data: x


  1. Band-limited Training and Inference for Convolutional Neural Networks 1

  2. 2

  3. FFT IFFT 3

  4. 4

  5. Mathieu et al.: “Fast Training of Convolutional Networks through FFTs” Fast Convolutional Nets With fbfft: A GPU Performance Evaluation Data: x xfft FFT(x) offt Out: o xfft  yfft IFFT(offt) Filter: y FFT(y) yfft

  6. Mathieu et al.: “Fast Training of Convolutional Networks through FFTs” Fast Convolutional Nets With fbfft: A GPU Performance Evaluation Data: x xfft FFT(x) offt Out: o xfft  yfft IFFT(offt) Filter: y FFT(y) yfft

  7. Mathieu et al.: “Fast Training of Convolutional Networks through FFTs” Fast Convolutional Nets With fbfft: A GPU Performance Evaluation Data: x xfft FFT(x) Out: o offt xfft  yfft IFFT(offt) Filter: y FFT(y) yfft

  8. Mathieu et al.: “Fast Training of Convolutional Networks through FFTs” Fast Convolutional Nets With fbfft: A GPU Performance Evaluation Data: x xfft FFT(x) offt Out: o xfft  yfft IFFT(offt) Filter: y FFT(y) yfft

  9. Mathieu et al.: “Fast Training of Convolutional Networks through FFTs” Fast Convolutional Nets With fbfft: A GPU Performance Evaluation cuDNN cu DNN: Subs ubstantial tantial memor ory y wor orkspace space neede ded d for or intermed ermediate iate resul ults. ts. Data: x xfft FFT(x) offt Out: o xfft  yfft IFFT(offt) Filter: y FFT(y) yfft

  10. Band-limiting = masking out high frequencies xfft Data: x xCfft Band-limited FFT(x) (xfft) offt Out: o xCfft  yCfft IFFT(offt) Filter: y yfft Band-limited FFT(y) (yfft) yCfft

  11. xfft Data: x xCfft Band-limited Less memory used FFT(x) (xfft) offt Out: o xCfft  yCfft IFFT(offt) Filter: y yfft Band-limited FFT(y) (yfft) yCfft

  12. xfft Data: x xCfft Band-limited Less memory used FFT(x) (xfft) offt Out: o xCfft  yCfft IFFT(offt) Filter: y yfft Band-limited FFT(y) Faster computation (yfft) yCfft

  13. Preserve enough of the spectrum to retain high accuracy of models. xfft Data: x xCfft Band-limited Less memory used FFT(x) (xfft) offt Out: o xCfft  yCfft IFFT(offt) Filter: y yfft Band-limited FFT(y) Faster computation (yfft) yCfft

  14. 14

  15. 2. Conjugate symmetry 1-j 1+j 15

  16. 2. Conjugate symmetry 1+j 16

  17. 2. Conjugate symmetry DC 1+j 3. Real values 17

  18. 2. Conjugate symmetry DC 1+j 3. Real values 4. No constraints 18

  19. 2. Conjugate symmetry DC 3. Real values 4. No constraints 5. 1 st compression 19

  20. 2. Conjugate symmetry DC 3. Real values 4. No constraints 5. 1 st compression 20

  21. 2. Conjugate symmetry DC 3. Real values 4. No constraints 5. 1 st compression 6. 2 nd compression 21

  22. 2. Conjugate symmetry DC 3. Real values 4. No constraints 5. 1 st compression 6. 2 nd compression 22

  23. Test Accuracy (%) 95 90 ResNet-18 on CIFAR-10 85 0 20 40 60 80 Compression rate (%) 23

  24. 93.5% Test Accuracy (%) 95 90 ResNet-18 on CIFAR-10 85 0 20 40 60 80 Compression rate (%) 24

  25. 93.5% Test Accuracy (%) 95 92% 90 ResNet-18 on CIFAR-10 85 0 20 40 60 80 Compression rate (%) 25

  26. 93.5% Test Accuracy (%) 95 92% 90 ResNet-18 on CIFAR-10 85 0 20 40 60 80 Test Accuracy (%) 80 75.3% 71.2% 70 DenseNet-121 on CIFAR-100 60 0 20 40 60 80 Compression rate (%) 26

  27. ▪ ▪ ▪ ▪ ▪ ▪ ▪ 27

  28. ▪ ▪ ▪ ▪ ▪ ▪ ▪ ▪ 30

  29. Cross-correlate input data and filter: x ∗ c y F x ω = F x n F y ω = F y n x ∗ c y = F −1 (F x ω ʘ F y ω ) Spectrum of convolution: S ω = F x ω ʘ F y ω 𝐍 𝐝 𝛛 = ቊ 𝟐, 𝛛 ≤ 𝐝 𝐏, 𝛛 > 𝐝 x ∗ c y = F −1 [ F x ω ʘ M c ω ) ʘ (F y ω ʘ M c ω ] x ∗ c y = F −1 S ω ʘ M c ω 𝑂−1 |𝑦 𝑜 | 2 = σ 𝜕=0 Energy (Parseval’s theorem): σ 𝑜=0 2𝜌 𝑦 𝜕 | 2 |𝐺 31

  30. 32

  31. DenseNet-121 on CIFAR-100 80 70 Test accuracy (%) 60 50 40 30 20 10 C=50 C=75 0 0 20 40 60 80 Inference Compression Rate (%) 33

  32. DenseNet-121 on CIFAR-100 80 70 Test accuracy (%) 60 50 40 30 20 10 C=0 C=50 C=75 C=85 0 0 20 40 60 80 Inference Compression Rate (%) 34

  33. ResNet-18 on CIFAR-10 performance Normalized 100 (%) 50 GPU memory allocated 0 0 20 40 60 80 performance 100 Normalized 50 (%) Epoch time 0 0 20 40 60 80 Compression rate (%) 35

  34. 100 ResNet-18 on CIFAR-10 80 Test accuracy (%) 60 40 Train Compression Rate (%): 20 C=0 0 0 10 20 30 40 50 60 70 80 Inference Compression Rate (%) 36

  35. 100 ResNet-18 on CIFAR-10 80 Test accuracy (%) 60 40 Train Compression Rate (%): 20 C=0 C=85 0 0 10 20 30 40 50 60 70 80 Inference Compression Rate (%) 37

  36. 100 ResNet-18 on CIFAR-10 80 Test accuracy (%) 60 40 Train Compression Rate (%): 20 C=0 C=85 0 0 10 20 30 40 50 60 70 80 Inference Compression Rate (%) Smooth degradation of accuracy during inference 38

  37. 100 ResNet-18 on CIFAR-10 80 Test accuracy (%) 60 40 Train Compression Rate (%): 20 C=0 C=30 C=50 C=85 0 0 10 20 30 40 50 60 70 80 Inference Compression Rate (%) Apply the same compression rate to training and inference 39

  38. Test Accuracy (%) 95 100 90 50 GPU memory allocated 85 0 0 50 0 20 40 60 80 100 Test Accuracy (%) 80 50 70 Epoch time 0 60 0 20 40 60 80 0 50 Compression rate (%) Compression rate (%) 40

  39. 41

  40. 42

  41. 43

  42. 44

  43. 45

  44. “Speaking of longer term, it would be nice if the community migrated to a fully open sourced implementation for all of this [convolution operations, etc.]. This stuff is just too important to the progress of the field for it to be locked away in proprietary implementations . The more people working together on this the better for everyone. There's plenty of room to compete on the hardware implementation side.” Scott Gray https://github.com/soumith/convnet-benchmarks/issues/93 46

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend