High-Performance Deep Learning: Issues, Trends, and Challenges CSE - PowerPoint PPT Presentation

High-Performance Deep Learning: Issues, Trends, and Challenges CSE 5194.01 Autumn ‘20 Dhabaleswar K. (DK) Panda Hari Subramoni Arpan Jain The Ohio State University The Ohio State University The Ohio State University E-mail: panda@cse.ohio-state.edu E-mail: subramon@cse.ohio-state.edu E-mail: jain.575@osu.edu http://www.cse.ohio-state.edu/~panda http://www.cse.ohio-state.edu/~subramon http://www.cse.ohio-state.edu/~jain.575

Outline • Introduction – The Past, Present, and Future of Deep Learning – What are Deep Neural Networks? – Diverse Applications of Deep Learning – Deep Learning Frameworks • Overview of Execution Environments • Parallel and Distributed DNN Training • Latest Trends in HPC Technologies • Challenges in Exploiting HPC Technologies for Deep Learning Network Based Computing Laboratory CSE 5194.01 2

What is Deep Learning? • Deep Learning (DL) – A subset of Machine Learning that uses Deep Neural Networks (DNNs) – Perhaps, the most revolutionary subset! • Based on learning data representation • Examples Convolutional Neural Networks, Recurrent Neural Networks, Hybrid Networks • Data Scientist or Developer Perspective 1. Identify DL as solution to a problem 2. Determine Data Set 3. Select Deep Learning Algorithm to Use 4. Use a large data set to train an Courtesy: https://hackernoon.com/difference-between-artificial-intelligence-machine-learning- algorithm and-deep-learning-1pcv3zeg, https://blog.dataiku.com/ai-vs.-machine-learning-vs.-deep-learning Network Based Computing Laboratory CSE 5194.01 3

Brief History of Deep Learning (DL) Courtesy: http://www.zdnet.com/article/caffe2-deep-learning-wide-ambitions-flexibility-scalability-and-advocacy/ Network Based Computing Laboratory CSE 5194.01 4

Milestones in the Development of Neural Networks Courtesy: https://beamandrew.github.io/deeplearning/2017/02/23/deep_learning_101_part1.html Network Based Computing Laboratory CSE 5194.01 5

The Deep Learning Revolution • Deep Learning (DL) is a sub-set of Machine Learning (ML) – Perhaps, the most revolutionary subset! Machine Learning Deep – Feature extraction vs. hand-crafted features Learning – Availability of datasets! AI Examples: Logistic Examples: Regression • Deep Learning MLPs, DNNs, – A renewed interest and a lot of hype! – Key success: Deep Neural Networks (DNNs) – Everything was there since the late 80s except the “ computability of DNNs” Adopted from: http://www.deeplearningbook.org/contents/intro.html Network Based Computing Laboratory CSE 5194.01 6

AlexNet Three key pieces in the DL Resurgence 10000 9000 8000 Minutes to Train • Modern and efficient hardware enabled 7000 6000 5000 ~500 00X i X in 5 years – Computability of DNNs – impossible in the past! 4000 3000 – GPUs – at the core of DNN training 2000 1000 – CPUs – catching up fast 0 2 GTX 580 DGX-2 • Availability of Datasets – MNIST, CIFAR10, ImageNet, and more… • Excellent Accuracy for many application areas – Vision, Machine Translation, and several others... Courtesy: A. Canziani et al., “An Analysis of Deep Neural Network Models for Practical Applications”, CoRR , 2016. Network Based Computing Laboratory CSE 5194.01 7

The Rise of GPU-based Deep Learning Courtesy: http://images.nvidia.com/content/technologies/deep-learning/pdf/NVIDIA-DeepLearning-Infographic-v11.pdf Network Based Computing Laboratory CSE 5194.01 8

Intel is committed to AI and Deep Learning as well! Courtesy: https://newsroom.intel.com/editorials/krzanich-ai-day/ Network Based Computing Laboratory CSE 5194.01 9

Deep Learning and High-Performance Architectures • NVIDIA GPUs are the main driving force for faster training of DL models – The ImageNet Challenge - (ILSVRC) -- 90% of the teams used GPUs (2014) * – Deep Neural Networks (DNNs) like ResNet(s) and Inception • However, High Performance Architectures for DL and HPC are evolving – 110/500 Top HPC systems use NVIDIA GPUs (Jun ’20) – DGX-1 (Pascal) and DGX-2 (Volta) • Dedicated DL supercomputers – Cascade-Lake Xeon CPUs have 28 cores/socket (TACC Frontera– #8 on Top500) – AMD EPYC (Rome) CPUs have 64 cores/socket (Upcoming DOE Clusters) – AMD GPUs will be powering Frontier – DOE’s Exascale System at ORNL – Domain Specific Accelerators for DNNs are also emerging Accelerator/CP *https://blogs.nvidia.com/blog/2014/09/07/imagenet/ Performance Share www.top500.org Network Based Computing Laboratory CSE 5194.01 10

The Bright Future of Deep Learning Courtesy: https://www.top500.org/news/market-for-artificial-intelligence-projected-to-hit-36-billion-by-2025/ Network Based Computing Laboratory CSE 5194.01 11

Current and Future Use Cases of Deep Learning Courtesy: https://www.top500.org/news/market-for-artificial-intelligence-projected-to-hit-36-billion-by-2025/ Network Based Computing Laboratory CSE 5194.01 12

So what is a Deep Neural Network? • Example of a 3-layer Deep Neural Network (DNN) – (input layer is not counted) Courtesy: http://cs231n.github.io/neural-networks-1/ Network Based Computing Laboratory CSE 5194.01 14

Graphical/Mathematical Intuitions for DNNs Drawing of a Biological Neuron The Mathematical Model Courtesy: http://cs231n.github.io/neural-networks-1/ Network Based Computing Laboratory CSE 5194.01 15

Key Phases: DNN Training and Inference • Training is compute intensive – Many passes over data – Can take days to weeks – Model adjustment is done • Inference – Single pass over the data – Should take seconds – No model adjustment • Challenge: How to make “Training” faster? Courtesy: https://devblogs.nvidia.com/ – Need Parallel and Distributed Training… Network Based Computing Laboratory CSE 5194.01 16

TensorFlow playground (Quick Demo) • To actually train a network, please visit: http://playground.tensorflow.org Network Based Computing Laboratory CSE 5194.01 17

Inference on trained ResNet50 (Quick Demo) • To try your own image, please visit: https://microsoft.github.io/onnxjs-demo/#/resnet50 Network Based Computing Laboratory CSE 5194.01 18

Diverse Application Areas for Deep Learning • Vision – Image Classification – Style Transfer – Caption Generation • Speech – Speech Recognition – Real-time Translation • Text – Sequence Recognition and Generation • Disease discovery – Cancer Detection • Autonomous Driving – Combination of multiple areas like Image/Object Detection, Speech Recognition, etc. Network Based Computing Laboratory CSE 5194.01 20

Style Transfer Courtesy: https://github.com/alexjc/neural-doodle Network Based Computing Laboratory CSE 5194.01 21

Style Transfer Courtesy: https://github.com/alexjc/neural-doodle Network Based Computing Laboratory CSE 5194.01 22

Caption Generation Courtesy: https://machinelearningmastery.com/inspirational-applications-deep-learning/ Network Based Computing Laboratory CSE 5194.01 23

Shakespeare’s Style Passage Generation Remember, all the RNN knows are characters, so in particular it samples both speaker’s names and the contents. Sometimes we also get relatively extended monologue passages, such as: • VIOLA: Why, Salisbury must find his flesh and thought That which I am not aps, not a man and in fire, To show the reining of the raven and the wars To grace my hand reproach within, and not a fair are hand, That Caesar and my goodly father's world; When I was heaven of presence and our fleets, We spare with hours, but cut thy council I am great, Murdered and by thy master's ready there My power to give thee but so much as hell: Some service in the noble bondman here, Would show him to her wine. • KING LEAR: O, if you were a feeble sight, the courtesy of your law, Your sight and several breath, will wear the gods With his heads, and my hands are wonder'd at the deeds, So drop upon your lordship's head, and your opinion Shall be against your honour. Courtesy: http://karpathy.github.io/2015/05/21/rnn-effectiveness/ Network Based Computing Laboratory CSE 5194.01 24

Machine Translation Some of the “dirty” letters we use for training. Dirt, highlights, and rotation, but not too much because we don’t want to confuse our neural net. Courtesy: https://research.googleblog.com/2015/07/how-google-translate-squeezes-deep.html Network Based Computing Laboratory CSE 5194.01 25

Google Translate Courtesy: https://www.theverge.com/2015/1/14/7544919/google-translate-update-real-time-signs-conversations Network Based Computing Laboratory CSE 5194.01 26

High-Performance Deep Learning: Issues, Trends, and Challenges CSE - PowerPoint PPT Presentation

High-Performance Deep Learning: Issues, Trends, and Challenges CSE 5194.01 Autumn 20 Dhabaleswar K. (DK) Panda Hari Subramoni Arpan Jain The Ohio State University The Ohio State University The Ohio State University E-mail:

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Trends in High Performance Trends in High Performance Computing and the Grid Computing and the

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

Trends in High Performance Trends in High Performance Computing and Using Numerical Computing

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Deep learning for natural language processing A short primer on deep learning Benoit Favre <

Relational Deep Learning: A Deep Latent Variable Model for Link Prediction Hao Wang, Xingjian

Medical Imaging Elisa Sayrol Medical Imaging Interest in this area in Deep Learning: DeepDeep

Keeping the Conversation Going Cambridge University Press Remote Teaching Series Starting Soon!

WiFuzz: Detecting and Exploiting Logical Flaws in the Wi-Fi Cryptographic Handshake Mathy

Outreach Session: Government-to-Government Training Initiative Hosted by: FTA and National RTAP

SOUND BASICS 35305 slide 0 Key Notes difference between

APAC & ATAC July Meeting July 20/July 21, 2020 Texas Education Agency | Governance &

Way Presented by Tamera Parsons, CPHQ July 24, 2017 1 Housekeeping Slides were sent this

Algorithms for Graphical Games Luis E. Ortiz MIT CSAIL February 1, 2005 DIMACS Workshop on

Improving Undergraduate PL Curriculum Kathleen Fisher AT&T Labs Research 1 ACM SIGPLAN

High-Performance Deep Learning: Issues, Trends, and Challenges CSE - PowerPoint PPT Presentation

High-Performance Deep Learning: Issues, Trends, and Challenges CSE 5194.01 Autumn 20 Dhabaleswar K. (DK) Panda Hari Subramoni Arpan Jain The Ohio State University The Ohio State University The Ohio State University E-mail:

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Trends in High Performance Trends in High Performance Computing and the Grid Computing and the

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

Trends in High Performance Trends in High Performance Computing and Using Numerical Computing

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Deep learning for natural language processing A short primer on deep learning Benoit Favre &lt;

Relational Deep Learning: A Deep Latent Variable Model for Link Prediction Hao Wang, Xingjian

Medical Imaging Elisa Sayrol Medical Imaging Interest in this area in Deep Learning: DeepDeep

Keeping the Conversation Going Cambridge University Press Remote Teaching Series Starting Soon!

WiFuzz: Detecting and Exploiting Logical Flaws in the Wi-Fi Cryptographic Handshake Mathy

Outreach Session: Government-to-Government Training Initiative Hosted by: FTA and National RTAP

SOUND BASICS 35305 slide 0 Key Notes difference between

APAC &amp; ATAC July Meeting July 20/July 21, 2020 Texas Education Agency | Governance &amp;

Way Presented by Tamera Parsons, CPHQ July 24, 2017 1 Housekeeping Slides were sent this

Algorithms for Graphical Games Luis E. Ortiz MIT CSAIL February 1, 2005 DIMACS Workshop on

Improving Undergraduate PL Curriculum Kathleen Fisher AT&amp;T Labs Research 1 ACM SIGPLAN

Deep learning for natural language processing A short primer on deep learning Benoit Favre <

APAC & ATAC July Meeting July 20/July 21, 2020 Texas Education Agency | Governance &

Improving Undergraduate PL Curriculum Kathleen Fisher AT&T Labs Research 1 ACM SIGPLAN