Pluto A Distributed Heterogeneous Deep Learning Framework Jun Yang, - PowerPoint PPT Presentation

Pluto A Distributed Heterogeneous Deep Learning Framework Jun Yang, Yan Chen Large Scale Learning, Alibaba Cloud

Outline • PAI(Platform of Artificial Intelligence) • PAI Overview • Deep Learning with PAI • Pluto • PAI DL Application • Chatbot Engine • Summary 2

Machine Learning Platforms 3

PAI Overview PAI WEB Console PAI IDE Frontend Feature Statistic Machine Deep Algorithms …… Engineering Methods Learning Learning PAI SDK Serving MR/MPI/PS/Graph/Pluto… Distributed Computing Fuxi Scheduler CPU/GPU/FPGA/ASIC/… Data Database: Streaming data: OSS Storage ODPS/RDS DataHub/TT/Kafka Tutorial: data.aliyun.com 4

PAI Project Search Experiment Data Source Component Model Serving 5

Machine Learning with PAI Data Feature Deep Statistics Modeling Application Preprocessing Engineering Learning Feature Binary Sampling & Correlation Transformatio Classificatio DNN NLP Filtering Coefficients n n Multiple Feature Search & Data Merge Classificatio CNN Histogram Selection Rec. n Pluto Fill Missing Feature Image Hypothesis Clustering RNN Values Importance Test Process Normalizatio Feature Network Regression A La Carte Visualization n Generation Analysis Financial Prediction … … Section Evaluation … 6

Deep Learning with PAI 7

PAI TensorFlow • Rich Data IO • Distributed Job Optimization (Multi. GPU/CPUs) • Easy model Serving • Hyper Parameter Tuning 8

Pluto 9

Single-card Optimization • Compiler-oriented strategy • Fuse small ops into bigger one • Reduce CUDA kernel launch overhead • Prepare data layout friendly with low-level computation library • Memory optimization • Here again compiler-oriented tactics • Dependency analysis • Lifetime analysis 10

Multi-cards Optimization • Heuristic-based Model Parallelism • Both model weights and feature map taken into consideration • Memory allocator strategy taken into consideration A greedy allocation algorithm • With pre-run support • 11

Multi-cards Optimization • Hybrid-parallelism • Mixture of data-parallelism and model-parallelism • For communication-intensive parts, consider model-parallelism • For computation-intensive parts, consider data-parallelism • Tricks • Integrate seamlessly with computation graph style • Happier with pyramid network 12

Multi-cards Optimization • Hybrid-parallelism(cont.) M40 Result K40 Result 13

Multi-cards Optimization • Late-multiply • Customized for fully-connected layers • Trade-off between computation and communication W avg : [N l ,N l+1 ], X:[M, N l ], E:[M, N l+1 ], here N l ,N l+1 layer sizes, M is mini-batch size 14

Multi-cards Optimization • Late-multiply(cont.) 15

Multi-cards Optimization • Heuristic-based MA • Automatic batch-size selection • Learning rate auto-tuning • Happier with sequential model 16

Multi-cards Optimization • Heuristic-based MA(cont.) Model Metrics Training Time in Wallclock 17

Inference Optimization • Quantization • Significantly reduce model size(4X) • Around 2X speed-up on average • Binarized Neural Network • Binarize model weights • Convert floating point computation into bit manipulation • Both model size and computation speed significantly improved • Training process needs to be manipulated to compensate for accuracy • Happier with CNN, but for RNN… 18

PAI DL Application 19

AliMe – Personal Assistant Bot in E-commerce AliMe for AliMe for AliMe for Sellers Customers Enterprises From 海青@云栖大会 20 20

Open-Domain Conversations • Retrieval Model • Learning to rank Q 1 -A 1 : s 1 Q 2 -A 2 : s 2 QA pairs A1 Query Q 3 -A 3 : s 3 Knowledge Base ... Q n -A n : s n • Generation Model • Sequence to Sequence (Seq2Seq) Model Cho et al., 2014 • Recurrent Neural Networks: LSTM, GRU (our choice) 21

A Hybrid Conversation Model based on Seq2Seq • Overview Yes IR Score > Answer Rerank Query Output Candidates T No Chat logs Seq2Seq Answer QA pairs Model Generation SNS data KnowledgeBase Retrieval Module Seq2Seq Based Rerank and Generation Modules 22 [AliMe Chat: Minghui Qiu et al., ACL 2017]

PAI DL Support for AliMe • Both the offline training and online serving backed by PAI • Through heuristic-based MA, the offline training task has 2.8X convergence speed-up with 4 cards setting • Through quantization, the online serving task has 1.5X speed-up on commodity CPU servers. 23

数据智能触手可及 Conclusion • PAI DL SCAN BARCODE � • End2end machine learning platform START YOUR TRIAL � • Support big data analytics • Optimized Deep learning algorithms • Scheduling on CPU／GPU cloud • More data intelligence… • Pluto • Distributed optimization engine of PAI DL • PAI DL Application • PAI DL makes it easy to build DL methods for industrial applications 24

We are hiring! J muzhuo.yj@alibaba-inc.com chenyan.cy@alibaba-inc.com 25

Reference • AliMe Chat: A Sequence to Sequence and Rerank based Chatbot Engine, Minghui Qiu et al., ACL 2017. • Deep Learning with PAI: a Case Study of AliMe, Minghui Qiu et al., Deep Learning Summit 2017. • TensorFlow in AliMe, Jun Yang et al., Shanghai GDG Mar., 2017. 26

Thanks!

Pluto A Distributed Heterogeneous Deep Learning Framework Jun Yang, - PowerPoint PPT Presentation

Pluto A Distributed Heterogeneous Deep Learning Framework Jun Yang, Yan Chen Large Scale Learning, Alibaba Cloud Outline PAI(Platform of Artificial Intelligence) PAI Overview Deep Learning with PAI Pluto PAI DL Application

Stupid Pluto Tricks with the ADALM-PLUTO FOSDEM 2018 ROBIN GETZ MICHAEL HENNERICH 02/04/2018

Pluto and Charon From SINFONI Observations Francesca DeMeo and Christophe Dumas June 17, 2008

Ceres & Pluto (and many, many, many more) Damian G. Allis, Ph.D. NASA Solar System Ambassador

Brain Network Analysis with Pluto Micah Chambers Laboratory of Neuro Imaging Graph Theory and

Constraining Pluto's system with GAIA Laurne Beauvalet Valry Lainey, Jean-Eudes Arlot,

Recent Observations of Plutos System Probing Pluto's Atmosphere, Charon's Size and Orbital

Extending Pluto-Style Polyhedral Scheduling with Consecutivity Sven Verdoolaege 1 Alexandre Isoard

Pluto surface composition from New Horizons LEISA data New Horizons COMP team European Planetary

Pluto Scheduling Algorithm By Athanasios Konstantinidis Supervisor Paul H. J. Kelly About Me

Dr. S. Alan Stern Principal Investigator NASAs New Horizons Mission to the Pluto System

Planning Land Use and Transport Outlook 2040 Claire Finn and Toms Campbell Department of

Automatjc task-based parallelizatjon of Python codes Cristin Ramn-Corts Ramon Amela

Scalable Polyhedral Compilation, Syntax vs. Semantics: 10 in the First Round IMPACT

THE STORYBOARD After a fantastic 3 months in space, the explorers are ready to head home to

Physics of the tau lepton Jorge Portols Instituto de Fsica Corpuscular CSIC-UVEG, Valencia

Whats new in Cocoon? Carsten Ziegeler cziegeler@apache.org Competence Center Open Source

DATA TRANSFERS Maximillian Schrems v. Data Protection Commissioner Court of Justice of the

INSTITUTIONAL SELF Eligibility Requirements (ERs) EVALUATION Standard I-IV: 128 Standards

Internet Society Chapters Advisory Council Topics brought before the BoT BoT meeting, Panama

Dronebot Jean Ouedraogo CTO & co-founder at Opendarasa Outline Problem were trying to

Committee Meeting June 7, 2017 PRESENTED BY Finance and Administration Florida Agricultural and

Reliability and Security Technical Committee Update Greg Ford, Chair Member Representatives

Todays Agenda Welcome and introductions Background Table break out: Connecting

: The economics of job and s mall bus ines s creation. J obenomics Wes t Virginia Goal:

Pluto A Distributed Heterogeneous Deep Learning Framework Jun Yang, - PowerPoint PPT Presentation

Pluto A Distributed Heterogeneous Deep Learning Framework Jun Yang, Yan Chen Large Scale Learning, Alibaba Cloud Outline PAI(Platform of Artificial Intelligence) PAI Overview Deep Learning with PAI Pluto PAI DL Application

Stupid Pluto Tricks with the ADALM-PLUTO FOSDEM 2018 ROBIN GETZ MICHAEL HENNERICH 02/04/2018

Pluto and Charon From SINFONI Observations Francesca DeMeo and Christophe Dumas June 17, 2008

Ceres &amp; Pluto (and many, many, many more) Damian G. Allis, Ph.D. NASA Solar System Ambassador

Brain Network Analysis with Pluto Micah Chambers Laboratory of Neuro Imaging Graph Theory and

Constraining Pluto's system with GAIA Laurne Beauvalet Valry Lainey, Jean-Eudes Arlot,

Recent Observations of Plutos System Probing Pluto's Atmosphere, Charon's Size and Orbital

Extending Pluto-Style Polyhedral Scheduling with Consecutivity Sven Verdoolaege 1 Alexandre Isoard

Pluto surface composition from New Horizons LEISA data New Horizons COMP team European Planetary

Pluto Scheduling Algorithm By Athanasios Konstantinidis Supervisor Paul H. J. Kelly About Me

Dr. S. Alan Stern Principal Investigator NASAs New Horizons Mission to the Pluto System

Planning Land Use and Transport Outlook 2040 Claire Finn and Toms Campbell Department of

Automatjc task-based parallelizatjon of Python codes Cristin Ramn-Corts Ramon Amela

Scalable Polyhedral Compilation, Syntax vs. Semantics: 10 in the First Round IMPACT

THE STORYBOARD After a fantastic 3 months in space, the explorers are ready to head home to

Physics of the tau lepton Jorge Portols Instituto de Fsica Corpuscular CSIC-UVEG, Valencia

Whats new in Cocoon? Carsten Ziegeler cziegeler@apache.org Competence Center Open Source

DATA TRANSFERS Maximillian Schrems v. Data Protection Commissioner Court of Justice of the

INSTITUTIONAL SELF Eligibility Requirements (ERs) EVALUATION Standard I-IV: 128 Standards

Internet Society Chapters Advisory Council Topics brought before the BoT BoT meeting, Panama

Dronebot Jean Ouedraogo CTO &amp; co-founder at Opendarasa Outline Problem were trying to

Committee Meeting June 7, 2017 PRESENTED BY Finance and Administration Florida Agricultural and

Reliability and Security Technical Committee Update Greg Ford, Chair Member Representatives

Todays Agenda Welcome and introductions Background Table break out: Connecting

: The economics of job and s mall bus ines s creation. J obenomics Wes t Virginia Goal:

Ceres & Pluto (and many, many, many more) Damian G. Allis, Ph.D. NASA Solar System Ambassador

Dronebot Jean Ouedraogo CTO & co-founder at Opendarasa Outline Problem were trying to