CS535 Big Data 3/25/2020 Week 8-B Sangmi Lee Pallickara CS535 Big - PDF document

CS535 Big Data 3/25/2020 Week 8-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University CS535 BIG DATA FAQs • CS535 Online • Please read announcement on Canvas • If you have any questions, please post on Piazza PART B. GEAR SESSIONS SESSION 2: MACHINE LEARNING FOR BIG DATA Sangmi Lee Pallickara Computer Science, Colorado State University http://www.cs.colostate.edu/~cs535 CS535 Big Data | Computer Science | Colorado State University CS535 Big Data | Computer Science | Colorado State University Topics of Todays Class • Distributed PyTorch • Some common advanced optimizations • You will use it for your term project GEAR Session 2. Machine Learning for Big Data • Automatic Differentiation with Backpropagation Lecture 4. Distributed Neural Networks-PyTorch • Computation Graph PyTorch: Introduction • Distributed PyTorch Application CS535 Big Data | Computer Science | Colorado State University CS535 Big Data | Computer Science | Colorado State University This material is built based on Observations • Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., • Array-based programming Gimelshein, N., Antiga, L. and Desmaison, A., 2019. PyTorch: An imperative style, high- • Multidimensional arrays (A.K.A. tensors) became critical mathematical data type performance deep learning library. In Advances in Neural Information Processing • Automatic differentiation enabled fully automated computing of derivatives Systems (pp. 8024-8035). • Open-source Python ecosystem for numerical analysis • Baydin, A.G., Pearlmutter, B.A., Radul, A.A. and Siskind, J.M., 2017. Automatic • NumPy, SciPy and Pandas differentiation in machine learning: a survey. The Journal of Machine Learning • Availability and commoditization of general-purpose massively parallel hardware Research , 18 (1), pp.5595-5637. • GPUs • Writing Distributed Applications with PyTorch, • Specialized libraries, cuDNN https://pytorch.org/tutorials/intermediate/dist_tuto.html • Caffe, Torch7, TensorFlow take advantage of these hardware accelerators • PyTorch vs TensorFlow — spotting the difference, https://towardsdatascience.com/pytorch-vs-tensorflow-spotting-the-difference- 25c75777377b http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 1

CS535 Big Data 3/25/2020 Week 8-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University CS535 Big Data | Computer Science | Colorado State University Building Generative Adversarial Networks Programming Environment • Coping with increased computational complexity Generator • Easy implementation of new neural network architectures Discriminator • Layers • Expressed as Python classes • Models Setting up two • Classes that compose layers separate Loss models at the Function for same time the Discriminator Loss Function for the Generator CS535 Big Data | Computer Science | Colorado State University CS535 Big Data | Computer Science | Colorado State University Training Networks • Gradient based optimization is critical to deep learning • Automatically compute gradients of models specified by users GEAR Session 2. Machine Learning for Big Data • Challenge Lecture 4. Distributed Neural Networks-PyTorch • Python is a dynamic programming language that allows changing most behaviors at runtime PyTorch: Automatic Differentiation • PyTorch uses the operator overloading approach • builds up a representation of the computed function every time it is executed CS535 Big Data | Computer Science | Colorado State University CS535 Big Data | Computer Science | Colorado State University What is an Automatic Differentiation (AD)? Is AD same as Symbolic Differentiation? • No • A set of techniques to numerically evaluate the derivative of a function specified by a • Symbolic differentiation breaks apart a complex expression into a bunch of simpler computer program expressions by using various rules • Automatic Differentiation lets you compute exact derivatives in constant time • Examples • Sum rule: ! = ! !" # $ + ! !" # $ + & $ !" & $ • Constant rule: ! !" ( = 0 • Derivatives of powers rule: ! !" $ * = +$ *,- • Disadvantages • For complicated functions, the result expression can be extremely large • Wasteful to keep around intermediate symbolic expressions if we only need a numeric value of the gradient in the end • Prone to error http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 2

CS535 Big Data 3/25/2020 Week 8-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University CS535 Big Data | Computer Science | Colorado State University Is AD same as Numeric Differentiation? Numeric Differentiation • No • Numeric Differentiation is an algorithm for estimating the • Pros derivative of a mathematical function or function subroutine • A powerful tool to check the correctness of implementation, usually use h = 1e-6. • Example: Simple approximation of the first derivative • Cons • Two-point estimation • Rounding error and slow to compute • Slope of a nearby secant line through the points (x, f(x)) and (x+h, f(x+h)) for small number h • !′($) ≈ ' ()* +'(() * • Where we assume that h>0 CS535 Big Data | Computer Science | Colorado State University CS535 Big Data | Computer Science | Colorado State University AD with a Simple Example Taylor’s series with a dual number • Dual numbers • Plain Taylor’s series • Numbers of the form ! + #$ ,where $ % = 0 )(+) )1 2 )11 2 . − " % = ! " + • ! " = ∑ %&' ( . − " + (. − ") 4 +⋯ • Suppose that there are two dual numbers, ! + #$ and ( + )$ %! 3! 4! • Approximate f about a real number " + 6 6 4 =0 • ! + #$ + ( + )$ = ! + ( + # + ) $ • ! + #$ × ( + )$ = !( + !) + #( $ + #)$ % = !( + !) + #( $ )1 2 )11 2 6 4 + ⋯ • ! " + 6 = ! " + 3! 6 + 4! • ! " + 6 = ! " + 6!′(") • Example 6 4 =0 • ! . = . 4 + 1 • ! . + 6 = (. + 6) 4 +1 = . 4 + 6 4 + 2.6 + 1 • ! . + 6 = . 4 + 1 + 2xϵ • Therefore 2x is the derivative of . 4 + 1 CS535 Big Data | Computer Science | Colorado State University CS535 Big Data | Computer Science | Colorado State University Training Neural Networks • A forward pass to compute the value of the loss function • A backward pass to compute the gradients of the learnable parameters GEAR Session 2. Machine Learning for Big Data Lecture 4. Distributed Neural Networks-PyTorch PyTorch: Automatic Differentiation Backpropagation http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 3

CS535 Big Data 3/25/2020 Week 8-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University CS535 Big Data | Computer Science | Colorado State University Backpropagation Backpropagation $% $% $( $) = $( $) x x z=f(x, y) z=f(x, y) Operator f Operator f $% $% $( $& = !" $( $& !# y y Computing gradient becomes local computation CS535 Big Data | Computer Science | Colorado State University CS535 Big Data | Computer Science | Colorado State University Simple Backpropagation Example Simple Backpropagation Example # # • ! = • ! = #$%&(()*(+,+*(-,-) #$%&(()*(+,+*(-,-) w 1 w 1 1.0 * * 3.0 x 1 x 1 3.0 + + -1.0 w 2 w 2 -2.0 * * -4.0 1.0 -1.0 0.37 1.37 0.73 + + x 2 1/x x 2 *(-1) exp +1 2.0 *(-1) exp +1 1/x 2.0 w 0 w 0 CS535 Big Data | Computer Science | Colorado State University CS535 Big Data | Computer Science | Colorado State University Simple Backpropagation Example Simple Backpropagation Example # # • ! = • ! = #$%&(()*(+,+*(-,-) #$%&(()*(+,+*(-,-) w 1 1.0 w 1 1.0 * * 3.0 3.0 x 1 x 1 3.0 3.0 + + -1.0 -1.0 w 2 w 2 -2.0 -2.0 * -4.0 -4.0 1.0 -1.0 0.37 1.37 0.73 * -1.0 0.37 1.37 0.73 ex 1.0 ex + + x 2 x 2 2.0 *(-1) +1 1/x 2.0 *(-1) +1 1/x p -0.53 1 p 1 -0.53 -0.53 2.0 1 2! 2.0 3! w 0 ! / = / → 2/ = −1// 5 w 0 ! / = / + 1 → 3/ = 1 26 26 2! 2/ = −1// 5 2/ = 2! http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 4

CS535 Big Data 3/25/2020 Week 8-B Sangmi Lee Pallickara CS535 Big - PDF document

CS535 Big Data 3/25/2020 Week 8-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University CS535 BIG DATA FAQs CS535 Online Please read announcement on Canvas If you have any questions, please post on

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

CS535 Big Data 3/4/2020 Week 7-B Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 2/5/2020 Week 3- B Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 1/29/2020 Week 2- B Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 4/13/2020 Week 12-A Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 4/27/2020 Week 14-A Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 2/10/2019 Week 4-A Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 2/24/2020 Week 5-B Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 1/27/2020 Week 2-A Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 2/12/2020 Week 4-B Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 03/02/2020 Week 7-A Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 3/9/2020 Week 8-A Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 2/19/2020 Week 5-B Sangmi Lee Pallickara CS535 Big Data | Computer Science |

FAQs Lossy Algorithm http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State

CS535 Big Data 1/30/2019 Week 2- B Sangmi Lee Pallickara Week 2-A-0 Week 2-A-1 1/30/2019

Distributed Implementation of the Triplets View CS535 Big Data | Computer Science | Colorado State

Lost in Translation The Challenge of Managing Microservices Mirko Novakovic CEO & Co-founder

Ms. Alonna Barnhart SAF/USMX 24 May 11 APPLICATIONS Condition the enterprise to develop

Open Router Platforms: Is it time to move to an open routing infrastructure? Thomas Woo Bell

The state of the Community Gabriele Columbro Executive Director Open Source Strategy Forum,

Data-Intensive Distributed Computing CS 431/631 451/651 (Fall 2019) Part 1: MapReduce Algorithm

Open. Scalable. Intelligent? Free Mind Unstructured Open Too Source Ended For Business

End-to-End Delay Guarantees for Real-Time Systems using SDN Rakesh Kumar , Monowar Hasan, Smruti

The Center of the Premium Video Economy. AT THE HEART OF PREMIUM VIDEO ECONOMY PROVIDING

CS535 Big Data 3/25/2020 Week 8-B Sangmi Lee Pallickara CS535 Big - PDF document

CS535 Big Data 3/25/2020 Week 8-B Sangmi Lee Pallickara CS535 Big Data | Computer Science | Colorado State University CS535 BIG DATA FAQs CS535 Online Please read announcement on Canvas If you have any questions, please post on

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

CS535 Big Data 3/4/2020 Week 7-B Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 2/5/2020 Week 3- B Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 1/29/2020 Week 2- B Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 4/13/2020 Week 12-A Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 4/27/2020 Week 14-A Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 2/10/2019 Week 4-A Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 2/24/2020 Week 5-B Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 1/27/2020 Week 2-A Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 2/12/2020 Week 4-B Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 03/02/2020 Week 7-A Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 3/9/2020 Week 8-A Sangmi Lee Pallickara CS535 Big Data | Computer Science |

CS535 Big Data 2/19/2020 Week 5-B Sangmi Lee Pallickara CS535 Big Data | Computer Science |

FAQs Lossy Algorithm http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State

CS535 Big Data 1/30/2019 Week 2- B Sangmi Lee Pallickara Week 2-A-0 Week 2-A-1 1/30/2019

Distributed Implementation of the Triplets View CS535 Big Data | Computer Science | Colorado State

Lost in Translation The Challenge of Managing Microservices Mirko Novakovic CEO &amp; Co-founder

Ms. Alonna Barnhart SAF/USMX 24 May 11 APPLICATIONS Condition the enterprise to develop

Open Router Platforms: Is it time to move to an open routing infrastructure? Thomas Woo Bell

The state of the Community Gabriele Columbro Executive Director Open Source Strategy Forum,

Data-Intensive Distributed Computing CS 431/631 451/651 (Fall 2019) Part 1: MapReduce Algorithm

Open. Scalable. Intelligent? Free Mind Unstructured Open Too Source Ended For Business

End-to-End Delay Guarantees for Real-Time Systems using SDN Rakesh Kumar , Monowar Hasan, Smruti

The Center of the Premium Video Economy. AT THE HEART OF PREMIUM VIDEO ECONOMY PROVIDING

Lost in Translation The Challenge of Managing Microservices Mirko Novakovic CEO & Co-founder