Adam Paszke, Sam Gross , Soumith Chintala , Francisco Massa, Adam - - PowerPoint PPT Presentation

adam paszke sam gross soumith chintala francisco massa
SMART_READER_LITE
LIVE PREVIEW

Adam Paszke, Sam Gross , Soumith Chintala , Francisco Massa, Adam - - PowerPoint PPT Presentation

Adam Paszke, Sam Gross , Soumith Chintala , Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Alban Desmaison, Andreas Kopf, Edward Yang, Zach Devito, Martin Raison, Alykhan Tejani, Sasank


slide-1
SLIDE 1
slide-2
SLIDE 2

Adam Paszke, Sam Gross, Soumith Chintala, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Alban Desmaison, Andreas Kopf, Edward Yang, Zach Devito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy & Team

slide-3
SLIDE 3

What is PyTorch?

slide-4
SLIDE 4

What is PyTorch?

Ndarray library with GPU support automatic differentiation engine gradient based

  • ptimization package

Deep Learning Reinforcement Learning Numpy-alternative

Utilities (data loading, etc.)

slide-5
SLIDE 5

ndarray library

  • np.ndarray <-> torch.Tensor
  • 200+ operations, similar to numpy
  • very fast acceleration on NVIDIA GPUs
slide-6
SLIDE 6

ndarray library

Numpy PyTorch

slide-7
SLIDE 7

ndarray / Tensor library

slide-8
SLIDE 8

ndarray / Tensor library

slide-9
SLIDE 9

ndarray / Tensor library

slide-10
SLIDE 10

ndarray / Tensor library

slide-11
SLIDE 11

NumPy bridge

slide-12
SLIDE 12

NumPy bridge

Zero memory-copy very efficient

slide-13
SLIDE 13

NumPy bridge

slide-14
SLIDE 14

NumPy bridge

slide-15
SLIDE 15

Seamless GPU Tensors

slide-16
SLIDE 16

automatic differentiation engine

for deep learning and reinforcement learning

slide-17
SLIDE 17

PyTorch Autograd

from torch.autograd import Variable

slide-18
SLIDE 18

from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10))

PyTorch Autograd

slide-19
SLIDE 19

from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t())

MM MM

PyTorch Autograd

slide-20
SLIDE 20

from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t()) next_h = i2h + h2h

MM MM

PyTorch Autograd

slide-21
SLIDE 21

from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t()) next_h = i2h + h2h

Add MM MM

PyTorch Autograd

slide-22
SLIDE 22

from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t()) next_h = i2h + h2h next_h = next_h.tanh()

Add MM MM Tanh

PyTorch Autograd

slide-23
SLIDE 23

from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t()) next_h = i2h + h2h next_h = next_h.tanh() next_h.backward(torch.ones(1, 20))

Add MM MM Tanh

PyTorch Autograd

slide-24
SLIDE 24

Neural Networks

slide-25
SLIDE 25

Neural Networks

slide-26
SLIDE 26

Neural Networks

slide-27
SLIDE 27

Optimization package

SGD, Adagrad, RMSProp, LBFGS, etc.

slide-28
SLIDE 28

Work items in practice

Writing Dataset loaders Building models Implementing Training loop Checkpointing models Interfacing with environments Dealing with GPUs Building optimizers Building Baselines

slide-29
SLIDE 29

Work items in practice

Writing Dataset loaders Building models Implementing Training loop Checkpointing models Interfacing with environments Dealing with GPUs Building optimizers Building Baselines

Python + PyTorch - an environment to do all of this

slide-30
SLIDE 30

Writing Data Loaders

  • every dataset is slightly differently formatted
slide-31
SLIDE 31

Writing Data Loaders

  • every dataset is slightly differently formatted
  • have to be preprocessed and normalized differently
slide-32
SLIDE 32
  • every dataset is slightly differently formatted
  • have to be preprocessed and normalized differently
  • need a multithreaded Data loader to feed GPUs fast enough

Writing Data Loaders

slide-33
SLIDE 33

PyTorch solution:

  • share data loaders across the community!

Writing Data Loaders

slide-34
SLIDE 34

PyTorch solution:

  • share data loaders across the community!

Writing Data Loaders

slide-35
SLIDE 35

PyTorch solution:

  • use regular Python to write Datasets:

leverage existing Python code

Writing Data Loaders

slide-36
SLIDE 36

PyTorch solution:

  • use regular Python to write Datasets:

leverage existing Python code Example: ParlAI

Writing Data Loaders

slide-37
SLIDE 37

PyTorch solution:

  • Code in practice

Writing Data Loaders

slide-38
SLIDE 38

PyTorch solution:

  • Code in practice

Core Philisophy Upcoming Features Pain Points Research Workflows

Writing Data Loaders

slide-39
SLIDE 39

PyTorch solution:

  • Code in practice

Core Philisophy Upcoming Features Pain Points Research Workflows

Writing Data Loaders

slide-40
SLIDE 40

Interfacing with environments

Cars Video games Internet

slide-41
SLIDE 41

Interfacing with environments

Cars Video games Pretty much every environment provides a Python API

slide-42
SLIDE 42

Interfacing with environments

Cars Video games Natively interact with the environment directly

slide-43
SLIDE 43

Debugging

  • PyTorch is a Python extension
slide-44
SLIDE 44

Debugging

  • PyTorch is a Python extension
  • Use your favorite Python debugger
slide-45
SLIDE 45

Debugging

  • PyTorch is a Python extension
  • Use your favorite Python debugger
slide-46
SLIDE 46

Debugging

  • PyTorch is a Python extension
  • Use your favorite Python debugger
  • Use the most popular debugger:
slide-47
SLIDE 47

Debugging

  • PyTorch is a Python extension
  • Use your favorite Python debugger
  • Use the most popular debugger:

print(foo)

slide-48
SLIDE 48

Identifying bottlenecks

  • PyTorch is a Python extension
  • Use your favorite Python profiler
slide-49
SLIDE 49

Identifying bottlenecks

  • PyTorch is a Python extension
  • Use your favorite Python profiler: Line_Profiler
slide-50
SLIDE 50

Compilation Time

  • PyTorch is written for the impatient
slide-51
SLIDE 51

Compilation Time

  • PyTorch is written for the impatient
  • Absolutely no compilation time when writing your scripts

whatsoever

slide-52
SLIDE 52

Compilation Time

  • PyTorch is written for the impatient
  • Absolutely no compilation time when writing your scripts

whatsoever

  • All core kernels are pre-compiled
slide-53
SLIDE 53

Distributed PyTorch

  • MPI style distributed communication
  • Broadcast Tensors to other nodes
  • Reduce Tensors among nodes
  • for example: sum gradients among all nodes
slide-54
SLIDE 54

Visualization

TensorBoard-PyTorch Visdom

https://github.com/lanpa/tensorboard-pytorch https://github.com/facebookresearch/visdom

slide-55
SLIDE 55

Ecosystem

  • Use the entire Python ecosystem at your will
slide-56
SLIDE 56

Ecosystem

  • Use the entire Python ecosystem at your will
  • Including SciPy, Scikit-Learn, etc.
slide-57
SLIDE 57

Ecosystem

  • Use the entire Python ecosystem at your will
  • Including SciPy, Scikit-Learn, etc.
slide-58
SLIDE 58

Ecosystem

  • A shared model-zoo:
slide-59
SLIDE 59

Ecosystem

  • A shared model-zoo:
slide-60
SLIDE 60

Ecosystem

  • Probabilistic Programming

http://pyro.ai/ github.com/probtorch/probtorch

slide-61
SLIDE 61

Ecosystem

  • Gaussian Processes

https://github.com/cornellius-gp/gpytorch

slide-62
SLIDE 62

Ecosystem

  • QRNN - 2x to 17x faster than LSTM

https://github.com/salesforce/pytorch-qrnn

slide-63
SLIDE 63

Ecosystem

  • CycleGAN, pix2pix

https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix

slide-64
SLIDE 64

Ecosystem

  • Machine Translation

https://github.com/OpenNMT/OpenNMT-py https://github.com/facebookresearch/fairseq-py

slide-65
SLIDE 65

Ecosystem

  • AllenNLP

http://allennlp.org/

slide-66
SLIDE 66

Ecosystem

  • Pix2PixHD

https://github.com/NVIDIA/pix2pixHD

slide-67
SLIDE 67

Ecosystem

  • Sentiment Discovery

https://github.com/NVIDIA/sentiment-discovery

slide-68
SLIDE 68

Ecosystem

  • FlowNet2: Optical Flow Estimation with Deep Networks

https://github.com/NVIDIA/flownet2-pytorch

slide-69
SLIDE 69

http://pytorch.org With ❤ from