a framework for new-generation AI Research Soumith Chintala, Adam - - PowerPoint PPT Presentation

a framework for new generation ai research
SMART_READER_LITE
LIVE PREVIEW

a framework for new-generation AI Research Soumith Chintala, Adam - - PowerPoint PPT Presentation

a framework for new-generation AI Research Soumith Chintala, Adam Paszke, Sam Gross & Team Facebook AI Research Paradigm shifts in AI research Active Research & Tools for AI Today's AI Future AI keeping up with change Today's AI


slide-1
SLIDE 1
slide-2
SLIDE 2

a framework for new-generation AI Research

Soumith Chintala, Adam Paszke, Sam Gross & Team

Facebook AI Research

slide-3
SLIDE 3

Paradigm shifts in AI research

slide-4
SLIDE 4

Today's AI Active Research & Future AI Tools for AI

keeping up with change

slide-5
SLIDE 5

Today's AI Future AI Tools for AI

Today's AI

DenseCap by Justin Johnson & group

https://github.com/jcjohnson/densecap

slide-6
SLIDE 6

Today's AI Future AI Tools for AI

Today's AI

DeepMask by Pedro Pinhero & group

slide-7
SLIDE 7

Today's AI Future AI Tools for AI

Today's AI

Machine Translation

slide-8
SLIDE 8

Today's AI Future AI Tools for AI

Today's AI

Text Classification (sentiment analysis etc.) Text Embeddings Graph embeddings Machine Translation Ads ranking

slide-9
SLIDE 9

Data

BatchNorm

ReLU Conv2d

Model Objective Train Model

Today's AI Future AI Tools for AI

Today's AI

slide-10
SLIDE 10

Data

BatchNorm

ReLU Conv2d

Model Objective Train Model

BatchNorm

ReLU Conv2d

Deploy & Use

New Data

Prediction

Today's AI Future AI Tools for AI

Today's AI

slide-11
SLIDE 11

Data

BatchNorm

ReLU Conv2d

Objective Train Model

BatchNorm

ReLU Conv2d

Deploy & Use

New Data

Prediction

Today's AI Future AI Tools for AI

Static datasets + Static model structure Today's AI

slide-12
SLIDE 12

Data

BatchNorm

ReLU Conv2d

Objective Train Model

BatchNorm

ReLU Conv2d

Deploy & Use

New Data

Prediction

Today's AI Future AI Tools for AI

Static datasets + Static model structure Offline Learning Today's AI

slide-13
SLIDE 13

Today's AI Future AI Tools for AI

Current AI Research / Future AI

Self-driving Cars

slide-14
SLIDE 14

Today's AI Future AI Tools for AI

Current AI Research / Future AI

Agents trained in many environments Cars Video games Internet

slide-15
SLIDE 15

Today's AI Future AI Tools for AI

Current AI Research / Future AI

Dynamic Neural Networks self-adding new memory or layers changing evaluation path based on inputs

slide-16
SLIDE 16

Today's AI Future AI Tools for AI

Current AI Research / Future AI

Live data

BatchNorm

ReLU Conv2d

Prediction Continued Online Learning

slide-17
SLIDE 17

Today's AI Future AI Tools for AI

Current AI Research / Future AI

Sample-1

BatchNorm

ReLU Conv2d

Prediction Data-dependent change in model structure

B a t c h N
  • r
m ReLU Conv2d B a t c h N
  • r
m ReLU Conv2d BatchNorm ReLU Conv2d BatchNorm ReLU Conv2d BatchNorm ReLU Conv2d BatchNorm R e L U C
  • n
v 2 d
slide-18
SLIDE 18

Today's AI Future AI Tools for AI

Current AI Research / Future AI

Sample-2

BatchNorm

ReLU Conv2d

Prediction Data-dependent change in model structure

B a t c h N
  • r
m ReLU Conv2d B a t c h N
  • r
m ReLU Conv2d BatchNorm ReLU Conv2d BatchNorm

ReLU Conv2d

BatchNorm R e L U C
  • n
v 2 d
slide-19
SLIDE 19

Today's AI Future AI Tools for AI

Current AI Research / Future AI

Sample

BatchNorm

ReLU Conv2d

Prediction

BatchNorm ReLU Conv2d BatchNorm ReLU Conv2d

Change in model-capacity at runtime

BatchNorm ReLU Conv2d BatchNorm ReLU Conv2d BatchNorm ReLU Conv2d BatchNorm ReLU Conv2d
slide-20
SLIDE 20

Today's AI Future AI Tools for AI

Current AI Research / Future AI

Sample

BatchNorm

ReLU Conv2d

Prediction

BatchNorm ReLU Conv2d BatchNorm ReLU Conv2d

Change in model-capacity at runtime

BatchNorm ReLU Conv2d BatchNorm ReLU Conv2d BatchNorm ReLU Conv2d BatchNorm ReLU Conv2d BatchNorm ReLU Conv2d BatchNorm ReLU Conv2d BatchNorm ReLU Conv2d BatchNorm ReLU Conv2d
slide-21
SLIDE 21

Today's AI Future AI Tools for AI

The need for a dynamic framework

  • Interop with many dynamic environments
  • Connecting to car sensors should be as easy as training on a dataset
  • Connect to environments such as OpenAI Universe
  • Dynamic Neural Networks
  • Change behavior and structure of neural network at runtime
  • Minimal Abstractions
  • more complex AI systems means harder to debug without a simple API
slide-22
SLIDE 22

Today's AI Future AI Tools for AI

Tools for AI research and deployment

Many machine learning tools and deep learning frameworks

slide-23
SLIDE 23

Today's AI Future AI Tools for AI

Tools for AI research and deployment

Static graph frameworks

define-by-run

Dynamic graph frameworks

define-and-run

slide-24
SLIDE 24

Today's AI Future AI Tools for AI

Dynamic graph Frameworks

  • Model is constructed on the fly at runtime
  • Change behavior, structure of model
  • Imperative style of programming
slide-25
SLIDE 25
slide-26
SLIDE 26

Overview

Ndarray library with GPU support automatic differentiation engine gradient based

  • ptimization package

Deep Learning Reinforcement Learning Numpy-alternative

slide-27
SLIDE 27

ndarray library

with GPU support

slide-28
SLIDE 28

ndarray library

  • np.ndarray <-> torch.Tensor
  • 200+ operations, similar to numpy
  • very fast acceleration on NVIDIA GPUs
slide-29
SLIDE 29

ndarray library

Numpy PyTorch

slide-30
SLIDE 30

ndarray / Tensor library

slide-31
SLIDE 31

ndarray / Tensor library

slide-32
SLIDE 32

ndarray / Tensor library

slide-33
SLIDE 33

ndarray / Tensor library

slide-34
SLIDE 34

NumPy bridge

slide-35
SLIDE 35

NumPy bridge

Zero memory-copy very efficient

slide-36
SLIDE 36

NumPy bridge

slide-37
SLIDE 37

NumPy bridge

slide-38
SLIDE 38

Seamless GPU Tensors

slide-39
SLIDE 39

Seamless GPU Tensors

A full suite of high performance Tensor operations on GPU

slide-40
SLIDE 40

GPUs are fast

  • Buy $700 NVIDIA 1080Ti
  • 100x faster matrix multiply
  • 10x faster operations in general on matrices
slide-41
SLIDE 41

automatic differentiation engine

for deep learning and reinforcement learning

slide-42
SLIDE 42

Deep Learning Frameworks

  • Provide gradient computation
  • Gradient of one variable w.r.t. any variable in graph

Add MM MM Tanh

slide-43
SLIDE 43

Deep Learning Frameworks

  • Provide gradient computation
  • Gradient of one variable w.r.t. any variable in graph

Add MM MM Tanh

d(i2h)/d(W_h)

slide-44
SLIDE 44

Deep Learning Frameworks

  • Provide gradient computation
  • Gradient of one variable w.r.t. any variable in graph
  • Provide integration with high

performance DL libraries like CuDNN

Add MM MM Tanh

d(h2h)/d(W_h)

slide-45
SLIDE 45

PyTorch Autograd

from torch.autograd import Variable

slide-46
SLIDE 46

from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10))

PyTorch Autograd

slide-47
SLIDE 47

from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t())

MM MM

PyTorch Autograd

slide-48
SLIDE 48

from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t()) next_h = i2h + h2h

MM MM

PyTorch Autograd

slide-49
SLIDE 49

from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t()) next_h = i2h + h2h

Add MM MM

PyTorch Autograd

slide-50
SLIDE 50

from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t()) next_h = i2h + h2h next_h = next_h.tanh()

Add MM MM Tanh

PyTorch Autograd

slide-51
SLIDE 51

from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t()) next_h = i2h + h2h next_h = next_h.tanh() next_h.backward(torch.ones(1, 20))

Add MM MM Tanh

PyTorch Autograd

slide-52
SLIDE 52

side by side: TensorFlow and PyTorch

slide-53
SLIDE 53

High performance

  • Integration of:
  • CuDNN v6
  • NCCL
  • Intel MKL
  • 200+ operations, similar to numpy
  • very fast acceleration on NVIDIA GPUs

Upcoming feature:
 Distributed PyTorch

slide-54
SLIDE 54

Planned Feature: JIT Compilation

slide-55
SLIDE 55

Compilation benefits

Out-of-order execution

1 2 3 3 1 2

BatchNorm

ReLU Conv2d

Kernel fusion Automatic work placement

Node 0 GPU 0 Node 1 GPU 1 Node 1 CPU Node 0 GPU 1 Node 1 GPU 0 Node 0 GPU 0

slide-56
SLIDE 56

JIT Compilation

  • Possible in define-by-run Frameworks
  • The key idea is deferred or lazy evaluation
  • y = x + 2
  • z = y * y
  • # nothing is executed yet, but the graph is being constructed
  • print(z) # now the entire graph is executed: z = (x+2) * (x+2)
  • We can do just in time compilation on the graph before execution
slide-57
SLIDE 57

Lazy Evaluation

from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t()) next_h = i2h + h2h next_h = next_h.tanh() next_h.backward(torch.ones(1, 20))

Add MM MM Tanh

slide-58
SLIDE 58

Lazy Evaluation

from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t()) next_h = i2h + h2h next_h = next_h.tanh()

Add MM MM Tanh

Graph built but not actually executed

slide-59
SLIDE 59

Lazy Evaluation

from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t()) next_h = i2h + h2h next_h = next_h.tanh() print(next_h)

Add MM MM Tanh

Data accessed. Execute graph.

slide-60
SLIDE 60

Lazy Evaluation

  • A little bit of time between building and executing graph
  • Use it to compile the graph just-in-time

Add MM MM Tanh

slide-61
SLIDE 61

JIT Compilation

  • Fuse and optimize operations

Add MM MM Tanh

Fuse operations. Ex:

slide-62
SLIDE 62

JIT Compilation

  • Cache subgraphs

Add MM MM Tanh

I've seen this part of the graph before let me pull up the compiled version from cache

slide-63
SLIDE 63

JIT Compilation

  • Possible in Dynamic Frameworks
  • The key idea is deferred or lazy evaluation
  • y = x + 2
  • z = y * y
  • # nothing is executed yet, but the graph is being constructed
  • print(z) # now the entire graph is executed: z = (x+2) * (x+2)
  • We can do just in time compilation on the graph before execution
  • We can cache repeating patterns in subsets of the graph
  • to avoid recompilation
  • Compiler is very different from Ahead-of-time compiler
  • fast compilation
  • compile traces rather than full graph
slide-64
SLIDE 64

Summary

slide-65
SLIDE 65
  • Fast ndarray library with GPU support
  • Build the latest neural networks and do gradient based

learning using the autograd and neural network package

  • large community of people, many companies using and

contributing

slide-66
SLIDE 66

http://pytorch.org Released Jan 18th 58,000+ downloads 250+ community repos 6100+ user posts 524k+ forum views With ❤ from You?

slide-67
SLIDE 67