a framework for new-generation AI Research Soumith Chintala, Adam - PowerPoint PPT Presentation

a framework for new-generation AI Research Soumith Chintala, Adam Paszke, Sam Gross & Team Facebook AI Research

Paradigm shifts in AI research

Active Research & Tools for AI Today's AI Future AI keeping up with change

Today's AI DenseCap by Justin Johnson & group https://github.com/jcjohnson/densecap Future AI Tools for AI Today's AI

Today's AI DeepMask by Pedro Pinhero & group Future AI Tools for AI Today's AI

Today's AI Machine Translation Future AI Tools for AI Today's AI

Today's AI Text Classification (sentiment analysis etc.) Text Embeddings Graph embeddings Machine Translation Ads ranking Future AI Tools for AI Today's AI

Today's AI Model BatchNorm Train Model Conv2d Objective ReLU Data Future AI Tools for AI Today's AI

Today's AI Model BatchNorm Train Model Conv2d Objective ReLU Data BatchNorm New Conv2d Prediction Deploy & Use ReLU Data Future AI Tools for AI Today's AI

Today's AI Static datasets + Static model structure BatchNorm Train Model Conv2d Objective ReLU Data BatchNorm New Conv2d Prediction Deploy & Use ReLU Data Future AI Tools for AI Today's AI

Today's AI Static datasets + Static model structure BatchNorm Train Model Conv2d Objective ReLU Data O ffl ine Learning BatchNorm New Conv2d Prediction Deploy & Use ReLU Data Future AI Tools for AI Today's AI

Current AI Research / Future AI Self-driving Cars Future AI Tools for AI Today's AI

Current AI Research / Future AI Agents trained in many environments Cars Video games Internet Future AI Tools for AI Today's AI

Current AI Research / Future AI Dynamic Neural Networks self-adding new memory or layers changing evaluation path based on inputs Future AI Tools for AI Today's AI

Current AI Research / Future AI BatchNorm Live Conv2d Prediction ReLU data Continued Online Learning Future AI Tools for AI Today's AI

Prediction Data-dependent change in model structure Tools for AI Current AI Research / Future AI ReLU R e L U BatchNorm BatchNorm Conv2d C o n v 2 d ReLU ReLU BatchNorm BatchNorm Future AI Conv2d Conv2d ReLU ReLU B a t m c h r o N N o r h m c t a B Conv2d Conv2d ReLU BatchNorm Conv2d Today's AI Sample -1

Current AI Research / Future AI BatchNorm Conv2d ReLU ReLU C o n v BatchNorm 2 B a d t c h N o r R m e Conv2d BatchNorm L ReLU U Conv2d ReLU BatchNorm Conv2d Sample -2 Prediction B a t c h N Conv2d o r m ReLU Data-dependent change in model structure Future AI Tools for AI Today's AI

Prediction Change in model-capacity at runtime Tools for AI Current AI Research / Future AI ReLU ReLU ReLU BatchNorm BatchNorm BatchNorm Conv2d Conv2d Conv2d Future AI ReLU ReLU ReLU BatchNorm BatchNorm BatchNorm Conv2d Conv2d Conv2d ReLU BatchNorm Conv2d Today's AI Sample

Prediction Change in model-capacity at runtime Tools for AI Current AI Research / Future AI ReLU ReLU ReLU ReLU ReLU BatchNorm BatchNorm BatchNorm BatchNorm BatchNorm Conv2d Conv2d Conv2d Conv2d Conv2d Future AI ReLU ReLU ReLU ReLU ReLU BatchNorm BatchNorm BatchNorm BatchNorm BatchNorm Conv2d Conv2d Conv2d Conv2d Conv2d ReLU BatchNorm Conv2d Today's AI Sample

The need for a dynamic framework • Interop with many dynamic environments - Connecting to car sensors should be as easy as training on a dataset - Connect to environments such as OpenAI Universe • Dynamic Neural Networks - Change behavior and structure of neural network at runtime • Minimal Abstractions - more complex AI systems means harder to debug without a simple API Future AI Tools for AI Today's AI

Tools for AI research and deployment Many machine learning tools and deep learning frameworks Future AI Tools for AI Today's AI

Tools for AI research and deployment Static graph frameworks Dynamic graph frameworks define-by-run define-and-run Future AI Tools for AI Today's AI

Dynamic graph Frameworks • Model is constructed on the fly at runtime • Change behavior, structure of model • Imperative style of programming Future AI Tools for AI Today's AI

Overview automatic di ff erentiation Ndarray library gradient based engine with GPU support optimization package Deep Learning Numpy-alternative Reinforcement Learning

ndarray library with GPU support

ndarray library • np.ndarray <-> torch.Tensor • 200+ operations, similar to numpy • very fast acceleration on NVIDIA GPUs

ndarray library PyTorch Numpy

ndarray / Tensor library

NumPy bridge

NumPy bridge Zero memory-copy very e ffi cient

NumPy bridge

Seamless GPU Tensors

Seamless GPU Tensors A full suite of high performance Tensor operations on GPU

GPUs are fast • Buy $700 NVIDIA 1080Ti • 100x faster matrix multiply • 10x faster operations in general on matrices

automatic di ff erentiation engine for deep learning and reinforcement learning

Deep Learning Frameworks - Provide gradient computation - Gradient of one variable w.r.t. any variable in graph MM MM Add Tanh

Deep Learning Frameworks - Provide gradient computation - Gradient of one variable w.r.t. any variable in graph d(i2h)/d(W_h) MM MM Add Tanh

Deep Learning Frameworks - Provide gradient computation - Gradient of one variable w.r.t. any variable in graph - Provide integration with high performance DL libraries like CuDNN MM MM d(h2h)/d(W_h) Add Tanh

PyTorch Autograd from torch.autograd import Variable

PyTorch Autograd from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10))

PyTorch Autograd from torch.autograd import Variable x = Variable(torch.randn(1, 10)) MM MM prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t())

PyTorch Autograd from torch.autograd import Variable x = Variable(torch.randn(1, 10)) MM MM prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t()) next_h = i2h + h2h

PyTorch Autograd from torch.autograd import Variable x = Variable(torch.randn(1, 10)) MM MM prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) Add i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t()) next_h = i2h + h2h

PyTorch Autograd from torch.autograd import Variable x = Variable(torch.randn(1, 10)) MM MM prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) Add i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t()) next_h = i2h + h2h Tanh next_h = next_h.tanh()

PyTorch Autograd from torch.autograd import Variable x = Variable(torch.randn(1, 10)) MM MM prev_h = Variable(torch.randn(1, 20)) W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) Add i2h = torch.mm(W_x, x.t()) h2h = torch.mm(W_h, prev_h.t()) next_h = i2h + h2h Tanh next_h = next_h.tanh() next_h.backward(torch.ones(1, 20))

side by side: TensorFlow and PyTorch

High performance • Integration of: • CuDNN v6 • NCCL • Intel MKL • 200+ operations, similar to numpy • very fast acceleration on NVIDIA GPUs Upcoming feature:   Distributed PyTorch

Planned Feature: JIT Compilation

Compilation benefits Out-of-order Automatic Kernel fusion execution work placement Node 1 ReLU CPU 1 2 3 3 1 2 Node 1 GPU 1 Node 0 BatchNorm GPU 0 Node 1 Node 0 GPU 0 GPU 1 Conv2d Node 0 GPU 0

JIT Compilation • Possible in define-by-run Frameworks • The key idea is deferred or lazy evaluation - y = x + 2 - z = y * y - # nothing is executed yet, but the graph is being constructed - print(z) # now the entire graph is executed: z = (x+2) * (x+2) • We can do just in time compilation on the graph before execution

Lazy Evaluation from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) MM MM W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) i2h = torch.mm(W_x, x.t()) Add h2h = torch.mm(W_h, prev_h.t()) next_h = i2h + h2h next_h = next_h.tanh() Tanh next_h.backward(torch.ones(1, 20))

Lazy Evaluation from torch.autograd import Variable x = Variable(torch.randn(1, 10)) prev_h = Variable(torch.randn(1, 20)) MM MM W_h = Variable(torch.randn(20, 20)) W_x = Variable(torch.randn(20, 10)) i2h = torch.mm(W_x, x.t()) Add h2h = torch.mm(W_h, prev_h.t()) next_h = i2h + h2h Graph built but not actually executed next_h = next_h.tanh() Tanh

a framework for new-generation AI Research Soumith Chintala, Adam - PowerPoint PPT Presentation

a framework for new-generation AI Research Soumith Chintala, Adam Paszke, Sam Gross & Team Facebook AI Research Paradigm shifts in AI research Active Research & Tools for AI Today's AI Future AI keeping up with change Today's AI

Hardware Observability Framework Hardware Observability Framework Hardware Observability

TITANIUM EYEWEAR DESIGNED IN ICELAND, MADE IN ITALY AGNAR NEW NEW NEW ALBA NEW NEW NEW

A Framework for Automatic Generation A Framework for Automatic Generation of Configuration Files

Procedural Generation Lauri Kongas What is procedural generation? Procedural Generation It is

Procedural Generation Kaarel T onisson 2018-04-20 Kaarel T onisson Procedural Generation

Play Framework One Web Framework to rule them all Felix Mller Agenda Yet another web

ICTs in education Moving from 1 Generation to 2 Generation models a framework for program

Introducing the new Predator 68 New Predator 68 New Predator 68 New Predator 68 New Predator 68

sentecacommerce.com Evolution of eCommerce 1 st Generation 2nd Generation Next Generation ERP

The New Cooling Unit Generation Blue e+ 1 Die neue Khlgerte-Generation Blue e+ The Blue e+

Antonella Bogoni CNIT-TECIP Microwave Signal Generation High purity carrier generation

Code Generation Machine code generation cs4713 1 Machine code generation machine Intermediate

Electricity Generation U.S. vs. Ky Ky Elec. Generation by Type U.S. Elec. Generation by Type

GANocracy Outline Background: Text Generation Latent-Variable Generation Learning

Mobile Communications Wireless Telecommunication Systems Systems Generation 1 (1G)

Generation Z Libraries and the Post-Millennial Generation

Convergent Perturbation Theory for the lattice 4 -model Aleksandr Ivanov 1 , Vasily Sazonov 2 ,

DirectConnect: A Guide to A Direct Care Career Jennifer Rabalais, MA Project Director Institute

Lattice Flavour Physics N. Tantalo Rome University Tor Vergata and INFN sez. Tor

The Connected Mobility Global Forecast Quantifying future connected car services April 2016 -

CIEL A universal execution engine for distributed data-flow computing Murray, Derek G., et al.

EXPECTATIONS FROM PK-PD MODELLING AND SIMULATION IN THE EVALUATION OF MEDICINES IN CHILDREN Pr

University of Virginia Board of Visitors Richard Chait, Professor Emeritus, Harvard University

Investigation of MSWI mechanical properties based on the amount of reactive compounds PhD