ONNX Sar Sarah B ah Bird, d, Dmy Dmytro Dz o Dzhul hulgak - PowerPoint PPT Presentation

ONNX Sar Sarah B ah Bird, d, Dmy Dmytro Dz o Dzhul hulgak gakov ov Facebook

Deep Learning Frameworks

Tensors and Dynamic neural networks in Python with strong GPU acceleration Flexible Development • Research-oriented imperative model • Python flow-control constructs • Dynamic graph support with autograd http://pytorch.org Released Jan 18th 500,000+ downloads 2700+ community repos 17,200+ user posts 351 contributors

A New Lightweight, Modular, and Scalable Deep Learning Framework Production Powerhouse RUN ANYWHERE, • Scalable from small devices to large GPUs in DC FAST • Strong distributed training support • Highly optimized mobile device Your favorite deep learning technology, support now from zero to scale, cloud to mobile. • Based on ahead-of-time static graph – no interpreter needed in prod Train ImageNet in 1 hour

Research to Production Reimplementation takes weeks or months

Merge Frameworks? • Model transfer is important, but less common • Difficult to optimize the tools for all cases • Separate but interoperable tools is more efficient

Shared Model Format

Deep Learning Frameworks Zoo O(n 2 ) pairs Vendor and numeric libraries Framework backends … Qualcom Intel/Nervana SNPE Apple CoreML Nvidia TensorRT ngraph

Open Neural Network Exchange Shared model and operator representation From O(n 2 ) to O(n) pairs Vendor and numeric libraries Framework backends … Qualcom Intel/Nervana SNPE Apple CoreML Nvidia TensorRT ngraph

Standard?

Open community • Framework agnostic • GitHub from the beginning • Close partnerships and OSS contributions

Unframeworks

Unframeworks Vision: Interoperable Tools • Accelerate research to production • Developers can use the best combination of tools for them • Enables more people to contribute Approach: • Split toolchain into smaller components

UNIX philosophy for deep learning frameworks Build reusable components that work well together (across frameworks)

Framework anatomy Frontend (dev experience) Modelling abstractions Data Distributed engine High level IR / Operators gloo Framework glue Low level IR ATen code Executi Kernel Graph-level on compiler engines engine TVM, TC, XLA TensorRT, NN libraries BLAS CoreML, SNPE CUDNN, MKL, MPSCNN, ... Backend Device runtime cuBLAS, ... x86, CUDA, OpenCL, ... (HW platform)

ONNX high-level IR ReLU • Initial focus on exchange for inference BatchNorm • SSA graph structure, serializable Conv2d • Support for structured control flow • Standard operator definitions • Striking balance on granularity • Codified semantics in tests/ref • Common optimization passes

Current status • ONNX IR spec is V1.0 • Good coverage for vision models • Iterating on: • Optimization-friendly RNNs • Control Flow • More hardware backends

Beyond static graphs: Capturing dynamic behavior

Declarative vs Eager mode Python script Python interpreter Building IR Code in Python Python-independent Operator execution implementations Framework’s VM Regular python extension Operator Execution implementations engine

Tracing for static graph Record which operators were invoked def foo(x): X y = x.mm(x) print(y) # still works! 1 MatMul return y + 1 x = torch.Tensor([[1,2],[3,4]]) Add foo(x) Enough to cover CNNs and static sections

Tracing for dynamic graphs [0,0] w def foo(x, w): X[0] MatMul y = torch.zeros(1, 2) for t in x: Add y = y.mm(w) + t w return y X[1] MatMul w = torch.Tensor([[0.5, 0.2], [0.1, 0.4]]) x = torch.Tensor([[1, 2], [3, 4], [5, 6]]) Add foo(x, w) w x2 = torch.Tensor([[7, 8], [9, 10]) foo(x2, w) X[2] MatMul Doesn’t do what you want! Add

Tracing for dynamic graphs def foo(x, w): [0,0] y = torch.zeros(1, 2) for t in x: for i = range(X.shape[0]): y = y.mm(w) + t w return y MatMul X[i] w = torch.Tensor([[0.5, 0.2], [0.1, 0.4]]) Add x = torch.Tensor([[1, 2], [3, 4], [5, 6]]) foo(x, w) x2 = torch.Tensor([[7, 8], [9, 10]) y foo(x2, w) Capture control flow from python?

Approaches for dynamic graphs • Parse or compile Python (tricky) • Use special primitives (annoying) for t in x: lib.For(x, y, lambda y, t: y = y.mm(w) + t y.mm(w) + t) • Capture common patterns like RNN • Build DSL for subset of Python • Make it easy to embed C++ calling back to framework

Putting it together Capturing dynamic behavior • Trace static portions • Minimum rewrites for dynamic parts • Establish tooling for step-by-step code migration

Get Involved! ONNX is a community project. https://onnx.ai https://github.com/onnx

ONNX Sar Sarah B ah Bird, d, Dmy Dmytro Dz o Dzhul hulgak - PowerPoint PPT Presentation

ONNX Sar Sarah B ah Bird, d, Dmy Dmytro Dz o Dzhul hulgak gakov ov Facebook Deep Learning Frameworks Tensors and Dynamic neural networks in Python with strong GPU acceleration Flexible Development Research-oriented imperative model

Using ONNX for accelerated inferencing on cloud and edge Prasanth Pulavarthi (Microsoft) Kevin

CSE 5194.01: OpenAI and ONNX John Herwig CSE 5194.01 OpenAI What is OpenAI? According to their

Cryptographic Hash Functions Chester Rebeiro IIT Madras CR STINSON : chapter4 Issues with

Lists, tuples, files Genome 373 Review Python is object oriented, with many types of objects

Strongly connected components Finding strongly-connected components A strongly connected component

Partly based on slides by AnHai Doan Find houses with 2 bedrooms priced under 200K New faculty

Practical Neural Networks for NLP (Part 2) Chris Dyer, Yoav Goldberg, Graham Neubig Previous

Histograms of Oriented Gradients for Human Detection N. Dalal and B. Triggs CVPR 2005 HOG Steps

Rigorous fault-tolerance thresholds Ben Reichardt UC Berkeley N gate circuit 0/1 N gate

Densely Connected Networks (DenseNet) Densely Connected Networks (DenseNet) x [ x , ( x )),

Dynamic Graph CNN for learning on point clouds Wang Yue, et al. Otakar Jaek March 25, 2019

Concatenating bipartite graphs Paul Seymour (Princeton) joint with Maria Chudnovsky, Patrick

Publius: A robust, tamper-evident, censorship-resistant web publishing system M Waldman, A Rubin

Lecture 4: Transformations and Matrices CSE 40166 Computer Graphics (Fall 2010) Overall

1 The Minimization Problem The Minimization Problem Input: A DFA (deterministic finite-state

Satellite operators as group actions on knot concordance Arunima Ray, Rice University (Joint

Concordance of positive knots Alexander School of General Studies, GIST

Ribbon Concordance and Link Homology Theories Adam Simon Levine (with Ian Zemke, Onkar Singh

Grand Summary The Concordance: 1998-2018 ASTR/PHYS 4080: Introduction to Cosmology Spring 2018:

A new family of links topologically, but not smoothly, concordant to the Hopf link Arunima Ray

Corks, exotic 4-manifolds and knot concordance Kouichi Yasui Hiroshima University March 10,

Word Sense Disambiguation Word Sense Disambiguation (WSD) Given A

NEUTRINOS AND FUTURE CONCORDANCE COSMOLOGIES Neutrino 2008 / Richard Easther (Yale)

Part II Semistructured Data XML: II.1 Semistructured data, XPath and XML II.2 Structuring XML