december 5 2019 zach tatlock
play

December 5, 2019 Zach Tatlock OctoML Lets Get in the Wayback - PowerPoint PPT Presentation

2nd TVM and Deep Learning Compilation Conference December 5, 2019 Zach Tatlock OctoML Lets Get in the Wayback Machine 2018 Lets Get in the Wayback Machine 2018 Challenges for Deep Learning IRs State-of-the-art models increasingly


  1. 2nd TVM and Deep Learning Compilation Conference December 5, 2019

  2. Zach Tatlock OctoML

  3. Let’s Get in the Wayback Machine 2018

  4. Let’s Get in the Wayback Machine 2018

  5. Challenges for Deep Learning IRs • State-of-the-art models increasingly depend on: • Datatypes - lists, trees, graphs • Control flow - branches, loops, recursion let encode = λ st. • Whole-program analyses and optimizations if(...): encode(step(st)) else: ... • Any one feature “easy to bolt on” • Folklore suggests full, expressive IR will be slow

  6. Challenges for Deep Learning IRs • State-of-the-art models increasingly depend on: • Datatypes - lists, trees, graphs • Control flow - branches, loops, recursion let encode = λ st. • Whole-program analyses and optimizations if(...): encode(step(st)) else: ... • Any one feature “easy to bolt on” • Folklore suggests full, expressive IR will be slow

  7. The Relay IR • Relay generalizes NNVM • Retains graph-level optimizations • Provides more expressive features • Datatypes, control flow, code re-use • Functional semantics to simplify analysis • Automatic differentiation + optimizations ~ “OCaml for ML”

  8. Relay: Expressiveness + Performance • High-level Relay models match NNVM in traditional vision inference Relay Inference Mean Speedup

  9. Relay: Expressiveness + Performance • High-level Relay models match NNVM in traditional vision inference Relay Inference Mean Speedup

  10. Relay: Expressiveness + Performance • Low-cost abstraction enabled by: • Tensor shape inference and specialization • High-level operator fusion • Whole-program partial evaluation

  11. Relay: Expressiveness + Performance • Low-cost abstraction enabled by: • Tensor shape inference and specialization But most of all by extensible, composable optimization framework! • High-level operator fusion • Whole-program partial evaluation

  12. Relay Win: Support for New Models • High-level Relay models for RNNs and LSTMs can outperform the rest Relay Inference Mean Speedup

  13. Relay Win: Support for New Models • High-level Relay models for RNNs and LSTMs can outperform the rest Plus support for new/improved targets via high-level transformations: Relay Inference Mean Speedup

  14. Relay Win: Support for New Models • High-level Relay models for RNNs and LSTMs can outperform the rest Plus support for new/improved targets via high-level transformations: Relay Inference Mean Speedup

  15. Research Ready ➡ Production Ready

  16. Relay + You! • Relay merged in to TVM mainline • Documentation, tutorials, examples • Add your own analyses and optimizations • Target new accelerators + many more amazing folks! • Support new models • Tons of community support!

  17. Relay + You! • Relay merged in to TVM mainline • Documentation, tutorials, examples • Add your own analyses and optimizations • Target new accelerators + many more amazing folks! • Support new models • Tons of community support! You!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend