A Functional Reboot for Deep Learning Conal Elliott Target August - PowerPoint PPT Presentation

A Functional Reboot for Deep Learning Conal Elliott Target August 2019 Conal Elliott A Functional Reboot for Deep Learning August 2019 1 / 23

Goal Extract the essence of DL. Shed accidental complexity and artificial limitations, i.e., simplify and generalize. Conal Elliott A Functional Reboot for Deep Learning August 2019 2 / 23

Essence Optimization: best element of a set (by objective function). Usually via differentiation and gradient following. For machine learning, sets of functions . Objective function is defined via set of input/output pairs. Conal Elliott A Functional Reboot for Deep Learning August 2019 3 / 23

Accidental complexity in deep learning Conal Elliott A Functional Reboot for Deep Learning August 2019 4 / 23

Accidental complexity in DL (overview) Imperative programming Weak typing Graphs (neural networks ) Layers Tensors/arrays Back propagation Linearity bias Hyper-parameters Manual differentiation Conal Elliott A Functional Reboot for Deep Learning August 2019 5 / 23

Imperative programming Thwarts correctness/dependability (usually “not even wrong”). Thwarts efficiency (parallelism). Unnecessary for expressiveness. Poor fit. DL is math, so express in a math language. Conal Elliott A Functional Reboot for Deep Learning August 2019 6 / 23

Weak typing Requires people to manage detail & consistency. Run-time errors. Conal Elliott A Functional Reboot for Deep Learning August 2019 7 / 23

Graphs (neural networks ) Clutters API, distracting from purpose. Purpose: a representation of functions. We already have a better one: programming language. Can we differentiate? Conal Elliott A Functional Reboot for Deep Learning August 2019 8 / 23

Graphs (neural networks ) Clutters API, distracting from purpose. Purpose: a representation of functions. We already have a better one: programming language. Can we differentiate? An issue of implementation , not language or library definition. Fix accordingly. Conal Elliott A Functional Reboot for Deep Learning August 2019 8 / 23

Layers Strong bias toward sequential composition. Neglects equally important forms: parallel & conditional. Awkward patches: “skip connections”, ResNet, HighwayNet. Don’t patch the problem; eliminate it. Replace with binary sequential, parallel, conditional composition. Conal Elliott A Functional Reboot for Deep Learning August 2019 9 / 23

“Tensors” Really, multi-dimensional arrays. Awkward: imagine you could program only with arrays (Fortran). Unsafe without dependent types. Multiple intents / weakly typed Even as linear maps: meaning of m ˆ n array? Limited: missing almost all differentiable types. Missing more natural & compositional data types, e.g., trees. Conal Elliott A Functional Reboot for Deep Learning August 2019 10 / 23

Back propagation Specialization and rediscovery of reverse-mode auto-diff. Described in terms of graphs. Highly complex due to graph formulation. Stateful: Hinders parallelism/efficiency. High memory use, limiting problem size. Conal Elliott A Functional Reboot for Deep Learning August 2019 11 / 23

Linearity bias “Dense” & “fully connected” mean arbitrary linear transformation. Sprinkle in “activation functions” as exceptions to linearity. Misses simpler and more efficient architectures. Conal Elliott A Functional Reboot for Deep Learning August 2019 12 / 23

Hyper-parameters Same essential purpose as parameters. Different mechanisms for expression and search. Inefficient and ad hoc Conal Elliott A Functional Reboot for Deep Learning August 2019 13 / 23

A functional reboot Conal Elliott A Functional Reboot for Deep Learning August 2019 14 / 23

Values Precision : meaning, reasoning, correctness. Simplicity : practical rigor/dependability. Generality : room to grow; design guidance. Conal Elliott A Functional Reboot for Deep Learning August 2019 15 / 23

Essence Optimization: best element of a set (by objective function). Usually via differentiation and gradient following. For machine learning, sets of functions . Objective function is defined via set of input/output pairs. Conal Elliott A Functional Reboot for Deep Learning August 2019 16 / 23

Optimization Describe a set of values as range of function: f :: p Ñ c . Objective function: q :: c Ñ R . Find argMin p q ˝ f q :: p . When q ˝ f is differentiable, gradient descent can help. Otherwise, other methods. Consider also global optimization, e.g., with interval methods. Conal Elliott A Functional Reboot for Deep Learning August 2019 17 / 23

Learning functions Special case of optimization, where c “ a Ñ b , i.e., f :: p Ñ p a Ñ b q , and q :: p a Ñ b q Ñ R . Objective function often based on sample set S Ď a ˆ b . Measure mis-predictions (loss). Additivity enables parallel, log-time learning step. Conal Elliott A Functional Reboot for Deep Learning August 2019 18 / 23

Differentiable functional programming Directly on Haskell (etc) programs : Not a library/DSEL No graphs/networks/layers Conal Elliott A Functional Reboot for Deep Learning August 2019 19 / 23

Differentiable functional programming Directly on Haskell (etc) programs : Not a library/DSEL No graphs/networks/layers Differentiated at compile time Simple, principled, and general ( The simple essence of automatic differentiation ) Generating efficient run-time code Amenable to massively parallel execution (GPU, etc) Conal Elliott A Functional Reboot for Deep Learning August 2019 19 / 23

Beyond “tensors” Most differentiable types are not vectors (uniform n -tuples), and most derivatives (linear maps) are not matrices. A more general alternative: Free vector space over s : i Ñ s » f s (“ i indexes f ”) Special case: Fin n Ñ s » Vec n s ˆ Algebra of representable functors : f ˆ ˆ g , 1, g ˝ f , Id Your (representable) functor via deriving Generic Linear map p f s ⊸ g s q » g p f s q » p g ˝ f q s (generalized matrix). Other representations for efficient reverse-mode AD (w/o tears). Conal Elliott A Functional Reboot for Deep Learning August 2019 20 / 23

Beyond “tensors” Most differentiable types are not vectors (uniform n -tuples), and most derivatives (linear maps) are not matrices. A more general alternative: Free vector space over s : i Ñ s » f s (“ i indexes f ”) Special case: Fin n Ñ s » Vec n s ˆ Algebra of representable functors : f ˆ ˆ g , 1, g ˝ f , Id Your (representable) functor via deriving Generic Linear map p f s ⊸ g s q » g p f s q » p g ˝ f q s (generalized matrix). Other representations for efficient reverse-mode AD (w/o tears). Use with Functor , Foldable , Traversable , Scannable , etc. No need for special/limited array “reshaping” operations. Compositional and naturally parallel-friendly ( Generic parallel functional programming ) Conal Elliott A Functional Reboot for Deep Learning August 2019 20 / 23

Modularity How to build function families from pieces, as in DL? Category of indexed sets of functions. Extract monolithic function after composing. Other uses, including satisfiability. Prototyped, but problem with GHC type-checker. Conal Elliott A Functional Reboot for Deep Learning August 2019 21 / 23

Progress Simple & efficient reverse-mode AD. Some simple regressions, simple DL, and CNN. Some implementation challenges with robustness. Looking for collaborators, including GHC internals (compiling-to-categories plugin) Background in machine learning and statistics Conal Elliott A Functional Reboot for Deep Learning August 2019 22 / 23

Summary Generalize & simplify DL (more for less). Essence of DL: pure FP with minarg . Generalize from “tensors” (for composition & safety). Collaboration welcome! Conal Elliott A Functional Reboot for Deep Learning August 2019 23 / 23

A Functional Reboot for Deep Learning Conal Elliott Target August - PowerPoint PPT Presentation

A Functional Reboot for Deep Learning Conal Elliott Target August 2019 Conal Elliott A Functional Reboot for Deep Learning August 2019 1 / 23 Goal Extract the essence of DL. Shed accidental complexity and artificial limitations, i.e.,

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Reboot Festival New Media and Digital Art Festival 10-13 October, 2019 Palcio Baldaya, Lisbon,

Feature Spotlight: Reboot Sam Doran, Senior Software Engineer, Red Hat Ansible Core Agenda

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

FFR Guided Functional FFR Guided Functional FFR Guided Functional FFR Guided Functional

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Functional Linear Models 1 66 / 181 Functional Linear Models Statistical Models So far we have

Functional Programming in 40 minutes @russolsen Functional Programming in 40 minutes

+ f(x) = Python Functional Programming Python Functional Programming Functional Programming by

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

uRPF - reboot Alexander Azimov aa@qrator.net uRPF Feasible Mode: Problem Provider A Provider B

Site report UGent Stijn De Weirdt (UGent) Quattor workshop 30/09-02/10/2014 Done since previous

A Reboot of the Starpack Build Process For Tcl 8.6 and beyond Steve Huntley Health Information

MIT Communications Futures Program Every sector of the economy and all aspects of society now

The Xen Port of Kexec / Kdump A short introduction and status report Magnus Damm Simon Horman

Web of Things Easier, more accessible descriptions for Web developers Dave Raggett

Social Media Reboot Justin Ramers Director of Social Media @JustinRamers June 2012 For audio

Ext3/4 file systems Don Porter CSE 506 Logical Diagram Binary Memory Threads Formats