Economical machine learning via functional programming
Big Data Scala by the Bay – August 18, 2015
David Andrzejewski - @davidandrzej Data Sciences Engineering, Sumo Logic
Economical machine learning via functional programming Big Data - - PowerPoint PPT Presentation
Economical machine learning via functional programming Big Data Scala by the Bay August 18, 2015 David Andrzejewski - @davidandrzej Data Sciences Engineering, Sumo Logic Sumo Logic Confidential Sumo Logic Machine data intelligence
Big Data Scala by the Bay – August 18, 2015
David Andrzejewski - @davidandrzej Data Sciences Engineering, Sumo Logic
Sumo Logic
This talk
Machine learning
– take our jobs? – annihilate humanity?
– robots studying – heads with gears in them
So hot right now
What kinds of “stuff” can machines learn to do?
And how do they do it?
What How
Model
Estimate Predict
Rise of complementary goods More Cloud
(source: Forrester via Forbes)
Moore’s Law
(source: Fairchild via computerhistory.org)
More Data
(source: IDC via The Economist)
Technical debt
“...you are sure that it will make further changes harder in the future.” – Martin Fowler
ML: new & exciting ways to shoot yourself in the foot
Trough of disillusionment?
Machine Learning: The High Interest Credit Card of Technical Debt
Two big challenges in machine learning Léon Bottou (ICML 2015 keynote)
– test and debug – safely improve – manage data/features
– erode boundaries – glue / hack / duct tape A Systems View of Machine Learning Joshua Bloom (PyData 2015 keynote)
Machine Learning
Essential Complexity Out of the Tar Pit
Moseley & Marks (2006) – h/t Paco Nathan
Incidental Complexity Actual problem “Reality tax” Business logic Implementation detail SQL Hadoop Declarative Imperative
Control “complexity spend” with Functional Programming
Functional programming (FP)
– no side effects – referential transparency
– immutability – 1st class functions – higher-order functions
– version 7.1.3 – see “learning scalaz” blog post series by eed3si9n (sp?)
Case 0: your code, does it work?
– Case class wrappers – Unboxed tagged types
Step 0: use the types
def ¡getData(datasetId: ¡Long, ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡startTime: ¡Long, ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡endTime: ¡Long) ¡ def ¡getData(datasetId: ¡DatasetId, ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡timeRange: ¡DatasetInterval) ¡
Case 0: your code, does it work?
Step 1: unit testing
Testing for data scientists Trey Causey (PyData 2015)
Case 0: your code, does it work?
Step 2: property testing
– bounded output – find weird edge cases (e.g., empty clusters)
Universal Conditional
Case 0: your code, does it work?
Step 3: statistical estimators are functions
– customer generator: sets of datasets
Approach Case 1: loose coupling via type class pattern Example Hard-wired
def ¡foo(x: ¡MyBuzzType) ¡ ¡
Parametric polymorphism
def ¡add[T](x: ¡T) ¡ ¡
Variance annotation
class ¡Stack[+T] ¡
Ad-hoc polymorphism
def ¡sort[T ¡: ¡Ordering] ¡ (xs: ¡List[T]) ¡
Advantages of type classes
– Timestamped[T] – Featurable[T] – Labeled[T]
need label info
Type class laws
¡
Property Testing + Type Classes
Case 2: Monoids + Monoids = Monoids
Implementing Monoid
(I believe Shapeless can do this automagically...!?)
Map(TestGroup ¡-‑> ¡Results(79,119,171,14), ¡ ¡ ¡ ¡ ¡ ¡ControlGroup ¡-‑> ¡Results(34,77,136,112)) ¡
Distributed compute via monoid homomorphism
See: Twitter Algebird and related talks, Jimmy Lin “Monoidify!” paper
DATA ¡
DATA ¡ DATA ¡
Monoidal classifiers: 400x faster than Weka
Algebraic Classifiers: a generic approach to fast cross-validation, online training, and parallel training - Izbicki, ICML13
Key trick: prefix-sum
Case 3: auditing computation with Writer Monad
Understanding multiclass predictions (credit: Kumar Avijit)
f(x) = argmax
i
wT
i x
Confusion matrix with “max significant feature”
Tracking illuminates “bad features”
How did we do that?
Writer Monad in simple drawings
How did we do that?
Writer Monad in simple drawings
How did we do that?
Writer Monad in simple drawings
Case 4: stateful traversal
Example: sampling from p-th order autoregressive model
!! = !!
! !!!
!!!! + !!!
Case 4: stateful traversal
Re-arrange to take current window state as input
!! = !!
! !!!
!!!! + !!!
Case 4: stateful traversal
Partially apply the function for fixed parameters
!! = !!
! !!!
!!!! + !!!
Case 4: stateful traversal
Map function over random noise terms
!! = !!
! !!!
!!!! + !!!
Now we’ve got something like g: ¡Window ¡=> ¡(Window, ¡Prediction) ¡
1. Convert each position into independent State calculation 2. Traverse/Sequence to convert List[State] → State[List] 3. Supply initial window state and run()
Monoids, Monads, who cares?
– Monoids: generalized addition/combination – Monads: computation within context
– type-checking – generalized wiring – optimization opportunities – common vocabulary
design
composition
randomized behavior
Correctness
design pattern
via ad hoc polymorphism
classes
Monoid design Monad design
Manage ML tech debt with functional programming
Loose coupling
structures
general plumbing
distributed computation
validation
model prediction with Writer
failure handling