Machine Learning: Opening the Pandora’s Box
By Dhiana Deva - Machine Learning Engineer at Spotifz QCon São Paulo - May 2019
Machine Learning: Opening the Pandoras Box By Dhiana Deva - Machine - - PowerPoint PPT Presentation
Machine Learning: Opening the Pandoras Box By Dhiana Deva - Machine Learning Engineer at Spoti fz QCon So Paulo - May 2019 Agenda About me Open the Pandoras Box Start with simple Aim to the skies Hit half-way there Enjoy the
By Dhiana Deva - Machine Learning Engineer at Spotifz QCon São Paulo - May 2019
About me Open the Pandora’s Box Start with simple Aim to the skies Hit half-way there Enjoy the journey!
is like opening the Pandora’s Box
Problems Problems Problems P r
l e m s
is like opening the Pandora’s Box
Problems Problems Problems P r
l e m s
is like opening the Pandora’s Box
Assumptions Constraints Issues R i s k s
What decisions can you afgect? What are the system implications? What does your ML Infra support?
Illustration from the book "Creative People Must be Stopped” By David A. Owens
Business Constraints
Data Constraints
Systems Constraints
Investigate, communicate, and address it strategically by either:
WARNING: Hitting an unexpected critical constraint too late in the process can kill your ML product!
You might not have enough data to back your hypothesis. Historical data is biased by existing heuristics. The hypothesis behind your ML product might be based on a critical assumption.
Assumptions bridging between "Known Unknowns" and "Known Knowns" KNOWN UNKNOWN KNOWN
ASSUMPTIONS
UNKNOWN
impact analysis and further more sophisticated approaches.
assumptions fjrst.
data.
Is there latency introduced? Did the systems need to be changed, decoupled or refactored? Issues from systems implications might impact your metrics and should not be attributed to Machine Learning.
You don’t want to compare apples and oranges! vs vs
Data
System
Other
vs vs
Running A/A Tests
What to expect?
In case a discrepancy is found on the A/A Test analysis:
Decide whether to fjx it based on its impact size
vs vs vs vs Run an A/A/B Test if time sensitive! But only trust the A/B part once you validated the A/A part!
Optimizing for metric A might lead to risking metric B.
"If you optimize your business to maximize one metric, something important happens. Just like
squeezing it in one place makes it bulge out in another.”
Quote from the book “Lean Analytics” by Benjamin Yoskovitz and Alistair Croll
Before experimenting
Ps: Same goes when collecting randomised data.
After experiment
Illustration from the book “Feature Engineering for Machine Learning" by Alice Zheng and Amanda Casari.
“Type a quote here.”
Quote from the book "Doing Data Science" by Cathy O’Neil and Rachel Schutt. Chapter contributed by Claudia Perlich.
“Doing simple sanity checking to make sure things are what you think they are can sometimes get you much further in the end than web scraping and a big fancy machine learning algorithm. It may not seem cool and sexy, but it’s smart and good
publishable research, but at least it’s legitimate and solid work.”
Illustration from the "Analytics Solutions Unifjed Method” ASUM-DM by IBM
Addressing the constraints, assumptions, risks and issues.
Illustration from the "Analytics Solutions Unifjed Method” ASUM-DM by IBM
A s s u m p t i
s Constraints I s s u e s R i s k s A s s u m p t i
s C
s t r a i n t s I s s u e s R i s k s
Addressing the constraints, assumptions, risks and issues.
Illustration from the "Analytics Solutions Unifjed Method” ASUM-DM by IBM
A s s u m p t i
s Constraints I s s u e s R i s k s A s s u m p t i
s C
s t r a i n t s R i s k s
Addressing the constraints, assumptions, risks and issues.
Illustration from the "Analytics Solutions Unifjed Method” ASUM-DM by IBM
Constraints I s s u e s R i s k s A s s u m p t i
s C
s t r a i n t s R i s k s
Addressing the constraints, assumptions, risks and issues.
Illustration from the "Analytics Solutions Unifjed Method” ASUM-DM by IBM
I s s u e s R i s k s A s s u m p t i
s C
s t r a i n t s R i s k s
Addressing the constraints, assumptions, risks and issues.
Illustration from the "Analytics Solutions Unifjed Method” ASUM-DM by IBM
R i s k s A s s u m p t i
s C
s t r a i n t s R i s k s
Addressing the constraints, assumptions, risks and issues.
Illustration from the "Analytics Solutions Unifjed Method” ASUM-DM by IBM
R i s k s A s s u m p t i
s C
s t r a i n t s
Illustration from the paper "Hidden Technical Debt in Machine Learning Systems” by D Sculley et al (Google) - 2015
ML Systems are complex systems!
Illustration adapted from the paper "Hidden Technical Debt in Machine Learning Systems” by D Sculley et al (Google) - 2015
Start with simple!
Illustration adapted from the paper "Hidden Technical Debt in Machine Learning Systems” by D Sculley et al (Google) - 2015
Iterate with strategical proportional investments across the ML stack.
Illustration adapted from the paper "Hidden Technical Debt in Machine Learning Systems” by D Sculley et al (Google) - 2015
And so on…
Machine Learning is a powerful tool, but buy-in and sponsorship is much needed. A big vision is vital for Machine Learning products.
Multi-Objective Optimization Multiple Levers Multiple ML Products
Learning.
Try to understand how decisions impact outcomes. Learn more: check out the slides from the tutorial "Algorithmic Bias in Practice" at ACM FAT*2019.
Illustration from “AAAI 2017 Spring Symposium Series - Designing the UX of ML Systems” by Henriette Cramer and Jenn Thom
Open the Pandora’s Box Start with simple Aim for the skies Hit half-way there Enjoy the journey!
dhiana@spotifz.com @dhianadeva on Twitter