Machine Learning: Opening the Pandoras Box By Dhiana Deva - Machine - - PowerPoint PPT Presentation

machine learning opening the pandora s box
SMART_READER_LITE
LIVE PREVIEW

Machine Learning: Opening the Pandoras Box By Dhiana Deva - Machine - - PowerPoint PPT Presentation

Machine Learning: Opening the Pandoras Box By Dhiana Deva - Machine Learning Engineer at Spoti fz QCon So Paulo - May 2019 Agenda About me Open the Pandoras Box Start with simple Aim to the skies Hit half-way there Enjoy the


slide-1
SLIDE 1

Machine Learning: Opening the Pandora’s Box

By Dhiana Deva - Machine Learning Engineer at Spotifz QCon São Paulo - May 2019

slide-2
SLIDE 2

Agenda

About me Open the Pandora’s Box Start with simple Aim to the skies Hit half-way there Enjoy the journey!

slide-3
SLIDE 3

About Me

slide-4
SLIDE 4

Me @ QCon Rio 2015

slide-5
SLIDE 5

Me @ QCon São Paulo 2019

slide-6
SLIDE 6

Open the Pandora’s Box!

slide-7
SLIDE 7

Introducing Machine Learning

slide-8
SLIDE 8

Introducing Machine Learning

is like opening the Pandora’s Box

Problems Problems Problems P r

  • b

l e m s

slide-9
SLIDE 9

Introducing Machine Learning

is like opening the Pandora’s Box

Problems Problems Problems P r

  • b

l e m s

slide-10
SLIDE 10

Introducing Machine Learning

is like opening the Pandora’s Box

Assumptions Constraints Issues R i s k s

slide-11
SLIDE 11

Constraints

slide-12
SLIDE 12

Be aware (not afraid)

  • f constraints

What decisions can you afgect? What are the system implications? What does your ML Infra support?

Illustration from the book "Creative People Must be Stopped” By David A. Owens

slide-13
SLIDE 13

Example Constraints

Business Constraints

  • Metrics
  • Business logic
  • Legal needs

Data Constraints

  • Volume
  • Features
  • Labels

Systems Constraints

  • Available levers
  • Infrastructure support
  • Systems implications
  • Engineering efgort
slide-14
SLIDE 14

Addressing Constraints

Investigate, communicate, and address it strategically by either:

  • Accepting and working under its boundaries
  • Expanding its boundaries

WARNING: Hitting an unexpected critical constraint too late in the process can kill your ML product!

slide-15
SLIDE 15

Assumptions

slide-16
SLIDE 16

"You have no idea, but you pretend you know."

You might not have enough data to back your hypothesis. Historical data is biased by existing heuristics. The hypothesis behind your ML product might be based on a critical assumption.

Assumptions bridging between "Known Unknowns" and "Known Knowns" KNOWN UNKNOWN KNOWN

ASSUMPTIONS

UNKNOWN

slide-17
SLIDE 17

Example Assumptions

  • Are the metrics sensitive to the levers the ML approach is pulling?
  • How do customers behave under changes in the logic?
  • Impact analysis assumptions:
  • Cost of misclassifjcation
  • Benefjt of correct classifjcation
  • Assumptions for worst case scenario
  • Parameters for more optimistic scenarios
slide-18
SLIDE 18

Addressing Assumptions

  • Experiment early and focus on learning parameters needed for better

impact analysis and further more sophisticated approaches.

  • Consider reframing initial problems to be solved, to validate most critical

assumptions fjrst.

  • To be able to more forward with an unbiased approach, collect randomized

data.

slide-19
SLIDE 19

Issues

slide-20
SLIDE 20

Machine Learning itself might not be the issue!

Is there latency introduced? Did the systems need to be changed, decoupled or refactored? Issues from systems implications might impact your metrics and should not be attributed to Machine Learning.

You don’t want to compare apples and oranges! vs vs

slide-21
SLIDE 21

Example Issues

Data

  • Instrumentation
  • Metrics

System

  • Latency
  • Bugs

Other

  • UX
  • CX
slide-22
SLIDE 22

A/A Test

vs vs

slide-23
SLIDE 23

Unveiling Issues

Running A/A Tests

  • A: existing system, existing heuristic
  • A*: new system, existing heuristic
  • ML “turned-ofg”
  • Bypassing the ML decision

What to expect?

  • A should be equal A*:
  • Operational metrics
  • Business metrics
  • CS metrics
  • If two A’s perform difgerent:
  • Trust me, there’s an issue!
  • Time to investigate!
slide-24
SLIDE 24

Addressing Issues

In case a discrepancy is found on the A/A Test analysis:

  • Which metric is showing discrepancies?
  • What could have caused it?
  • What is the impact of this discrepancy?

Decide whether to fjx it based on its impact size

slide-25
SLIDE 25

A/A/B Test

vs vs vs vs Run an A/A/B Test if time sensitive! But only trust the A/B part once you validated the A/A part!

slide-26
SLIDE 26

Risks

slide-27
SLIDE 27

Careful about "Squeeze Toys"

Optimizing for metric A might lead to risking metric B.

"If you optimize your business to maximize one metric, something important happens. Just like

  • ne of those bulging stress-relief squeeze toys,

squeezing it in one place makes it bulge out in another.”

Quote from the book “Lean Analytics” by Benjamin Yoskovitz and Alistair Croll

slide-28
SLIDE 28

Addressing Risks

Before experimenting

  • Simulate worst case scenarios
  • Simulate random baseline

Ps: Same goes when collecting randomised data.

After experiment

  • Calculate experiment costs
slide-29
SLIDE 29

Start with simple!

slide-30
SLIDE 30

Illustration from the book “Feature Engineering for Machine Learning" by Alice Zheng and Amanda Casari.

“Type a quote here.”

slide-31
SLIDE 31

Quote from the book "Doing Data Science" by Cathy O’Neil and Rachel Schutt. Chapter contributed by Claudia Perlich.

“Doing simple sanity checking to make sure things are what you think they are can sometimes get you much further in the end than web scraping and a big fancy machine learning algorithm. It may not seem cool and sexy, but it’s smart and good

  • practice. People might not invite you to a meetup to talk about it. It may not be

publishable research, but at least it’s legitimate and solid work.”

slide-32
SLIDE 32

Iterate!

Illustration from the "Analytics Solutions Unifjed Method” ASUM-DM by IBM

slide-33
SLIDE 33

Iterate!

Addressing the constraints, assumptions, risks and issues.

Illustration from the "Analytics Solutions Unifjed Method” ASUM-DM by IBM

A s s u m p t i

  • n

s Constraints I s s u e s R i s k s A s s u m p t i

  • n

s C

  • n

s t r a i n t s I s s u e s R i s k s

slide-34
SLIDE 34

Iterate!

Addressing the constraints, assumptions, risks and issues.

Illustration from the "Analytics Solutions Unifjed Method” ASUM-DM by IBM

A s s u m p t i

  • n

s Constraints I s s u e s R i s k s A s s u m p t i

  • n

s C

  • n

s t r a i n t s R i s k s

slide-35
SLIDE 35

Iterate!

Addressing the constraints, assumptions, risks and issues.

Illustration from the "Analytics Solutions Unifjed Method” ASUM-DM by IBM

Constraints I s s u e s R i s k s A s s u m p t i

  • n

s C

  • n

s t r a i n t s R i s k s

slide-36
SLIDE 36

Iterate!

Addressing the constraints, assumptions, risks and issues.

Illustration from the "Analytics Solutions Unifjed Method” ASUM-DM by IBM

I s s u e s R i s k s A s s u m p t i

  • n

s C

  • n

s t r a i n t s R i s k s

slide-37
SLIDE 37

Iterate!

Addressing the constraints, assumptions, risks and issues.

Illustration from the "Analytics Solutions Unifjed Method” ASUM-DM by IBM

R i s k s A s s u m p t i

  • n

s C

  • n

s t r a i n t s R i s k s

slide-38
SLIDE 38

Iterate!

Addressing the constraints, assumptions, risks and issues.

Illustration from the "Analytics Solutions Unifjed Method” ASUM-DM by IBM

R i s k s A s s u m p t i

  • n

s C

  • n

s t r a i n t s

slide-39
SLIDE 39

Illustration from the paper "Hidden Technical Debt in Machine Learning Systems” by D Sculley et al (Google) - 2015

ML Systems are complex systems!

slide-40
SLIDE 40

Illustration adapted from the paper "Hidden Technical Debt in Machine Learning Systems” by D Sculley et al (Google) - 2015

Start with simple!

slide-41
SLIDE 41

Illustration adapted from the paper "Hidden Technical Debt in Machine Learning Systems” by D Sculley et al (Google) - 2015

Iterate with strategical proportional investments across the ML stack.

slide-42
SLIDE 42

Illustration adapted from the paper "Hidden Technical Debt in Machine Learning Systems” by D Sculley et al (Google) - 2015

And so on…

slide-43
SLIDE 43

Aim for the skies!

slide-44
SLIDE 44

What’s the limit of what’s achievable?

Machine Learning is a powerful tool, but buy-in and sponsorship is much needed. A big vision is vital for Machine Learning products.

slide-45
SLIDE 45

Questions - cheat sheet

  • What if you had all the levers that you could possibly pull?
  • What if you could optimize all the aspects of the business and user experience?
  • What if you would break it down to multiple Machine Learning products?
  • What if you had all the data you would like to use?
  • What if you had the ideal Machine Learning infrastructure?
  • What if you would use the ideal Machine Learning model and approach?
  • What if you had all monitoring in place to quickly catch any issues?
slide-46
SLIDE 46

Vision - cheat sheet

Improve _____ and reduce _____ by _____ the right _____ and _____ with the right _____ and the right _____

Multi-Objective Optimization Multiple Levers Multiple ML Products

slide-47
SLIDE 47

Hit half-way there!

slide-48
SLIDE 48

Good enough is better than perfect!

  • You might discover other interesting opportunities for Machine Learning.
  • You might discover other interesting opportunities even without Machine

Learning.

  • You might discover there’s a third party service for your domain.
  • Machine Learning is as part of the solution, not the whole solution.
slide-49
SLIDE 49

Avoid harms

Try to understand how decisions impact outcomes. Learn more: check out the slides from the tutorial "Algorithmic Bias in Practice" at ACM FAT*2019.

Illustration from “AAAI 2017 Spring Symposium Series - Designing the UX of ML Systems” by Henriette Cramer and Jenn Thom

slide-50
SLIDE 50

Enjoy the journey!

slide-51
SLIDE 51

Have fun!

  • Celebrate the invaluable improvements and learnings brought along the journey:
  • Data, metrics, instrumentation and experimentation
  • Business and domain understanding
  • System design and quality
  • Get ready for even more exciting next steps!
  • Enjoy the journey and don’t forget the bigger picture: customer value!
slide-52
SLIDE 52

Re-cap

slide-53
SLIDE 53

Open the Pandora’s Box Start with simple Aim for the skies Hit half-way there Enjoy the journey!

slide-54
SLIDE 54

Obrigada!

dhiana@spotifz.com @dhianadeva on Twitter