WELCOME WHO ARE WE? David Brader Noah Erickson Data Scientist - - PowerPoint PPT Presentation

welcome who are we
SMART_READER_LITE
LIVE PREVIEW

WELCOME WHO ARE WE? David Brader Noah Erickson Data Scientist - - PowerPoint PPT Presentation

WELCOME WHO ARE WE? David Brader Noah Erickson Data Scientist Director, Process Improvement Titan America LLC Titan America LLC 3 WHAT WE LEARNED Highway To The Danger Zone -the key to staying out of it Mediators, Moderators and


slide-1
SLIDE 1
slide-2
SLIDE 2

WELCOME

slide-3
SLIDE 3

WHO ARE WE?

David Brader

Director, Process Improvement

Titan America LLC

Noah Erickson

Data Scientist

Titan America LLC

3

slide-4
SLIDE 4

WHAT WE LEARNED…

Highway To The Danger Zone

  • the key to staying out of it…

Mediators, Moderators and M2’s Oh My

  • when variables model differently than in the process world

Making Data Science Meaningful In Real-Time

4

slide-5
SLIDE 5

EVER NOTICE EVERYONE IS A DATA SCIENTIST?

5

And they almost invariably have the wrong perspective?

  • Like this one, the typical Senior Manager And Above:
slide-6
SLIDE 6

CEO & E-SUITE PERSPECTIVE

6

Data Science

slide-7
SLIDE 7

SIMPLE RIGHT? …JUST GET THE DATA…

7

slide-8
SLIDE 8

STILL GETTING THE DATA…

8

slide-9
SLIDE 9

NO SUCH THING AS BAD DATA,

JUST OPPORTUNITIES FOR A LOT OF DATA CLEANING

9

slide-10
SLIDE 10

DATA SCIENCE – THE REAL PYRAMID

10

slide-11
SLIDE 11

WHY DOMO AND DOMO DATA SCIENCE?

11

Each Step Is Somewhat Automated Within It’s Function But Often Manually Transported Through To The Next Stage

But Being Forward Looking, and In Order To Ensure Proper Human Capital Consumption (i.e. lazy) Our Vision Was….

Standard Playbook Data Preparation Analytics (ML) Visualize

slide-12
SLIDE 12

DATA SCIENCE CONSUMPTION AS A CONTINUOUS OPERATING PROCESS

12

Data Acquisition and Preparation Low Level Alert and Notification Data Analysis (ML) High Level Alert and Notification Near Real Time Process Parameter Manipulation

At Present Within DOMO: Ø >100 Data Sources Ø 20,300 Unique Columns Ø >381 M Rows Ø Process Data Being Logged Every 10 Minutes at 5 s Frequency

Every Step Happens Automatically As New Data Is Uploaded In Near Real Time VISUALIZATION

slide-13
SLIDE 13

13

WE WANTED TO BE ABLE TO RAPIDLY DEPLOY THAT MODEL

So We Teamed Up With DOMO Data Science – We just needed them to show us how to use the R and Python Tiles in DOMO… But They Made Us Do All This Other Work

– Data Glossary and Definition Sheets – Give Them Process Flow Charts (Actual Process Not Data Flow) – Have Meetings…..

Why? We Were Already In The Data Science Zone?!?

slide-14
SLIDE 14

…THEN, A WEEKEND GETS RUINED…

14

slide-15
SLIDE 15

SHOW ME….

15

Turns Out They Really Read That 300+ Page Dissertation I Sent Them!

– We Were Here

THE FUNDAMENTAL THEORETICAL RELATIONSHIP BETWEEN THE PRIMARY PREDICTIVE FEATURE AND THE VALUE WE WERE TRYING TO PREDICT WAS NOT PRESENT IN OUR DATA!?!

slide-16
SLIDE 16

NOAH (& TONY) GOT BACK TO WORK….

16

Expanded Dataset To Encompass More Variation And Proper Relationships And…. Last Version Of Modeling Gave Us A 33% Improvement In The Baseline R2 From Historical Modeling In The Literature!

slide-17
SLIDE 17

SUCCESS!....?

17

But, we had an inline model that outperforms those in the literature, why maybe? Remember we wanted something that would be inline and near real time, turns out the DOMO Datascience Process showed us something very different…

slide-18
SLIDE 18

WE WEREN’T READY FOR THE REAL WORLD…

The bulk of the features that were significant in predicting the target value, are ones that we currently ONLY GET AT THE SAME TIME as the value we wanted to predict, which is 2 days after the actual processing WE NEED NEW MEANS BY WHICH TO GET THE NECESSARY DATA ANALYZED AND CAPTURED REAL TIME! THIS MEANS CAPEX! NEW ANALYTICAL EQUIPMENT ON ORDER

18

slide-19
SLIDE 19

NOW FOR A LOOK UNDER THE HOOD….

19

slide-20
SLIDE 20

LESSONS LEARNED

The Process Will Set You Free – Or At Least Keep You Out Of Trouble Be Prepared For Features & Variables That Will Behave Far Differently In A Real World Operating Process Than They Do In Controlled Situations Institutionalizing Your Solutions Requires Them To Operate At The Same Frequency As Your Decision Making

20

slide-21
SLIDE 21

THANK YOU