Optimizing Physical Assets with Machine Learning Rajendra Koppula - - PowerPoint PPT Presentation

optimizing physical assets with machine learning
SMART_READER_LITE
LIVE PREVIEW

Optimizing Physical Assets with Machine Learning Rajendra Koppula - - PowerPoint PPT Presentation

Optimizing Physical Assets with Machine Learning Rajendra Koppula WWW.MANIFOLD.AI About Us Manifold is a full-service AI development services firm that accelerates AI development for leading companies. Our team has a proven ability to design,


slide-1
SLIDE 1

WWW.MANIFOLD.AI

Optimizing Physical Assets with Machine Learning

Rajendra Koppula

slide-2
SLIDE 2

WWW.MANIFOLD.AI

About Us

Manifold is a full-service AI development services firm that accelerates AI development for leading companies. Our team has a proven ability to design, build, deploy, and manage data applications at scale.

slide-3
SLIDE 3

WWW.MANIFOLD.AI

Audience & Agenda

Audience

  • Practitioners with some

knowledge of PyData eco- system, ML workflows

Slides

www.manifold.ai/2019SensorsExpo

Agenda

  • Introduction & Motivation
  • Design Patterns
  • Conclusion & Key Takeaways
slide-4
SLIDE 4

WWW.MANIFOLD.AI

Lean AI

  • 1. Build the simplest

E2E system first.

  • 2. Make iterations as

quickly as possible.

slide-5
SLIDE 5

WWW.MANIFOLD.AI

Case Study

  • Leading industrial services company
  • “We want to use AI to be more efficient across our operations. The

vision is to create a system for making better decisions.”

slide-6
SLIDE 6

WWW.MANIFOLD.AI

Business Understanding Workshop

  • I get paid for uptime, how can I make

that higher?

  • Unplanned maintenance costs me a lot
  • f time and money and erodes customer

satisfaction, how can I prevent that?

  • I roll trucks every 30 days for

preventative maintenance, no matter

  • what. Can I go less often?
  • I have sensors on all these units and I’ve

been collecting data for a few years. I want to get more value out of this instrumentation.

  • Many, many more...

What are your business problems (that you think AI can help you with)?

slide-7
SLIDE 7

WWW.MANIFOLD.AI

AI Uncertainty Principle

AI AI v value ≤ bu business value x da data ta quality ty x pr predictive sign gnal

Multiplicative! If any term goes to 0, value goes to 0!

slide-8
SLIDE 8

WWW.MANIFOLD.AI

Create an AI Specification

  • Predict major faults where machine is

continuously down for >2 hours.

  • Predict whether major fault will

happen over a horizon of 1, 2, … , 5 days.

  • Use machine-generated data as input

features, e.g., ~30 continuous time series, ~20 discrete time series.

  • Use demographic data about

machines, e.g., unit type, location, etc.

  • Do not use human-generated service

data because of data quality issues.

slide-9
SLIDE 9

WWW.MANIFOLD.AI

Typical ML Workflow

Database S3

Mod Modeling Mod Model Dep Deploy

  • ymen

ent Fe Feature En Engineering Pre Prepro processin ing

slide-10
SLIDE 10

WWW.MANIFOLD.AI

Lookback = 2 days Horizon = 5 days

Why This Target?

  • Clear business value because company gets paid for uptime and often there is

customer call and truck is rolled if there machine is in major fault.

  • Acceptable data quality because it is purely machine generated, i.e., can look at

the status register.

  • Defined major as >2 hours continuously in faulted state. Most lesser faults are

automatically or manually cleared before this time.

slide-11
SLIDE 11

WWW.MANIFOLD.AI

AI Uncertainty Principle

AI AI v value ≤ bu business value x da data ta quality ty x pr predictive sign gnal

De-risked as much as possible. Have to take leap of faith now.

slide-12
SLIDE 12

WWW.MANIFOLD.AI

Data Engineering is the Foundation

Foundation

source: Monica Rogati

slide-13
SLIDE 13

WWW.MANIFOLD.AI

Spec the Requirements

The Constants

  • AI/ML is software engineering.
  • You will develop locally.
  • You will develop in the cloud.
  • You will collaborate.
  • You will experiment.
  • You will deploy.

The Variables

  • Volume of data
  • Velocity of data
  • Source of data
  • Important features
  • Downstream integrations
  • Prediction velocity
  • Training velocity
slide-14
SLIDE 14

WWW.MANIFOLD.AI

Architecting the Solution

The Constants

  • Docker-first ML with Orbyter
  • github.com/manifoldai/orbyter-

docker

The Variables

  • Sampling
  • How to generate training and

test data?

  • TS data subtleties
  • Architecting for Volume
  • Spark + DASK + HDF5
  • Modeling
  • Trees and Interpretability
  • Feature Engineering
  • Evaluation
  • Deployment
slide-15
SLIDE 15

WWW.MANIFOLD.AI

What is the ML Problem?

  • 50+ sensors logged @ 1 minute intervals 24/7
  • Pose as supervised learning problem
slide-16
SLIDE 16

WWW.MANIFOLD.AI

Sampling for Supervised Learning

  • Train a supervised learning algorithm using historical examples. It

learns patterns where there are failures and looks for them in the future.

  • This requires us to pass historical samples in a clean manner by slicing

and dicing the time series the way we need.

ETL

X y

slide-17
SLIDE 17

WWW.MANIFOLD.AI

Preventing Data Leakage

  • Separate data into training set and validation set, 70%

training, 30% validation. No data leakage.

  • Prevent overfitting.

700k samples 200k samples

X700k,54,2880

slide-18
SLIDE 18

WWW.MANIFOLD.AI

Sample Rebalancing and Filtering

  • Failure is a rare event.
  • Many y=0 samples than y=1 samples. May have to rebalance training dataset.
  • Invalid sample rejection, for ex - don’t let fault predict fault.

The unit is already significantly faulted at this

  • point. Predicting is not really useful at this point.

Horizon = 5 days Lookback = 2 days

slide-19
SLIDE 19

WWW.MANIFOLD.AI

Feature Engineering Workshop

Desired output = prioritized list of features

  • Need the domain experts in the room, i.e. mechanical engineers, head
  • f maintenance, SW engineering
  • Feature engineering is the main way you are encoding their domain

knowledge

  • Must trade off predictive power with engineering complexity
slide-20
SLIDE 20

WWW.MANIFOLD.AI

Feature Engineering

  • Continuous Time Series Features
  • Mean over lookback
  • Variance over lookback
  • Fourier Transform
  • Trend over lookback
  • Discrete Time Series Features
  • State counts over lookback
  • Demographic Features
  • One hot encoded

Feature Matrix

Collapse the time dimension

X700k,54,2880 F700k,54,N

slide-21
SLIDE 21

WWW.MANIFOLD.AI

Architect for Volume

  • Ingest is optimized for

throughput and high availability

  • Data from an asset is spread

across many files in S3

  • Varying sizes
  • Different time periods
  • Sampler Pipeline works well if all the data from an asset is in one

contiguous file => Use Spark to gather, massage and transform

slide-22
SLIDE 22

WWW.MANIFOLD.AI

Tools in the Pipeline

  • A (very) high-level picture of the pipeline
  • Spark for ETL
  • Dask for Feature Engineering
  • HDF5 as storage engine
slide-23
SLIDE 23

WWW.MANIFOLD.AI

Dask: Out of Core

  • Create a dask array from a HDF5 dataset
  • 250 GB of data on disk
  • Pass the 3d dask array to feature engineering step
slide-24
SLIDE 24

WWW.MANIFOLD.AI

Dask: Parallelism

  • Build the series of features
  • All compute is delayed until .compute() is called
slide-25
SLIDE 25

WWW.MANIFOLD.AI

Dask: Parallelism

  • Another example of feature engineering
  • Build a lot of histograms
slide-26
SLIDE 26

WWW.MANIFOLD.AI

slide-27
SLIDE 27

WWW.MANIFOLD.AI

The Fun Stuff

The fun stuff

source: Monica Rogati

slide-28
SLIDE 28

WWW.MANIFOLD.AI

Create a Baseline Model

  • classification > regression
  • class errors are easier to understand learn from
  • even for continuous targets, you may want to do a binary (or

multiclass) classifier before regression

  • random forest > gradient boosted trees > deep learning
  • few parameters to tune, robust to overfitting, quick to train
  • interpretable feature importance to learn from
  • pick a few features to start, then create more features

It’s all about learning! Then iterate, iterate, iterate.

slide-29
SLIDE 29

WWW.MANIFOLD.AI

Evaluate to Learn

  • Aggregate Metrics
  • Cross-Validated ROC and AUC =

your score to improve by iterative modelling

  • Feature importance done properly
  • Individual Metrics (Sample-level)
  • Prediction probability distribution
  • “Four corners and the middle

analysis”

  • most accurate negatives
  • most accurate positives
  • least accurate negatives
  • least accurate positives
  • least certain estimates
slide-30
SLIDE 30

WWW.MANIFOLD.AI

Iterate the Baseline Model

Feature Matrix

X700k,5

4,2880

F700k,54

,N

Deep Learning (CNNs) Tree Methods (RF and GBT) Feature Engineering Model Evaluation

Model to Deploy

slide-31
SLIDE 31

WWW.MANIFOLD.AI

slide-32
SLIDE 32

WWW.MANIFOLD.AI

User Feedback Working Sessions

  • Multiple structured sessions with final end users. In our case they were

mechanical engineers and maintenance leads.

  • Prototype tooling, e.g., nothing, Excel, Jupyter notebooks.
  • Observe their workflow and how they integrate predictions.
slide-33
SLIDE 33

WWW.MANIFOLD.AI

Not as Simple as Looking at Predictions

  • Most high probability of fault units are known stressed units
  • Most are in basins where line pressure is high

Example “Stressed” Unit

slide-34
SLIDE 34

WWW.MANIFOLD.AI

Prediction Filtering

  • Rules on historical predictions to

find “interesting events”

  • Different filters for different use

cases

  • Absolute probability => stressed

units

  • % prob change => “surprising”

daily changes

  • Tune rules to appropriate place
  • Currently tuned to have low false

positives

  • Look for a few things and find

them accurately—status quo for the rest.

current probability of failure: .62 average probability over past 3 days: .44 42% increase in chance of failure

slide-35
SLIDE 35

WWW.MANIFOLD.AI

Need Diagnostics to be Actionable

  • AI analyzes 70+ parameters to predict probability of failure
  • Human spends 10+ minutes looking at the data and may not be able to

see what the AI sees

  • Triage needs to be directed to within that parameter set
  • Need explainable AI to point user in right direction

“Can you tell me where to look?”

slide-36
SLIDE 36

WWW.MANIFOLD.AI

Tree Interpreter

  • Identify what sensors are driving

the increased probability of failure

  • Absolute
  • Daily Change
  • This is a good starting point for

the team to look for the causation

Today’s Contributions Daily Change in Contribution

slide-37
SLIDE 37

WWW.MANIFOLD.AI

Deliver Workflow Tools, Not Models

  • The raw predictions almost always need post processing before they

are useful.

  • It is our job as AI engineers to create workflow tools that help users

derive value from the AI.

“Build the UI for the AI”

slide-38
SLIDE 38

WWW.MANIFOLD.AI

Thank You

www.manifold.ai/2019SensorsExpo

Rajendra Koppula