From Whiteboard to Production System Demand Forecasting System for an Online Grocery Retailer
Robert Pesch & Robin Senge Strata Data Conference NYC, September 2019
From Whiteboard to Production System Demand Forecasting System for - - PowerPoint PPT Presentation
From Whiteboard to Production System Demand Forecasting System for an Online Grocery Retailer Robert Pesch & Robin Senge Strata Data Conference NYC, September 2019 Rewe and inovex drive big data and data science initiatives in order to
From Whiteboard to Production System Demand Forecasting System for an Online Grocery Retailer
Robert Pesch & Robin Senge Strata Data Conference NYC, September 2019
Trainings Big Data Platform Big Data Applications Data Science & AI
3
1. Go to the online shop or mobile app 2. Fill your basket 3. Select a future delivery slot 4. Shop checks availability 5. Receive your purchase
4
Availability of products play a key role to success of the business case.
20 % Inaccurate predictions Central logistic problems Unexpected spoilage 7 % Unexpected inventory correction 17 % 80 % 100,0% Not available articles Reasons for unavailability
(several possible per case)
Not available articles Availability requests
Whiteboard Production System
(at least ‘complex’ in terms of Cynefin framework)
usefulness of models
success
8
Business problem definition Deployment Analyse and visualize Modelling and programming Evaluation Data gathering Data preparation
Iterative, agile process model, e.g. Scrum
9
Supply Chain Process Owner Data Scientist Software Engineer Big Data Platform Engineer
among products
products
lunch“-theorem there will not be a single best model
16
Autoregressive Model Exponential Smoothing ... (S)ARIMA(X), Prophet, ...
17
ID Date Price ... Amount 00001 2017- 03-31 2.69 114 00002 2017- 03-31 0.49 111 .. .. .. 99999 2018- 03-31 1.79 121
Train model Transform and define features
information
averages
location, ...
19
Easy to interpret Limited extrapolation capacity
Easy to interpret Limited expressiveness (linear dependencies)
Already strong out-of-the-box Not much data preparation necessary Limited extrapolation capacity Prone to outliers “blind spots”
Potentially strong model class Specialized topologies (LSTM) Need lots of computing power High effort in engineering Ensemble effect helps combining the strengths of different models High effort to support many models
using supervised learning, under review
not scale
using error metrics
changes
fallback model
Sculley et al., Hidden Technical Debt in Machine Learning Systems, NIPS, 2015
Complexity
Feature- Generator Trainer & Evaluator Prediction- Workflow Outlier- Detection Data collector Model- Selector Blacklist & Importer
Features Raw data Predictions Filtered predictions Results Selection Runs once per day Runs once per month Cleaned data
Simulator
KPI simulations Blacklist
24
Infrastructure Data Processing Frameworks Languages Machine Learning Frontend and Monitoring
evaluation
used directly
components for errors and failures is the standard Add the following:
plausibility
– quantile filters – rules – proximity based methods
– imputation – removal of data points – skip features
28
predictions is not feasible
○ yet hidden programming bugs ○ instable models ○ broken assumptions ○ blind spots
strange
99% 50% 1%
value assuming symmetric costs
asymmetrical costs and a non-trivial cost-function
(esp. spoilage)
target quantile Distributional Regression for Demand Forecasting in e-Grocery - https://ssrn.com/abstract=3312609
NAV SP
inovex GmbH Ludwig-Erhard-Allee 6 76131 Karlsruhe GERMANY robert.pesch@inovex.de robin.senge@inovex.de