Humans and Algorithms: Creation and Measurement of Economic Value in - PowerPoint PPT Presentation

Humans and Algorithms: Creation and Measurement of Economic Value in Demand Forecasting Peter Kauf, PrognosiX AG Thomas Ott, IAS, ZHAW

long shelf life low costs low margins

long shelf life Short shelf life low costs high costs low margins high margins

Why forecasting? food waste 0.7% - 3% of turnover = 56.3 Bio CHF p.a. Loss (NWS-Europe)

Why forecasting? stock out food waste 0.7% - 3% of turnover = 56.3 Bio CHF p.a. Loss (NWS-Europe) 1% - 2.3% of turnover = 55.9 Bio CHF p.a. Lost turnover (NWS-Europe)

We join Forces of Algorithms and People Comprehensive Forecasting PrognosiX AG is a Spin-off from IAS Institute for Applied Simulation of ZHAW

Institute of Applied Simulation IAS ZHAW Zurich University of Applied Sciences Bio-Inspired Modeling & Learning Systems • IAS: • 6 research groups Predictive Analytics • • about 40 people Biomedical Simulation • Applied Computational Genomics • Simulation & Optimisation • Knowledge Engineering •

CTI project Denner Application Distribution center Migros Zürich, Fruits & Vegetables Distribution center Bischofszell Nahrungsmittel Production Planning Embedding Inform Software (Aachen, D.) Demand planning add*ONE (Denner) Development Zürcher Hochschule für angewandte Wissenschaften Algorithms, interface and usability concepts PrognosiX AG Software development, commercialization

Challenge Sales Absatz weekly sales

Learning algorithms Sales Absatz weekly sales

Add economic feedback sales data stock out, foodwaste, forecasting economic value storage costs algorithm human overrides error metrics external drivers

Add economic feedback human expertise sales data external drivers stock out, foodwaste, forecasting economic value storage costs human overrides error metrics Library of algorithms

Simple logic? better forecasts reduced leftovers / stockout cost reduction => just pick the best forecasting method/algorithm 34

How to choose the best algorithm? => Measures of forecast accuracy The goal of good forecasting is to minimize the forecasting error(s) 𝑓 " = 𝐺 " − 𝑌 " , 1 where 𝑌 " is the actual demand at time t and 𝐺 " is the respective forecast. => How to quantify/evaluate the errors? N.B. For now we assume that both 𝑌 " and 𝐺 " are available. 35

Measures of forecast accuracy Overview: • Standard accuracy measures / error metrics • Advanced cost-based error metrics and sensitivity analysis • Stock-keeping models

� Measures of forecast accuracy 1. Scale-dependent metrics The most popular measures are the mean absolute error (MAE) 0 𝑁𝐵𝐹 𝑜 = 1 𝑜 . │𝑓 " │ 2 "12 and the root mean square error (RMSE) 0 1 𝑜 . 𝑓 "6 𝑆𝑁𝑇𝐹 𝑜 = (3) "12 Here and in the following we assume that the forecasting series is evaluated over a period 𝑢 = 1, … , 𝑜 . 37

How to choose the best algorithm? => Measures of forecast accuracy 2. Percentage error metrics aim at scale-independence. E.g., the widely used mean absolute percentage error MAPE 0 𝑁𝐵𝑄𝐹 𝑜 = 1 𝑜 . │ 𝑓 " │ 4 𝑌 " "12 38

Measures of forecast accuracy 3. Relative error metrics compare the errors of the forecasting with the errors of some benchmark forecasting method. One of the measures used in this context is the relative mean absolute error (RelMAE), defined as 0 𝑆𝑓𝑚𝑁𝐵𝐹 𝑜 = 1 │𝑓 " │ 𝑜 . 5 │𝑌 " − 𝑌 "@2 │ "12 39

Measures of forecast accuracy 4. Scale free error metrics have been introduced to counteract the problem “zeros in the denominators”. The mean absolute scaled error introduces a scaling by means of the MAE from the naïve forecast: 0 𝑁𝐵𝑇𝐹 𝑜 = 1 𝑓 " 𝑜 . 6 1 0 𝑜 − 1 ∑ 𝑌 C −𝑌 C@2 "12 C16 MASE (Hyndman and Koehler, 2006) 40

Measures of forecast accuracy All measures come along with advantages and disadvantages Class Advantage (e.g.) Disadvantage (e.g.) Scale dependent metrics Rather simple No comparison across different time series Percentage error metrics comparison across Problems with small values different time series /zeros in denominator Relative error metrics comparison across Problems with small values different time series /zeros in denominator Scale free error metrics No problems with Interpretation of economic small errors significance? If we just want to know which is the best method- does it actually matter which metric to use? 41

Choosing the error metric Yes, it matters sometimes! Example: Sales sequence and two different forecasts for a convenience food product (both forecasting models based on regression trees) 42

Choosing the error metric Which model should be chosen? Þ No coherent answer: Peak model? Baseline model? Naive model? What model to choose? => What metric to choose? How to decide? 43

Reasons for the differences? • «Toy» example: Sales sequence (blue) with five disruptive peaks. A perfect baseline model (red) that misses the peaks and a perfect peak model (black) which is slightly shifted in between peaks. 44

Reasons for the differences? • «Toy» example: • MAE/RMSE seem to put a heavier penalty on single high peaks than MAPE/relMAE => they favour the peak model over the baseline model • Why so? We will see later 45

Economic significance of forecasting error • The examples show an incoherent picture with regard to error metrics (which is also not remedied by the many alternatives that have been proposed in the literature) • How to resolve the situation? Þ The actual core question is: «What is the economic significance of the forecasts?» I.e., «what are the consequences in terms of costs that come along with the forecasting errors?» 46

Cost-based error metrics • Costs are product-specific and market-specific • Real costs depend on many factors such as the stock-keeping process • Simplest assumptions: – Forecast errors and costs are in direct relation – Costs do not depend on the history 𝑑 (𝑌 ", 𝐺 " ), (𝑌 "@2 , 𝐺 "@2 ), (𝑌 "@6 , 𝐺 "@6 ), … = 𝑑 𝑓 " . 7 • Example «ultra fresh products» e t > 0 => forecast too high => foodwaste cost – e t < 0 => forecast too low => stock-out cost – 47

Cost-based error metrics • Generalised Mean Cost Error MCE (Ansatz): 0 1 𝑁𝐷𝐹 𝑜 = 𝑡 𝑜 . 𝑑 𝑓 " , 8 "12 where 𝑑(H) is a cost function and 𝑡(H) is a scaling function. • MAE and RMSE are special instances: 48

Cost-based error metrics • Linear MCE: neglect economies of scale and assume proportionality: 𝑏 : cost per item for 𝑓 " > 0 Cost per unsold item => foodwaste, storage 𝑐 : cost per item for 𝑓 " < 0 Stockout cost => Non realised profit 49

Linear MCE: Sensitivity analysis • For the example of «ultra fresh products»: linMCE expresses the cost due to foodwaste and stockout that results from forecasting errors • In practice, it might be difficult to specify a and b for each product exactly. => make an estimate and perform a sensitivity analysis fora model comparison based on the ratio x=a/b . Price of sale b: stockout cost Product base x=a/b price a: foodwaste cost 50

Linear MCE: Sensitivity analysis Direct comparison of two forecasting models - Use ratio of linMCE: T2 − 𝑐 H 𝑚𝑗𝑜𝑁𝐷𝐹 V 𝑐 = 𝑚𝑗𝑜𝑁𝐷𝐹 T2 T2 𝑔 𝑦 = 𝑏 𝑚𝑗𝑜𝑁𝐷𝐹 T6 = 𝑏 H 𝑚𝑗𝑜𝑁𝐷𝐹 U T6 − 𝑐 H 𝑚𝑗𝑜𝑁𝐷𝐹 V T6 𝑏 H 𝑚𝑗𝑜𝑁𝐷𝐹 U T2 − 𝑚𝑗𝑜𝑁𝐷𝐹 V T2 = 𝑦 H 𝑚𝑗𝑜𝑁𝐷𝐹 U T6 10 T6 − 𝑚𝑗𝑜𝑁𝐷𝐹 V 𝑦 H 𝑚𝑗𝑜𝑁𝐷𝐹 U TW 𝑚𝑗𝑜𝑁𝐷𝐹 U Sum of all positive errors for model M i TW 𝑚𝑗𝑜𝑁𝐷𝐹 V Sum of all negative errors for model M i 51

Linear MCE: Sensitivity analysis «Toy» Example: f(x) can be determined analytically: = 𝑚𝑗𝑜𝑁𝐷𝐹 XYZ[\W][ 𝑔 𝑦 = 𝑏 0.95𝑏 = 2.11 2𝑐 = 𝑦 11 𝑚𝑗𝑜𝑁𝐷𝐹 ^[Y_ 𝑐 52

Linear MCE: Sensitivity analysis 𝑐 = 𝑚𝑗𝑜𝑁𝐷𝐹 XYZ[\W][ 𝑔 𝑦 = 𝑏 0.95𝑏 = 2.11 2𝑐 «Toy» Example: = 𝑦 11 𝑚𝑗𝑜𝑁𝐷𝐹 ^[Y_ Conclusion: peak model Baseline model The peak model performs better 𝑔 𝑦 = 𝑏 𝑐 if the food-waste cost per item is smaller than 2.11 times the stock-out cost per item MAE => Baseline model with high stock-out costs during peaks Critical point x=2.11 53

Humans and Algorithms: Creation and Measurement of Economic Value in - PowerPoint PPT Presentation

Humans and Algorithms: Creation and Measurement of Economic Value in Demand Forecasting Peter Kauf, PrognosiX AG Thomas Ott, IAS, ZHAW long shelf life low costs low margins long shelf life Short shelf life low costs high costs low

Language in humans Today: how do humans process language? Language in Humans We ve

Snails Versus Humans Comparing Relative Strength of Snails and Humans OBJECTIVE Students will

TRAMADOL LETHAL DOSE HUMANS ARE ANIMALS PRESENTATION Tramadol Lethal Dose Humans Are Animals

CHAPTER 10 Premodern Humans Chapter Outline * Premodern Humans of the Middle Pleistocene *

Graph Algorithms Chapter 22 1 CPTR 430 Algorithms Graph Algorithms Why Study Graph Algorithms?

Algorithms Chapter 3 Chapter Summary Algorithms n Example Algorithms n Algorithmic Paradigms

Greedy Algorithms Chapter 16 1 CPTR 430 Algorithms Greedy Algorithms Greedy Algorithms For

Episodic Memory for Virtual Humans and Virtual Humans for Episodic Memory Cyril Brom et al.

Science Animals Including Humans Science | Year 6 | Animals Including Humans | Transporting Water

HUMANS EVOLVED TO EXERCISE Unlike our ape cousins, humans require high levels of physical

Outline Light Real light How humans see light How computers trick humans into

humans are awesome* *compressors (or: what machines can learn from humans about lossy

9.4 Local Perception Filters 9.4 Local Perception Filters Exploiting Exploiting Perceptual

General remarks Algorithms Algorithms Oliver Oliver Week 8 Kullmann Kullmann Greedy Greedy

Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations

Randomized Algorithms Randomized Algorithms Two Types of Randomized Algorithms Two Types of

t sts

Getting Building Fund 27 th July 2020 Adam Bryan Rhiannon Mort Steven Bishop SELEP Chief

Caterpillar Inc. Andrew Bonfield, CFO Caterpillar Confidential Green Caterpillar Confidential

Whither the EU? David K. Levine 1 An Evolutionary Perspective: Places that Fail institutions

Sub-topics Chemical characterization Sorption-Desorption Characteristics Determination of k d

Household Financial Stability April 21, 2020, 1pm ET Welcome Guillermo Cantor Director, Applied

Economic Costs of Climate Change odio (a) , Miguel A. Ferreira (b) , Cl audia Cust Emilia

Professor Keith Rizzardi The Exceptions to Supplemental Jurisdiction Explain the cross