Machine learning techniques in predicting uncertainty of - PowerPoint PPT Presentation

Machine learning techniques in predicting uncertainty of environmental models Dimitri Solomatine Professor of Hydroinformatics, IHE Delft Institute for Water Education Delft, The Netherlands 1

Outline  Introduction: what are analysisng?  Machine learning methods to (a) analyse and (b) predict the model uncertainty  Suggested approach: “escalation” of uncertainty  Examples 2 D.P. Solomatine. Escalation of uncertainty.

Example for a quick start: deterministic forecasts and 90% uncertainty bounds 4000 3500 3000 Discharge(m3/s) 2500 2000 1500 1000 500 0 900 920 940 960 980 1000 1020 Time(days) 3 D.P. Solomatine. Escalation of uncertainty.

Sources of model uncertainty: perceptual, structure, parameters, data y = M(x, p) + ε s + ε θ + ε x + ε y SF – Snow SF – Snow RF – Rain RF – Rain EA – Evapotranspiration EA – Evapotranspiration SP – Snow cover SP – Snow cover SF SF SF IN – Infiltration IN – Infiltration RF RF RF EA EA EA R – Recharge R – Recharge SM – Soil moisture SM – Soil moisture CFLUX – Capillary transport CFLUX – Capillary transport SP SP UZ – Storage in upper reservoir UZ – Storage in upper reservoir IN IN IN PERC – Percolation PERC – Percolation SM SM LZ – Storage in lower reservoir LZ – Storage in lower reservoir R R CFLUX CFLUX CFLUX Qo – Fast runoff component Qo – Fast runoff component Q1 – Slow runoff component Q1 – Slow runoff component Q0 Q0 Q0 Q – Total runoff Q – Total runoff UZ UZ PERC PERC PERC Q1 Q1 Q1 Q=Q0+Q1 Q=Q0+Q1 Transform Transform LZ LZ function function 4 D.P. Solomatine. Escalation of uncertainty.

Traditional steps in uncertainty analysis of a calibrated model  Identification of sources of uncertainty (input, parameter, model structure)  Quantification of uncertainty (e.g. as distribution)  Studying propagation of uncertainty through the model (e.g. by Monte Carlo simulation)  Quantification of uncertainty in the model outputs (i.e. identification of output distribution (pdf) or its characteristics – mean, st.dev., quantiles)  If possible, reduction of uncertainty (e.g. model improvement, more accurate measurements, etc.)  Application of the uncertain information in decision making process 5 D.P. Solomatine. Escalation of uncertainty.

Data uncertainty (input, parameters): propagation of uncertainty through the model  y^ = M (x, p)  x = input, p = parameters  Uncertainty in X and p propagates to output y  pdf of parameters  pdf of output pdf p  pdf y  pdf of inputs pdf x  pdf of output pdf x  pdf y 6 D.P. Solomatine. Escalation of uncertainty.

Monte Carlo Simulation 7

Mote Carlo casino: roulette wheel It is a random number generator – uses  uniform distribution with the range of [0, 36] 8 D.P. Solomatine. Escalation of uncertainty.

Single model run (no uncertainty) Input (single SF – Snow SF – Snow RF – Rain RF – Rain EA – Evapotranspiration EA – Evapotranspiration Single parameter SP – Snow cover SP – Snow cover time series) SF SF SF RF RF RF IN – Infiltration IN – Infiltration EA EA EA R – Recharge R – Recharge SM – Soil moisture SM – Soil moisture vector P: CFLUX – Capillary transport CFLUX – Capillary transport SP SP UZ – Storage in upper reservoir UZ – Storage in upper reservoir IN IN IN PERC – Percolation PERC – Percolation SM SM FC, ALPHA, K, LZ – Storage in lower reservoir LZ – Storage in lower reservoir R R Qo – Fast runoff component Qo – Fast runoff component CFLUX CFLUX CFLUX Q1 – Slow runoff component Q1 – Slow runoff component Q0 Q0 Q0 Q – Total runoff Q – Total runoff UZ UZ MAXBAS, etc. PERC PERC PERC Q1 Q1 Q1 Q=Q0+Q1 Q=Q0+Q1 Transform Transform LZ LZ function function Run the model Output (single time series) 9 D.P. Solomatine. Escalation of uncertainty.

Monte Carlo simulation in analysing parametric uncertainty 10 D.P. Solomatine. Escalation of uncertainty.

Sampling parameters and multiple model runs Sample one Do this muliple times parameter vector from distributions Input (single SF – Snow SF – Snow RF – Rain RF – Rain EA – Evapotranspiration EA – Evapotranspiration SP – Snow cover SP – Snow cover time series) SF SF SF Single parameter RF RF RF IN – Infiltration IN – Infiltration EA EA EA R – Recharge R – Recharge SM – Soil moisture SM – Soil moisture CFLUX – Capillary transport CFLUX – Capillary transport vector P: SP SP UZ – Storage in upper reservoir UZ – Storage in upper reservoir IN IN IN PERC – Percolation PERC – Percolation SM SM LZ – Storage in lower reservoir LZ – Storage in lower reservoir FC, ALPHA, K, R R Qo – Fast runoff component Qo – Fast runoff component CFLUX CFLUX CFLUX Q1 – Slow runoff component Q1 – Slow runoff component Q0 Q0 Q0 Q – Total runoff Q – Total runoff UZ UZ MAXBAS, etc. PERC PERC PERC Q1 Q1 Q1 Q=Q0+Q1 Q=Q0+Q1 Transform Transform LZ LZ function function Single output Run the model Ensemble of multiple output time series 11 D.P. Solomatine. Escalation of uncertainty.

Monte Carlo sampling: illustration y = M(x, s, θ ) + ε s + ε θ + ε x + ε y 12 D.P. Solomatine. Escalation of uncertainty.

Sampling rainfall and multiple model runs Sample one input Do this muliple times time series from distributions SF – Snow SF – Snow RF – Rain RF – Rain Input (single EA – Evapotranspiration EA – Evapotranspiration Single parameter SP – Snow cover SP – Snow cover SF SF SF RF RF RF IN – Infiltration IN – Infiltration time series) EA EA EA R – Recharge R – Recharge SM – Soil moisture SM – Soil moisture vector P: CFLUX – Capillary transport CFLUX – Capillary transport SP SP UZ – Storage in upper reservoir UZ – Storage in upper reservoir IN IN IN PERC – Percolation PERC – Percolation SM SM FC, ALPHA, K, LZ – Storage in lower reservoir LZ – Storage in lower reservoir R R Qo – Fast runoff component Qo – Fast runoff component CFLUX CFLUX CFLUX Q1 – Slow runoff component Q1 – Slow runoff component Q0 Q0 Q0 Q – Total runoff Q – Total runoff UZ UZ MAXBAS, etc. PERC PERC PERC Q1 Q1 Q1 Q=Q0+Q1 Q=Q0+Q1 Transform Transform LZ LZ function function Single output Run the model Ensemble of multiple output time series 13 D.P. Solomatine. Escalation of uncertainty.

Representing uncertainty of model output by the confidence bounds 400 350 300 Discharge (m3.s) 250 q 95 200 q 5 150 100 50 0 10 20 30 40 50 60 70 Time (hr) Instead of fitting a theoretical distribution, we can use mean, standard deviation, quantiles. E.g., 5% and 95% form the 90% confidence bounds 14 14 D.P. Solomatine. Escalation of uncertainty.

Propagation of parameters/data uncertainty by Monte Carlo simulation is a typical practical approach. But is it the only one? 15 D.P. Solomatine. Escalation of uncertainty.

QUESTION 1. On assumptions  We are assuming some known distributions of parameters or inputs. How safe is this?  Could we take a safer route and assume less?  Let’s make a step backwards and pose the QUESTION 1: what is the uncertainty of the calibrated model itself ? 16 D.P. Solomatine. Escalation of uncertainty.

Residual uncertainty: uncertainty of a calibrated (“optimal”) model Output Y Model Model output y^ Measured Model error Measured value y Actual Observation error Actual value y* (unknown) time Uncertainty of an optimal model M (x, θ)   Model M is calibrated on measured data y  We say the model M uncertainty is manifested in the residual model error ε = y^ – y  This error incorporates all uncertainties due to: observational errors, inaccurately estimated parameters, inadequate model structure 17 D.P. Solomatine. Escalation of uncertainty.

ESCALATION (“build up”) of model uncertainty [message 1]  1. Study the (residual) uncertainty of an optimal model M (p*)  2. Add and study (typically, by MC simulation)  A) uncertainty of M (p*) due to DATA uncertainty  B) uncertainty of M (p) due to PARAMETERS uncertainty  3. Add and study uncertainty of M (p) due to STRUCTURAL uncertainty  4. Study uncertainty of a model class M (p), given the probabilistic properties of parameters and data 18 D.P. Solomatine. Escalation of uncertainty.

QUESTION 2. On what is analysed 400 350 300 Discharge (m3.s) 250 q 95 200 q 5 ? 150 100 50 0 10 20 30 40 50 60 70 Time (hr)  In UA we always use the past data, so Estimates of uncertainty are about the PAST.  QUESTION 2: how can we assess the model uncertainty for new inputs, i.e. for the future? - and this question we pose for all sources of uncertainty (and not only residual ) 19 D.P. Solomatine. Escalation of uncertainty.

Models of Residual Uncertainty : Using Methods of Computational Intelligence 20

Machine learning techniques in predicting uncertainty of - PowerPoint PPT Presentation

Machine learning techniques in predicting uncertainty of environmental models Dimitri Solomatine Professor of Hydroinformatics, IHE Delft Institute for Water Education Delft, The Netherlands 1 Outline Introduction: what are analysisng?

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Uncertainty AIMA Chapter 13 Outline Uncertainty Uncertainty Probability Syntax and

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Welcome Predicting Change Outcomes Leveraging SQL Server Profiler Lee Everest SQL Rx Predicting

UNCERTAINTY IN KNOWLEDGE Ch. 9 Uncertainty in Knowledge 1 Sources of Uncertainty

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft K-means DBSCAN 1 MACHINE

Predicting Hotel Cancellations with Machine Learning Michael el Grogan Machine Learning

Machine Learning for NLP Ethics and Machine Learning Aurlie Herbelot 2019 Centre for

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Portable Monte Carlo Transport Performance Evaluation in the PATMOS Prototype Tao CHANG 1

An Agent-Based Boom-Bust Business Cycle Model with Search-for-Yield and Heterogeneous

Kevin McLaughlin Outline Advance of Fab technologies and the evolution of raw materials for

Monte Carlo simulation for a doubly nonlinear problem in finance Lokman Abbas-Turki First part

Sequential Monte Carlo Methods for State and and Parameter Estimation (with application to ocean

Statistical Thermodynamics of Polymers with a Biophysics Emphasis Continued development of

Aim Provide a strategic overview of how simulation can enhance individual training scheduling

What I will Show You Today (in 10 Minutes!) PLS has no advantage at small sample size Not