Electricity Demand Forecasting using Multi-Task Learning - PowerPoint PPT Presentation

Electricity Demand Forecasting using Multi-Task Learning Jean-Baptiste Fiot, Francesco Dinuzzo Dublin Machine Learning Meetup - July 2017 1 / 32 �

Outline 1 Introduction 2 Problem Formulation 3 Kernels 4 Experiments 5 Conclusion 2 / 32 �

Electricity Demand Forecasting Electricity is a special commodity It cannot be stored efficiently (in large quantities) It looses value when being moved (line losses) Demand forecasting is critical Operations, bidding, demand response, maintenance, planning, etc. The game is changing Distributed renewable generation Higher volatility on markets Increased number of participants 4 / 32 �

Demand Forecasting Methods (Non-)linear variants of least-squares, ARMAX, fuzzy logic, etc. Black-box models based on neural networks [Hippert et al., 2001] Generalized Additive Models (GAM) Great performance [Fan and Hyndman, 2012, Ba et al., 2012] Efficient and scalable training algorithms Interpretability of the model Hippert, HS, et al. Neural networks for short-term load forecasting: A review and evaluation. Power Systems, IEEE Transactions on , 16(1):44–55, 2001. Fan, S and Hyndman, R. Short-term load forecasting based on a semi-parametric additive model. Power Systems, IEEE Transactions on , 27(1):134–141, 2012. Ba, A, et al. Adaptive learning of smoothing functions: application to electricity load forecasting. In Advances in Neural Information Processing Systems 25 (NIPS 2012) , pages 2519–2527. 2012. 5 / 32 �

Demand Forecasting using Kernel Methods In 2001, kernel-based support vector regression won EUNITE (European Network on Intelligent Technologies for Smart Adaptive Systems) demand forecasting competition [Chen et al., 2004] Later, kernel-based regularizations and support vector techniques were successfully used [Espinoza et al., 2007, Hong, 2009, Elattar et al., 2010] Chen, B, et al. Load forecasting using support vector machines: A study on EUNITE competition 2001. Power Systems, IEEE Transactions on , 19(4):1821–1830, 2004. Espinoza, M, et al. Electric load forecasting. Control Systems, IEEE , 27(5):43–57, 2007. Hong, WC. Electric load forecasting by support vector model. Applied Mathematical Modelling , 33(5):2444–2454, 2009. Elattar, E, et al. Electric load forecasting based on locally weighted support vector regression. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on , 40(4):438–447, 2010. 6 / 32 �

Electric Demand Forecasting y = f ( t , d , c , ˆ y l , u l , j , s j ) , Time/Calendar features t ∈ [0 , 24) is the time of day expressed in hours, d ∈ { 1 , 2 , . . . , 365 , 366 } is the day of the year , c is the type of day , e.g. Monday to Sunday, Dynamic features y l is a real vector containing lagged values of the electric demand, u l is a real vector containing measurements of lagged values of exogenous variables other than the demand (such as temperature), Meter features j is the meter ID in the electricity network, s j is a vector of features describing the demande measured at j . 8 / 32 �

Electric Demand Forecasting y = f ( t , d , c , ˆ y l , u l , j , s j ) , Time/Calendar features t ∈ [0 , 24) is the time of day expressed in hours, d ∈ { 1 , 2 , . . . , 365 , 366 } is the day of the year, c is the type of day, e.g. Monday to Sunday, Dynamic features y l is a real vector containing lagged values of the electric demand , u l is a real vector containing measurements of lagged values of exogenous variables other than the demand (such as temperature), Meter features j is the meter ID in the electricity network, s j is a vector of features describing the demande measured at j . 8 / 32 �

Electric Demand Forecasting y = f ( t , d , c , ˆ y l , u l , j , s j ) , Time/Calendar features t ∈ [0 , 24) is the time of day expressed in hours, d ∈ { 1 , 2 , . . . , 365 , 366 } is the day of the year, c is the type of day, e.g. Monday to Sunday, Dynamic features y l is a real vector containing lagged values of the electric demand, u l is a real vector containing measurements of lagged values of exogenous variables other than the demand (such as temperature), Meter features j is the meter ID in the electricity network, s j is a vector of features describing the demande measured at j . 8 / 32 �

Solving Multiple Demand Forecasting Problems Consider m smart meters, indexed by j Goal: learn { f j : X → R } 1 ≤ j ≤ m from datasets ( x ij , y ij ) ∈ X × R . 9 / 32 �

Optimisation Problem Letting f : X → R m the function with components f j , we minimize ℓ j m ( y ij − f j ( x ij ))) 2 + λ � f � 2 � � R ( f , L ) = H L , (1) j =1 i =1 where λ > 0 is a regularization parameter, and H L is a Reproducing Kernel Hilbert Space (RKHS) of vector-valued functions with (matrix-valued) kernel H ( x i , x j ) = K ( x i , x j ) · L , (2) K : X × X → R is the input kernel , and L ∈ R m × m is the output kernel . Representer theorem : there exist functions ˆ f j minimizing R ( f , L ) in the form: m ℓ k ˆ � � f j ( x ) = L jk c ik K ( x ik , x ) . (3) k =1 i =1 10 / 32 �

Fixing L = I: Independent Kernel Ridge Regression 11 / 32 �

Learning L = I: Output Kernel Learning Remark: B = ( b ij ) is a Cholesky factor of L 12 / 32 �

Output Kernel Learning Joint optimization problem min min R ( f , L ) + λ tr( L ) , L ∈ S m , p f ∈H L + where S m , p is the cone of p.s.d. matrices with rank ≤ p . + Re-indexing the observations { x i } i =1 ,...,ℓ , the solution becomes p ℓ ˆ � � f j ( x ) = b jk g k ( x ) , g k ( x ) = a ik K ( x i , x ) , k =1 i =1 � b jk coefficients form a low-rank factor of L , where g k functions can be seen as modes or typical profiles . It is sufficient to store ( ℓ + m ) p parameters, which can be much smaller than � m j =1 ℓ j . 13 / 32 �

Multiple Seasonalities in Electricity Demand eseau de Transport d’´ Figure: French National Demand (R´ Electricit´ e data) 15 / 32 �

Capturing Demand Seasonalities with Kernels Time-of-day kernel K t ( t 1 , t 2 ) = exp ( − h T ( | t 1 − t 2 | ) /σ t ) , (4) Day-of-year kernel K d ( d 1 , d 2 ) = exp ( − h D ( | d 1 − d 2 | ) /σ d ) , (5) where h P ( x ) = min { x , P − x } is a change of variable that yields P -periodic kernels over the square [0 , P ] 2 . In our experiment, σ t and σ d were respectively set to 4 hours and 120 days. Day-type kernel � 1 if c 1 = c 2 K c ( c 1 , c 2 ) = if c 1 � = c 2 . . (6) 0 16 / 32 �

Kernels for Electric Demand Forecasting To define K (( t 1 , d 1 , c 1 ) , ( t 2 , d 2 , c 2 )), we combine the basis kernels Additive Models K t ( t 1 , t 2 ) + K d ( d 1 , d 2 ) , (7) K t ( t 1 , t 2 ) + K d ( d 1 , d 2 ) + K c ( c 1 , c 2 ) , (8) Semi-Additive Models K d ( d 1 , d 2 ) + K t ( t 1 , t 2 ) · K c ( c 1 , c 2 ) , (9) K t ( t 1 , t 2 ) + K d ( d 1 , d 2 ) · K c ( c 1 , c 2 ) , � � (10) Multiplicative Models K t ( t 1 , t 2 ) · K d ( d 1 , d 2 ) , (11) K t ( t 1 , t 2 ) · K d ( d 1 , d 2 ) · K c ( c 1 , c 2 ) . (12) 17 / 32 �

Commission for Energy Regulation (CER) Data 6435 smart meters 536 days (Jul 14, 2009 - Dec 31, 2010) Half-hour sampling 3 groups: residential, SME, others 19 / 32 �

Commission for Energy Regulation (CER) Data 6435 smart meters 536 days (Jul 14, 2009 - Dec 31, 2010) Half-hour sampling 3 groups: residential , SME, others 19 / 32 �

Commission for Energy Regulation (CER) Data 6435 smart meters 536 days (Jul 14, 2009 - Dec 31, 2010) Half-hour sampling 3 groups: residential, SME , others 19 / 32 �

Commission for Energy Regulation (CER) Data 6435 smart meters 536 days (Jul 14, 2009 - Dec 31, 2010) Half-hour sampling 3 groups: residential, SME, others 19 / 32 �

Pre-processing Removed two corrupted meters Corrected DST measurements Downsampled to 3-hour resolution Final dataset: m = 6433 smart meters ℓ = 4288 time slots Customer group Meters Sparsity Residential 4225 0.028% Industrial (SME) 485 0.035% Others 1723 17% 20 / 32 �

Learning the Models Data split 1 year (2920 obs.) used for training (80%) and validation (20%) ∼ 0.5 year (1368 obs.) used for testing Independent Kernel Ridge Regression using the 6 kernels Output Kernel Learning using MM2 1 model for { residential } ∪ { others } , p = 200 to fit in memory 1 model for { SME } , full rank ( p = 485) 21 / 32 �

Qualitative Analysis 8000 7000 6000 5000 4000 3000 2000 2010−11−28 2010−12−05 2010−12−12 2010−12−19 2010−12−26 12 10 8 6 4 2010−11−28 2010−12−05 2010−12−12 2010−12−19 2010−12−26 3 2.5 2 1.5 1 0.5 2010−11−28 2010−12−05 2010−12−12 2010−12−19 2010−12−26 Figure: Measured load (blue), indep. KRR (red) and multi-task OKL (black) forecasts for the aggregated demand (top), a single SME meter (middle), and a single residential meter (bottom). 22 / 32 �

Electricity Demand Forecasting using Multi-Task Learning - PowerPoint PPT Presentation

Electricity Demand Forecasting using Multi-Task Learning Jean-Baptiste Fiot, Francesco Dinuzzo Dublin Machine Learning Meetup - July 2017 1 / 32 Outline 1 Introduction 2 Problem Formulation 3 Kernels 4 Experiments 5 Conclusion 2 / 32

Forecasts and potential futures Rob Hyndman Author, forecast Forecasting Using R Sample

Forecasting demand in the National Electricity Market October 2017 Agenda Trends in the

Tool Demonstration: Demand Forecasting PACE D 2.0 RE Team Agenda Demand Forecasting

Electricity Demand Forecasting by Multi-Task Learning Jean-Baptiste Fiot Francesco Dinuzzo IBM

This is my presentation about electricity. By:Cassidee M. What is electricity? Electricity is

Welcome to Forecasting Using R Rob Hyndman Author, forecast Forecasting Using R What you will

Electricity price forecasting: from prob- abilistic to deep learning approaches TU Delft &

Flood Forecasting Initiative Guy Shalev Flooding impact Flood Forecasting Flood Forecasting

Lecture 9: Demand Uncertainty: Demand Uncertainty: Lecture 9: Forecasting Forecasting

Science Electricity Year One Science | Year 4 | Electricity | Exciting Electricity | Lesson 1

Sustainability First GB Electricity Demand Project realising the resource The electricity

Nash demand game Julio D avila 2009 Julio D avila Nash demand game Nash demand game

Challenges in Demand Forecasting Raj Protim Kundu ERLDC, POSOCO Need for forecasting IEGC

Forecasting 21 January 2013 1 FCAS Agenda Business Goals & Forecasting Approach

Lecture 10 Forecasting and Model Fitting Colin Rundel 02/20/2017 1 Forecasting 2 Forecasting

Multi-Task Active Learning Yi Zhang Outline Active Learning Multi-Task Active Learning

Forecasting and Optimization Methods Laura Ramrez Elizondo Learning Objectives What is

Evaluation of Relational Operations: Other Techniques [R&G] Chapter 14, Part B CS4320 1

Women 1115 Waiver July 16, 2020 Healthy Texas Women 1115 Waiver Overview About the

Managing Student Debt Heather Jarvis, Presenter Todays Agenda Income-Based Repayment

SAVE E THE D E DATE! E! 22nd An Annua ual CFO C Coun ouncil C Con onferen ence The

Challenges in Bayesian Network Modelling of Climate and Weather Data Marco Scutari

ADVANCED ECONOMETRICS I Theory (1/3) Instructor: Joaquim J. S. Ramalho E.mail:

Week 3: Multiple Linear Regression Polynomials, log transformation, categorical variables,

Electricity Demand Forecasting using Multi-Task Learning - PowerPoint PPT Presentation

Electricity Demand Forecasting using Multi-Task Learning Jean-Baptiste Fiot, Francesco Dinuzzo Dublin Machine Learning Meetup - July 2017 1 / 32 Outline 1 Introduction 2 Problem Formulation 3 Kernels 4 Experiments 5 Conclusion 2 / 32

Forecasts and potential futures Rob Hyndman Author, forecast Forecasting Using R Sample

Forecasting demand in the National Electricity Market October 2017 Agenda Trends in the

Tool Demonstration: Demand Forecasting PACE D 2.0 RE Team Agenda Demand Forecasting

Electricity Demand Forecasting by Multi-Task Learning Jean-Baptiste Fiot Francesco Dinuzzo IBM

This is my presentation about electricity. By:Cassidee M. What is electricity? Electricity is

Welcome to Forecasting Using R Rob Hyndman Author, forecast Forecasting Using R What you will

Electricity price forecasting: from prob- abilistic to deep learning approaches TU Delft &amp;

Flood Forecasting Initiative Guy Shalev Flooding impact Flood Forecasting Flood Forecasting

Lecture 9: Demand Uncertainty: Demand Uncertainty: Lecture 9: Forecasting Forecasting

Science Electricity Year One Science | Year 4 | Electricity | Exciting Electricity | Lesson 1

Sustainability First GB Electricity Demand Project realising the resource The electricity

Nash demand game Julio D avila 2009 Julio D avila Nash demand game Nash demand game

Challenges in Demand Forecasting Raj Protim Kundu ERLDC, POSOCO Need for forecasting IEGC

Forecasting 21 January 2013 1 FCAS Agenda Business Goals &amp; Forecasting Approach

Lecture 10 Forecasting and Model Fitting Colin Rundel 02/20/2017 1 Forecasting 2 Forecasting

Multi-Task Active Learning Yi Zhang Outline Active Learning Multi-Task Active Learning

Forecasting and Optimization Methods Laura Ramrez Elizondo Learning Objectives What is

Evaluation of Relational Operations: Other Techniques [R&amp;G] Chapter 14, Part B CS4320 1

Women 1115 Waiver July 16, 2020 Healthy Texas Women 1115 Waiver Overview About the

Managing Student Debt Heather Jarvis, Presenter Todays Agenda Income-Based Repayment

SAVE E THE D E DATE! E! 22nd An Annua ual CFO C Coun ouncil C Con onferen ence The

Challenges in Bayesian Network Modelling of Climate and Weather Data Marco Scutari

ADVANCED ECONOMETRICS I Theory (1/3) Instructor: Joaquim J. S. Ramalho E.mail:

Week 3: Multiple Linear Regression Polynomials, log transformation, categorical variables,

Electricity price forecasting: from prob- abilistic to deep learning approaches TU Delft &

Forecasting 21 January 2013 1 FCAS Agenda Business Goals & Forecasting Approach

Evaluation of Relational Operations: Other Techniques [R&G] Chapter 14, Part B CS4320 1