On Self-adaptive Resource Allocation through Reinforcement Learning - PowerPoint PPT Presentation

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Machine Learning Methodologies Supervised Learning Classification Algorithms when labels are known to belong to a finite set C Regression Algorithms when labels are known to belong to R Unsupervised Learning Clustering Algorithms when labels are unknown but their cardinality K is assumed be fixed J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 6/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Example of Clustering Problem Space Exploration Clustering algorithms can be used to identify patterns in remotely (e.g. in space) sensed data and improve the scientific return by sending to the ground station only statistically significant data [1]. 1 1 http://nssdc.gsfc.nasa.gov/ J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 7/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Reinforcements in Behavioural Psychology Definition In behavioural psychology, reinforcement consists of the strengthening of a behaviour associated to a stimulus through its repetition. Pioneers B.F. Skinner (1904-1990), together with E. Thorndike (1874-1949), is considered to be one the fathers of current theories on reinforcement and conditioning [2]. J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 8/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Reinforcements in Behavioural Psychology Definition In behavioural psychology, reinforcement consists of the strengthening of a behaviour associated to a stimulus through its repetition . Pioneers B.F. Skinner (1904-1990), together with E. Thorndike (1874-1949), is considered to be one the fathers of current theories on reinforcement and conditioning [2]. J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 8/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Reinforcements in Behavioural Psychology Definition In behavioural psychology, reinforcement consists of the strengthening of a behaviour associated to a stimulus through its repetition. Pioneers B.F. Skinner (1904-1990), together with E. Thorndike (1874-1949), is considered to be one the fathers of current theories on reinforcement and conditioning [2]. J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 8/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Pavlov’s Dog A precursor of Skinner theories Ivan Pavlov (1849-1936) made conditioning famous with his experiment of drooling dogs. J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 9/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References reinforcement learning in computer science is something a bit different both from supervised/unsupervised learning and reinforcements in behavioural psychology.. J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 10/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Why Reinforcement Learning is Different (I) Supervised/Unsupervised Machine Learning data-point → label (or a cluster) Reinforcements in Behavioural Psychology stimulus → behaviour Reinforcement Learning state of the world → action J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 11/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Why Reinforcement Learning is Different (II) Reinforcement Learning state of the world → action → new state of the world → action → .. Because the performance metric of RL (i.e., the collected reward S ) is computed over time, solving a RL problem allows to make • planning • complex, sequential decisions • even counterintuitive decisions J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 12/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Why Reinforcement Learning is Different (III) If today was a sunny day • a classification algorithm would label it as “go to the seaside” • RL would tell you “you might as well study and enjoy the fact that you did not fail your exams later in the summer” RL is not an epicurean carpe diem methodology, but a more farsighted and judicious approach. The point is, not how long you live, but how nobly you live. - Lucius Annaeus Seneca J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 13/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References moving on to self-adaptive computing.. J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 14/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Typical Properties of Self-adaptive Computing Self-configuration The system requires limited or no human intervention in order to set-up. Self-optimization The system is able to achieve user-defined goals autonomously, without human interaction. Self-healing The system can detect and recover from faults without human intervention. Together with self-protection, these are the properties identified in [3] for autonomic system. J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 15/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Self-configuration Example Multi-platform software Software that is able to run on different hardware configurations seamlessly is a good example of self-configuration. Hardware Detect Config. Run Inst.Tools Software Install J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 16/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Self-optimization Example Smart Video Players Players that can adjust media encoding in order to maintain a certain Quality of Service (QoS) can be considered self-optimizing applications. Video Detect Quality Play Manager Encoder Control J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 18/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Self-healing Example Reconfigurable Logic FPGAs are a good playground for self-healing implementation. Part of the hardware resources can be used to verify the correct functioning of the rest of the logic and force reconfiguration when a fault is detected. Prog.Logic Reconfigure Detect Fault µ Contr. Listener Inform J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 20/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Research Question Is RL a suitable approach for self-adaptive computing? J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 22/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Case Study Testing Environment • Desktop workstation • Multi-core Intel i7 Processor • Linux-based operating system Objective of our Experiments Enabling self-adaptive properties in applications of the PARSEC[4] benchmark suite through reinforcement learning algorithms. J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 23/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Tests Set-Up Reinforcement Learning Framework • A finite set of states S → heart rate of the PARSEC benchmark application measured through Heart Rate Monitor (HRM) APIs [5] • A finite set of actions A → (1) number of cores on which the PARSEC benchmark application is scheduled 2 and (2) CPU frequency 3 • A reward function R ( s ) ∶ S → R → whether a user-defined target (in heartbeats/s) is met or not 2 sched setaffinity system call 3 cpufrequtils package J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 24/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Self-configuration 1 2 3 4 5 6 7 8 9 10 25 perf. (M options/s) 20 15 10 5 0 4 3 cores 2 1 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 9501000 time(s) blackscholes managed exploiting ADP and core allocation . J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 25/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Self-optimization 1 2 3 4 5 6 7 8 9 10 perf. (M exchanges/s) 2 . 5 2 1 . 5 1 0 . 5 0 4 3 cores 2 1 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 time (s) canneal managed exploiting ADP and core allocation. J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 26/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Self-healing 1 2 3 4 5 6 7 8 9 10 perf. (M exchanges/s) 2 . 5 2 1 . 5 1 0 . 5 0 4 3 cores 2 1 14 frequency 1 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 time (s) canneal managed exploiting ADP, core allocation, and frequency scaling. J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 27/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Self-healing 1 2 3 4 5 6 7 8 9 10 perf. (M exchanges/s) 2 . 5 2 1 . 5 1 0 . 5 0 4 3 cores 2 1 14 frequency 1 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 time (s) canneal managed exploiting ADP, core allocation, and frequency scaling . J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 27/31 – mistlab.ca

POLYTECHNIQUE MONTR´ EAL Rationale Reinforcement Learning Self-adaptive Computing Case Study Conclusions References Self-healing 1 2 3 4 5 6 7 8 9 10 perf. (M exchanges/s) 2 . 5 2 1 . 5 1 0 . 5 0 4 3 cores 2 1 14 frequency 1 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 time (s) canneal managed exploiting ADP, core allocation, and frequency scaling. J. Panerati et al. – On Self-adaptive Resource Allocation through Reinforcement Learning 27/31 – mistlab.ca

On Self-adaptive Resource Allocation through Reinforcement Learning - PowerPoint PPT Presentation

On Self-adaptive Resource Allocation through Reinforcement Learning Jacopo Panerati , Filippo Sironi , Matteo Carminati , Martina Maggio , Giovanni Beltrame , Piotr J. Gmytrasiewicz , Donatella Sciuto and Marco D.

Neural Nets for Adaptive Filter and Adaptive Neural Nets as Adaptive Filters Pattern Recognition

Adaptive Control Chapter 1: Introduction to Adaptive Control Adaptive Control Landau, Lozano,

Adaptive Control Chapter 11: Direct Adaptive Control 1 Adaptive Control Landau, Lozano,

Resource Allocation Task Force Resource Allocation Task Force Gigi Karmous Edwards Gigi

ADEPT Scalability Predictor in Support of Adaptive Resource Allocation IPDPS 2010 Outline

Adaptive Control Chapter 12: Indirect Adaptive Control 1 Adaptive Control Landau, Lozano,

Autonomic Systems Autonomic Systems Autonomic : adaptive : adaptive Autonomic Self

More Register Allocation Last time Register allocation Global allocation via graph

Adaptive Control Chapter 13: Multimodel adaptive control with switching Chapter 13: Multimodel

Adaptive Control Chapter 14: Adaptive regulation Rejection of unknown disturbances 1

Section 6.1: Resource Allocation Issues Chapter 6: Congestion Control and Resource Allocation

Multiagent Resource Allocation: What to optimise, how, and why? Ulle Endriss Imperial College

Resource Allocation and Deadlock Resource Allocation and Deadlock Handling Conditions for

A Resource Allocation-centric A Resource Allocation-centric Grid Operation Model Grid Operation

Adaptive Management: Adaptive Management: Science, Management, or What? Science, Management, or

Resource Allocation in Bounded Degree Trees Reuven Bar-Yehuda, Michael Beder, Yuval Cohen (CS,

Animal behaviour and humans Initially, animals were probably observed for practical reasons

Disclosures I have nothing to disclose 1 3/11/16 Overview Review current conceptualization

Entropy and mixing for Z d SFTs Ronnie Pavlov University of Denver www.math.du.edu/ rpavlov

ControlBasis-III double recommended-setpoint(actions[NACTIONS][NDOF]; (scope is a project

Retrospective on 10 Years of Modelling Human Dynamics: Never be your own lawyer - Never model

Behaviour change and behavioural

Unlocking Urban Mobility Behaviour Change Joint Project Conference: MOBI, PTP-CYCLE, STARS

Visualizing Structure, Behavior and Evolution of Software Lecture Notes: Software Visualization,