SNMP- -based Monitoring Agents and Heuristic based Monitoring - - PowerPoint PPT Presentation

snmp based monitoring agents and heuristic based
SMART_READER_LITE
LIVE PREVIEW

SNMP- -based Monitoring Agents and Heuristic based Monitoring - - PowerPoint PPT Presentation

International Conference on G G rid computing, high rid computing, high- -perform perform A A nce nce and and International Conference on D istributed A pplications GADA'07 D istributed A pplications GADA'07 November 29 November 29- -30,


slide-1
SLIDE 1

1

SNMP SNMP-

  • based Monitoring Agents and Heuristic

based Monitoring Agents and Heuristic Scheduling for Large Scheduling for Large-

  • scale Grids

scale Grids

Edgar Magaña1-2, Laurent Lefèvre3, Masum Hasan1, and Joan Serrat2

1 Cisco Systems, Inc.

emagana, masum@cisco.com

3 INRIA RESO / Université de Lyon

LIP UMR 5668 (CNRS, ENS Lyon, INRIA, UCB), France laurent.lefevre@inria.fr

2 Universitat Politècnica de Catalunya

serrat@tsc.upc.edu

International Conference on International Conference on G Grid computing, high rid computing, high-

  • perform

performA Ance nce and and D Distributed istributed A Applications pplications GADA'07 GADA'07

November 29 November 29-

  • 30, 2007

30, 2007 -

  • Vilamoura

Vilamoura, Algarve, Portugal , Algarve, Portugal

slide-2
SLIDE 2

2

Outline

  • Introduction
  • Grid Resource Management (GRM)
  • Policy-based Grid Management Architecture
  • SBLOMARS Monitoring Agents Approach
  • BLOMERS Heuristic Scheduler Approach
  • Grid5000 Experiments
  • Conclusions and Future Work
slide-3
SLIDE 3

3

Introduction (I)

  • Grid Computing Definition:

“Grid is an interconnected collections of geographically distributed and heterogeneous hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities”. [Foster_Grid2] –Grid Systems has to fulfill following three points: » Coordinates Resources that are not under centralized management. » Uses standard, open, generic protocols and interfaces. » Provides non-trivial Quality of Services (QoS).

slide-4
SLIDE 4

4

Introduction (II)

Grid Computing Tendencies:

– Grid Services should be provided to users regardless

  • f network technology, administrative domain or
  • perative platform (Heterogeneity).

– Effective access to large amount

  • f computing,

network and storage resources, reducing procurement, deployment, maintenance and operational cost. – Network Performance: Fault-tolerance, Scalability and Flexibility.

slide-5
SLIDE 5

5

Outline

  • Introduction
  • Grid Resource Management (GRM)
  • Policy-based Grid Management Architecture
  • SBLOMARS Monitoring Agents Approach
  • BLOMERS Heuristic Scheduler Approach
  • Grid5000 Experiments
  • Conclusions and Future Work
slide-6
SLIDE 6

6

It is the ability to discover, allocate, negotiate, monitor, and manage the use of network-accessible capabilities in order to achieve various end-to-end

  • r global qualities of service*.

Grid Resource Management (I) Definition

*J. Nabrzyski, J. M. Schopf and J. Weglarz, "Grid Resource Management State of the Art and Future Trends“

Kluwer Academic Publishers. Boston, USA October 2003. ISBN 1-4020-7575-8.

slide-7
SLIDE 7

7

Grid Resource Management (II)

Main Activities:

  • GRM is regarded as a vital component of the Grid

infrastructure.

  • It coordinates and shares multiple kinds of resources

efficiently.

  • GRM must fulfill strict functional requirements from

heterogeneous, and sometimes conflicting, domains (e.g., the users’, applications and networks domains).

  • It must adhere to non-functional requirements that are also

rigid, such as reliability and efficiency in terms of time

consumption and load on the host nodes.

slide-8
SLIDE 8

8

Grid Resource Management (III)

Therefore…

  • GRM is a very challenging issue. Where, several project are

proposing their approaches:

slide-9
SLIDE 9

9

Grid Resource Management (IV)

  • Remaining Problems:

– Swift and dynamic allocation and reservation

  • f computational resources.

– Algorithms to find available resource. – Reduce resources analysis and scheduling

  • times. (Makespan)

– Dynamic resources discovery and analysis. – Makespan is normally reduced but Load Balancing is not taking into account. – Allocation of network resources per service.

Divide Divide and and Conquer Conquer

slide-10
SLIDE 10

10

Grid Resource Management (V)

Three Grid Management Phases:

  • Resource Discovery and Monitoring (SBLOMARS)

– Which resources are available to a given user – Selecting resource source to search in more detail – Filtering out resources that do not meet the minimal job requirements

  • Resource Scheduling (BLOMERS)

– Given a group of possible resources must be selected on which to schedule the job – Algorithms for scheduling should be applied – Load Balanced techniques should be take into account – Setting up times also have to be considered

  • Job Allocation and Execution (PbGRMA)

– Job submission through simple commands or policies – Monitoring progress and evaluation of the performance – Notification to users when jobs have done – Cleaning up any temporal information on the server side

slide-11
SLIDE 11

11

Outline

  • Introduction
  • Grid Resource Management
  • Policy-based Grid Management Architecture
  • SBLOMARS Monitoring Agents Approach
  • BLOMERS Heuristic Scheduler Approach
  • Grid5000 Experiments
  • Conclusions
  • Ongoing and Future Work
slide-12
SLIDE 12

12

Policy-based Grid Management Architecture

Policy Manager

Policy Conflict

Extended PDPs Extended PDPs Service PDP

Repository

Inter Domain Manager

PDPs

PEP PEP

WS-RF Registry

Service Descriptor

External Interface

QoS PDP

PEP

<!-- == WS-Resource Properties Document for Newton’s Method Application .. <wsdl:portType name="Newton's Method" wsrp:ResourceProperties= "intf:GenericMethodProperties"> <xsd:sequence> <xsd:element maxDistribution="5" minDistribution="1" name=" " … <xsd:element amountMinMemory="20" MaxMemory="250" =" "... … "wsa:EndpointReferenceType"/> </wsdl:portType>

Network Level Policies

Element Level Policies Element Level Policies

SBLOMARS

Monitoring Agents Monitoring Agents

BLOMERS Scheduler (GA)

Policy Editor

Domain Manager

Policy Manager

Policy Conflict

Extended PDPs Extended PDPs Service PDP

Repository

Inter Domain Manager

PDPs

PEP PEP

WS-RF Registry

Service Descriptor

External Interface

QoS PDP

PEP

<!-- == WS-Resource Properties Document for Newton’s Method Application .. <wsdl:portType name="Newton's Method" wsrp:ResourceProperties= "intf:GenericMethodProperties"> <xsd:sequence> <xsd:element maxDistribution="5" minDistribution="1" name=" " … <xsd:element amountMinMemory="20" MaxMemory="250" =" "... … "wsa:EndpointReferenceType"/> </wsdl:portType> <!-- == WS-Resource Properties Document for Newton’s Method Application .. <wsdl:portType name="Newton's Method" wsrp:ResourceProperties= "intf:GenericMethodProperties"> <xsd:sequence> <xsd:element maxDistribution="5" minDistribution="1" name=" " … <xsd:element amountMinMemory="20" MaxMemory="250" =" "... … "wsa:EndpointReferenceType"/> </wsdl:portType>

Network Level Policies

Element Level Policies Element Level Policies Element Level Policies Element Level Policies

SBLOMARS

Monitoring Agents Monitoring Agents

BLOMERS Scheduler (GA)

Policy Editor

Domain Manager

slide-13
SLIDE 13

13

Features:

– We obtain a gorgeous synergy by coupling Policy-based Technology and SNMP-based Agents and Genetic Algorithms. – Simplifies Grid Services deployment and management – Support for dynamic, reconfigurable on demand, secure and highly customizable computing storage and networking environments – Dynamic extensibility and flexibility of the architecture – Deployment and Activation of Grid services in all planes

Policy-based Grid Management Architecture*

*E. Magaña, L. Lefevre and J. Serrat. “Autonomic Management Architecture for Flexible Grid Services

Deployment Based on Policies”. ARCS'07, Zurich, Switzerland. 2007.

slide-14
SLIDE 14

14

Outline

  • Introduction
  • Grid Resource Management (GRM)
  • Policy-based Grid Management Architecture
  • SBLOMARS Monitoring Agents Approach
  • BLOMERS Heuristic Scheduler Approach
  • Grid5000 Experiments
  • Conclusions and Future Work
slide-15
SLIDE 15

15

Definition and Features:

– SNMP-based Balanced Load Monitoring Agents for Resource Scheduling – It is a pure decentralized monitoring system in charge of permanently capturing computational resource performance based on autonomous distributed agents. – it integrates SNMP technology and thus, offers an alternative solution to handle heterogeneous resources. – it implements complex dynamic software structures, which are used to monitor from simple personal computers to robust multiprocessor systems or clusters with even multiple hard disks and storage partitions. – It distributes the monitoring activities into a set of sub-monitoring instances which are specific per each kind of computational resource to monitor (processor, memory, software, network and storage)

SBLOMARS Monitoring Agents Approach (I)

slide-16
SLIDE 16

16

SBLOMARS Monitoring Agents Approach (I)

Dynamic Software Structures

[0] “Number of Element” [0] Element ID (String) [1] Total Space Available (Long – Bytes) [2] Total Space Free (Long – Bytes) [3] Total Space Used (Long – Bytes) [4] Total Space Used (Long – Bytes) [5] … (Available when it could be requiered) … [N] (As long as the number of resource elements exits on the monitored node)

Network Map

8

XML Monitoring Service

<Monitoring_Storage_Available_Information> <Device_Type>Storage</Device_Type> <Storage_Device> <Label>C:\ </Label> <Space_Total>21476171776</Space_Total> <Space_Available>6833168384</Space_Available> <Space_Used>14643003392</Space_Used> <Space_Used_Percent>68</Space_Used_Percent>

Resource Discovery

Principal Agent Deployer

Administrators

Resource Sub- Monitoring Agents Real-Time Reporter Statistical Reporter

1 3 2 5 4 6 7

Architecture:

slide-17
SLIDE 17

17

Data Structures:

SBLOMARS Monitoring Agents Approach (I)

<?xml version="1.0" encoding="UTF-8" ?> <!-- Edited with Agent BLOMERSXML v1.0 … <!-- Monitoring Resources Service xmlns:xsi= … <Monitoring_Storage_Available_Information> <Device_Type>Storage</Device_Type> <Number_of_Elements>3</Number_of_Elements> <Storage_Device> <Label>C:\ Label: Serial Number f010b634</Label> <Space_Total>21476171776</Space_Total> <Space_Available>6833168384</Space_Available> <Space_Used>14643003392</Space_Used> <Space_Used_Percent>68</Space_Used_Percent> </Storage_Device> <Storage_Device> <Label>G:\ Label:Disco local Serial Number 302e</Label> <Space_Total>10733957120</Space_Total> <Space_Available>3095842816</Space_Available> <Space_Used>7638114304</Space_Used> <Space_Used_Percent>71</Space_Used_Percent> </Storage_Device> <Storage_Device> <Label>H:\ Label:SHARED Serial Number 48f893</Label> <Space_Total>34290843648</Space_Total> <Space_Available>13172244480</Space_Available> <Space_Used>21118599168</Space_Used> <Space_Used_Percent>61</Space_Used_Percent> </Storage_Device> </Monitoring_Storage_Available_Information>

Dynamic Software Structures - “Storage”

[0] “Number of Element” [0] Element ID (String) [1] Total Space Available (Long – Bytes) [2] Total Space Free (Long – Bytes) [3] Total Space Used (Long – Bytes) [4] Total Space Used Percentage (Long – Bytes) [5] … (Available when it could be requiered) … … [1] “Number of Element” [0] Element ID (String) [1] Total Space Available (Long – Bytes) [2] Total Space Free (Long – Bytes) [3] Total Space Used (Long – Bytes) [4] Total Space Used Percentage (Long – Bytes) [5] … (Available when it could be requiered) … … … [N] (As long as the number of resource elements exits on the monitored node)

XML Reports “Storage”

slide-18
SLIDE 18

18

Graphical User Interface:

SBLOMARS Monitoring Agents Approach (II)

Grid 5000 CPU Resource Performance Monitoring by SBLOMARS

slide-19
SLIDE 19

19

Outline

  • Introduction
  • Grid Resource Management (GRM)
  • Policy-based Grid Management Architecture
  • SBLOMARS Monitoring Agents Approach
  • BLOMERS Heuristic Scheduler Approach
  • Grid5000 Experiments
  • Conclusions and Future Work
slide-20
SLIDE 20

22

Definition and Activities: – Balanced Load Multi-Constrain Resource Scheduler (BLOMERS) – This scheduler makes use of the statistical resource availability information generated by SBLOMARS agents. – This procedure examines the set of available resources, generates a number of candidates and evaluates the candidate resources to select a final subset to be allocated and communicates the results. – Every resource (not node) is assigned an ID which is the reference to generate new populations in our approach. – The reference ID is taken in its binary representation to perform “Mutation and Crossover” operations.

BLOMERS Heuristic Scheduler Approach (III)

slide-21
SLIDE 21

23

Pseudo Code:

BLOMERS Heuristic Scheduler Approach (IV)

Where:

(Pk) Is the selected

Population (set of resources)

(k) Is one kind of

resource (memory, storage, etc.)

CleaningBuffer (Pk) Initialize (k, Pk); Evaluate (Pk); Do { Select_Resource_Candidates (Pk); Crossover (Pk); IF Evaluate(Pk+1) == Minimal Constraints; Ends Do-While; ELSE Mutation(Pk+1); IF Evaluate(Pk+2) == Minimal Constraints; Ends Do-While; } Deliver (k_solution);

slide-22
SLIDE 22

24

BLOMERS Heuristic Scheduler Approach (V)

0000 1100 01

=

Parent 2

Storage_1 ID_45: 147.83.106.167:6401

0000 0110 00

=

Parent 1

Storage_0 ID_24: 147.83.106.199:6400

CHROMOSOME … … … 0000 1100 00 = Child 2

Storage_0 ID_44: 147.83.106.167:6400

0000 0110 01 = Child 1

Storage_1 ID_25: 147.83.106.199:6401

CHROMOSOME

Crossover:

0000 1110 01 = Child 1 Storage_N ID_57: 147.83.206.199:6401 0000 0110 01 = Parent 1 Storage_1 ID_25: 147.83.106.199:6401

CHROMOSOME

Mutation:

slide-23
SLIDE 23

25

Outline

  • Introduction
  • Grid Resource Management (GRM)
  • Policy-based Grid Management Architecture
  • SBLOMARS Monitoring Agents Approach
  • BLOMERS Heuristic Scheduler Approach
  • Grid5000 Experiments
  • Conclusions and Future Work
slide-24
SLIDE 24

26

SBLOMARS and BLOMERS Approach (I)

Public Network

I nt er - Do I nt er - Dom ai n m ai n Rout er ut er

  • 192. 168. 88. 0
  • 192. 168. 88. 21
  • 192. 168. 22. 0
  • 192. 168. 14. 14

pr i vat e_ pr i vat e_2. net

  • 2. net

I nt er - Do I nt er - Dom ai n m ai n Rout er

  • ut er
  • 192. 168. 1. 0
  • 128. 40. 42. 180
  • 192. 168. 3. 3
  • 192. 168. 1. 10
  • 192. 168. 3. 0
  • 192. 168. 11. 11
  • 192. 168. 3. 4

pr i vat e_ pr i vat e_3. net

  • 3. net
  • 192. 168. 11. 0
  • 192. 168. 11. 12

pr i vat e_ pr i vat e_n. net

  • n. net
  • 192. 168. 22. 23
  • 192. 168. 14. 15
  • 128. 40. 38. 94
  • 192. 168. 22. 24
  • 192. 168. 1. 1
  • 192. 168. 3. 5

PO LI CI ES DB

Dat a

Dat a

  • 192. 168. 22. 22
  • 192. 168. 1. 88
  • 192. 168. 14. 0
  • 192. 168. 1. 2

pr i vat e_ pr i vat e_1. net

  • 1. net

PO LI CI ES DB

BLOMERS BLOMERS BLOMERS BLOMERS

SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS

BLOMERS BLOMERS

SBLOMARS SBLOMARS

Public Network

I nt er - Do I nt er - Dom ai n m ai n Rout er ut er

  • 192. 168. 88. 0
  • 192. 168. 88. 21
  • 192. 168. 22. 0
  • 192. 168. 14. 14

pr i vat e_ pr i vat e_2. net

  • 2. net

I nt er - Do I nt er - Dom ai n m ai n Rout er

  • ut er
  • 192. 168. 1. 0
  • 128. 40. 42. 180
  • 192. 168. 3. 3
  • 192. 168. 1. 10
  • 192. 168. 3. 0
  • 192. 168. 11. 11
  • 192. 168. 3. 4

pr i vat e_ pr i vat e_3. net

  • 3. net
  • 192. 168. 11. 0
  • 192. 168. 11. 12

pr i vat e_ pr i vat e_n. net

  • n. net
  • 192. 168. 22. 23
  • 192. 168. 14. 15
  • 128. 40. 38. 94
  • 192. 168. 22. 24
  • 192. 168. 1. 1
  • 192. 168. 3. 5

PO LI CI ES DB

Dat a

Dat a

  • 192. 168. 22. 22
  • 192. 168. 1. 88
  • 192. 168. 14. 0
  • 192. 168. 1. 2

pr i vat e_ pr i vat e_1. net

  • 1. net

PO LI CI ES DB

BLOMERS BLOMERS BLOMERS BLOMERS

SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS SBLOMARS

BLOMERS BLOMERS

SBLOMARS SBLOMARS

General Scenario:

slide-25
SLIDE 25

27

Grid5000 Experiments (I)

SBLOMARS PERFORMANCE:

30 msec. 30 msec.

Sixty Seconds CPU Usage Forty-eight Hours Memory Usage Twenty-four Hours CPU Usage

slide-26
SLIDE 26

28

Grid5000 Experiments (II)

SBLOMARS FLEXIBILITY:

a) b) c) Information is missing a) b) c) Information is missing a) b) c) a) b) c)

Auto-configuration between SNMP Traps Fix Timing between SNMP Traps

slide-27
SLIDE 27

29

Grid5000 Experiments (III)

SBLOMARS SCALABILITY:

5 10 15 20 25 30 35 10 20 30 40 50 60 70 80 90 100 110 Number of Nodes Time (s) 1 2 3 4 5 6 7 8 10 20 30 40 50 60 70 80 90 100 110 Number of Nodes Time (s)

10 20 30 40 50 60 10 20 30 40 50 60 70 80 90 100 110 Number of Nodes Time (ms)

SBLOMARS Configuration Time SBLOMARS Starting Time SBLOMARS Responding Time

slide-28
SLIDE 28

30

– Grid5000 : 4000 CPUs/cores on 10 sites around 10Gbit/s networks. Heterogeneous architectures. http://www.grid5000.fr – The total amount of nodes performing this experiment were 115. – Every node was running a processor generator application to simulate processor load. – Each scheduling algorithm was working along 120 minutes (2hrs) receiving 30 jobs every 60 seconds. On every node were also running a background processes generator. It was running randomly for the whole experiment (6hrs) – Round-robin: This algorithm schedules every job received to the next available node from a list of nodes available. – Least Used: This algorithm schedules based on the average of the least used node. http://nmg.upc.es/~emagana/sblomars/grid5000.html

Grid5000 Experiments (IV)

slide-29
SLIDE 29

31

Grid5000 Experiments (v)

Grid 5000 CPU Resource Performance Monitoring by SBLOMARS

slide-30
SLIDE 30

32

Outline

  • Introduction
  • Grid Resource Management (GRM)
  • Policy-based Grid Management Architecture
  • SBLOMARS Monitoring Agents Approach
  • BLOMERS Heuristic Scheduler Approach
  • Grid5000 Experiments
  • Conclusions and Future Work
slide-31
SLIDE 31

33

Conclusions and Future Work

  • BLOMERS implements a Genetic Algorithm, which it offers a parallelism

to multi-constraint service requests avoiding to be enclosed into a local minima.

  • It improves resource load-balancing and reduces the makespan in any

scheduling.

  • BLOMERS is a scalable system based on a distributed agents design

and its flexibility allows it to handle heterogeneous devices.

  • It is able to schedule large numbers of services in real scenarios, such

as Grid5000 testbed.

  • Current evaluation results do not include yet, the effect of network

latency and other communication impairments.

  • We are including network performance between end-to-end edge

routers as an entry parameter for our genetic algorithm. Resources with high latencies or jitter will be taken into account.

slide-32
SLIDE 32

34

Questions?

Laurent Lefèvre

http://perso.ens-lyon.fr/laurent.lefevre laurent.lefevre@inria.fr

Edgar Magaña

http://nmg.upc.es/~emagana/ emagana@cisco.com emagana@nmg.upc.edu

http://nmg.upc.es/~emagana/sblomars/grid5000.html