Resource and Application Resource and Application Models for - - PowerPoint PPT Presentation

resource and application resource and application models
SMART_READER_LITE
LIVE PREVIEW

Resource and Application Resource and Application Models for - - PowerPoint PPT Presentation

Resource and Application Resource and Application Models for Advanced Grid Models for Advanced Grid Schedulers Schedulers Aleksandar Lazarevic, Lionel Sacks Aleksandar Lazarevic, Lionel Sacks Dept. of Electrical and Electronic Engineering,


slide-1
SLIDE 1

Resource and Application Resource and Application Models for Advanced Grid Models for Advanced Grid Schedulers Schedulers

Aleksandar Lazarevic, Lionel Sacks Aleksandar Lazarevic, Lionel Sacks

  • Dept. of Electrical and Electronic Engineering,
  • Dept. of Electrical and Electronic Engineering,

University College London University College London

slide-2
SLIDE 2

May May-

  • 04

04 v0.2

Problem Problem

  • Heterogeneous and dispersed systems

Heterogeneous and dispersed systems

  • Quest for effective scheduling technique

Quest for effective scheduling technique

  • Good scheduling decisions depend on

Good scheduling decisions depend on quality and availability of information quality and availability of information

  • Importance of resource

Importance of resource-

  • efficient

efficient information dissemination. information dissemination.

slide-3
SLIDE 3

May May-

  • 04

04 v0.2

Motivation Motivation -

  • Scheduling

Scheduling

  • Scheduling on distributed, heterogeneous

Scheduling on distributed, heterogeneous and dynamic Grid resources. and dynamic Grid resources.

  • Current Schedulers

Current Schedulers

  • Queuing or Batch:

Queuing or Batch:

  • NQE, PBS, LSF, Load

NQE, PBS, LSF, Load Leveler Leveler

  • Application Level:

Application Level:

  • AppLeS

AppLeS, MARS, SEA, DOME , MARS, SEA, DOME

  • Dynamic, Ranking:

Dynamic, Ranking:

  • Condor

Condor ClassAd ClassAd language and matchmaker language and matchmaker

slide-4
SLIDE 4

May May-

  • 04

04 v0.2

Motivation Motivation – – Info Distribution Info Distribution

  • Current Globus approach

Current Globus approach -

  • centralized

centralized LDAP information provider (MDS). LDAP information provider (MDS).

  • Little research in alternatives

Little research in alternatives – – MDS works MDS works for current size of Grid clusters. for current size of Grid clusters.

  • Centralized services are becoming a

Centralized services are becoming a bottleneck bottleneck

  • SMP or clusters as gateways to the Grid?

SMP or clusters as gateways to the Grid?

slide-5
SLIDE 5

May May-

  • 04

04 v0.2

Bright Ideas Bright Ideas -

  • Scheduling

Scheduling

  • Advance reservation and partitioning of

Advance reservation and partitioning of resources complex and wasteful. resources complex and wasteful.

  • Low

Low-

  • level scheduling in multitasking OS

level scheduling in multitasking OS can distort machine loading info. can distort machine loading info.

  • Decouple application load and node

Decouple application load and node computational output computational output

  • Assign jobs based on requested

Assign jobs based on requested turnaround and unsubscribed capacity. turnaround and unsubscribed capacity.

slide-6
SLIDE 6

May May-

  • 04

04 v0.2

Subscribed Load Scheduling Subscribed Load Scheduling

Proc 2 CPU Time Proc 1 Requested T/T Unsubscribed @ t Proc 1 Estimated T/T Safety Mrg Proc 1 Projected T/T @ t

CPU Usage [%] t 100

Proc 1 CPU Time Time

slide-7
SLIDE 7

May May-

  • 04

04 v0.2

Application & Node Profiles Application & Node Profiles

  • Distinction between volatile and non

Distinction between volatile and non-

  • volatile resources.

volatile resources.

  • Profiles in XML with modular matchmaker.

Profiles in XML with modular matchmaker.

  • Nodes self asses the level of fitness for a

Nodes self asses the level of fitness for a given request and return a Bid Value. given request and return a Bid Value.

  • Monitoring and feedback improve

Monitoring and feedback improve confidence levels and reduce safety confidence levels and reduce safety margins margins

slide-8
SLIDE 8

May May-

  • 04

04 v0.2

Bright Ideas Bright Ideas -

  • Information

Information

  • Small

Small-

  • Worlds principle

Worlds principle – – information information shared among several neighbours and few shared among several neighbours and few distant nodes. distant nodes.

  • Fuzzy picture of the Grid environment

Fuzzy picture of the Grid environment – – enables enables “ “good good” ” but not necessarily but not necessarily “ “best best” ” decisions. decisions.

  • Gaining credibility, good resilience to

Gaining credibility, good resilience to random node failures random node failures

slide-9
SLIDE 9

May May-

  • 04

04 v0.2

Information Flows Information Flows

  • Localised, need

Localised, need-

  • to

to-

  • know

know information flow policy information flow policy

  • 3

3-

  • Tier Information Flow:

Tier Information Flow:

  • Node Current State

Node Current State

Low Low-

  • latency, short shelf life

latency, short shelf life

  • Volatile Resources State

Volatile Resources State

Self Self-

  • organized, distributed, fuzzy
  • rganized, distributed, fuzzy
  • Accounting

Accounting

Centralized, reliable, accurate Centralized, reliable, accurate

Accounting & Management Resource Discovery Integrity, Intelligence & Information [I3]

Policy-based Management Security Management Self-Organised Res. Discovery Resource Management GRAM Local Job Manager (Fork, PBS) Operating System Ganglia MDS Policy Repository & Distribution GSI - PKI Security Infrastructure Monitoring Control

slide-10
SLIDE 10

May May-

  • 04

04 v0.2

Conclusions & Future Work Conclusions & Future Work

  • New approaches needed to handle

New approaches needed to handle dynamic and heterogeneous resource pool. dynamic and heterogeneous resource pool.

  • Reduce complexity and possible points of

Reduce complexity and possible points of failure. failure.

  • Develop a prototype meta

Develop a prototype meta-

  • scheduler and

scheduler and test on 200 CPU UCL Grid test on 200 CPU UCL Grid