R. K. Ghosh Dept of CSE, IIT Kanpur Department of C S E R. K. - - PDF document

r k ghosh dept of cse iit kanpur
SMART_READER_LITE
LIVE PREVIEW

R. K. Ghosh Dept of CSE, IIT Kanpur Department of C S E R. K. - - PDF document

Grid Computing Grid Computing: Research Issues and Challenges R. K. Ghosh Dept of CSE, IIT Kanpur Department of C S E R. K. Ghosh Cutting Edge, April, 2005 1 of 31 Grid Computing Export Restriction by US


slide-1
SLIDE 1

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 1 of 31

Grid Computing: Research Issues and Challenges

  • R. K. Ghosh

Dept of CSE, IIT Kanpur

slide-2
SLIDE 2

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 2 of 31

Export Restriction by US

  • Computer export from US to India, China, Russia

and Middle East based on MTOPs

  • Before 2001 – 28,000 MTOPs –> less powerful

than a cluster of 10 1.5 GHz/2-way PCs.

  • 2001 – 85,000 MTOPs –> less powerful than a

cluster of 10 2.2 GHz/4-way PCs.

  • 2002 – 195,000 MTOPs –> less powerful than

a cluster of 10 3 GHz/8-way PCs. (source: Xiaodong Zhang, NSF.)

slide-3
SLIDE 3

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 3 of 31

Inadequacy of Client/Server

  • 2×1018 Bytes/year generated in Internet.
  • But only 3×1012 Bytes/year available to public

(0.00015%).

  • Google only searches 1.3×108 Web pages.

(source: Gong, IEEE Internet Computing, 2001.)

slide-4
SLIDE 4

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 4 of 31

Inadequacy of Client/Server

  • Asymmetric utilization of services and band-

width. – Clients have mainly passive roles –> comput- ing cycles are unutilized. – Servers (popular ones) suffer from traffic con- gestion.

slide-5
SLIDE 5

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 5 of 31

Characteristics of P2P Model

+ Nodes can leave and join at any time.

  • Hetergeneity: service capabilities, storage, net-

work speed, service demand + A decentralized system with equal opportunities for all participating nodes.

slide-6
SLIDE 6

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 6 of 31

P2P Model

  • Client server.
  • Pure P2P system
  • Hybrid P2P (directory on top of pure P2P)
slide-7
SLIDE 7

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 7 of 31

Problems in P2P Computing

  • Security and Privacy

– Information leakage, evil codes and viruses, privacy protection (loss of anonymity)

  • Weak resource coordination

– Unbalanced load due to weak/no coordina- tion – Lacks communication/schdule monitor –> traffic congestions – Rely on self organization.

slide-8
SLIDE 8

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 8 of 31

Ideal P2P Model

  • Fast peer service

– Low diameter region for peer to peer interac- tion. – Dynamically identifying and collecting trusted peers. – Adaptive self-organized coordination.

  • Allowing peer distrustful peer to exist

– DoS attack, evil code and viruses, intrusion detection. – Exposing identity of peers (communication anonymity)

  • Measurable security metrics

– Benchmarks, stochastic models, quantifying degree of security.

slide-9
SLIDE 9

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 9 of 31

Ideal P2P Model

  • Understanding tradeoffs

– Impact of loss of central control over security. – Quantifying security loss, performance loss/gain due to decentralization. – Conflict of common and individual objectives.

  • Building over existing infrastructures

– Minimizing new standards and protocols – Avoid modifying commonly used and general purposed s/w. – Peer oriented processing should be automatic.

slide-10
SLIDE 10

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 10 of 31

Application on P2P

  • Document/file sharing: with no or limited cen-

tral control.

  • Instant messaging: immediate voice and file ex-

change among peers

  • Distributed processing: use resources available

in other remote peers.

slide-11
SLIDE 11

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 11 of 31

Application Differences: Grid & P2P

  • Grid:

global problem solving environment for large and critical scientific applications and pro- fessional collaborations, where each node is a server.

  • P2P:

a general and commercial informa- tion/computing services, where each peer can be both server and client.

slide-12
SLIDE 12

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 12 of 31

Operation Differences: Grid & P2P

  • Grid:

direct access to computing, software, and data resources in remote & targeted sites (Servers-based).

  • P2P: random accesses to available computing,

software, and data resources without a specifict target (Clients-based).

slide-13
SLIDE 13

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 13 of 31

Different Participants: Grid & P2P

  • Grid: pre-determined and registered clients and

servers.

  • P2P: clients and servers are not distinguished

and registered, which can come and go by their choices.

slide-14
SLIDE 14

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 14 of 31

Different QoS: Grid & P2P

  • Grid: guaranteed and reliable services are re-

quired for each grid server.

  • P2P: only partially reliable, because services from

some peers are not guaranteed and trusted.

slide-15
SLIDE 15

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 15 of 31

Security Differences: Grid & P2P

  • Grid: authentication, authority, and firewall pro-

tection to each grid.

  • P2P: privacy, anonymity, authentication, author-

ity, and fire wall protection to each peer is not guaranteed.

slide-16
SLIDE 16

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 16 of 31

Different Controls: Grid & P2P

  • Grid: centralized control plays important role in

resource monitoring/allocations and job schedul- ing.

  • P2P: limited or no central controls, mainly rely
  • n self-organization.
slide-17
SLIDE 17

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 17 of 31

Grid Computing

  • Term was coined around 1995 to denote (a pro-

posed) distributed computing infrastructure for science & engineering.

  • Extended to commercial computing applications.
  • Dynamically links resources together for execu-

tion of large scale, resource intensive distributed applications.

  • Integrates networking, communication, compu-

tation and information into a virtual platform for computation and data management.

slide-18
SLIDE 18

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 18 of 31

Grid Computing

acquisition data visualization advanced imaging instruments largscale databases equipment video terminal data analysis computational resources graphics terminal

slide-19
SLIDE 19

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 19 of 31

Grid Computing

  • Similar to a utility grid.
  • Seeks to and is capable of adding an infinite num-

ber of computing devices.

  • Capabilities can be added within the operational

environment.

  • Collaboration at global level –> huge talent

pool.

  • Takes distributed computing to next evolutionary

level.

  • Creates an illusion of a simple but large self man-

aging virtual computer.

slide-20
SLIDE 20

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 20 of 31

Grid Computing

  • Complicated global computing environment that

leverages many open standards and technologies in a wide variety of implementation schemes. – UDDI, XML, SOAP, HTTP, WSDL, WSFL – Globus, Linux, Java

slide-21
SLIDE 21

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 21 of 31

Grid Computing

  • Ubiquitous platform – so far as usability scenarios

and virtual organizations indicate.

  • Virtual organizations

– Financial forecasting models (e.g. deciding

  • n new factory location)

– Feasibility studies (e.g. multi-disciplinary simulation of aircarft) – Crisis management (e.g. mitigation of chem- ical spills) – Data grid (e.g. high energy physics - 178,368 peta bytes of data) – Internet games (e.g. virtual world - adding to population) – Impact of drug on performance of brain (low level chemical simulation across differ- ent databases)

slide-22
SLIDE 22

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 22 of 31

Grid Candidates

  • A cluster system on local area network –> just

a resource. – a centralized control over the hosts that it manages.

  • Web Service is generic solution for interoperabil-

ity over distributed environment (Internet).

slide-23
SLIDE 23

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 23 of 31

Web Service & Grid Computing

  • Grid –> extension of WS to solve computing

problems in scientific and business domain.

  • OGSA (open grid service architecture) is a dis-

tributed interaction and computing architecture – Leverages WS to define WSDL interfaces for Grid service. – Assures interoperability on heterogeneous systems based arond Grid services.

slide-24
SLIDE 24

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 24 of 31

Grid Candidates

  • Multi-site schedulers can reasonably be called

(first-generation) Grid

  • Distributed computing systems provided (Con-

dor, Entropia, and United Devices) which har- ness idle desktops

  • Peer-to-peer systems (such as Gnutella) which

support file sharing among participating peers;

  • A federated deployment of the Storage Resource

Broker, which supports distributed access to data resources.

  • The protocols used in these systems are too spe-

cialized though each integrates distributed re- sources in the absence of centralized control, and delivers interesting qualities of service in narrow domains.

slide-25
SLIDE 25

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 25 of 31

Grid Solution Sphere

Logical grid Business process sharing Application sharing Physical grid Resource sharing Predefined configuration Dynamic configuration

  • Logical grid refers to s/w and appl. sharing as

well as business process sharing. Configured dy- namically.

  • Physical grid refers to computer power shared
  • ver distributed n/w for a specific task –> pre-

defined.

slide-26
SLIDE 26

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 26 of 31

BPO: Logical Grid

Enterprise B Parterners Grid portal Web services UDDI registry Local grid Local grid Suppliers SOAP/ XML SOAP/ XML SOAP/ XML SOAP/ XML SOAP/ XML transport SOAP/ XML Business Process power creation PO checking Credit Local grid Enterprise A Business Process

slide-27
SLIDE 27

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 27 of 31

Remote Execution

  • Easiest use of grid computing is to run existing

application on a different m/c.

  • Pre-req for this:

– Application must be executable remotely with undue overhead. – Remote m/c could meet all special h/w, s/w

  • r other resource requirements.
  • Using remote m/c for word processing (interac-

tive jobs) does not make sense. But for batch jobs it is ok.

  • There are under-utilized computing resources

(desktops are busy only 5% of time). Grid com- puting can make use of these.

  • Grid can make data highly available. Most com-

puter has lot of storage, so data can be repli- cated.

slide-28
SLIDE 28

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 28 of 31

Collaboration

  • Simplifies collaboration.
  • Users can be organized dynamically into number
  • f VOs each having different policy requirements.

But they share resources collectively.

  • Sharing can be in data (files, databases) – by

replication, striping etc.

  • Equipment, s/w, services, licenses all can be

shared.

  • But sharing calls for strong security rules.
slide-29
SLIDE 29

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 29 of 31

Balanced Resource Utilization

  • Better balancing of resource utilization.

high load migrate low load low load low load

slide-30
SLIDE 30

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 30 of 31

Parallel Computing

  • Parallel CPU capacity is another attractive fea-

ture.

  • All applications can not be transformed to run in

parallel.

  • The number of independent running parts into

which an application can be split is the major difficulty. – All application can not be transformed to run in parallel on a grid and achieve scalability. – There are no tools for transforming an arbi- trary application to exploit parallel capabili- ties of a grid.

slide-31
SLIDE 31

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 31 of 31

Reliability & Fault Tolerance

  • Redundancy in conventional system must be

built explicitly and expensive (both h/w and s/w).

  • Inherent redundancy in grid configuration allows

fault tolerance and reliability without any extra costs.

job A job A job A

slide-32
SLIDE 32

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 32 of 31

Grid Management

  • Controlled expenditures for computing resources
  • ver a larger organization.
  • Priorities among different projects can be better

managed

  • Aggregated utilization enhances ability to an-

ticipate future upgrades and eased maintenance (reroute jobs from maintenance sites).

  • Autonomic computing tools (recovery from var-

ious grid outages, failures) can be deployed.

slide-33
SLIDE 33

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 33 of 31

Application Development

  • Grid

Application Development Software (GrADS) provides tools and execution envi- ronment.

  • Ongoing since 1999. Participating universities:

UC San Diego, U Tennessee Knoxville, UIUC, Univ of Houston.

  • Development framework has two distinct parts

– GrADS program preparation system (GrADS PPS). – GrADS execution environment (GrADS EE).

slide-34
SLIDE 34

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 34 of 31

Development Framework

Problem solving environment Application GrADS compiler Configurable

  • bject program

scheduler service negotiator binder negotiation realtime monitor library GrADS GrADS runtime system Program preparation env. Execution env. components S/W feedback

slide-35
SLIDE 35

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 35 of 31

Challenges

  • Comprehensive administration
  • Resource provisioning
  • Adapative application intergration
  • Flexible data sharing and access.
  • Activity monitoring.
  • Policy-based grid management mechanisms.
slide-36
SLIDE 36

✧ ★ ✦ ✥

Department of C S E Grid Computing

  • R. K. Ghosh

Cutting Edge, April, 2005 36 of 31

References

[1 ] F. Berman, et al. New Grid Scheduling and Reschduling Methods in the GrADS Project, In- ternational Journal of Parallel Programming (to appear), 2005. [2 ] X. Zhang. Research Issues for Building and Integrating Peer-Based and Grid Systems. In http://www.nesc.ac.uk/talks/china meet/ zhang beijing talk.pdf [3 ] Ian Foster, et. al. The Anatomy of a Grid. International Journal of Supercomputer Applica- tions, 2001. [4 ] Jen-Yao Chung, Liang-Jie Zhang. Business Grid: Grid Computing Infrastructure for e- Business Solutions, OMG’s 2nd Workshop On Web Services Modeling, Architectures, Infras- tructures And Standards. [5 ] Li Gong. A Software Architecture for Open Ser- vice Gateways. IEEE Internet Computing 5(1): (2001).