a group membership service for large scale grids
play

A Group Membership Service for Large-Scale Grids* Fernando Castor - PowerPoint PPT Presentation

A Group Membership Service for Large-Scale Grids* Fernando Castor Filho 1,4 , Raphael Y. Camargo 2 , Fabio Kon 3 , and Augusta Marques 4 1 Informatics Center, Federal University of Pernambuco 2 School of Arts, Sciences, and Humanities, University


  1. A Group Membership Service for Large-Scale Grids* Fernando Castor Filho 1,4 , Raphael Y. Camargo 2 , Fabio Kon 3 , and Augusta Marques 4 1 Informatics Center, Federal University of Pernambuco 2 School of Arts, Sciences, and Humanities, University of São Paulo 3 Department of Computer Science, University of São Paulo 4 Department of Computing and Systems, University of Pernambuco *Supported by CNPq/Brazil, grants #481147/2007-1 and #550895/2007-8

  2. Faults in Grids  Important problem  Waste computing and network resources  Waste time (resources might need to be reserved again)  Scale worsens matters  Failures become common events  Opportunistic grids  Shared grid infrastructure  Nodes leave/fail frequently  Fault tolerance can allow for more efficient use of the grid 2 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  3. Achieving Fault Tolerance  Fist step: detecting failures...  And then doing something about them  Other grid nodes must also be aware  Otherwise, progress might be hindered  More generally: each node should have an up- to-date view of group membership  In terms of correct and faulty processes 3 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  4. Requirements for Group Membership in Grids 1 Scalability 2 Autonomy 3 Efficiency 4 Capacity of handling dynamism 5 Platform-independence 6 Distribution (decentralization) 7 Ease of use 4 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  5. Our Proposal  A group membership service that addresses the aforementioned requirements  Very lightweight  Assuming a crash-recovery fault model  Deployable in any platform that has an ANSI C compiler  Leveraging recent advances in  Gossip/infection-style information dissemination  Accrual failure detectors 5 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  6. Gossip/Infection-Style Information Dissemination  Based on the way infectious diseases spread  Or, alternatively, on how gossip is disseminated  Periodically, each participant randomly infects some of its neighbors  Infects = passes information that (potentially) modifies its state  Weakly-consistent protocols  Sufficient for several practical applications  Highly scalable and robust 6 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  7. Accrual Failure Detectors  Decouple monitoring and interpretation  Output values on a continuous scale  Suspicion level  Eventually strongly accurate failure detectors  Heartbeat interarrival times define a probability distribution function  Several thresholds can be set  Each triggering different actions  As good as “regular” adaptive FDs  More flexible and easier to use 7 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  8. Architecture of the Group Membership Service Node2 Failure Detector Failure Accrual Handler 1 failure Information Failure detector Dissemination Handler 2 Node1 … Failure Monitor Membership Handler N Management Node3 Monitored process Node4 Each computer runs an instance of the group membership service 8 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  9. Membership Management  Handles membership requests Failure Detector Failure Accrual Handler 1 failure Information Failure detector Dissemination Handler 2 …  Disseminates information about Failure Monitor Membership Handler N Management new members Monitored process  Informs them about existing members  Removes failed members from the group  Failed processes can also rejoin  Epoch mechanism  Only 32 extra bits in each heartbeat message 9 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  10. Failure Detector  Collects data about k processes Failure Detector Failure Accrual Handler 1 failure Information  Push heartbeats Failure detector Dissemination Handler 2 … Failure Monitor Membership  G ossiped periodically ( T hb ) Handler N Management Monitored process  if p 1 monitors p 2 then there is a TCP connection between them  Accrual Failure Detector  Keeps track of the last m interarrival times for a given process  Derives a probability that a process has failed  Calculation is performed in O (log| S |) steps 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 10 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  11. Collecting Enough Information  Adaptive FDs need to receive Failure Detector Failure Accrual information about monitored Handler 1 failure Information Failure detector Dissemination Handler 2 processes regularly … Failure Monitor Membership Handler N Management  Also applies to accrual FDs Monitored process  Traditional gossip protocols are not regular  Solution: persistent monitoring relationships between processes  Established randomly  Exhibit the desired properties of gossip protocols 11 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  12. Failure Handlers  For each monitored process, Failure Detector Failure Accrual Handler 1 failure Information Failure detector Dissemination Handler 2 a set of thresholds is set … Failure Monitor Membership Handler N Management  For example: 85, 90, and 95% Monitored process  A handler is associated to each one  Several handling strategies are possible  Each executed when the corresponding threshold is reached  It is easy to define application-specific handlers 12 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  13. Information Dissemination  Responsible for gossiping Failure Detector Failure Accrual Handler 1 failure Information Failure detector Dissemination Handler 2 information … Failure Monitor Membership Handler N Management  About failed nodes (specific messages) Monitored process  Important for failure handling  About correct members (piggybacked in heartbeat messages)  Dissemination speed is based on parameter j  j should be O (log( N )) 13 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  14. Implementation  Written in Lua  Compact, efficient, extensible, and platform- independent  The service is packaged as a reusable Lua module  Uses a lightweight CORBA ORB (OiL) for IPC  Also written in Lua  Approximately 80KB of souce code 14 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  15. Initial Evaluation  Main goal: to assess scalability and resilience to failures  20-140 concurrent nodes  Distributed accross three machines equipped with 1GB RAM  100Mbps Fast Ethernet Network  Emulated WAN  latency = 500ms and jitter = 250ms  Parameters T hb = 2s, k = 4, j = 6, 15 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  16. Initial Evaluation  Two situations:  When no failures occur  20, 40, 60, 80, 100, 120, 140 processes  When processes fail, including realistically large numbers of simultaneous failures  140 processes  10, 20, 30, and 40% of failures  Number of sent messages per process as a measure of scalability 16 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  17. Scenario 1: No failures 17 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  18. Scenario 2: 10-40% of process failures No process became isolated.  Almost 95% were still monitored by at least k – 1 processes  18 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  19. Scenario 2: 40% of process failures 19 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  20. Concluding Remarks  Main contribution: to combine gossip-based information dissemination and accrual FDs  while guaranteeing that the AFD collects enough information ;  scalably; and  in a timely and fault-tolerant way  Ongoing work:  More experiments  Self-organizing for better resilience and better scalability  Periodic dissemination of failure information 20 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

  21. Thank You! Contact: Fernando Castor fcastor@acm.org 21 Middleware'2008 Workshop on Middleware for Grid Computing. Brussels, Belgium, December 1st, 2008

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend