building high available systems
play

Building High-available systems Chander Damodaran - Collabera - PowerPoint PPT Presentation

Building High-available systems Chander Damodaran - Collabera Agenda Introduction Key Concepts Approach Availability Index Key HA Design Principles 3 Sample Business Scenarios Root causes IRCTC RPO Mid-size GitHub


  1. Building High-available systems Chander Damodaran - Collabera

  2. Agenda • Introduction • Key Concepts • Approach • Availability Index • Key HA Design Principles 3

  3. Sample Business Scenarios Root causes IRCTC RPO Mid-size GitHub Company Single point of failure System used beyond design limits Software error Human error Wrong design assumptions 4

  4. Key Concepts • Availability : Availability is the measure of how often or how long a service or a system component is available for use. Uptime __________________ Availability = Uptime + Downtime • Reliability : Reliability is the measure of fault avoidance. • Serviceability : Serviceability is a measurement that expresses how easily a system is serviced or repaired. system is serviced or repaired. • Disaster Recovery : Disaster recovery is the ability to continue with services in the case of major outages, often with reduced capabilities or performance. 5

  5. Approach Review the solution, and check its Design solution behaviour against failure Map scenarios scenarios to requirements Evaluate scenarios, and determine their List probability Vulnerabilities VULNERABILITY LIKELIHOOD IMPACT (1-5) LEVEL OF SOLUTION (1-5) CONCERN Failed disk 5 1 5 Implement Mirrored disks Application Crash 5 4 20 Distributed application, failover, clustering 6

  6. Availability Index Disaster Recovery A V Replication A Failovers I Services and Applications L Client Management A B Local Environment I Networking L Disk and Volume Management I I Reliable Backups T Y Good System Administration Practices *Blueprints for High Availability INVESTMENT 7

  7. Components, failures & protection mechanism Component category Typical failure Fault protection User environment Data deletion or Disaster-recovery corruption processes Administration Data deletion or Disaster-recovery environment corruption processes Application Crashes, data corruption Distributed application, failover, clustering Middleware Middleware Crashes, memory leaks Crashes, memory leaks Clustering Clustering (Network) infrastructure Connection loss Independent high- availability architecture Operating system Crash, device driver Clustering errors Hardware Device defect Redundant components, hot-spare disks maintenance contracts Physical environment Power outage, fire, floods UPS, backup data center 8

  8. Key High Availability Design Principles • Assume Nothing • Remove Single Points of Failure (SPOFs) • Plan Ahead & Design for Growth • One Problem, One Solution • Choose Mature, Reliable Hardware • Choose Mature Software • Learn from History • Separate Your Environments • Separate Your Environments • Test Everything • Employ Service Level Agreements • Document Everything • Enforce Change Control • Watch Your Speed • Consolidate Your Servers • Enforce Security • Don’t Be Cheap 9

  9. QUESTIONS? ChanderD@Collabera.com 10

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend