Building High-available systems Chander Damodaran - Collabera - - PowerPoint PPT Presentation

building high available systems
SMART_READER_LITE
LIVE PREVIEW

Building High-available systems Chander Damodaran - Collabera - - PowerPoint PPT Presentation

Building High-available systems Chander Damodaran - Collabera Agenda Introduction Key Concepts Approach Availability Index Key HA Design Principles 3 Sample Business Scenarios Root causes IRCTC RPO Mid-size GitHub


slide-1
SLIDE 1
slide-2
SLIDE 2

Building High-available systems

Chander Damodaran - Collabera

slide-3
SLIDE 3

Agenda

  • Introduction
  • Key Concepts
  • Approach
  • Availability Index
  • Key HA Design Principles

3

slide-4
SLIDE 4

Sample Business Scenarios

Root causes IRCTC RPO Mid-size Company GitHub Single point of failure System used beyond design limits Software error Human error

4

Wrong design assumptions

slide-5
SLIDE 5

Key Concepts

  • Availability: Availability is the measure of how often or how long a service or a

system component is available for use.

  • Reliability: Reliability is the measure of fault avoidance.
  • Serviceability: Serviceability is a measurement that expresses how easily a

system is serviced or repaired. Uptime __________________ Uptime + Downtime Availability =

5

system is serviced or repaired.

  • Disaster Recovery: Disaster recovery is the ability to continue with services in

the case of major outages, often with reduced capabilities or performance.

slide-6
SLIDE 6

Approach

List Vulnerabilities Evaluate scenarios, and determine their probability Map scenarios to requirements Design solution Review the solution, and check its behaviour against failure scenarios

6

VULNERABILITY LIKELIHOOD (1-5) IMPACT (1-5) LEVEL OF CONCERN SOLUTION Failed disk 5 1 5 Implement Mirrored disks Application Crash 5 4 20 Distributed application, failover, clustering

slide-7
SLIDE 7

Availability Index

Disaster Recovery Replication Failovers Services and Applications Client Management Local Environment Networking Disk and Volume Management A V A I L A B I L I

7

Reliable Backups Good System Administration Practices INVESTMENT I T Y

*Blueprints for High Availability

slide-8
SLIDE 8

Components, failures & protection mechanism

Component category Typical failure Fault protection User environment Data deletion or corruption Disaster-recovery processes Administration environment Data deletion or corruption Disaster-recovery processes Application Crashes, data corruption Distributed application, failover, clustering Middleware Crashes, memory leaks Clustering

8

Middleware Crashes, memory leaks Clustering (Network) infrastructure Connection loss Independent high- availability architecture Operating system Crash, device driver errors Clustering Hardware Device defect Redundant components, hot-spare disks maintenance contracts Physical environment Power outage, fire, floods UPS, backup data center

slide-9
SLIDE 9

Key High Availability Design Principles

  • Assume Nothing
  • Remove Single Points of Failure (SPOFs)
  • Plan Ahead & Design for Growth
  • One Problem, One Solution
  • Choose Mature, Reliable Hardware
  • Choose Mature Software
  • Learn from History
  • Separate Your Environments
  • Separate Your Environments
  • Test Everything
  • Employ Service Level Agreements
  • Document Everything
  • Enforce Change Control
  • Watch Your Speed
  • Consolidate Your Servers
  • Enforce Security
  • Don’t Be Cheap

9

slide-10
SLIDE 10

QUESTIONS?

10

ChanderD@Collabera.com

slide-11
SLIDE 11