Emerging trends for High Availability Asim Zuberi Senior - PowerPoint PPT Presentation

Emerging trends for High Availability Asim Zuberi Senior Consultant, Collective Technologies Ayaz Mudarris Senior Consultant, Collective Technologies

Module 1: Concepts…

What is Downtime? – If a user cannot get his job done on time, the system is down – the downtime is incurred.

Causes of Downtime!

What is Availability? MTBF A = ——————— MTBF + MTTR where: A – is the degree of availability expressed as a percentage MTBF – is the mean time between failures (Uptime) MTTR – is the maximum time to repair (Downtime)

Availability Equation (A closer look) Case-I : As MTTR approaches zero, A increases toward 100%. MTBF A = ——————— MTBF + MTTR

Availability Equation (A closer look) Case-I : As MTTR approaches zero, A increases toward 100%. Case-II: As MTBF gets larger, MTTR has less impact on A. MTBF A = ——————— MTBF + MTTR

Increasing Availability • Key is obviously to minimize downtime • As downtime approaches zero, availability approaches 100% 10 0 8 0 6 0 4 0 2 0 0 10 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 10 0 A v a ila blity

The Rule of 9’s % Uptime Annual Downtime 100 0 hours 99.99 (4 – 9’s) 52.8 Minutes 99.98 1 hour 45 Minutes 99.90 (3 – 9’s) 8 hours 45 Minutes 99.80 17 Hours 30 Minutes 99.70 26 Hours 17 Minutes 99.50 43 Hours 43 Minutes 99.00 (2 – 9’s) 87 Hours 36 Minutes

Why do you need Availability? – Issues which have caused problems or concerns with computer availability… • Terrorist attacks • Satellite Outages • Attacks by computer viruses • Emergence of internet as viable force

Levels of Availability • Level 1: Regular Availability (Do Nothing Special) • Level 2: Increased Availability (Protect the Data) • Level 3: High Availability (Protect the System) • Level 4: Disaster Recovery (Protect the Organization)

Twenty Key System Design Principles 20) Spend Money…but not blindly 19) Assume Nothing 18) Remove/Identify SPOFs 17) Maintain Tight Security 16) Consolidate Your Servers 15) Automate Common Tasks 14) Document Everything 13) Establish Service Level Agreements 12) Plan Ahead 11) Test Everything 10) Maintain Separate Environments 9) Invest in Failure Isolation 8) Examine the History of the System 7) Build for Growth 6) Choose Mature Software 5) Select Reliable and Serviceable Hardware 4) Reuse Configurations 3) Exploit External Resources 2) One Problem, One Solution 1) KISS: Keep It Simple Simple

End-to-end Availability Measurem ent Application Network I nfrastructure System E-E-A Software Operating System Hardware

Modeling Availability • Complex as system comprises many components • Most common techniques – Monte Carlo principle – Markov techniques • Basically state diagrams

State Diagram

W hat Does I t Mean to Us? How do you minimize downtime? Replication Backups Clustering Snapshot Mirroring 24 hrs 12 hrs 1 hr 10 min 1 min

Trinity of TTs

Module 2: Storage Area Networks…

W hy Storage Area Netw orks? • Management of distant configurations. • Soft recabling. • Storage consolidation. • Heterogeneous connectivity. • Data sharing. • Massive configurations. • LAN-less and/ or server-less backup.

W hy Fibre Channel? • Reliable Communication – Removes the performance barriers of legacy LANs. – Support for other, typically "non-network" protocols, such as SCSI. • Low-latency message passing • High bandwidth transfer – Connection and connectionless data delivery. – sustain data transfer rates at 90 Mbps – variable length (0-2 KB) frames. – Highly effective for protocol frames of less than 100 bytes, as well as bulk data transfer • Scalable networks.

SAN Com ponents • Host Bus Adapter (HBA) • Channel • Switch/ Hub/ Bridge

HBA • Fibre Channel Cards – Every device on the SAN has a World Wide Number (WWN) including HBA’s – 64 bit assigned by IEEE – Similar to the way MAC addresses are assigned to Network Interface Cards (NICs). • Vendors – JNI for Solaris – Emulex for NT

Channel • Medium – Copper 30m – Fibre optics • Multimode 500m • Single mode 10km • Buffer to buffer copy • Transmission isolated from control • FC-0 through FC-4

Topologies • Point-to-point – Two Nodes

Topologies • Arbitrated loop – 126 nodes – Practically even less Hub

Topologies Array • Fabric – 16 million nodes Switch Enterprise Switch Switch Hub Bridge Array JBOD

Sw itches/ Hubs/ Bridges • Workgroup switches – 8 or 16 port – Redundant Power supplies – Hot Swappable GBIC’s • Enterprise Switches – 64 port – Everything redundant, everything hot swappable • Hubs – Connects FC-AL to FC-SW • FC/ SCSI Bridges – Reuse old JBOD or SCSI tape drives

NAS Vs SAN • NAS devices are storage appliances big, single purpose servers that you plug into your network. • These appliances perform one task, and they perform it well: They serve files very fast. • The difference between how a NAS appliance and a SAN function is subtle. • NAS is a defined product that sits between your application server and your file system. • SAN is a defined architecture that sits between your file system and your underlying physical storage. • NAS is network-centric. • A SAN is data-centric.

The Final Conflict • NAS appliances offer – performance and reliability at a low cost. – excellent devices for collaboration and data storage, especially in heterogeneous computing environments. – Yet, NAS appliances can send only files, not data blocks, which limits their ability. • SAN promises to free your network of bottlenecks. – traffic relief comes at a high price.

Third level of High Availability • 85% of storage on Unix servers is unprotected! • RAID,Replication and Snapshots can protect you when disaster strikes. • New emerging concepts – Business Continuance Volumes (BCV) – Shared Storage Option (SSO)/ Smart Media – SAN over WAN – iSCSI

Business Continuance Volum es Backup/ Restore High speed Tapeless Offsite Test Environm ent Softw are Lifecycles Y2 k/ Euro Currency Decision Support Reporting DataW arehouse

BCV • sync-split-mount sequence – Directly form disk to internal cache and then BVC – Works at volume group level • Block-by-block copy • Only changed tracks copied at next sync. • Instantaneous fallback.

Sharing Tape Libraries • Tape Drives are shared – Heterogeneous connectivity. – Reduces cost – Increases availability Enterprise Switch SUN NT Switch Bridge Tru6 4

Module 3: High Availability trends for SAN…

Sw itch/ Sw itch com ponents MIRRORING CLUSTER FC Switch SPOF;

Sw itch/ Sw itch com ponents FC Switch MIRRORING CLUSTER FC Switch SPOF;

MIRRORING CLUSTER Enterprise FC Switch SPOF;

Drives Drives 50% Enterprise FC Switch Tape Drives SPOF;

Module 4: Design Issues for Clustering…

Design Issues • Objectives • Understand design issues of high-availability • Understand trade-offs of design issues

Design Suggestions • Keep it simple – Complexity hurts long term maintenance and manageability • Know all single points of failure (SPOFs) • Avoid failover if possible

Know and Document ALL Single Points of Failure • Look for all SPOFs in both hardware and software • Look for SPOFs both on hosts and on cluster as a whole • Could the failure of any single component prevent a client from accessing a vital service?

A Typical Layout NETWORK : Ethernet heartbeat Hub Hub • hbas links • routers • switches • hubs • power source NICS OS Disks OS Disks SCSI2 SCSI2 SCSI1 SCSI1 DISKS : HOSTS : • hbas • critical file systems, • drives e.g. / and /usr Service Network • power source • power source NICs NICs FC0 FC0 FC1 FC1 SAN : TAPE : • hbas • hbas • routers • drives Bridge • switches • robots • hubs • power source • power source

SPOF:Hosts Ethernet heartbeat Hub link NICS SCSI1 SCSI1 HOSTS : • critical file systems, e.g. / and /var Service Network • power source NIC NIC

SPOF:Disks ethernet heartbeat Hub link NICS OS Disks OS Disks SCSI2 SCSI2 SCSI1 SCSI1 DISKS : • controllers • drives Service Network • power source NICs NICs

SPOF:Heartbeat NETWORK : Ethernet heartbeat Hub Hub • nic links • switches • hubs • power source NICS SCSI2 SCSI2 SCSI1 SCSI1 Service Network NICs NICs

SPOF:Storage/SAN Ethernet heartbeat Hub Hub links NICS SCSI2 SCSI2 SCSI1 SCSI1 Service Network NICs NICs FC0 FC0 FC1 FC1 Storage/SAN • hbas • cabling • switches • hubs • power source

SPOF:Tape ethernet heartbeat Hub Hub links NICS SCSI2 SCSI2 SCSI1 SCSI1 Service Network NICs NICs FC0 FC0 FC1 FC1 TAPE : • hbas • drives Bridge • robots • power source

SPOF:Network Switching SystemB Clients Switch SystemA

Emerging trends for High Availability Asim Zuberi Senior - PowerPoint PPT Presentation

Emerging trends for High Availability Asim Zuberi Senior Consultant, Collective Technologies Ayaz Mudarris Senior Consultant, Collective Technologies Module 1: Concepts What is Downtime? If a user cannot get his job done on time, the

Drupal High Availability High Performance Samstag, 3. November 12 Drupal High Availability

for High Availability Martin Thompson - @mjpt777 What Is High Availability ?

Chapter 4: Implementing High Availability and Redundancy in a Campus Network CCNP-RS SWITCH

Availability Knob Flexible User-Defined Availability in the Cloud Mohammad Shahrad and David

Extending CSP with tests for availability Gavin Lowe Extending CSP with tests for availability

Contents Introduction Basic Model High Availability, Scalable Storage, Availability

High Availability with the openais project Prepared by: Steven Dake October 2005 Agenda

High Availability with the openais project Prepared by: Steven Dake 7/12/05 Agenda Service

Preamble Emerging Trends in Medical Preamble Emerging Trends in Medical Simulation:

The Need for Global Availability of High The Need for Global Availability of High Quality Fuels

Virtualization and High Availability Mika Karlstedt AMICT'08 May 2008 Faculty of Science

High Availability using virtualization Federico Calzolari Scuola Normale Superiore - INFN Pisa

High Availability and Automatic Failover in PostgreSQL Using Open Source Solutions Avinash

AutoASAP AutoASAP Features AutoAsap Entities Availability & Availability &

Availability Suite Cornel Popescu Veeam Systems Engineer South East Europe Agenda Introduction

Data Protection & Availability Kaushal Devater Data Protection & Availability Discipline

GV311 Britain and Europe Part 2: Options for Britain Simon Hix Professor of European &

Process Mining for Component Management in EUROCLEAR 2017-2018 ACTIVITIES Pr Process Mining

Financial Stability and Payment Systems Report 2010 Briefing to Analysts & Fund Managers by

A Model of the Consumption Response to Fiscal Stimulus Payments Greg Kaplan Gianluca Violante

Multi-resolution Inference of Stochastic Models from Partially Observed Data Samuel Kou

Animation structure and content of the internal of changes over time? representation. Why?

European Nuclear Education European Nuclear Education Initiatives Initiatives Jean-Paul Glatz

Babylon Medea-Persia Greece Rome Lion with Lopsided Four- Monstrous Eagles Wings Bear

Emerging trends for High Availability Asim Zuberi Senior - PowerPoint PPT Presentation

Emerging trends for High Availability Asim Zuberi Senior Consultant, Collective Technologies Ayaz Mudarris Senior Consultant, Collective Technologies Module 1: Concepts What is Downtime? If a user cannot get his job done on time, the

Drupal High Availability High Performance Samstag, 3. November 12 Drupal High Availability

for High Availability Martin Thompson - @mjpt777 What Is High Availability ?

Chapter 4: Implementing High Availability and Redundancy in a Campus Network CCNP-RS SWITCH

Availability Knob Flexible User-Defined Availability in the Cloud Mohammad Shahrad and David

Extending CSP with tests for availability Gavin Lowe Extending CSP with tests for availability

Contents Introduction Basic Model High Availability, Scalable Storage, Availability

High Availability with the openais project Prepared by: Steven Dake October 2005 Agenda

High Availability with the openais project Prepared by: Steven Dake 7/12/05 Agenda Service

Preamble Emerging Trends in Medical Preamble Emerging Trends in Medical Simulation:

The Need for Global Availability of High The Need for Global Availability of High Quality Fuels

Virtualization and High Availability Mika Karlstedt AMICT'08 May 2008 Faculty of Science

High Availability using virtualization Federico Calzolari Scuola Normale Superiore - INFN Pisa

High Availability and Automatic Failover in PostgreSQL Using Open Source Solutions Avinash

AutoASAP AutoASAP Features AutoAsap Entities Availability &amp; Availability &amp;

Availability Suite Cornel Popescu Veeam Systems Engineer South East Europe Agenda Introduction

Data Protection &amp; Availability Kaushal Devater Data Protection &amp; Availability Discipline

GV311 Britain and Europe Part 2: Options for Britain Simon Hix Professor of European &amp;

Process Mining for Component Management in EUROCLEAR 2017-2018 ACTIVITIES Pr Process Mining

Financial Stability and Payment Systems Report 2010 Briefing to Analysts &amp; Fund Managers by

A Model of the Consumption Response to Fiscal Stimulus Payments Greg Kaplan Gianluca Violante

Multi-resolution Inference of Stochastic Models from Partially Observed Data Samuel Kou

Animation structure and content of the internal of changes over time? representation. Why?

European Nuclear Education European Nuclear Education Initiatives Initiatives Jean-Paul Glatz

Babylon Medea-Persia Greece Rome Lion with Lopsided Four- Monstrous Eagles Wings Bear

AutoASAP AutoASAP Features AutoAsap Entities Availability & Availability &

Data Protection & Availability Kaushal Devater Data Protection & Availability Discipline

GV311 Britain and Europe Part 2: Options for Britain Simon Hix Professor of European &

Financial Stability and Payment Systems Report 2010 Briefing to Analysts & Fund Managers by