Site Recovery Manager 6.1 Overview and Technical Walkthru GS Khalsa - - PowerPoint PPT Presentation
Site Recovery Manager 6.1 Overview and Technical Walkthru GS Khalsa - - PowerPoint PPT Presentation
Site Recovery Manager 6.1 Overview and Technical Walkthru GS Khalsa Technical Marketing, Storage & Availability About This Presentation Author(s) Technical Marketing GS Khalsa & Ken Werneburg Title and General Site Recovery
About This Presentation
Author(s) Technical Marketing – GS Khalsa & Ken Werneburg Title and General description Site Recovery Manager 6.1 Technical Overview Distribution and Audience type Internal, External, Partner, Customer. This deck is designed to be a deep dive on SRM 6.1. This is considerably longer than required for most presentations as it is designed to be fairly exhaustive. For most sales conversations this deck should be dramatically shortened, and should be used in its entirety only for a longer session exploring the technology of Site Recovery Manager. For further SRM information please visit the Vault and look for the SRM Overview deck as well as the SRM offline demo. Primary target audience Technical: Server team (managers, directors, VI Admin), BC/DR teams. Technical level High Time required to present 90+ minutes Date updated September 17, 2015
Agenda
- Overview
- Architecture
- Topologies
- Deployment and Configuration
- Replication
- Protection
- Recovery
- Workflows
Overview
VMware Site Recovery Manager
vSphere
vCenter Server Site Recovery Manager vCenter Server Site Recovery Manager
vSphere
Production Site Recovery Site
Servers Servers
Array-based replication vSphere Replication
- Centralized recovery plans for 1000’s of VMs
- Non-disruptive recovery testing
- Automated DR workflows
- Integrated with the VMware product stack
- Eliminates complexity and risk of manual
processes
- Enables fast and highly predictable RTOs
- Provides policy-driven DR control for any
virtualized app
Transforms Management of Recovery And Migration Plans
v Weeks or months to set up recovery plans v Unstructured and manual makes them error-prone v Quickly fall out of sync with infrastructure changes ü Simple set up in minutes ü Software-defined workflows eliminate errors ü Simple to update and keep in sync with changes
From Complex Runbooks… …to Simple Recovery Plans
Terminology
7
RPO - Recovery Point Objective RTO - Recovery Time Objective
Last Viable Restore Point All Functionality Recovered
Disaster Strikes
Architecture
Replication
9
- SRM is not a replication solution
- SRM monitors and interacts with
replication solutions
- Choice of replication options
- vSphere Replication/Host
Based Replication
- Array Based Replication
OS Data App OS Data App
SAN Virtual SAN
vSphere Replication
Only changes are replicated
OS Data App OS Data App
Site Recovery Manager Architecture
vSphere Web Client SRM Plugin SRM Server vSphere vSphere Web Client SRM Plugin SRM Server vSphere Array Replication vCenter Server Linked Mode SRA Storage Storage SRA SSO PSC vCenter Server SSO PSC vCenter Server vSphere Replication Protected Site Recovery Site VR Appliance VR Appliance
Use Cases and Topologies
Disaster Recovery
- Least frequent but most-critical use case
- Ensures fastest RTO
12
PROTECTED SITE RECOVERY SITE
Disaster Avoidance
- Ensures app-consistency and zero data loss
- Zero downtime if used with stretched storage
- Proactive, controlled workflow
13
PROTECTED SITE RECOVERY SITE
Planned Migration
- Most common use case
- Frequent on-ramp for SRM
- Enables data center maintenance and
global load balancing
14
SITE A SITE B
Active-Passive Failover
15
Recovery Production
- Dedicated resources for recovery
- Most common
- Paying for idle resources
Active-Active Failover
16
Recovery Production
- Run low-priority apps on recovery infrastructure
- Shutdown low-priority apps as part of recovery
Bi-directional Failover
17
Production Production
- Production applications at both sites
- Each site acts as the recovery site for the other
Multi-Site Failover
18
SRM VC Remote Office A SRM VC Main Data Center SRM SRM SRM VC Remote Office B SRM VC Remote Office C SRM VC Site B SRM SRM VC Site C SRM SRM VC Site A SRM
- One to One pairing of SRM servers
- Each VM only protected once
- Each VM only replicated once
- Utilize enhanced linked mode
Stretched Storage & Orchestrated vMotion
19
Production Production
- Production apps at both sites with seamless mobility across sites
- Zero downtime for planned events
- Typically limited to a Metro distance (less than 100 km)
Stretched Storage
Deployment
Site Recovery Manager Concepts
Recovery Plans SRM Server Networks, Folders, Resources, Storage Policies, Placeholder Datastores SRM Server One or more Protection Groups Protection Groups vCenter Server vCenter Server Protected Site Recovery Site Site Pairing Mapping Protection Groups Resources: Networks, Folders, Resource Pools, Storage Policies Recovery Plans Groups of VMs
Recovered Together
Replication
Storage Replication Adapters (SRAs):
- Discover arrays
- Determine which LUNs are replicated
- Assist in initiating tests, recovery
- Other SRA capabilities
– Reprotect – Synchronization – Planned Migration
- SRM Compatibility Matrix: http://www.vmware.com/pdf/srm_storage_partners.pdf
SRM Server SRA
Vendor Management Interface
Array Manager Array Manager Replication Manager
SRA
Vendor Management Interface Array Array Array
Storage Array Integration
Storage Array Integration
vSphere Replication Overview
- Per-VM, host-based replication
- Network-efficient by replicating only changed data
- Included with vSphere Essentials Plus and higher editions
25
OS Data App OS Data App
SAN Virtual SAN vCenter Server vSphere Replication Only changes are replicated
OS Data App OS Data App
vSphere Replication Features and Benefits
- Easy virtual appliance deployment
– Minimal time investment, no hardware procurement
- Integration with vSphere Web Client
– Ease of administration and monitoring
- Protect any VM regardless of OS and apps
– One solution reduces complexity and cost
26
OS Data App
vSphere Replication Features and Benefits
- Flexible recovery point objective (RPO) policies
– Supports a wide variety of business requirements
- Compatible with Virtual SAN, SAN, NAS, local storage
– One solution reduces complexity and cost
- Quick recovery for individual VMs
– Reduces downtime, minimizes resource requirements
27
vSphere Replication Features and Benefits
- End-to-end network compression
– Further reduces bandwidth requirements
- Network traffic isolation
– Control bandwidth, improve performance, security
- Windows VSS and Linux file system quiescing
– Increased reliability when recovering VMs
28
Management Replication
WAN LAN
vSphere Replication Overview
- Reliable: Protecting thousands of VMs since 2011
- Efficient: WAN-friendly replication with compression
- Value: Included with vSphere Essentials Plus Kit and higher
- Easy: Virtual appliance deployment, vSphere Web Client management
29
OS Data App OS Data App
SAN Virtual SAN vCenter Server vSphere Replication Only changes are replicated
OS Data App OS Data App
Protection
Protection Groups
- Group of VMs that will be recovered together
– Application – Department – System type – Or ?
- Different depending on replication type
- A VM can only belong to one Protection Group
CONFIDENTIAL 31
Protection Group
vSphere Replication Protection Groups
- Group VMs as desired into Protection Groups
- What storage they are located on doesn’t matter
CONFIDENTIAL 32
Protection Group 1 – Web App Protection Group 2 – Email Protection Group 3 – SharePoint
Array Based Protection Groups
33
Consistency Group Protection Group 1 – Web App
LUN 1
Protection Group 2 – Email Protection Group 3 – SharePoint
Datastore A
LUN 2
Datastore B
LUN 3
Datastore C
LUN 4
Datastore D
LUN 5
Datastore F
Storage Policy-Based Protection Groups
CONFIDENTIAL 34
Profile Driven Protection Group
- Policy Driven Protection
- New Style Protection Group leveraging storage
profiles
- High level of automation compared to traditional
protection groups
- Policy based approach reduces OpEx
- Simpler integration of VM provisioning,
migration, and decommissioning
Storage Policy
Recovery
Protection Groups fit into Recovery Plans
CONFIDENTIAL 36
Protection Group 1 – Web App Protection Group 2 – Email Protection Group 3 – SharePoint Protection Group 1 – Web App Protection Group 2 – Email Protection Group 3 – SharePoint Recovery Plan 2 - Email Protection Group 2 – Email Recovery Plan 3 – Whole Site Recovery Plan 1 – Web App Protection Group 1 – Web App
Priorities and Dependencies UI
37
Priority Group 5 Priority Group 4 Priority Group 3 Priority Group 2 Priority Group 1
Desktop Desktop Desktop Desktop Apache Apache Mail Sync Exchange App Server 2 App Server 1 Database
Priorities and Dependencies
Master Database
Dependency
VM IP Customization
- IP Subnet Mapping
– Ability to map entire subnets rather than individual addresses
39
Shutdown & Startup Actions
- Can be customized for each VM
40
Pre and post power on steps
41
- Script or Prompt
- Can be run on
– Recovered VM – SRM server
Workflows
Workflows for Recovery Plans
- Recovery
– Planned Migration – Disaster Recovery
- Reprotect
- Test
- Cleanup
Replication
Running a Recovery Plan – Planned Migration or Disaster Recovery
Protected Site Recovery Site
- Synchronize storage
- Power off VMs
- Synchronize storage again
- Break replication
- Mount datastores to hosts at
Recovery Site
- Power off non-critical VMs at
Recovery Site (optional)
- Power on VMs
Differences between Planned Migration & Disaster Recovery
- Planned Migration Mode
§ Allows for a data synchronization as part of the process § Will stop on errors and allow you to resolve them before continuing § Since it shut’s down the virtual machines being migrated, application consistent VM’s are recovered on
the recovery side
- Disaster Recovery Mode
§ Allows for a data synchronization as part of the process § Will not stop on errors § If the protected site is available, than the virtual machines being migrated will be application consistent
at the recovery side.
§ If the protected site is not available the consistency state will be what was designed in the solution
45
Failback is a process of “Reverse Recovery”
Reliable and automated for both ABR and VR Easily return environments to the primary production site
Failback – continued
- After a reprotect, replication now goes in reverse – to the protected side
Testing a Recovery Plan
Protected Site Recovery Site
Replication not impacted
Isolated Test Network Snapshot
- Entirely non-disruptive to
production VMs and replication
- Allows for data synchronization
as part of the process
- Supports a recovery that uses a
different network
- Uses a clone or snapshot
Steps:
- Replicate storage (optional)
- Snapshot VM or Storage
- Mount snapshot to hosts
- Power on VMs
Cleaning up a Test Recovery
- Run after testing is complete
- Steps:
- Power off VMs
- Remove VMs from inventory
- Delete snapshot
- Following cleanup, no test resources are
in use at the recovery site
- Test or recovery is now ready to be run
History Reports
- Each workflow operation has an associated history report
History Reports - continued
History Reports - continued
Additional Resources
- Hands on Lab
- SRM Technical Overview
- SRM Evaluation Guide
- Product Documentation
- Trial Licenses
- VMTN Community Forums
- SRM FAQ
CONFIDENTIAL 53
Supplemental
Recovery Plan Steps
- Are all the steps that need to be taken to recover
- Pre-sync storage – to reduce downtime
- Shutdown VMs – to ensure no data loss
- Sync storage – to get latest data
- Power On VMs - in desired sequence
- VMs can be left shutdown as part of recovery
56
Multi-site UI
CONFIDENTIAL 57
Advanced – IP Customization
Forced Recovery (Introduced in 5.0.1)
VMware vSphere
VMware vCenter Server Site Recovery Manager VMware vCenter Server Site Recovery Manager
VMware vSphere
Site A (Primary) Site B (Recovery) Servers Servers
? ?
Avoid delays to RTO when protected site is inconsistent
Testing a Recovery Plan
Testing a Recovery Plan
VM’s are ready to be used now
All Paths Down Handling
VMFS
vSphere vSphere
VMFS
vSphere vSphere
Limits
CONFIDENTIAL 63
Maximum
Protected virtual machines total 5000 Simultaneously recoverable VMs 2000 Protected virtual machines in a single protection group 500 Protection groups 250 Simultaneous running recovery plans 10 vSphere Replicated virtual machines 2000
SDRS, sVmotion & Array Based Replication
- SRM + SDRS supported with heterogeneous datastore clusters
– replicated and non-replicated – mix of consistency groups
- Protected VM state maintained
Datastore Cluster CG 1 CG 2 SDRS SDRS
SDRS, sVmotion & vSphere Replication
LUN 1 LUN 2
- Protected or recovery site SDRS or svmotion move between devices
supported
- Protected VM state maintained
- Full sync resumes (not restarts) if interrupted by svmotion or SDRS move
Embedded vPostgres Database
- Provides alternate integrated and simplified installation option
- Supports any size SRM environment
CONFIDENTIAL 66
Enhanced topology support
SRM VC Site A Shared Site VC SRM SRM SRM VC Site A
Enhanced topology support
- Shared recovery site and shared protected site support
SRM VC Remote Office A SRM VC Main Data Center SRM SRM SRM VC Remote Office B SRM VC Remote Office C
Enhanced topology support
Remote Office A Remote Office B Remote Office B SRM VC Shared DR Site SRM VC Site A
Remote Office
SRM
Remote Office