CTO, Cloud Platforms, Citrix
Architecting for the cloud: lessons learned from 100 CloudStack deployments
Sheng Liang
Architecting for the cloud: lessons learned from 100 CloudStack - - PowerPoint PPT Presentation
Architecting for the cloud: lessons learned from 100 CloudStack deployments Sheng Liang CTO, Cloud Platforms, Citrix CloudStack History 2008 2009 2010 2012 2011 Sept 2008: Nov 2009: May 2010: July 2011: April 2012: CloudStack
CTO, Cloud Platforms, Citrix
Architecting for the cloud: lessons learned from 100 CloudStack deployments
Sheng Liang
2008 Sept 2008: VMOps Founded 2009 Nov 2009: CloudStack 1.0 GA 2010 May 2010: Cloud.com Launch & CloudStack 2.0 GA 2011 July 2011: Citrix Acquires Cloud.com 2012 April 2012: Apache CloudStack
CloudStack History
Open Source Xen Hypervisor Open Source Xen Hypervisor Amazon Proprietary Orchestration Software Amazon Proprietary Orchestration Software EC2 API EC2 API Amazon eCommerce Platform Amazon eCommerce Platform Networking Storage Commodity Servers
The inventor of IaaS cloud – Amazon EC2
Open Source Xen Hypervisor Open Source Xen Hypervisor Amazon Proprietary Orchestration Software Amazon Proprietary Orchestration Software EC2 API EC2 API Amazon eCommerce Platform Amazon eCommerce Platform Networking Storage Commodity Servers XenServer CloudStack CloudPortal Cloud APIs
ESX Hyper-V KVM OVM
CloudStack is inspired by Amazon EC2
There will be 1000s of clouds
IT SP Owner | Operator Horizontal General Purpose Vertical Special Purpose
Desktop Cloud
Data center mgmt and automationLearning from 100s of CloudStack deployments
Enterprise Service Providers Web 2.0
What is the biggest difference between traditional-style data center automation and Amazon-style cloud?
How to handle failures
Annual Failure Rate of servers
70% - hard disk 6% - RAID controller 5% - memory 18% - other factors
Network failure Software bugs Human admin error
Internet Core Routers Access Routers Aggregation Switches Load Balancers Top of Rack Switches
… …
Servers
Effectiveness of network redundancy in reducing failures
mechanism
as TCP back-off, timeouts, and spanning tree reconfiguration
never fail
when failure happens C.Tell users to expect failure. Users to backup VM and handle failure themselves
zCloud East Zone AWS East Zone zCloud West Zone AWS West Zone
zCloud East Zone AWS East Zone zCloud West Zone AWS West Zone
Design for
Failure
Cloud workloads
Traditional-Style
Reliable hardware, backup entire cloud, and restore for users when failure happens
Amazon-Style
Tell users to expect failure. Users to build apps that can withstand infrastructure failure Link aggregation Storage multi-pathing VM HA, fault tolerance VM live migration VM backup/snapshots Multi-site redundancy Chaos monkey Ephemeral resources Strong consistency Eventual consistency
Designing a zone for a traditional workload
vCenter/XenCenter vCenter/XenCenter Hypervisor Cluster Hypervisor Cluster Hypervisor Cluster Hypervisor Cluster Hypervisor Cluster Hypervisor Cluster Enterprise Networking (e.g., VLAN) Enterprise Networking (e.g., VLAN) Enterprise Storage (e.g., SAN) Enterprise Storage (e.g., SAN) Hypervisor Storage SAN Networking L2 VLANs Network Services Load Balancing VPN Multi-tier Apps Ent App Mgmt vSphere or XenServer Enterprise Traditional-Style Availability ZoneDesigning a zone for an Amazon-style workload
Hypervisor Storage Local EBS Networking L3 SDN based L2 Elastic IP Network Services Security Groups ELB Multi-tier Apps 3rd Party Tools (e.g., RightScale, enStratus) 3rd Party Tools (e.g., RightScale, enStratus) XenServer or KVM GSLB Software Defined Networks (e.g., Security Groups, EIP, ELB,...) Software Defined Networks (e.g., Security Groups, EIP, ELB,...) Amazon-Style Availability Zone Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Server Racks Elastic Block Storage Elastic Block StorageELB/ GSLB
NetScaler Availability Zone 2?
Object store is critical for Amazon-style cloud
NetScaler Availability Zone 2 Storage CloudSame Cloud can Support Both Styles
Traditional Style Availability Zone Traditional Style Availability Zone Apache CloudStack Mgmt Server Traditional Style Availability Zone Traditional Style Availability ZoneReplication/DR
Tests for a “true” cloud app
Learning from 100s of CloudStack deployments
Enterprise Service Providers Web 2.0
Traditional-style Mostly Amazon-style Mostly traditional style
…
… … … …
Availability Zone 1 Servers Object Store Pod 1 Pod 2 Pod 3 Pod N Load Balancer Internet Availability Zone 2 Primary CloudStack Mgmt Server Cluster Primary MySQL CloudStack Admin Backup MySQL Standby CloudStack Mgmt Server Cluster…
DB Security Group Web Security Group
Layer 3 cloud networking (security groups)
… …
Web VM Web VM Web VM Web VM Web VM Web VM Web VM Web VM DB VM DB VM Web VM Web VM DB VM DB VM Web VM Web VM…
Layer 2 VLAN networking
… …
User 2 User 2 User 1 User 1 User 1 User 1 User 1 User 1 User 1 User 1 User 2 User 2…
OVS networking
… …
User 2 User 2 User 1 User 1 User 1 User 1 User 1 User 1 User 1 User 1 User 2 User 2 OVS OVS OVS OVS OVSGRE Key 2 GRE Key 1
Multi-tier virtual networking
App subnet 10.1.2.0/24 App VM 1 App VM 2 Web VM 1 Web VM 2 Web VM 3 Web VM 4 Web subnet 10.1.1.0/24 DB Subnet 10.1.3.0/24 DB VM 1 Customer Premises Customer PremisesIPSec VPN
Internet Internet
MPLS VLAN
Network Services
Public VLAN
Network flexibility
Network Services
Network Isolation
Service Providers
“The Apache Way”
Apache CloudStack Community
Pre Apache Move (Jan 2012) June Actuals # of companies endorsing project 1 68 # of companies participating 10 140 # of developers working on project 40 238Apache CloudStack community projects
Nicira Midokura Big Switch Networks Stratosphere
Sungard
Cisco Brocade (ADX)
Hadoop + S3 API for object store NetApp (FlexPod, object store) Basho RIAK CS Caringo object store Cloudian S3
CloudFoundry implementation through IronFoundry and Stackato teams Engine Yard Cumulogic GigaSpaces
Workload requirements drive cloud architecture There is real demand for SDN in cloud infrastructure Open source developers drive cloud adoption
More info http://cloudstack.org