MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS - - PowerPoint PPT Presentation

multi site openstack
SMART_READER_LITE
LIVE PREVIEW

MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS - - PowerPoint PPT Presentation

MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed Chief Architect asayeed@redhat.com DISCLAIMER Important Informa+on The informa+on described in this slide set does not provide any commitments to roadmaps or


slide-1
SLIDE 1

MULTI-SITE OPENSTACK

DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS

Azhar Sayeed Chief Architect asayeed@redhat.com

slide-2
SLIDE 2

DISCLAIMER

Important Informa+on

2

The informa+on described in this slide set does not provide any commitments to roadmaps or availability of products or features. Its inten+on is purely to provide clarity in describing the problem and drive a discussion that can then be used to drive open source communi+es Red Hat Product Management owns the roadmap and supportability conversa+on for any Red Hat product

slide-3
SLIDE 3

3

AGENDA

  • Background: OpenStack Architecture
  • Telco Deployment Use case
  • Distributed deployment – requirements
  • Multi-Site Architecture
  • Challenges
  • Solution and Further Study
  • Conclusions
slide-4
SLIDE 4
slide-5
SLIDE 5

OPENSTACK ARCHITECTURE

slide-6
SLIDE 6

6

WHY MULTI-SITE FOR TELCO?

  • Compute requirements – Not just at Data Center
  • Mul+ple Data Centers
  • Managed Service Offering
  • Managed Branch Office
  • Thick vCPE
  • Mobile Edge Compute
  • vRAN – vBBU loca+ons
  • Virtualized Central Offices
  • Hundreds to thousands of loca+ons
  • Primary and Backup Data Center – Disaster recovery
  • IoT Gateways – Fog compu+ng

Centrally managed Compute closer to the user

slide-7
SLIDE 7

7

Multiple DC or Central Offices

Security & Firewall Quality of Service (QoS) Traffic Shaping Device Management Main Data Center Overlay Tunnel over Internet

E2E Orchestrator Remote Sites

  • Hierarchical Connec+vity model of CO
  • Remote sites with compute

requirements

  • Extend OpenStack to these sites

Independent OpenStack Deployments

Backup Data Center Remote Data Centers

A typical service almost always spans across mul3ple DCs

slide-8
SLIDE 8

8

Multiple DCs – NFV Deployment

L2 or L3 Extensions between DCs

Real Customer Requirements

Fully Redundant System Controllers Storage Nodes Compute Nodes

Region 1 Region 2 . . . . . . 25

  • 25 Sites
  • 2-5 VNFs required at each site
  • Maximum of 2 Compute Nodes per site needed for these

VNFs

  • Storage Requirements = Image storage only
  • Total number of control Nodes = 25 *3 =75
  • Total Number of Storage Nodes = 25 * 3 = 75
  • Total Number of Compute Nodes = 25 * 2 = 50

Redundant Configura+on Overhead 75%

slide-9
SLIDE 9

9

Virtual Central Office

L2 or L3 Extensions between DCs

Real Customer Challenge

Fully Redundant System Controllers Storage Nodes Compute Nodes

Region 1

Region 2 . . . . . . 1000+

  • 1000+ Sites – Central Offices
  • From few 10s to 100s of VMs
  • Fully Redundant configura+ons
  • Termina+on of Residen+al, Business and Mobile Services
  • Managing 1000 openstack islands
  • Tier 1 Telcos already have >100 sites today

Management Challenge

slide-10
SLIDE 10

DEPLOYMENT OPTIONS

10

slide-11
SLIDE 11

OPTIONS

  • Mul+ple Independent Island Model – seen this already
  • Common Authen+ca+on and Management

– External user policy management with LDAP integra+on – Common Keystone

  • Stretched deployment model

– Extend compute and Storage Nodes into other Data Centers – Keep central control of all remote resources

  • Allow Data Centers to share workloads – Tri-circle approach
  • Proxy the APIs – Master Slave model or cascading model
  • Agent based model
  • Something else??

11

slide-12
SLIDE 12

12

Multiple DC or Central Offices

L2 or L3 Extensions between DCs

Feed the load balancer

  • Site capacity independent of the other
  • User informa+on separate or

replicated offline

  • Load balancer directs traffic where to

go to – Good for loadsharing

  • DR – external problem

Independent OpenStack Deployments L B

Fully Redundant System Fully Redundant System Controllers Storage Nodes Compute Nodes

Cloud Management Pladorm Region 1 Region 2…N Good for few 10s of sites – What about 100s or Thousands of sites

Directory

slide-13
SLIDE 13

13

Extended OpenStack Model

L2 or L3 Extensions between DCs

Common or Shared Keystone

  • Single Keystone for authen+ca+on
  • User informa+on in one loca+on
  • Independent Resources
  • Modify the keystone endpoint table
  • Endpoint, Service, Region, IP

Shared Keystone Deployment

Fully Redundant System Fully Redundant System Controllers Storage Nodes Compute Nodes

Cloud Management Pladorm Region 1 Region 2…N Keystone Iden+ty: Keystone – Single point of control

Directory

slide-14
SLIDE 14

14

Extended OpenStack Model

L2 or L3 Extensions between DCs

Central Controller

  • Single authen+ca+on
  • Distributed Compute Resources
  • Single Availability Zone per Region

Central Controller and Remote Compute & Storage (HCI) Nodes

Fully Redundant System Controllers Storage Nodes Compute Nodes

Cloud Management Pladorm Region 1 Region 2…N

Replicated Storage – Galera Cluster Cinder, Glance and Image Manual Restore

Directory

slide-15
SLIDE 15

15

Revisiting the Branch Office - Thick CPE

Enterprise vCPE x86 Server with VNFs Data Center Internet Enterprise vCPE NFVI Security & Firewall Quality of Service (QoS) Traffic Shaping Device Management

OpenStack, OpenShift/ Kubernetes

Can we deploy compute nodes at all the branch sites and centrally control them?

IPSec, MPLS

  • r Other

Tunnel mechanism

E2E Network Orchestrator Deploy Nova Compute How do I scale it to thousands of sites?

slide-16
SLIDE 16

OSP 10 – Scale components independently

Most OpenStack HA services and VIPs must be launched/managed by Pacemaker or HAProxy. However, some can be managed via systemctl thanks to the simplification of pacemaker constraints introduced in version 9 and 10.

slide-17
SLIDE 17

17

COMPOSABLE SERVICES AND CUSTOM ROLES

  • Leverage composable services model

– to define a Central Keystone – Place functionality where it is needed – i.e. dis-aggregate

  • Deployable standalone on separate nodes or combined with other services into

Custom Role(s).

– Distribute the functionality depending on the DC locations

Hardcoded Controller Role Custom Controller Role Custom Ceilometer Role Custom Networker Role

...

Keystone Ceilometer Neutron RabbitMQ Glance Keystone Ceilometer Neutron RabbitMQ Glance

...

slide-18
SLIDE 18

18

Re-visiting the Virtual Central Office use case

L2 or L3 Extensions between DCs

Real Customer Challenge

Fully Redundant System Controllers Storage Nodes Compute Nodes

Region 1 Require Flexibility and some Hierarchy

Region 2 Region 3 Region 4 Region 3b Region 3a

slide-19
SLIDE 19

Scaling across a thousand sites?

19

CONSIDERATIONS

  • Some areas that we need to look at
  • Latency and Outage times
  • Delays due to distance between DCs and link speeds - RTT
  • The remote site is lost – headless operations and subsequent

recovery

  • Startup Storms
  • Scaling Oslo messaging
  • RabbitMQ
  • Scaling of Nodes => Scale RabbitMQ/Messaging
  • Ceilometer (Gnocchi & Aodh)– heavy user of MQ
slide-20
SLIDE 20

Scaling across a thousand sites?

20

LATENCY AND OUTAGE TIMES

  • Latency between sites – Nova API Calls
  • 10, 50, 100 ms? Round trip +me = Queue tuning
  • Bojleneck link/node speed
  • Outage +me – recovery +me
  • 30s or more?
  • Nova Compute services flapping
  • Confirma+on – from provisioning to opera+on
  • Neutron +me outs – binding issues
  • Headless opera+on
  • Restart –causes storms
slide-21
SLIDE 21

21

RABBITMQ TUNING

  • Tune the buffers – increase buffer size
  • Take into account messages in flight – rates and round trip +mes
  • BDP = Bojleneck speed * RTT
  • Number of messages
  • Servers * backends * requests/sec = Number of messages/sec
  • Split into mul+ple instances of message queues for distributed deployment
  • Ceilometer into a MQ – Heaviest user of MQ
  • Nova into a single MQ
  • Neutron into a MQ
  • Refer to an interes+ng presenta+on on this topic – “Tuning RabbitMQ

at Large Scale Cloud” – Openstack Summit – Aus+n 2016

MQ MQ MQ Nova Conductor Compute Ceilometer collector Ceilometer Agents Neutron

slide-22
SLIDE 22

22

RECENT AMQP ENHANCEMENTS

  • Eliminates the broker based model
  • Enhances AMQP 1.0
  • Separate messaging end point from message

routers

  • Newton has AMQP driver for oslo messaging
  • Ocata provides perf tuning, upstream support for

Triple-O

  • If you must use RabbitMQ
  • Use clustering and exchange configurations
  • Use shovel plugin with exchange configurations

and multiple instances

Broker Broker Broker Broker Broker

Hierarchical - Tree Mesh - Routed

slide-23
SLIDE 23

OPENSTACK CASCADING PROJECT

23

Parent Child Child Child Child Parent AZ1 AZn

Proxy for Nova, Cinder, Celometer & Neutron subsystems per site At Parent – loads of proxys one set per Child User communicates to the master

slide-24
SLIDE 24

Cascading solution split into two projects

24

TRICIRCLE AND TRIO2O

  • Tricircle – Networking across openstack clouds
  • Trio2o – Single API Gateway for Nova, Cinder

Expand workloads into other OS instances Create Networking extensions Isola+on of East-west traffic Applica+on HA

API Gateway

User1 UserN

AZ1 AZx AZn

TRI-CIRCLE Make Neutron(s) work as a single cluster

Trio2o OPNFV Mul+-Site Project – Eupherates release

Single Region with mul+ple sub regions Shared or Federated Keystone Shared or Distributed Glance UID = TenantID+PODID

pod

slide-25
SLIDE 25

Remote Compute Nodes

25

WHAT’S THE ALTERNATIVE?

  • Should we abandon the idea of Remote Nova Nodes?
  • Use Packstack/AllinOne – OSP in a box – ala Vz uCPE
  • High overhead if you want to run 1-2 VNFs
  • Perhaps some optimization possible using Kolla/Container model
  • Initialize the remote nodes – Need L3/L2 connectivity for PXE
  • Make that a Kubernetes Node – Use containers on that node
  • Implement a new interface for remote nodes
  • Nova Agent on remote nodes ?
  • Abandon the idea of OpenStack – No!!!! No OpenStack really!!! ?
  • Use a CMP – to manage remote bare metal nodes
  • KVM – Hypervisor
  • Run Containers on remote nodes – Do we run into same issues?
slide-26
SLIDE 26

Virtual controllers – to get around node restrictions

26

VIRTUAL CONTROLLER MODEL

Kolla –Containerizing the control plane

  • Kolla –Kubernetes and Kolla Ansible
  • Containerizing OSP control makes the previous options easier
  • Can remote nodes be considered as PODS in Kubernetes environments
  • Interface between Master and Host node
  • The containers can be deployed on those nodes to manage apps or even OSP

services

Keystone Glance Nova Neutron VM1 VM2

slide-27
SLIDE 27

27

SUMMARY

  • Deploying OpenStack at multiple sites is a must for Telcos
  • Tri-circle and Trio2o offer good promise
  • Tune Rabbit MQ or move to MQ enhancements (AMQP)
  • Partition MQ
  • Scale MQ instances
  • Carefully craft the Availability Zone model
  • Nova Agent Proxy
  • Deploying baremetal at remote sites still an issue does not solve the

problem of access

  • Another way of automation using call home
  • Use Kubernetes as master orchestrator => Kubernetes managing OSP

managing container workloads – K8S Sandwich

slide-28
SLIDE 28

plus.google.com/+RedHat linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHatNews

THANK YOU

slide-29
SLIDE 29

ABSTRACT

Important Informa+on

29

OpenStack provides a great Infrastructure-as-a-Service (IaaS) pladrom for deployment of applica+ons in virtual machines and containers. For telcos specifically, OpenStack unifies the point of presence (PoP), central office, and datacenter infrastructure. However, many telcos need OpenStack deployed in many datacenters around the region or country. The ques+on is how should they deploy OpenStack for mul+-site needs? Should they consider stretched deployment where different components sit in different loca+ons? Or should they consider replica+ng the en+re OpenStack environment in each loca+on? What impact does this have for Keystone, messaging, disaster recovery, and more importantly, unified management of all these sites? This presenta+on will discuss architectural and deployment op+ons for mul+-site deployments of OpenStack