Flexible Networking at Large Mega-Scale Exploring issues and - - PowerPoint PPT Presentation

flexible networking at large mega scale
SMART_READER_LITE
LIVE PREVIEW

Flexible Networking at Large Mega-Scale Exploring issues and - - PowerPoint PPT Presentation

Flexible Networking at Large Mega-Scale Exploring issues and solutions What is Mega-Scale? One or more of: > 10,000 compute nodes > 100,000 IP addresses > 1 Tb/s aggregate bandwidth Massive East/West traffic


slide-1
SLIDE 1

Flexible Networking at Large Mega-Scale

Exploring issues and solutions

slide-2
SLIDE 2

What is “Mega-Scale”?

One or more of:

  • > 10,000 compute nodes
  • > 100,000 IP addresses
  • > 1 Tb/s aggregate bandwidth
  • Massive East/West traffic between tenants

Yahoo is “Mega-Scale”

slide-3
SLIDE 3

What are our goals?

  • Mega-Scale, with

○ Reliability

■ Yahoo supports ~200 million users/day -- it must be reliable

○ Flexibility

■ Yahoo has 100s of internal and user-facing services

○ Simplicity

■ Undue complexity is the enemy of scale!

slide-4
SLIDE 4

Our Strategy

Leverage high-performance network design with:

➢ OpenStack ➢ Augmented with additional automation ➢ Hosting applications designed to be “disposable”

  • Fortunately, we already had many of the needed pieces
slide-5
SLIDE 5

Traditional network design

  • Large layer 2 domains
  • Cheap to build and manage
  • Allows great flexibility of solutions
  • Leverage pre-existing network design
  • IP mobility across the entire domain

It’s Simple. But...

slide-6
SLIDE 6

L2 Networks Have Limits

  • The L2 Domain can only be extended so far

○ Hardware TCAM limitations (size and update rate) ○ STP scaling/stability issues

  • But an L3 network can

○ scale larger ○ at less cost ○ but limits flexibility

slide-7
SLIDE 7

Potential Solutions

  • Why not use a Software Defined Network?

○ Overlay allows IP mobility but

■ Control plane limits scale and reliability ■ Overhead at on-ramp boundaries ○ OpenFlow-based solutions ■ Not ready for mega-scale yet w/ L3 support ■ Control plane complexities

Not Ready for Mega-Scale

slide-8
SLIDE 8

Our Solution

  • Use Clos design network backplane
  • Each cabinet has a Top-Of-Rack router

○ Cabinet is a separate L2 domain ○ Cabinets “own” one or more subnets (CIDRs) ○ OpenStack is patched to “know” which subnet to use

  • Network backplane supports East-West and North-

South traffic equally Well

  • Structure is ideal if we decide to deploy SDN overlay
slide-9
SLIDE 9

A solution for scale: Layer 3 to the rack

Compute Racks

L2 L3

Compute + Admin

Admin= API, DB, MQ, etc

...

  • Clos-based L3 network
  • TOR (Top Of Rack) routers
slide-10
SLIDE 10

Adding Robustness With Availability Zones

slide-11
SLIDE 11

Problems

  • No IP Mobility Between Cabinets

○ Moving a VM between cabinets requires a re-IP ○ Many small subnets rather than one or more large

  • nes

○ Scheduling complexities: ■ Availability zones, rack-awareness

  • Other issues

○ Coordination between clusters ○ Integration with existing infrastructure

You call that “flexible?”

slide-12
SLIDE 12

(re-)Adding Flexibility

  • Leverage Load Balancing

○ Allows VMs to be added and removed

(remember, our VMs are mostly “disposable”)

○ Conceals IP changes (such as rack/rack movement) ○ Facilitates high-availability ○ Is the key to flexibility in what would otherwise be a constrained architecture

slide-13
SLIDE 13

(re-)Adding Flexibility (cont’d)

  • Automate it:

○ Load Balancer Management ■ Device selection based on capacity & quotas ■ Association between service groups and VIPs ■ Assignment of VMs to VIPs ○ Availability Zone selection & balancing ○ Multiple cluster integration

  • Implement “Service Groups”

○ (external to OpenStack -- for now)

slide-14
SLIDE 14

Service Groups

  • Consists of groups of VMs running the same

application

  • Can be a layer of an application stack, an

implementation of an internal service, or a user-facing server

  • Present an API that functions behind a VIP

○ Web services everywhere!

slide-15
SLIDE 15

Service Group Creation

slide-16
SLIDE 16

Integrating With Openstack

slide-17
SLIDE 17

Putting It Together

  • Registration of hosts and services

○ A VM is associated with a service group at creation ○ A tag associated with the service group is accessible to resource allocation

  • Control of load balancers

○ Allocates and controls hardware ○ Manages VMs for each service group ○ Provides elasticity and robustness

slide-18
SLIDE 18

Putting It Together (cont’d)

  • OpenStack Extensions and Patches

○ Three points of integration:

  • 1. Intercept request before issue
  • 2a. Select network based on hypervisor
  • 2b. Transmit new instance information to external automation
  • 3. Transmit deleted instance information to external automation
slide-19
SLIDE 19

Wither OpenStack?

  • Our Goals:

○ Minimize patching code ○ Minimize points of integration with external systems ○ Contribute back patches of general use ○ Replace custom code with community code: ■ Use Heat for automation ■ Use LBaaS to control load balancers ○ Share our experiences

slide-20
SLIDE 20

Complications

  • OpenStack clusters don’t exist in a vacuum
  • - this makes scaling them harder

○ Existing physical infrastructure ○ Existing management infrastructure ○ Interaction with off-cluster resources ○ Security and organizational policies ○ Requirements of existing software stack ○ Stateful application introduce complexities

slide-21
SLIDE 21

Conclusion

  • Mega-Scale has unique issues

○ Many potential solutions don’t scale sufficiently ○ Some flexibility must be sacrificed

*BUT*

○ Mega-Scale also admits solutions that aren’t practical or cost-effective at smaller scale ○ Automation and integration with external infrastructure is key

slide-22
SLIDE 22

Questions

?

email: edhall@yahoo-inc.com