Robotron: Top-down Network Management at Scale - - PowerPoint PPT Presentation

robotron top down network management at scale
SMART_READER_LITE
LIVE PREVIEW

Robotron: Top-down Network Management at Scale - - PowerPoint PPT Presentation

Robotron: Top-down Network Management at Scale Yu-Wei Eric Sung , Xiaozheng Tie, Starsky H.Y. Wong, Hongyi Zeng ACM SIGCOMM 2016 August 25, 2016 Scale of Facebook Community 500 Million 1 Billion 1 Billion 1.7 Billion on


slide-1
SLIDE 1
slide-2
SLIDE 2

Robotron: Top-down Network Management at Scale

Yu-Wei Eric Sung, Xiaozheng Tie, Starsky H.Y. Wong, Hongyi Zeng

ACM SIGCOMM 2016 August 25, 2016

slide-3
SLIDE 3

Scale of Facebook Community

1.7 Billion 500 Million 1 Billion 1 Billion

  • n Facebook Monthly
  • n Whatsapp Monthly
  • n Instagram Monthly
  • n Messenger Monthly
slide-4
SLIDE 4

Network Management at Facebook

`

. . . . . . . . . . . . . . . . . . . . . . . . . . .

R

. . .

R

. . . . . .

1 511 512 1024
  • Goals: Build and evolve FB

network

  • Example tasks: circuit/device

turnup, network monitoring

  • Human interactions -> outages

What’s involved?

slide-5
SLIDE 5
  • Distributed Configurations
  • Multiple Domains
  • Versioning
  • Dependency
  • Vendor Differences

Network Management at Facebook

Why is it hard?

slide-6
SLIDE 6

Network Management at Facebook

2004-2007 2008 2009 2010 2011 2012 2013 2014 2015

Manual Configuration and Monitoring with ad-hoc scripts

Early days…

slide-7
SLIDE 7

Contribution

2004-2007 2008 2009 2010 2011 2012 2013 2014 2015

Manual Configuration and Monitoring with ad-hoc scripts Robotron started Our Paper

  • Shed light on
  • Network management tasks
  • Robotron’s usage
  • Evolution of Roboron
  • Our experiences using Robotron
slide-8
SLIDE 8

Overview of Facebook’s Network

Lifecycle of user requests

POPs Internet Backbone Data Centers Users

slide-9
SLIDE 9

Point of Presence (POP)

POPs Internet Backbone Data Centers Users

  • Standardized topology
  • Services: LB, Cache
  • Common tasks
  • Build/upgrade a cluster
  • Provisioning new peering

circuits

slide-10
SLIDE 10

Backbone

POPs Internet Backbone Data Centers Users

  • Irregular, demand-driven

topology

  • Common tasks:
  • Add/migrate circuits
  • Add/remove routers
slide-11
SLIDE 11

Datacenter

POPs Internet Backbone Data Centers Users

  • Standardized topology
  • Services: Web, Cache,

Database

  • Common tasks
  • Build/decomm a cluster
  • Cluster capacity upgrade
slide-12
SLIDE 12

POP

Overview of Facebook’s Network

0.2 0.4 0.6 0.8 1 # of clusters (normalized) Time Gen3V6 Gen3 Gen2V6 Gen2-D Gen2-C Gen2-B Gen2-A Gen1

0.2 0.4 0.6 0.8 1 # of clusters (normalized) Time Gen2 Gen1 (normalized)

DC

Multiple versions of FB cluster architectures co-exist

8 generations

slide-13
SLIDE 13

Robotron: “Top-Down” Network Management System@FB

Overview FBNet DB Network Design Config Generation Deployment Monitoring

slide-14
SLIDE 14

FBNet: Modeling the Network

Example 4-post POP cluster

20G

Internet

PSWa PSWb PSWc PSWd PR1

BB1 BB2

To Top-of-Rack switches & servers PR2 4-post POP Cluster

slide-15
SLIDE 15

Networkswitch Linecard PhysicalInterface PhysicalInterface AggregatedInterface V6Prefix BgpV6Session Circuit Circuit

FBNet: Modeling the Network

Object

PR1 PSWa

10G 10G et1/1 et1/2 et2/1 et3/1 ae0 ae1 2001::1 2001::2 eBGP session Linecard Circuit

slide-16
SLIDE 16

name=PSWa

slot=1 model=X name=et1/1 name=et1/2

name=ae0

prefix=2001::1

Networkswitch Linecard PhysicalInterface PhysicalInterface AggregatedInterface V6Prefix BgpV6Session

speed=10G

Circuit

speed=10G

Circuit

FBNet: Modeling the Network

Value

PR1 PSWa

10G 10G et1/1 et1/2 et2/1 et3/1 ae0 ae1 2001::1 2001::2 eBGP session Linecard Circuit

slide-17
SLIDE 17

name=PSWa

slot=1 model=X device= name=et1/1 linecard= agg_interface= name=et1/2 agg_interface= linecard=

name=ae0

prefix=2001::1 interface= a_prefix= z_prefix=

Networkswitch Linecard PhysicalInterface PhysicalInterface AggregatedInterface V6Prefix BgpV6Session

a_endpoint= z_endpoint= speed=10G

Circuit

a_endpoint= z_endpoint= speed=10G

Circuit

FBNet: Modeling the Network

Relationship

PR1 PSWa

10G 10G et1/1 et1/2 et2/1 et3/1 ae0 ae1 2001::1 2001::2 eBGP session Linecard Circuit

It’s complicated

slide-18
SLIDE 18

FBNet Model Snippet

class PhysicalInterface(Interface): linecard = models.ForeignKey(Linecard) agg_interface = models.ForeignKey( AggregatedInterface)

slide-19
SLIDE 19

FBNet Model Snippet

Related models

class PhysicalInterface(Interface): linecard = models.ForeignKey(Linecard) agg_interface = models.ForeignKey( AggregatedInterface)

slide-20
SLIDE 20

FBNet Model Snippet

Model inheritance

class PhysicalInterface(Interface): linecard = models.ForeignKey(Linecard) agg_interface = models.ForeignKey( AggregatedInterface)

slide-21
SLIDE 21

FBNet

FBNet: Architecture

API Layer

Read API Read API

Read Service

Read API Read API

Write Service

  • RPC services
  • Read: fine-grained per-

model query

  • Write: task-based
  • High Availability: Multiple

replicas per DC

slide-22
SLIDE 22

FBNet

FBNet: Architecture

API Layer

Read API Read API

Read Service

Read API Read API

Write Service

  • 1 primary, multiple secondary

DBs

  • Scalability: 1 slave per DC

Primary Slave Slave

Secondary Replication Stream

slide-23
SLIDE 23

Robotron’s management life cycle

Network Design Config Generation FBNet DB Deployment Monitoring

slide-24
SLIDE 24

Network Design

Design intent à FBNet objects

Cluster( devices={ PR: DeviceSpec( hardware=“Router_Vendor1” num_devices=2) PSW: DeviceSpec( hardware=“Switch_Vendor2” num_devices=4) }, Link_groups=[ LinkGroup( a_device=PR, z_device=PSW, pifs_per_agg=2, ip=V6) ] )

Template for a POP cluster FBNet objects

BackboneRouters: 2 NetworkSwitches: 4 Circuits: 16 PhysicalInterfaces: 32 AggregatedInterfaces: 16 V6Prefixes: 16 BgpV6Sessions: 8

94 objects across 7 models

PR1 PR2 PSWa PSWb PSWc PSWd

slide-25
SLIDE 25

Config Generation

FBNet objects à Device configs

PR1 PR2 PSWa PSWb PSWc PSWd

FBNet

FBNet objects Per-device

  • bjects

Vendor agnostic

Config Schema

PR1 PSWa PSWc PSWb PSWd PR2

struct Device { 1: list<AggregatedInterface> aggs, } struct AggregatedInterface { 1: string name, 2: i32 number, 3: string v4_prefix, 4: string v6_prefix, 5: list<PhysicalInterface> pifs, } struct PhysicalInterface { 1: string name, }

slide-26
SLIDE 26

Config Generation

FBNet objects à Device configs

Vendor 1 Vendor 2 Config Schema interface template BGP template MPLS template

interface template BGP template MPLS template

PR1 PR2 PSWa PSWb PSWc PSWd

FBNet

PR1 PSWa PSWc PSWb PSWd PR2

FBNet objects Per-device

  • bjects

Vendor agnostic

PR1 config PR2 config PSWa config PSWb config PSWc config PSWd config

Vendor-specific Device Configs

Vendor Specific

{% for agg in device.aggs %} interface {{agg.name}} mtu 9192 no switchport load-interval 30 {% if agg.v4_prefix %} ip addr {{agg.v4_prefix}} {% endif %} {% if agg.v6_prefix %} ipv6 addr {{agg.v6_prefix}} {% endif %} no shutdown ! {% endfor %}

slide-27
SLIDE 27

Config Generation

FBNet objects à Device configs

Vendor 1 Vendor 2 Config Schema interface template BGP template MPLS template

interface template BGP template MPLS template

PR1 PR2 PSWa PSWb PSWc PSWd

FBNet

PR1 PSWa PSWc PSWb PSWd PR2

FBNet objects Per-device

  • bjects

Vendor agnostic

PR1 config PR2 config PSWa config PSWb config PSWc config PSWd config

Vendor-specific Device Configs

Vendor Specific

{% for agg in device.aggs %} interface {{agg.name}} mtu 9192 no switchport load-interval 30 {% if agg.v4_prefix %} ip addr {{agg.v4_prefix}} {% endif %} {% if agg.v6_prefix %} ipv6 addr {{agg.v6_prefix}} {% endif %} no shutdown ! {% endfor %}

slide-28
SLIDE 28

Config Generation

FBNet objects à Device configs

Vendor 1 Vendor 2 Config Schema interface template BGP template MPLS template

interface template BGP template MPLS template

PR1 PR2 PSWa PSWb PSWc PSWd

FBNet

PR1 PSWa PSWc PSWb PSWd PR2

FBNet objects Per-device

  • bjects

Vendor agnostic

PR1 config PR2 config PSWa config PSWb config PSWc config PSWd config

Vendor-specific Device Configs

Vendor Specific

{% for agg in device.aggs %} interface {{agg.name}} mtu 9192 no switchport load-interval 30 {% if agg.v4_prefix %} ip addr {{agg.v4_prefix}} {% endif %} {% if agg.v6_prefix %} ipv6 addr {{agg.v6_prefix}} {% endif %} no shutdown ! {% endfor %}

slide-29
SLIDE 29

Config Generation

FBNet objects à Device configs

Vendor 1 Vendor 2 Config Schema interface template BGP template MPLS template

interface template BGP template MPLS template

PR1 PR2 PSWa PSWb PSWc PSWd

FBNet

PR1 PSWa PSWc PSWb PSWd PR2

FBNet objects Per-device

  • bjects

Vendor agnostic

PR1 config PR2 config PSWa config PSWb config PSWc config PSWd config

Vendor-specific Device Configs

Vendor Specific

{% for agg in device.aggs %} interface {{agg.name}} mtu 9192 no switchport load-interval 30 {% if agg.v4_prefix %} ip addr {{agg.v4_prefix}} {% endif %} {% if agg.v6_prefix %} ipv6 addr {{agg.v6_prefix}} {% endif %} no shutdown ! {% endfor %}

slide-30
SLIDE 30
  • # of FBNet model change?
  • # changed FBNet objects per design change?
  • Frequency and size of config change?

Usage Statistics

slide-31
SLIDE 31

FBNet Model Changes

How much does FBNet model change over time?

  • Still many changes over time
  • Reasons: new models, values, relationships
slide-32
SLIDE 32

Design Changes

How many FBNet object are changed per design change?

0.25 0.5 0.75 1 1 10 100 1,000 10,000 CDF across design changes # of FBNet objects All Interface Circuit v6 Prefix v4 Prefix Device

0.25 0.5 0.75 1 1 10 100 1,000 10,000 CDF across design changes # of FBNet objects All Interface Circuit v6 Prefix v4 Prefix Device

POP/DC Backbone

slide-33
SLIDE 33

Design Changes

How many FBNet object are changed per design change?

0.25 0.5 0.75 1 1 10 100 1,000 10,000 CDF across design changes # of FBNet objects All Interface Circuit v6 Prefix v4 Prefix Device

0.25 0.5 0.75 1 1 10 100 1,000 10,000 CDF across design changes # of FBNet objects All Interface Circuit v6 Prefix v4 Prefix Device

POP/DC Backbone

  • POP/DC: bigger design changes
  • Backbone: smaller design changes
slide-34
SLIDE 34
  • Median number of config lines changed per week
  • POP/DC devices: 500 lines
  • Backbone devices: <100 lines
  • Avg number of times changes happen per week
  • POP/DC devices: 2.53
  • Backbone devices: 12.46

Configuration Changes

What’s the frequency and size of configuration change?

  • POP/DC: few bigger config changes
  • Backbone: many smaller config changes
slide-35
SLIDE 35

Evolution of Robotron

Bottom-up, experience driven

2008 2009 2010 2011 2012 2013 2014 2015 2016

FBNet modeling started Active monitoring Passive monitoring Basic Deployment Basic design and config generation

Robotron

slide-36
SLIDE 36
  • A new eBGP session needed a proper import policy
  • Robotron was used without proper support à egress link

saturated

  • Most development time spent on model changes

Experience: Modeling is laborious

Problem Scenario: new eBGP session configuration

  • Lesson: Modeling is hard
  • Open problem: Lack of a network model

widely accepted by vendors

slide-37
SLIDE 37
  • 1. An engineer updated FBNet to add a new rack, but forgot to

generate config

  • 2. The engineer pushed stale config
  • 3. The rack added never came online

Experience: Coupling changes is key

Problem Scenario: POP cluster switch turnup

  • Lesson: Network design, config generation

and deployment should be tightly coupled

  • Open problem:
  • Atomicity
  • Conflict resolution
slide-38
SLIDE 38
  • Engineer bypassed Robotron to manually configure devices
  • SSH into device
  • Make config change
  • Log out
  • Needed upon emergencies
  • Passively curtail with config monitoring

Experience: Fallback is important

Problem Scenario: Robotron-less management

  • Lesson: Bypassing mechanism is needed
  • Open problem:
  • How to reliably account for such

activities?

  • How to safely revert such activities?
slide-39
SLIDE 39
  • First work sharing experience on a production network

management system

  • Open research problems:
  • Network modeling
  • Atomicity and conflict resolution across management tasks
  • Make network management system work with manual fallback

mechanisms

Conclusion

slide-40
SLIDE 40

Questions?

  • robotron@fb.com
  • Poster session on Thursday
slide-41
SLIDE 41
  • Irregular, demand-driven

topology

  • PRs/DRs form an iBGP

mesh

  • Common tasks:
  • Add/migrate circuits
  • Add/remove

BBs/PRs/DRs

Overview of Facebook’s Network

Backbone: Interconnecting POPs/DCs

BB BB BB BB BB BB BB

PR1 PR2 To POPs & Internet DR1 DR2 To DCs

slide-42
SLIDE 42
  • Standardized topology
  • Services: LB (Proxygen),

Cache

  • Common tasks
  • Build/upgrade a cluster
  • Provisioning new peering

circuits

Overview of Facebook’s Network

Point of Presence (POP)

Internet

PR1

BB1 BB2

PR2 POP Clusters

slide-43
SLIDE 43
  • Standardized topology
  • Services: Web, Cache (TAO),

Database

  • Common tasks
  • Build/decomm a cluster
  • Cluster capacity upgrade

Overview of Facebook’s Network

Data Center

DR1

BB3 BB4

DC Clusters DR2

slide-44
SLIDE 44

FBNet: Modeling the Network

Object, Value, and Relationship

PR1 PSWa

10G 10G et1/1 et1/2 et2/1 et3/1 ae0 ae1 2001::1 2001::2 eBGP session Linecard Circuit

name=PSWa

slot=1 model=X device= name=et1/1 linecard= agg_interface= name=et1/2 agg_interface= linecard=

name=ae0

prefix=2001::1 interface= a_prefix= z_prefix=

Networkswitch Linecard PhysicalInterface PhysicalInterface AggregatedInterface V6Prefix BgpV6Session

a_endpoint= z_endpoint= speed=10G

Circuit

a_endpoint= z_endpoint= speed=10G

Circuit

slide-45
SLIDE 45

Dependencies between FBNet models

0.2 0.4 0.6 0.8 1 5 10 15 20 25 30 CDF across models # of related models

slide-46
SLIDE 46
  • Manual config changes on devices are error-prone
  • Ideal: All changes made through Robotron
  • Reality: Robotron has latency, bugs and missing features. Quick fixes

needed upon emergency

  • Alternatives to discourage manual changes:
  • Config monitoring
  • Automatic config override after emergency window

Experience: Fallback is needed

Problem Scenario: manual changes to devices

slide-47
SLIDE 47
  • Bottom-up config analysis:

[Benson11,Sung09,Kim11,…]

  • Abstraction-driven design and config generation:
  • Top down config optimization: [Condor, Sun13]
  • Centralized platform for network management: [Onix,

Statesman]

  • Template based config generation: [Enck09]
  • Config modeling: [OpenConfig, DMTF]

Related Work

slide-48
SLIDE 48

FBNet Desired

FBNet: Modeling the Network

Desired versus Derived

A B C

Derived

A B C

= ?

slide-49
SLIDE 49
  • New device: full config replacement
  • Existing devices: Incremental “Live” updates
  • Dryrun, Atomic, Phased, etc

Deployment

Device configs à Devices

slide-50
SLIDE 50
  • Passive monitoring
  • Active monitoring
  • Config monitoring

Monitoring

Is the network healthy?

slide-51
SLIDE 51