Using PubSub For Scheduling in Azure SDN Qi Zhang (Microsoft - - - PowerPoint PPT Presentation

using pubsub for scheduling in azure sdn
SMART_READER_LITE
LIVE PREVIEW

Using PubSub For Scheduling in Azure SDN Qi Zhang (Microsoft - - - PowerPoint PPT Presentation

Using PubSub For Scheduling in Azure SDN Qi Zhang (Microsoft - Azure Networking) Azure Networking Regional Cable Azure Region A Network Consumers CDN Regional Network Carrier Microsoft Edge Enterprise, SMB, WAN mobile Azure


slide-1
SLIDE 1

Using PubSub For Scheduling in Azure SDN

Qi Zhang (Microsoft - Azure Networking)

slide-2
SLIDE 2
slide-3
SLIDE 3

Azure Networking

DC Hardware Services Intra-Region WAN Backbone Edge and ExpressRoute CDN Last Mile

  • SmartNIC/FPGA
  • SONiC
  • Virtual Networks
  • Load Balancing
  • VPN Services
  • Firewall
  • DDoS Protection
  • DNS & Traffic

Management

  • DC Networks
  • Regional Networks
  • Optical Modules
  • Software WAN
  • Subsea Cables
  • Terrestrial Fiber
  • National Clouds
  • Internet Peering
  • ExpressRoute
  • Acceleration for

applications and content

  • E2E monitoring

(Network Watcher, Network Performance Monitoring)

Enterprise DC/Corpnet Consumers Regional Network Microsoft WAN Edge ExpressRoute CDN Enterprise, SMB, mobile Azure Region ‘A’ Azure Region ‘B’ Regional Network Regional Network Regional Network Internet Exchanges Cable Carrier

slide-4
SLIDE 4

Microsoft Global Network

One of the largest private networks in the world

  • 8,000+ ISP sessions
  • 130+ edge sites
  • 44 ExpressRoute locations
  • 33,000 miles of lit fiber
  • SDN Managed (SWAN, OLS)

DCs and Network sites not exhaustive

United States United States Canada Mexico Venezuela Colombia Peru Bolivia Brazil Argentina Atlanta Ocean Algeria Mali Niger Nigeria Chad Libya Egypt Sudan Ethiopi a Dr Congo Angola Zambia Nambia South Africa Greenland Svalbard Sweden Norway United Kingdom France Poland Ukraine Turkey Saudi Arabia Iran Kazakistan India Russia Russia China Myanmar (Burma) Indian Ocean Indonesia Australia Pacific Ocean Pacific Ocean

Data center Owned Capacity Moving to Owned Leased Capacity Edge Site

slide-5
SLIDE 5

Software Defined Networking (SDN)

Azure SDN

Basis of all NW virtualization in

  • ur datacenters

Control Plane

Centralized, hierarchical, highly scalable and available controllers

Data Plane

Host agent, drivers

vNIC vNIC vNIC vNIC vNIC vNIC

Commodity HW Central Controllers Key to flexibility and scale is SDN Management API

SmartNIC

Host Agents

slide-6
SLIDE 6

PubSub in SDN

  • Scale:
  • 40+ regions, hundreds of DCs, millions of servers
  • millions of VNets and LBs
  • Flexible, scalable and efficient scheduling between controllers and agents
  • Publisher/Subscriber pattern

Controller Agent 1 Agent i Agent N PubSub

Publish flow Notification flow

slide-7
SLIDE 7

Virtual Network in Azure

Secure per customer virtual datacenter in the cloud Instantiate and configure complex topologies in minutes Rich security and networking services

Internet Cross premises Connectivity

Virtual Network Virtual Network Virtual Network Virtual Network Virtual Network

VNet Peering

slide-8
SLIDE 8

Host Node 3 Host Node 2

CA-PA Mappings

VM-SW1 VM-SW2 Host Node 1 VM2

Payload, including CA, is encapsulated Traverses physical network

. . . . . .

VM1

CA 10.0.0.1 CA 10.0.0.4 CA 10.0.0.6 CA 10.0.0.7 CA 10.0.0.7 CA 10.0.0.4 CA 10.0.0.1

VM-SW3

CA PA 10.0.0.1 10.1.1.2 10.0.0.4 10.1.1.3 10.0.0.6 10.1.3.3 10.0.0.7 10.1.5.2 10.1.5.2  10.1.1.2 Payload 10.0.0.7  10.0.0.1 Payload 10.0.0.7  10.0.0.1 Payload CA PA 10.0.0.1 10.1.5.3 10.0.0.7 10.1.1.4 10.0.0.4 10.1.3.2

Directory Service

Data traffic Control msgs

PA 10.1.1.2 PA 10.1.1.4 PA 10.1.1.3 PA 10.1.3.3 PA 10.1.3.2 PA 10.1.5.3 PA 10.1.5.2

slide-9
SLIDE 9

PubSub for CA-PA Mapping

Challenges:

  • Scale: hundreds K agents, millions of VNets
  • Scope: cluster, regional, global
  • VNet size limit: 4K mappings -> 64K mappings, 500 peerings
  • Provisioning Speed: minutes -> seconds

Directory Service VNet Controller Agent 1 Agent i Agent N VNet Controller Agent 1 Agent i Agent N PubSub

slide-10
SLIDE 10

Scenario I: Global Peering

VNet Controller

PubSub

Agent Agent

Region A / VNET A Region B / VNET B PubSub

VNet Controller

slide-11
SLIDE 11

Scenario II: DataExfil

{ id: “policy-123”, service: “xstore”, subscription: “{guid}, accounts: [ “users”, “wiki.*” ], storage_type: “blob”, access: “rw” } METADATA (resource A): { subscription: “{guid}, account: “users”, storage_type: “blob” } METADATA (resource B): { subscription: “{guid}, account: “users”, storage_type: “table” } METADATA (resource C): { subscription: “{guid}, account: “wikimain”, storage_type: “blob” }

BLOCK Service Tunnel Policy

Resource “Metadata” Resource “Metadata” Resource “Metadata” Resource A Resource B Resource C

Agent VNetPolicyCache

NRP

Storage FE Policy

PubSub Host

slide-12
SLIDE 12
  • Persisted KV Store
  • Hierarchical name space
  • Set watcher on a node
  • Single watcher
  • Bulk watcher
  • Interfaces
  • Publish (batch/multi supported)
  • Subscribe
  • Notification
  • Query
  • State Update/Delivery
  • Initial state
  • Subsequent state updates

Overview

Root

Publish

CreateNode UpdateNode

Notification

Created, Deleted DataChanged ChildrenChanged

Subscribe

watcher bulkwatcher

PK1 PKi PKn a1 a2 a3 n b1 b2 b3 b4 b5

… … Query

GetNodeInfo

Publisher Subscriber

W W

slide-13
SLIDE 13

Partition Key Partition Key

SDN PubSub Service

4 Microservices:

Stateless Service

  • Routing Service
  • Notification Service

Stateful Service

  • Selector Service
  • Madari Service
slide-14
SLIDE 14

Partition Key Partition Key

SDN PubSub Service

4 Microservices:

Stateless Service

  • Routing Service
  • Notification Service

Stateful Service

  • Selector Service
  • Madari Service

Publisher (Vnet Controller) Subscriber Agent) 1 2 3 4 5 6 1 2 3 4 5 6

PK: /Vnet/{VnetId1}, Path: /mappings/ipv4/{CA1} Data (bond message): {PA1} /Vnet/{VnetId1} MadariService_02 PK: /Vnet/{VnetId1}, Path: /mappings/ipv4/{CA1} Data (bond message): {PA1} PK: /Vnet/{VnetId2} Path: / /Vnet/{VnetId2} MadariService_03 SetBulkWatcher: PK: /Vnet/{VnetId2} <notifications> <notifications>

slide-15
SLIDE 15

Madari Selector Service: Data Partitioning

MadariService_01 MadariService_02 MadariService_03 Selector Service AddPartitionKey(“baz”)

Partition Key Madari Instance “foo” MadariService_01 “bar” MadariService_02 ….. ….. “baz” MadariService_01 Madari Instance Total Data Size MadariService_01 1.05G MadariService_02 1.9G MadariService_03 1.6G

1 2 3

slide-16
SLIDE 16

Subscription through Notification Service

NotificationService_03 NotificationService_08

….. ….. …..

Subscriber I Subscriber II

….. ….. …..

vnet1 vnet2 vnet1 vnet3

Root vnet 1 vnet 2

MadariService_02

Root vnet 3 vnet 4

MadariService_04

A B C D A C B D

Subscriber III vnet1

slide-17
SLIDE 17

Service Fabric Ring

  • Service Fabric ring
  • Multiple PaaS tenants form a Service

Fabric ring

  • Service Fabric ring is on a VNET
  • PubSub as Service Fabric application
  • Routing Service/Notification Service
  • Stateless
  • On every node
  • MadariService/MadariSelectorService(s)
  • Stateful
  • Min 3, target 7

n1 n2 n3 n4 n5 n6 n7 n8 n9

n10 n11 n12 n13 n14 n15

Tenant1 Tenant2 Tenant3 Cluster3 Cluster2 Cluster1

slide-18
SLIDE 18

Client Libraries

  • Managed Libraries
  • Madari.ClientLibrary
  • Publishing through WCF channel
  • Reliable Publisher
  • IMOS-based publishers
  • User implements:
  • Commit hooks
  • Handler
  • Nuget package:

Madari.ReliablePublisher.RSL Madari.ReliablePublisher.ServiceFabric

  • Native Libraries
  • Publish
  • Nuget package:

Madari.MadariFrontEnd.Native

  • Subscribe
  • Nuget package:

Madari.Subscriber.Native

IMOS Repo Commit hooks Lib Runtime Worker Handler

Persist reliable tasks Commit hooks triggered Execute handler Mark objects modified Pick up tasks Delete executed tasks on success Retry on failure

slide-19
SLIDE 19

Hierarchical PubSub Infrastructure

Regional PubSub Resource Scope Publisher Subscriber CA-PA mapping regional VNet Controller Agent DataExfil policy global NRP Agent Regional PubSub Regional PubSub Global PubSub

Resource Scope => PubSub Service Scope

DataExfil policy CA->PA CA->PA CA->PA

slide-20
SLIDE 20

Global PubSub

PubSub (AZ01) PubSub (AZ02) PubSub (AZ03) Region A PubSub (AZ01) PubSub (AZ02) PubSub (AZ03) Region B Replication Service

Global PubSub

slide-21
SLIDE 21

Publish Policy – No Replication (Sync)

Routing Service Replication Service Madari Service Selector Service 1 8 5 4 3 2 Global PubSub

/DataExfil/Policies/ {policyid}

6 7 Remote Regional P/S 8

/DataExfil/Policies/ {policyid}

slide-22
SLIDE 22

Replication Service

Op Id Status Operation Replication Details

1001 Replicated [add] /DataExfil/Policies/Policy1 {Dest1:Y, Dest2:Y, Dest3:Y } 1002 Replicating [update] /DataExfil/Policies/Policy1 {Dest1:Y, Dest2:N, Dest3:Y } 1003 Committed [remove] /DataExfil/Policies/Policy1 {Dest1:N, Dest2:N, Dest3:N }

Operation Tracking Table Request to Partition 1 Dest1: req1002 Dest 2: req1001 Dest 3: req1001 Replication Queue Destination Tracker Partition 1 Madariservice/01 Replicationservice/01

slide-23
SLIDE 23

Global SF Ring

n1 n2 n3 n4 n5 Tenant1 uswest n1 n2 n3 n4 n5 Tenant2 useast n1 n2 n3 n4 n5 Tenant5 europewest n1 n2 n3 n4 n5 Tenant3 uswestcentral n1 n2 n3 n4 n5 Tenant4 asiasoutheast vnet1 vnet2 vnet3 vnet4 vnet5

slide-24
SLIDE 24

Major Performance KPIs

KPI Write throughput 10k req/s Read throughput 42k req/s End to End latency 10ms/300ms (50%/99%) Max subscribers 500K

  • 15 partitions
  • In a large region:
  • < 300k agents
  • < 100K VNets
  • ~1k read/sec, ~200 write/sec
slide-25
SLIDE 25

Work in Progress

  • Accelerating read flow
  • End to end validation
slide-26
SLIDE 26

Q & A

Thank you!