OpenStack internal messaging at the edge: In-depth evaluation Ken - - PowerPoint PPT Presentation

openstack internal messaging at the edge in depth
SMART_READER_LITE
LIVE PREVIEW

OpenStack internal messaging at the edge: In-depth evaluation Ken - - PowerPoint PPT Presentation

OpenStack internal messaging at the edge: In-depth evaluation Ken Giusti Javier Rojas Balderrama Matthieu Simonin Whos here? Ken Giusti Javier Rojas Balderrama Matthieu Simonin Fog Edge and Massively Distributed Cloud Working Group


slide-1
SLIDE 1

OpenStack internal messaging at the edge: In-depth evaluation

Ken Giusti Javier Rojas Balderrama Matthieu Simonin

slide-2
SLIDE 2

Who’s here?

Javier Rojas Balderrama Matthieu Simonin Ken Giusti Fog Edge and Massively Distributed Cloud Working Group (a.k.a FEMDC)

  • Wed. 5:30pm - 6:10pm
slide-3
SLIDE 3

Challenges at the edge

Core Network DC1 DC2 Regional sites Local sites Edge sites

Conceptual challenges

  • Scalability
  • Locality
  • Placement
  • Resiliency
  • ...
slide-4
SLIDE 4
  • Scalability increase the number of communicating agents
  • Locality

○ Keep control traffic in the same latency domain (site) as much as possible ○ Mitigate control traffic over WAN: ■ APIs ■ Database state accesses and internal management (Thursday, 9AM: Keystone in the context of Fog/Edge MDC) ■ Remote Procedure Calls

➢ Scope: Openstack’s Remote Procedure Calls in a massively distributed context

Messaging challenges at the edge

Conceptual challenges—the messaging perspective

slide-5
SLIDE 5

What’s next ?

  • Oslo.messaging
  • Scalability evaluation
  • Locality evaluation
  • Lessons learnt
slide-6
SLIDE 6

Openstack Internal Messaging

slide-7
SLIDE 7

Openstack Interprocess Communications

Bus

Remote Procedure Call (RPC)

slide-8
SLIDE 8

Openstack Interprocess Communications

  • Oslo.messaging

○ Part of the OpenStack Oslo Project (https://wiki.openstack.org/wiki/Oslo) ○ APIs for messaging services ○ Remote Procedure Call (RPC) ■ Inter-project control messaging in OpenStack ○ Abstraction - hides the actual message bus implementation ■ Opportunity to evaluate different messaging architectures

slide-9
SLIDE 9
  • slo.messaging RPC
  • Remote Procedure Call (RPC)

○ Synchronous request/response pattern ○ Three different flavors: ■ Call - typical request/response ■ Cast - request/no response expected ■ Fanout - multicast version of Cast ○ How does the message get to the proper server???

Client Server Client Client Server Server Server

slide-10
SLIDE 10
  • slo.messaging Addressing (Targets)
  • Service Addressing

○ Project assigns servers a well known address ■ Example: “Service-A” ○ Server subscribes to that address on message bus ○ Clients sends requests to “Service-A” ○ Represented by a Target Class in the API ○ Unique to a particular Server ■ Direct messaging ○ Or Shared among Servers ■ Load balancing/Multicast

Client Server Server

B U S

To: Service-A Service-A Service-B

slide-11
SLIDE 11
  • slo.messaging Alternative Message Buses
  • Supports multiple underlying messaging implementations:

○ RabbitMQ Broker (based on AMQP 0-9.1 prototype) ○ Apache Qpid Dispatch Router (AMQP 1.0 ISO/IEC 19464) ■ Barcelona Summit Router Presentation: https://bit.ly/2Iuw6Pu

Broker

RPC Server RPC Client RPC Client RPC Server

slide-12
SLIDE 12
  • Brokered RPC Messaging (RabbitMQ)

○ Centralized communications hub (broker) ○ Queues “break” the protocol transfer ○ Non-optimal path

Broker and brokerless approaches to RPC

Core Network

Broker

Regional Site Local Site

RPC Client RPC Server A RPC Server B

slide-13
SLIDE 13

Broker and brokerless approaches to RPC

  • Brokered RPC Messaging (RabbitMQ)

○ Centralized communications hub (broker) ○ Queues “break” the protocol transfer ○ Non-optimal path Core Network

Broker

Regional Site Local Site

RPC Client RPC Server A RPC Server B

slide-14
SLIDE 14

Broker and brokerless approaches to RPC

Core Network

Broker

Regional Site Local Site

RPC Client RPC Server A RPC Server B

  • Brokered RPC Messaging (RabbitMQ)

○ Centralized communications hub (broker) ○ Queues “break” the protocol transfer ○ Non-optimal path

slide-15
SLIDE 15

Broker and brokerless approaches to RPC

Core Network Regional Site Local Site

RPC Client RPC Server A RPC Server B

  • Brokerless (Apache Qpid Dispatch Router)

○ Deployed in any Topology ○ Dynamic Routing Protocol (Dijkstra) ○ Least Cost Path between RPC Client & Server

slide-16
SLIDE 16

Broker and brokerless approaches to RPC

Core Network Regional Site Local Site

RPC Client RPC Server A RPC Server B

  • Brokerless (Apache Qpid Dispatch Router)

○ Deployed in any Topology ○ Dynamic Routing Protocol (Dijkstra) ○ Least Cost Path between RPC Client & Server

slide-17
SLIDE 17

Broker and brokerless approaches to RPC

Core Network Regional Site Local Site

RPC Client RPC Server A RPC Server B

  • Brokerless (Apache Qpid Dispatch Router)

○ Deployed in any Topology ○ Dynamic Routing Protocol (Dijkstra) ○ Least Cost Path between RPC Client & Server

slide-18
SLIDE 18

The Tools used for Testing

slide-19
SLIDE 19

Oslo messaging benchmarking toolkit

RPCs RPCs

Test BUS C

  • m

m a n d s A p p l i c a t i

  • n

m e t r i c s

  • mbt controller

Oslo Messaging Benchmarking Tool

  • mbt: https://git.io/vp2kX

Deploys S y s t e m m e t r i c s

  • o controller

Ombt orchestrator

  • o: https://git.io/vp2kh
slide-20
SLIDE 20

Scalability evaluation

slide-21
SLIDE 21

Evaluation

  • Test Plan http://bit.do/os-mdrpc

○ Six scenarios (synthetic / operational) ○ Two drivers ■ Router ■ Broker ○ Three comm. patterns ■ Cast ■ Call ■ Fanout ○ Grid’5000 testbed

Methodology

slide-22
SLIDE 22

Scalability evaluation

  • Single (large) shared distributed target

○ Global shared target ○ Scale the # of producers, constant throughput ○ How big a single target can be ?

  • Multiple distinct distributed targets

○ Many targets running in parallel ○ Scale the # of targets ○ How many targets can be created ?

  • Single (large) distributed fanout

Scale the # of consumers

How large a fanout can be ?

Synthetic scenarios (TC1, TC2, TC3)

Target Target Target

slide-23
SLIDE 23

Scalability evaluation

Parameters

  • # of calls, # of agents (producers, consumers), call types
  • Bus topologies

○ RabbitMQ cluster of size 1, 3, 5 ○ Complete graph of routers of size 1, 3, 5 ○ Ring of routers up to size 30 (inc. latency between routers)

  • Bus load

○ Light: 1K msgs/s ○ Heavy: 10K msgs/s > oo test_case_2 --nbr_topics 100 --call-type rpc-call --nbr_calls 1000 > oo campaign --incremental --provider g5k --conf conf.yaml test_case_1

slide-24
SLIDE 24

Scalability evaluation

Memory consumption: Single shared target (TC1) - rpc-call

>25GB ~12GB each <10GB each <5GB ~2GB each <2GB each max # of supported agents

slide-25
SLIDE 25

Scalability evaluation

CPU consumption: Single shared target (TC1) - rpc-call

>20 cores ~3 cores ~2 cores each ~1 core each >15 cores each <10 cores each

slide-26
SLIDE 26

Scalability evaluation

Latency: Single shared target (TC1) - rpc-call - 8K clients

Target

slide-27
SLIDE 27

Scalability evaluation

Latency: Multiple distinct targets (TC2) - rpc-call - 10K Targets

Target

slide-28
SLIDE 28

Scalability evaluation

Latency: single large fanout (TC3) - rpc-fanout - 10K consumers

Target

slide-29
SLIDE 29

Wrap up

  • All test cases

○ Routers are lightweight (CPU, memory, network connections)

  • Single shared distributed target:

○ Implicit parallelism is observed only with routers (single queue in brokers) ○ Scale up to 10K producers

  • Multiple distinct distributed targets:

○ Similar behaviour for both drivers because of short buffering ○ Scale up to 10K targets (20K agents)

  • Single distributed fanout:

○ Router is less sensitive to the size of broadcast ○ Scale up to 10K consumers

slide-30
SLIDE 30

Locality evaluation

slide-31
SLIDE 31

Multisite : producers and consumers spread over different distant locations

  • Scalability
  • Locality

○ Keep traffic in the same latency domain as much as possible

Locality evaluation

Conceptual challenges: reminder

slide-32
SLIDE 32

Locality evaluation

Strategies: Centralized message bus

  • Producer in site 1

○ Need to break the symmetry of the consumers ■ Give less rabbit_qos_prefetch_count/

rpc_server_credit to remote consumers¶¶

■ Effects depends on

  • Latency
  • Actual workload
  • Producer in site 2

■ Sub-Optimal data path

➢ Bad locality in the general case

Consumer Producer Consumers

LAN latency WAN latency

Site 1 Site 2

slide-33
SLIDE 33

Locality evaluation

Strategies: Sharded Message bus

  • E.G : Nova CellsV2
  • Strict locality

○ A shard can be a latency domain ○ Traffic remains in a shard

  • Caveat :

○ routing requests (consumer index, inter-shard communication),

➢ Routing is deferred to the application

Top Shard (orchestration) Shard 1

Consumers Producers

Shard 2

Consumers Producers

Shard 3

Consumers Producers

slide-34
SLIDE 34

Locality evaluation

Strategies: Alternative to sharding

  • A tree of routers

➢ Routing is transparent to the application ➢ How locality is ensured ?

Shard 1

Consumers Producers

Shard 2

Consumers Producers

Shard 3

Consumers Producers

r1 r2 r3 r0

slide-35
SLIDE 35

Two levels of locality

  • Strict locality:

○ Closest mode ○ Local consumers are picked over remote ○ Caveat: local consumers backlog

  • Load-sharing

○ Balanced mode ○ Cost is associated with every consumer ○ Consumer with the lower cost is picked ○ Cost is dynamic

Locality evaluation

Strategies: Decentralized bus (amqp 1.0 only)

r3 r2 r1 Producers Consumer 1 (C1) (C2) Consumer 2 (C3) Consumer 3

50ms 100ms Site 1 Site 2 Site 3

slide-36
SLIDE 36

➢ Cost is dynamic ➢ Load sharing is locality aware

Locality evaluation

Decentralization of the message bus

Up to n=30 sites (ring) Increase the message rate Increase the intersite latency Evaluate the locality in message delivery 99% Local 66% Local 66% Local 95% Local Increasing load Increasing inter-site latency

n=3

slide-37
SLIDE 37

Locality evaluation

Decentralization of the message bus From High Availability OpenStack to High Locality OpenStack

r2 r1

Site 3 C + N + cpts Site 2 C + N + cpts

r1 Control (C) Network (N)

Compute node 1 Compute node n

Site 1 C + N + cpts

r3

  • One single Openstack
  • Shared bus (router mesh)

➢ RPCs locality only In the future

  • APIs locality
  • Database locality

➢ Join the FEMDC !

slide-38
SLIDE 38

Lessons learnt

slide-39
SLIDE 39

Lessons learnt

  • Communication layer of OpenStack

○ Two implementations : AMQP 0-9.1 / rabbitmq and AMQP 1.0 / qpid-dispatch-router

  • Centralized deployments

○ Similar scalability ○ Router are lightweight and achieve low latency message delivery esp. under high load

  • Decentralized Deployments

○ A mesh of routers offers guarantees in the locality of the messages ○ 2 levels of locality : strict / locality-aware load sharing

  • Toward a high-locality OpenStack for multisite deployment

○ Leveraging a router mesh ○ Same ideas need to be applied to APIs and database

  • FEMDC working group is studying different options

  • Wed. 4:40pm - 5:20pm
slide-40
SLIDE 40

OpenStack internal messaging at the edge: In-depth evaluation

kgiusti@redhat.com javier.rojas-balderrama@inria.fr matthieu.simonin@inria.fr http://bit.do/oo-jupyter-tc1 http://bit.do/oo-jupyter-tc2 http://bit.do/oo-jupyter-tc3 http://bit.do/oo-tc1-ring30

slide-41
SLIDE 41