Edge Resource Management Systems: From Today to Tomorrow November - - PowerPoint PPT Presentation

edge resource management systems from today to tomorrow
SMART_READER_LITE
LIVE PREVIEW

Edge Resource Management Systems: From Today to Tomorrow November - - PowerPoint PPT Presentation

Edge Resource Management Systems: From Today to Tomorrow November 2018 Berlin OpenStack Summit Who are we? Sandro Mazziotta Abdelhadi Chari Adrien Lebre Cloud/NFV innovation Director Product Management Professor (HdR) at IMT Atlantique


slide-1
SLIDE 1

Edge Resource Management Systems: From Today to Tomorrow

November 2018 Berlin OpenStack Summit

slide-2
SLIDE 2

OpenStack Summit Berlin 2

Who are we?

Adrien Lebre

Professor (HdR) at IMT Atlantique FEMDC SiG Co-chair (2016-2018) Discovery PI

http://beyondtheclouds.github.io Abdelhadi Chari Cloud/NFV innovation Project manager abdelhadi.chari@orange.com Sandro Mazziotta Director Product Management Openstack NFV smazziot@redhat.com

slide-3
SLIDE 3

OpenStack Summit Berlin 3

Multiple use cases are triggering Edge...

Source: https://wiki.akraino.org/display/AK/Akraino+Edge+Stack

slide-4
SLIDE 4

OpenStack Summit Berlin 4

Edge from the Infrastructure viewpoint?

A set of independent computing sites that should be seen as a global and unique infrastructure National Core Regional Core Edge Far Edge Far Edge Multiple racks Less than 10 servers 1 rack 1 to 3 servers Form Factors

slide-5
SLIDE 5

OpenStack Summit Berlin 5

New constraints on the resource mgmt system

Latency / Bandwidth / intermittent Network consumer ⇔ service service ⇔ service control plane ⇔ resources Resilience make edge sites autonomous, minimize failure domain to 1 site Regulations keep sensitive data

  • n-site / within regulatory

region Geo-Distribution Need to deploy and perform the lifecycle management of distributed systems Scale From a few regional sites to a large number of remote resources

And many others...

slide-6
SLIDE 6

Orange: Edge through the NFV perspective

slide-7
SLIDE 7

OpenStack Summit Berlin 7

NFV hosting infrastructure needs: lower and lower

Regional PoPs

Distribution level

Country /International Data Centers

300 - 500 KM 1000 + KM Few tens Few per country

Backbone network Local PoPs Control Plane Focus

4G vBBU 2G/3G vRNC vSSL/IPSec Gateway vCDN vCPE vBOX vEPC4Business vBNG

DSLAM / OLT

RAN Virtual DU (MAC/RLC) vOLT MEC

Mobile Core:

Mobile Core GiLAN vEPC MVNO vCDN control vPCRF vMME MVAS vWiFi access control Gateway vIMS vSBC

Data Plane Focus

100 - 250 KM Few hundreds 5 - 50 KM Few- Tens of thousands

Central Offices

slide-8
SLIDE 8

OpenStack Summit Berlin 8

  • Simply speaking: we should be able to ‘’play’’ with this distributed cloud

infrastructure as if it was located in a single data center

  • Not so easy!

○ Scalability ○ Lifecycle of control components ○ Networking: especially the interactions with the WAN ○ On-site operations for initial setup, hardware upgrade/troubleshooting ○ How to architecture the control plane components for better resiliency/efficiency? ○ Can we really share the infrastructure among a large variety of NFV functions? (mission critical and best effort)

What do we expect from this distributed infrastructure ?

slide-9
SLIDE 9

OpenStack Summit Berlin 9

  • And additional requirements will appear, driven by the nature of the

NFV functions themselves:

○ Performance and real time constraints (e.g. for Higher Phy vRAN functions) ○ Mix of Workloads to be supported (VMs + Containers) ○ Location awareness for resources’ allocations/reconfigurations/interconnections

Orange as other Global Telcos, is strongly interested in preparing different scenarios of how to use Openstack to address these requirements and work/support OpenStack evolutions for this purpose

What do we expect from this distributed infrastructure ?

slide-10
SLIDE 10

Can we operate such a topology with OpenStack?

slide-11
SLIDE 11

OpenStack Summit Berlin 11

Edge: Envisioned Topology

The Akraino View

Source: https://wiki.akraino.org/display/AK/Akraino+Edge+Stack

E: Edge Site R: Regional Site C: Central Site

Another possible View

Wired Wireless

WAN

WAN Medium/Micro DCs Controller Compute/Storage Node Distributed DC Embedded DC Edge Site Constrained Edge Site Large Data Center Central / Regional Site Mobile Edge Site

slide-12
SLIDE 12

OpenStack Summit Berlin 12

  • Two questions to address:

○ How does openstack behave in each of these deployment scenarios? (i.e., what are the challenges of each scenarios?) ○ How can we make each openstack collaborate with the others?

Can we operate such a topology with OS?

WAN WAN Micro DC Distributed DC Embedded DC

Regional Site Customer Premises Equipment

Data Center

Central Site Public Transport

Wired Wireless Controller Compute/Storage Node

slide-13
SLIDE 13

OpenStack Summit Berlin

WAN

WAN

Micro DC Distributed DC Embedded DC

Edge Site Constrained Edge Site

Data Center

Central / Regional Site Mobile Edge Site

13

Edge: Envisioned Topology

s c a l a b i l i t y Footprint S y n c h r

  • n

i z a t i

  • n

Network Specifics

Wired Wireless Controller Compute/Storage Node

slide-14
SLIDE 14

OpenStack Summit Berlin

WAN Micro DC Embedded DC Edge Site Constrained Edge Site Data Center Central / Regional Site Mobile Edge Site

Wired Wireless

Distributed DC

14

Edge: Envisioned Topology

s c a l a b i l i t y Footprint S y n c h r

  • n

i z a t i

  • n

Network Specifics

Controller Compute/Storage Node

A few challenges (scalability, latency, intermittent network connectivity, deployment, etc.) Inria, Orange, and Redhat have been investigating the distributed DC scenarios for the two last years.

slide-15
SLIDE 15

Let’s focus on Distributed DC

slide-16
SLIDE 16

OpenStack Summit Berlin 16

Distributed DC => Distributed Compute Nodes

Central Location

Remote Sites 1 Remote Sites 2 Remote Sites 3 Remote Sites N

Features: ➢ 1 shared (Openstack) cluster ➢ 1 central control Plane and N remote sites with Compute nodes ➢ Each Remote Site is an AZ

Openstack Controllers TripleO (LifeCycle Mgmt) Compute Compute Compute Compute

A few performance studies ➢ Evaluating OpenStack WANWide (See FEMDC openstack wikipage) ➢ Supported by Redhat since Newton

WAN

slide-17
SLIDE 17

OpenStack Summit Berlin 17

Lessons learnt

  • The testing effort was focused on clarifying limitations and expectations in Newton

○ Latency ■ At 50ms roundtrip latency the testing started producing errors and timeouts. ■ Beyond this is not generally supported, but the testing was focused on the infrastructure ■ Once our testing reached 300ms roundtrip latency, the errors increased to the point where service communication will fail. ○ Size of Images ■ We had our initial tests to validate the size of the images and the impact to deploy the first time ■ Beyond 2GB images, deployments failed when deploying 1000 VMs over 10 compute nodes ○ Bandwidth ■ This is highly dependant on the environment [Size of Images, Unique Images, App Needs]. ■ Since images are cached, the most bandwidth is needed when sending a unique image the first time.

slide-18
SLIDE 18

OpenStack Summit Berlin 18

Upstream agenda

  • In Queens/Rocky

○ Compute Node with Ephemeral Storage ○ Director Deployment ○ Split Stack => Distributed Deployment ○ (L3 support)/Multi-subnet configurations

  • In Stein

○ Compute Node with local (persistent) Storage ○ Distributed Ceph Support: A single Ceph cluster across the central and remote sites ○ Glance Images on multi-store ○ Enable HCI Node (Converged Compute, Storage) ○ Ceph Cluster on remote sites (min of 3 servers)

  • In Train

○ Advanced monitoring capabilities to collect data and distribute them to the central location. See Presentation on Thursday 3PM Using Prometheus Operator ...

slide-19
SLIDE 19

OpenStack Summit Berlin 19

Are there other DCN challenges?

  • Ongoing activities

○ New performance evaluations based on Queen (Redhat) ○ Impact of remote failures/network disconnections (under investigations at Inria/Orange)

http://beyondtheclouds.github.io/blog/

○ Qpid router as an alternative to Rabbit: a few studies have been performed since 2017 ■ PTG in Dublin / Boston Presentation

https://www.openstack.org/videos/vancouver-2018/openstack-internal-messaging-at-the-edge-in-depth-evaluation

■ Berlin Presentation

https://www.openstack.org/summit/berlin-2018/rabbitmq-or-qpid-dispatch-router-pushing-openstack-to-the-edge

  • Challenges

○ SDN solution with DCN ■ Not all SDN solutions will work with DCN ■ In particular, need to take into account the interface with the WAN ○ Lifecycle of remote resources ○ Impact throughout the whole infrastructure (control and data planes) S e e v i d e

  • s
  • n

l i n e

slide-20
SLIDE 20

Let’s consider several openstack control planes

slide-21
SLIDE 21

OpenStack Summit Berlin

WAN

WAN

Micro DC Distributed DC Embedded DC

Regional Site Customer Premises Equipment

Data Center

Central Site Public Transport

21

We need several control planes

Wired Wireless Controller Compute/Storage Node

slide-22
SLIDE 22

OpenStack Summit Berlin 22

  • Collaborations between/for all services
  • A major constraint/objective: do not modify the code
  • Two alternative approaches:

  • 1. One ring to rule them all

■ A global AMQP bus and a global shared DB ■ A few presentations have been performed (see FEMDC openstack wiki page)

https://wiki.openstack.org/wiki/Fog_Edge_Massively_Distributed_Clouds#Achieved_Actions

Several control planes - academic investigations Several control planes: academic investigations

OpenStack Instances [3, 9, 45]

GALERA

Collaboration is not only sharing states (a few services have to be extended) Almost straightforward (integration at the oslo level) Scalability / Partitioning / Versioning

slide-23
SLIDE 23

OpenStack Summit Berlin 23

  • A major constraint/objective: do not modify the code
  • Two alternative approaches:

  • 2. Another Reify location information at the API level

Several control planes - academic investigations Several control planes: academic investigations

  • penstack server create my-vm ——flavor m1.tiny --image cirros.uec —-scope {”image”:”edge2”}

scope: {“identity”:”edge1”,”compute”:”edge1”,”volume”:”edge1”,”network”:”edge1”,”:"image":"edge2"}

slide-24
SLIDE 24

OpenStack Summit Berlin 24

  • A major constraint/objective: do not modify the code
  • Two alternative approaches:

  • 2. Reify location information at the API level (cross service collaborations)

Several control planes - academic investigations Several control planes: academic investigations

  • penstack server create my-vm ——flavor m1.tiny --image cirros.uec —-scope {”image”:”edge2”}
  • penstack image list —-scope {”image”:”edge1 AND edge2” AND ”edge3”}
  • penstack image create ... —-scope {”image”:”edge1 AND edge2”}
  • penstack server create my-vm ... —-scope {”compute”:”edge1 XOR edge2”}

Collaboration is not only sharing states (a few services have to be extended) Almost straightforward (with HA proxy) Scalability / Partitioning / Versioning

slide-25
SLIDE 25

Takeaway

slide-26
SLIDE 26

OpenStack Summit Berlin 26

  • OpenStack WANWide, a single control plane

○ Distributed Compute Nodes for NFV use-cases ■ First concrete deployments in an edge context ○ A few challenges to address but several ongoing efforts

  • Multiple control planes: a few technical and research challenges

○ Partitioning/Resynchronisation issues (Intermittent networks, embedded DC...) ○ Cross operations (neutron/cinder between sites...) ○ “Single Pane of Glass” (4000 edge sites… How many nodes?) ■ Does it make sense? ■ How can it be implemented? ○ etc.

Takeaway

slide-27
SLIDE 27

OpenStack Summit Berlin 27

  • And Kubernetes

○ Can we leverage lessons learnt from previous studies? ■ Does it make sense to perform WANWide Kubernetes evaluations? ■ Cluster federation? ■ Kubernetes has been designed to hide the resource distribution, edge aims at controlling location aspects. Can these two objectives adress simultaneously?

  • Could we deal with both ecosystem (some sites with kubernetes, others with

OpenStack)?

  • Is the API approach enough to fulfill the expected capabilities?
  • Should we considered edge agreement between resource providers such as

the network peering agreements?

Takeaway (cont.)

slide-28
SLIDE 28

Interested by all those aspects: See you in Denver for discussing new achievements !

slide-29
SLIDE 29

29

Edge is about all smart-* Apps and IoT devices

National Data Center Source:https://opentechdiary.wordpress.com/2015/07/22/part-5-a-walk-through-internet-of-things-iot-basics/

Smart cities, public transportations, industrial internet (Industry 4.0), internet of skills.

Source: https://www.ericsson.com/thinkingahead/the-networked-societ y-blog/2017/02/14/virtual-reality-comes-age-internet-skills/