Designing NFVi Architecture at the Edge Requirements, Challenges and - - PowerPoint PPT Presentation

designing nfvi architecture at the edge
SMART_READER_LITE
LIVE PREVIEW

Designing NFVi Architecture at the Edge Requirements, Challenges and - - PowerPoint PPT Presentation

Designing NFVi Architecture at the Edge Requirements, Challenges and Solutions Brent Roskos , Senior Principal Telco Architect, Red Hat Jaromir Coufal, Edge Computing Product Management, Red Hat April 29, 2019 Open Infrastructure Summit - Denver


slide-1
SLIDE 1

Brent Roskos, Senior Principal Telco Architect, Red Hat Jaromir Coufal, Edge Computing Product Management, Red Hat April 29, 2019 Open Infrastructure Summit - Denver

Designing NFVi Architecture at the Edge

Requirements, Challenges and Solutions

slide-2
SLIDE 2

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019

Network function virtualization infrastructure (NFVi) architecture can be complicated

2

During this session we'll look into:

  • Requirements
  • Challenges
  • Solutions
  • Best practice
  • What’s next
  • Questions
slide-3
SLIDE 3

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019

Edge Computing Motivation

3

Latency

Place processing power closer to the data source

Bandwidth

Reduce the amount of traffic that needs to travel back to the data center core

Resilience

Continuous operations of edge sites in event link drop

Regulations

Meet standards and compliance requirements

slide-4
SLIDE 4

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019

Edge Computing Challenges

4

Scale

Architecture requires horizontal scale

Environmental

Potential inconsistent connectivity, dust, heat, and space constraints

Expertise

Limited to no IT expertise in remote sites

All while controlling costs to ensure budget goals are met.

slide-5
SLIDE 5

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019

Edge Tiers

5

slide-6
SLIDE 6

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019

Deployment Configuration Considerations

6

Distributed nodes Standalone cluster(s)

slide-7
SLIDE 7

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019

Standalone Clusters

7

  • Multi-cluster deployment
  • Each site has its own standalone deployment
  • Complete cluster at each site (control + resource)

Benefits

  • Full isolation (in case of disaster)
  • High redundancy and availability
  • Very low impact in case of network drop out

Complications

  • Bigger hardware footprint (need for control plane)
  • More complex management (versioning)
slide-8
SLIDE 8

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019

Distributed Compute Nodes

8

  • Single cluster deployment
  • Primary site has shared control plane (and resource nodes)
  • Remote sites have only resource nodes

Benefits

  • Smaller footprint at the remote sites
  • Faster to scale to new location (resource scale out)
  • Easier operational management (single cluster, single config)

Complications

  • Control plane is still a single point of failure
  • Network drop affects management of workloads
slide-9
SLIDE 9

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019 9

Red Hat has recently published a guide which may be used for distributing compute nodes to edge sites: Deploying Distributed Compute Nodes to Edge Sites Red Hat OpenStack Platform 13 allows you to implement edge computing using distributed compute nodes (DCN). With this approach, you share control plane at the primary site and deploy compute nodes to the remote location. The OpenStack services are able to tolerate the network issues that that can arise, such as connectivity and latency, among others.

OpenStack Distributed Compute Nodes

Infra Solution / Reference Architecture

slide-10
SLIDE 10

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019

Red Hat OpenStack Platform 13 (Queens)

with distributed compute nodes

10

  • Support: Fully Supported by Red Hat
  • Use Cases: Cross-Industry
  • Based on: OpenStack Queens
  • Deployment Tool: director (TripleO)
  • Edge Site Resources: Only Compute
  • Storage: Local Ephemeral for Computes
  • Networking: L3 Routed (recommended)
  • Max Network Latency: 100 ms (roundtrip time)
  • Network Drop Outs: Best Effort (workload operational during core-edge network loss)
  • Image Sizing: No limit (network bandwidth impacts transfer time)
slide-11
SLIDE 11

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019 11

slide-12
SLIDE 12

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019

Deployment & Management

12

  • Deployed with Director/TripleO
  • Undercloud at the primary site
  • Container Registry at the primary site
  • Single Deployment Stack
  • An operation at the single Edge Site

runs through the whole stack

L3 Routed Primary Site DCN Site 1 DCN Site 2

Undercloud

Deployment Stack

Red Hat Satellite RPM Repos & Container Registry

slide-13
SLIDE 13

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019

Computing

13

  • Sites defined by Availability Zones
  • VM deployed to specific location by

scheduling to desired AZ

  • Primary Site can optionally also

contain Compute Nodes

  • All DCN Site Compute Nodes need to

be using the same storage (local ephemeral)

  • Primary Site can have specific

Compute Role to use Cinder Volumes (backed by Ceph, etc)

  • Each site is a specific Compute Role

DCN Site 1

AZ1 Compute Nodes

(Local Ephemeral) OPTIONAL

AZ0 Compute Nodes

(Local Ephemeral)

Primary Site

Ceph Cluster 0

OPTIONAL

AZ0

slide-14
SLIDE 14

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019

Storage

DCN is only available Compute services using local ephemeral storage. The edge cloud applications will need to be designed to consider data availability, locality awareness, and/or replication mechanisms. In addition, live migration will not be available. * Note that Compute nodes at the primary site must use same nova ephemeral backend that the remote sites use. (soon improved)

14

OPTIONAL

AZ0 Compute Nodes

(Local Ephemeral)

Ceph Cluster 0

OPTIONAL

AZ0

DCN Site 1

AZ1 Compute Nodes

(Local Ephemeral)

Primary Site

slide-15
SLIDE 15

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019

  • L3 Routed topology is the recommended

network setup

  • Recommendation is to use Provider

Networks over site-specific networks (complex tunnel mesh across sites)

  • Optimize for network performance

○ SR-IOV ○ OVS-DPDK ○ OVS-Offload (tech preview) With Review (support exception)

  • Routed Provider Networks with IP &

Metadata via config-drive, or with network provided DHCP relays to DHCP Agents on Controllers

Networking

15 L3 Routed Primary Site DCN Site 1 DCN Site 2

slide-16
SLIDE 16

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019

slide-17
SLIDE 17

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019

Ongoing Work Upstream

17

  • Deploy each site as an independent stack

○ Better resiliency and scale

  • Add ability to configure different storage backends per nova instance

○ Deploy Ceph cluster for primary side while DCN sites use local ephemeral

  • Deploy multiple Ceph clusters

○ Ability to dedicate a Ceph cluster per DCN site

  • Ability to combine compute & storage at the same node in the DCN site

○ Hyperconverged solution

  • Increase scale of deployment

○ Deploying for large sets of edge sites

  • Distributed DHCP and Metadata agents

○ Place agents at edge sites for enhanced availability and flexibility

slide-18
SLIDE 18

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019

  • Join the upstream community: OpenStack Edge Computing Group
  • Upstream Gathered Use Cases
  • Upstream Reference Architectures / Models
  • Freenode - #edge-computing-group
  • Mailing list
  • Meetings: Regular calls in alternating slots:

○ Every first Thursday of the month: 0700 UTC ○ On other weeks Tuesdays at 7am PDT / 1500 UTC

18

Where to Learn More?

slide-19
SLIDE 19

OPEN INFRASTRUCTURE SUMMIT - DENVER 2019

Questions

19

slide-20
SLIDE 20

THANK YOU