The Role of Machine Learning in Network Automation Alberto - - PowerPoint PPT Presentation

the role of machine learning in network automation
SMART_READER_LITE
LIVE PREVIEW

The Role of Machine Learning in Network Automation Alberto - - PowerPoint PPT Presentation

The Role of Machine Learning in Network Automation Alberto Leon-Garcia University of Toronto alberto.leongarcia@utoronto.ca Acknowledgment to: Dr. Saeideh Parsaei Fard and Iman Tabrizian Outline Context: Network Automation


slide-1
SLIDE 1

The Role of Machine Learning in Network Automation

Alberto Leon-Garcia University of Toronto alberto.leongarcia@utoronto.ca Acknowledgment to: Dr. Saeideh Parsaei Fard and Iman Tabrizian

slide-2
SLIDE 2

Outline

  • Context:
  • Network Automation
  • Addressing global challenges
  • Networks & Application Platforms
  • MAPE-K Loops and ML Pipelines
  • AI as a Service
  • Challenges & Recent Work

2

slide-3
SLIDE 3

AI and ML!

3

AI engines:

  • Rule engines
  • Expert systems
  • Evolutionary algorithms

Machine learning algorithms

  • Supervised learning
  • Unsupervised learning
  • Reinforcement learning
  • Online learning

Neural networks Deep Learning: MLP, CNN, RNN

source : www.atis.org. report 2018, Evolution to an Artificial Intelligence Enabled Network

slide-4
SLIDE 4

Use Cases of AI in Networking

https://www.etsi.org/images/files/ETSIWhitePapers/etsi_wp22_ENI_FINAL.pdf

4

slide-5
SLIDE 5

Ridesharing

With full penetration of ridesharing, the number of cars on the road could be reduced to 1/3.

5

slide-6
SLIDE 6

EVs, Ridesharing, Renewable Energy

6

  • Real-time control of EV’s
  • Ridesharing
  • Recharging from Renewables
  • Global Policy
  • Carbon Footprint
  • Energy Efficiency
  • Environmental Impact
  • Productivity
  • Immense Potential to address

Urbanization Challenges

slide-7
SLIDE 7

Air Quality & Climate Change

  • In 25 years, there could be no air pollution in Santiago Chile

7

slide-8
SLIDE 8

Smart is all about Data!

  • Real-time Situational Awareness
  • Continuous Monitoring & Data Collection
  • Learning and Intelligence
  • Analytics & Machine Learning
  • Visualization
  • Anomalies, Trends, Forecasting, KPI Analysis
  • Smart Applications Enablement
  • APIs provide real-time intelligence to Decision-making
slide-9
SLIDE 9

Making the Network Agile & Smart

Major functionalities for software-defined control

  • Identify the state of the network
  • Disseminate data (efficiently) to where it is consumed
  • Analyze data; understand network & service behaviour
  • Decide what changes are required
  • Apply changes to the network
  • MAPE-K Loop!
slide-10
SLIDE 10

Source Domain Data Gathering and Storage Data Preparation Analyzing, Optimizing and Learning Preparing a plan and related parameters for action Execution and Implementation Destination Domain

Step 1: Monitoring Step 2: Analysis

Step 3: Planning Step 4: Execution

Knowledge

MAPE-K Loop for Autonomous Network Management

10

slide-11
SLIDE 11

ETSI Experiential Networked Intelligence

Evolution to Network Intelligence

  • Dynamic network conditions
  • More services & users
  • Better network telemetry
  • Cost-effective AI and ML

Evolution to Mgmt & Ops Intelligence

  • Human decisions
  • Complex policy decitions
  • Complex manual operations

Actuation Sensing

Intelligent Components https://www.etsi.org/images/files/ETSIWhitePapers/etsi_wp22_ENI_FINAL.pdf

slide-12
SLIDE 12

Outline

  • Context:
  • Network Automation
  • Addressing global challenges
  • Networks & Application Platforms
  • MAPE-K Loops and ML Pipelines
  • AI as a Service
  • Challenges & Recent Work

12

slide-13
SLIDE 13

Network & Application Platform

13

Phys. Resources Cloud Controllers (SD) Network Controllers Access/Things Controllers

SDI Resource Management

SDI Manager Topology Manager Monitoring & Analytics Multi-Tier Software Defined Infrastructure

PaaS

End-To-End, Multi Domain, Orchestration

Information-Centric Data Dissemination

BIaaS Publish/Subscribe Overlay

Algorithmic Engines Analytics Engines

APIs SaaS

Portal Custom KPIs Urban Planning Congestion pricing 3rd Party Apps

slide-14
SLIDE 14

Questions

  • How to deploy MAPE-K loops?
  • Where to deploy ML?
  • To share or not to share?
  • AI as a Service?
slide-15
SLIDE 15

Outline

  • Context:
  • Network Automation
  • Addressing global challenges
  • Networks & Application Platforms
  • MAPE-K Loops and ML Pipelines
  • AI as a Service
  • Challenges & Recent Work

15

slide-16
SLIDE 16

3GPP Network Data Analytics Function

Service-based architecture for 5G control plane:

  • Components query NRF to discover

& communicate with each other

  • Cloud-native

NWDAF

  • Network Analytics logical function
  • Collection of data
  • Provide slice-specific network

analytics to other Network Functions

  • Enabler of network automation

NWDAF

slide-17
SLIDE 17

ENI Assisting MANO

Intelligent Components

17

slide-18
SLIDE 18

ENI Assisting SDN

Intelligent Components

18

slide-19
SLIDE 19

ITU ML-Aware 5G Architecture

ML Overlay

  • Logical entities (& functionalities)

combined to form analytics fcn

  • Independent of underlying netwks
  • Common vocabulary & nomenclature

for ML fcns & interfaces

  • Enables interoperability of ML apps with

heterogeneous networks

  • Rapid provisioning of ML apps
  • Cost-effective AI and ML

Technology-specific realization

  • Apply logical ML overlay to specific

technoolgy

  • 3GPP, MEC, EdgeX, …

collector pre-processor policy distributor

https://www.itu.int/en/ITU-T/focusgroups/ml5g/Documents/ML5G-delievrables.pdf

slide-20
SLIDE 20

Unified Architecture for ML in 5G

Management Subsystem

  • Orchestration
  • Management (VNFM, EMS)
  • Platform (VIM)
  • ML Function Orchestrator

Multi-level ML pipeline

  • Overlaid on existing NFs
  • Instantiated by MLFO
  • Multilevel chaining

Closed-loop subsystem

  • Allows ML pipeline to adapt to change
  • Sandbox w simulator & real data

https://www.itu.int/en/ITU-T/focusgroups/ml5g/Documents/ML5G-delievrables.pdf

slide-21
SLIDE 21

Multi-level ML in MEC & 3GPP

ML pipeline 3 → 6

  • Local

predictions at NMS affect configurations in different domains (e.g.

  • ptimizations).

9→2→4→ML pipeline1

  • inputs

from RAN & UE/RAN to make predictions at CN (e.g., MPP). 10→7→ ML pipeline 2 → 8

  • Inputs

from MEC platform to make predictions at the edge and apply them to MEC. Could also use side information from the UE and RAN (e.g., caching decisions made at the MEC). 3→4→ ML pipeline1 → 5

  • Inputs

from CN and possibly UE/RAN inputs to make predictions at CN ,and apply to NMS parameters, that in turn affect configurations in different domains (e.g., SON decisions made at the CN). https://www.itu.int/en/ITU-T/focusgroups/ml5g/Documents/ML5G-delievrables.pdf

slide-22
SLIDE 22

5G Slice Broker in NEC

Challenges;

  • How to map heterogeneous service

requirements onto the network resource availability?

Solution: 5G Network Slice Broker

  • A mediator should be

interposed between external tenants and mobile network management

slide-23
SLIDE 23

5G Network Slice Broker

  • Resource monitoring: e.g., resource blocks, MCSs
  • Machine Learning operations for traffic forecasting (online reinf. learning )
  • Admission Control for network slice requests (based on forecasting info)
  • Support for multiple classes of Network Slices SLAs
  • Heterogeneous QoS traffic requirements (data rate and latency)
slide-24
SLIDE 24

Machine Learning is NOT Only About Model Code

Hidden Technical Debt in Machine Learning Systems https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf

slide-25
SLIDE 25

Applying Big Data & ML Model to SON/SDN Controller

25

  • L. Le, D. Sinh, B. P. Lin and L. Tung, "Applying Big Data, Machine Learning, and SDN/NFV to 5G Traffic Clustering, Forecasting, and Management," 2018 4th IEEE

Conference on Network Softwarization and Workshops (NetSoft), Montreal, QC, 2018, pp. 168-176.

slide-26
SLIDE 26

26

Use cases:

  • Traffic forecasting
  • Congestion avoidance
  • Abnormality detection
  • Energy Saving
  • L. Le, D. Sinh, B. P. Lin and L. Tung, "Applying Big Data, Machine Learning, and SDN/NFV to 5G Traffic

Clustering, Forecasting, and Management," 2018 4th IEEE Conference on Network Softwarization and Workshops (NetSoft), Montreal, QC, 2018, pp. 168-176. doi: 10.1109/NETSOFT.2018.8460129

slide-27
SLIDE 27

Acumos AI

  • Started by AT&T and Tech Mahindra
  • Currently a Linux Foundation project
  • With the goal of making it easier to build, share, and

deploy AI apps

  • Acumos Marketplace
slide-28
SLIDE 28
slide-29
SLIDE 29

Outline

  • Context:
  • Network Automation
  • Addressing global challenges
  • Networks & Application Platforms
  • MAPE-K Loops and ML Pipelines
  • AI as a Service
  • Challenges & Recent Work

29

slide-30
SLIDE 30

Transport subdomain

Core subdomain

Open Interface-1 Chain of network functions of Slice Application n Chain of network functions of Slice Health Chain of network functions of Slice AR/VR Chain of network functions of Slice URRLC AI-aaS Orchestrator (s) X-MKL- Manager (s) SDN Controllers

CF1 CF3 CF2 CF4 CF5

CF1

CF3 CF2 CF4 CF5

MKL –chains of NAL MKL –chains of OTT

Intra AI-NAL Sandbox Intra Slice Sandbox

X-MKL Sandbox

SDI

Internal subdomain Third Parties subdomain

Computation domain Controllers Slice Controllers Open InterfaceI-2 SDI- manager(s)

AI-aaS Management Plane

AI-aaS Application Plane

AI-aaS Training Plane

Networking Application Layer (NAL)

e.g., load balance, QoS assurance, Anomaly detection, SLA management

Business Application Layer or Over the top (OTT-Layer)

Drones, e-factory, robotics, haptic, image processing, Big data analysis

AO- MM MM-SM AO-SM

TO

T1 T2 T3

TNAL T- OTT

SDI based Architecture for AI-aaS

slide-31
SLIDE 31

Kubeflow

  • An open source project by Google
  • Make it easier to run ML workflows on Kubernetes
  • Open sourcing the way Google ran Tensorflow
  • Democratizing ML
  • Response to increasing number of ML users at

Google

  • Active contribution by SAVI members to this

project

slide-32
SLIDE 32

Kubeflow Features

  • Hyperparameter Tuning
  • Multi tenant Jupyter notebooks
  • Distributed Machine Learning
  • Monitoring
  • Machine learning Pipelines based on Argo project
slide-33
SLIDE 33

Ku Kubeflow Pi Pipelines Ar Arch chitect cture

slide-34
SLIDE 34

Outline

  • Context:
  • Network Automation
  • Addressing global challenges
  • Networks & Application Platforms
  • MAPE-K Loops and ML Pipelines
  • AI as a Service
  • Challenges & Recent Work

34

slide-35
SLIDE 35

Online Monitoring Data Compression

SAVI EDGE Victoria

VM 1 VM 2 Metric Exporter

SAVI EDGE Waterloo

VM 1 VM 2 Metric Exporter

SAVI EDGE Carleton

VM 1 VM 2 Metric Exporter

SAVI Core Toronto

VM 1 VM 2 Metric Collector

Node 1 Node 2 Node 3

Kubeflow

slide-36
SLIDE 36

Our Setup over SAVI

  • 16 VMs (blue boxes)
  • 9 VMs acting as switches (orange circle).
  • Each VM has Open vSwitch and

monitoring tools preinstalled

  • For each region, there is a Prometheus

server

  • An HAProxy load balancer (VM 1) in the

”Core”

slide-37
SLIDE 37

Encoder Decoder Bottleneck

Learned Encoding

Auto Encoder

  • One
  • f

the main issues for a cognitive network management is how to compress diverse types of data in more efficient manner and reduce a huge volume

  • f data in networking
  • We have 111 data features and reduce the size

around 30% (Bottleneck layer has 70 output)

Step 1: Data Monitoring

Transformer Serve (TFX)

Report

Autoencoders

Use Case 1 Step 2: AI Engines Step 3 &4: Policy & execute

slide-38
SLIDE 38
  • More than 85% of "CPU usage" data is reconstructed with less than 10% error.
  • This result is promising for the potential of autoencoders in networking

applications Evaluation is based on the reconstruction error

slide-39
SLIDE 39

Evaluations

  • The effect of number of neurons in the bottleneck layer

Encoder Decoder Bottlenec k

Learned Encoding

A good tradeoff between error and compression ratio

slide-40
SLIDE 40

Federated Learning

Data ML Model Optimization problem System setting to solve optimization problem Cloud- based training Federated Learning Data is sent to the cloud to derive the model or solve the

  • ptimization

problem

  • Data is not sent to

the cloud

  • Each device

solves its own

  • ptimization

problem Data

Data Data Data

slide-41
SLIDE 41

Cost Function of the Centralized Model vs Federated Model

slide-42
SLIDE 42

Concluding Remarks

  • ML very promising for network automation
  • And AIaaS
  • Scalable and efficient approaches required
  • Many discoveries ahead!
slide-43
SLIDE 43

Thank You!

45