Virtualizzazione Corso di Sistemi Distribuiti e Cloud Computing - - PDF document

virtualizzazione
SMART_READER_LITE
LIVE PREVIEW

Virtualizzazione Corso di Sistemi Distribuiti e Cloud Computing - - PDF document

Universit degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Virtualizzazione Corso di Sistemi Distribuiti e Cloud Computing A.A. 2017/18 Valeria Cardellini Main references Virtual


slide-1
SLIDE 1

Virtualizzazione

Università degli Studi di Roma “Tor Vergata” Dipartimento di Ingegneria Civile e Ingegneria Informatica

Corso di Sistemi Distribuiti e Cloud Computing A.A. 2017/18 Valeria Cardellini

Main references

  • “Virtual machines and virtualization of clusters and data

centers”, chapter 3 of Distributed and Cloud Computing

http://bit.ly/2xBa2xg

  • “Virtualization”, chapter 3 of Mastering Cloud Computing
  • J.E. Smith, R. Nair, The architecture of virtual machines,

IEEE Computer, 2005. http://bit.ly/2z5cW0X

  • D. Bernstein, Containers and Cloud: From LXC to Docker to

Kubernetes, IEEE Cloud Computing, 2014. http://bit.ly/2hqudbf

  • More papers on the course web site

Valeria Cardellini - SDCC 2017/18 1

slide-2
SLIDE 2

Valeria Cardellini - SDCC 2017/18

Virtualizzazione

  • Virtualizzazione: livello alto di astrazione che nasconde i

dettagli dell’implementazione sottostante

  • Virtualizzazione: astrazione di risorse computazionali

– Si presenta all’utilizzatore una vista logica diversa da quella fisica

  • Le tecnologie di virtualizzazione comprendono una varietà

di meccanismi e tecniche usate per risolvere problemi di:

– Affidabilità, prestazioni, sicurezza, …

  • Come? Disaccoppiando l’architettura ed il comportamento

delle risorse percepiti dall’utente dalla loro realizzazione fisica

2

Virtualizzazione di risorse

  • Virtualizzazione delle risorse hw e sw di sistema

– Macchina virtuale, container, …

  • Virtualizzazione dello storage

– Storage Area Network (SAN), …

  • Virtualizzazione della rete

– Virtual LAN (VLAN), Virtual Private Network (VPN), …

  • Virtualizzazione del data center

Valeria Cardellini - SDCC 2017/18 3

slide-3
SLIDE 3

Components of virtualized environment

  • Three major components:

– Guest – Host – Virtualization layer

  • Guest: system

component that interacts with the virtualization layer rather than with the host

  • Host: original

environment where the guest is supposed to be managed

Valeria Cardellini - SDCC 2017/18 4

  • Virtualization layer: responsible for recreating the same
  • r a different environment where the guest will operate

Valeria Cardellini - SDCC 2017/18

Macchina virtuale

  • Una macchina virtuale (VM, Virtual Machine)

permette di rappresentare le risorse hw/sw di una macchina diversamente dalla loro realtà

– Ad es. le risorse hw della macchina virtuale (CPU, scheda di rete, controller SCSI) possono essere diverse dai componenti fisici della macchina reale

  • Una singola macchina fisica può essere

rappresentata e usata come differenti ambienti di elaborazione

– Molteplici VM su una singola macchina fisica

5

slide-4
SLIDE 4

Valeria Cardellini - SDCC 2017/18

Virtualizzazione: cenni storici

  • Il concetto di VM è un’idea “vecchia”, essendo

stato definito negli anni ’60 in un contesto centralizzato

– Ideato per consentire al software legacy (esistente) di essere eseguito su mainframe molto costosi e condividere in modo trasparente le (scarse) risorse fisiche – Ad es. il mainframe IBM System/360-67

  • Negli anni ’80, con il passaggio ai PC il problema

della condivisione trasparente delle risorse di calcolo viene risolto dai SO multitasking

– L’interesse per la virtualizzazione svanisce

6 Valeria Cardellini - SDCC 2017/18

Virtualizzazione: cenni storici (2)

  • Alla fine degli anni '90, l’interesse rinasce per rendere

meno onerosa la programmazione hw special- purpose

– VMware viene fondata nel 1998

  • Si acuisce il problema dei costi di gestione e di

sottoutilizzo di piattaforme hw e sw eterogenee

– L’hw cambia più velocemente del sw (middleware e applicazioni) – Aumenta il costo di gestione e diminuisce la portabilità

  • Diventa nuovamente importante la condivisione

dell’hw e delle capacità di calcolo non usate per ridurre i costi dell’infrastruttura

  • E’ una delle tecnologie abilitanti del Cloud computing

7

slide-5
SLIDE 5

Virtualizzazione: vantaggi

  • Facilita la compatibilità, portabilità, interoperabilità e

migrazione di applicazioni ed ambienti

– Indipendenza dall’hw – Create Once, Run Anywhere – VM legacy: eseguire vecchi SO su nuove piattaforme

Valeria Cardellini - SDCC 2017/18 8

Virtualizzazione: vantaggi (2)

  • Permette il consolidamento dei server in un data

center, con vantaggi economici, gestionali ed energetici

– Obiettivo: ridurre il numero totale di server ed utilizzarli in modo più efficiente – Vantaggi:

  • Riduzione dei costi e dei consumi energetici
  • Semplificazione nella gestione, manutenzione ed upgrade dei

server

  • Riduzione dello spazio occupato e dei tempi di downtime

Valeria Cardellini - SDCC 2017/18 9

slide-6
SLIDE 6

Virtualizzazione: vantaggi (3)

  • Permette di isolare componenti malfunzionanti o

soggetti ad attacchi di sicurezza, incrementando quindi l’affidabilità e la sicurezza delle applicazioni

– Macchine virtuali di differenti applicazioni non possono avere accesso alle rispettive risorse – Bug del software, crash, virus in una VM non possono danneggiare altre VM in esecuzione sulla stessa macchina fisica

  • Permette di isolare le prestazioni

– Ad es. tramite opportuno scheduling delle risorse fisiche che sono condivise tra molteplici VM

  • Permette di bilanciare il carico sui server

– Tramite la migrazione della VM da un server ad un altro

Valeria Cardellini - SDCC 2017/18 10

Taxonomy of virtualization techniques

  • Execution environment virtualization is the oldest,

most popular and most developed area we will mostly investigate it

Valeria Cardellini - SDCC 2017/18 11

slide-7
SLIDE 7

Valeria Cardellini - SDCC 2017/18

Uso di ambienti di esecuzione virtualizzati

  • In ambito personale e didattico

– Per eseguire diversi SO simultaneamente e semplificare l’installazione di sw

  • In ambito professionale

– Per debugging, testing e sviluppo di applicazioni

  • In ambito aziendale

– Per consolidare l’infrastruttura del data center – Per garantire business continuity: incapsulando interi sistemi in singoli file (system image) che possono essere replicati, migrati o reinstallati su qualsiasi server

12 Valeria Cardellini - SDCC 2017/18

Architetture delle macchine virtuali

A che livello realizzare la virtualizzazione?

  • Dipende fortemente dalle interfacce offerte dai vari

componenti del sistema

– Interfaccia tra hw e sw (user ISA: istruzioni macchina non privilegiate, invocabili da ogni programma) [interfaccia 4] – Interfaccia tra hw e sw (system ISA: istruzioni macchina invocabili solo da programmi privilegiati) [interfaccia 3] – Chiamate di sistema [interfaccia 2]

  • ABI (Application Binary Interface):

interfaccia 2 + interfaccia 4

– Chiamate di libreria (API) [interfaccia 1]

  • Obiettivo della virtualizzazione

– Imitare il comportamento di queste interfacce

Riferimento: “The architecture of virtual machines”

13

slide-8
SLIDE 8

An application uses library functions (A1), makes system calls (A2), and executes machine instructions (A3)

Hardware Operating System ISA Libraries ABI API

System calls

Applications

System ISA User ISA A1 A2 A3

Machine reference model

Valeria Cardellini - SDCC 2017/18 14 Valeria Cardellini - SDCC 2017/18

Virtualization layers

  • Common virtualization layers include:

– Application level (also process VM) – Library level (user-level API) – Operating system level (also containers) – ISA level

  • Requires binary translation and optimization, e.g.,

dynamic binary translation

– Hardware abstraction layer (also system VM): based on virtual machine monitor (VMM), also called hypervisor

  • VMM: software that securely partitions the resources of a

computer system into one or more VMs

15

Our focus Our focus

slide-9
SLIDE 9

Valeria Cardellini - SDCC 2017/18

Virtualization layers (2)

16

Docker

Valeria Cardellini - SDCC 2017/18

Macchina virtuale di processo

  • Virtualizzazione per un singolo processo

– VM di processo: piattaforma virtuale che esegue un singolo processo – Fornisce un ambiente ABI o API virtuale per le applicazioni utente

  • Il programma è compilato in un codice intermedio

(portabile), che viene successivamente eseguito nel sistema runtime

  • Esempi: JVM, .NET CLR

Istanze multiple di combinazioni <applicazione, sistema runtime>

17

slide-10
SLIDE 10

Valeria Cardellini - SDCC 2017/18

Monitor della macchina virtuale

  • Uno strato sw separato che scherma

(completamente) l’hw sottostante ed imita l’insieme di istruzioni dell’architettura

  • Sul VMM possono essere eseguiti

indipendentemente e simultaneamente sistemi

  • perativi diversi
  • Esempi: VMware, Parallels, VirtualBox, Xen, KVM

Istanze multiple di combinazioni <applicazioni, sistema operativo>

18

Termini e classificazione VMM

  • Host: piattaforma di base sulla quale si realizzano

le VM; comprende:

– Macchina fisica – Eventuale sistema operativo nativo – VMM

  • Guest: tutto ciò che riguarda ogni singola VM

– Sistema operativo ed applicazioni eseguite nella VM

  • Consideriamo per prima la virtualizzazione a livello

di sistema (VMM o hypervisor)

  • Per il VMM, distinguiamo:

– VMM di sistema – VMM ospitato – Virtualizzazione completa – Paravirtualizzazione

Valeria Cardellini - SDCC 2017/18 19

slide-11
SLIDE 11

Valeria Cardellini - SDCC 2017/18

VMM di sistema o VMM ospitato

VMM di sistema VMM ospitato host guest host

In quale livello dell’architettura di sistema si colloca il VMM?

– Direttamente sull’hardware (VMM di sistema) – Come applicazione su un SO esistente (VMM ospitato)

guest

20

VMM di sistema o VMM ospitato (2)

  • VMM di sistema (type-1): eseguito direttamente sull’hw,
  • ffre funzionalità di virtualizzazione integrate in un SO

semplificato

– L’hypervisor può avere un’architettura a micro-kernel (solo funzioni di base, no device driver) o monolitica – Esempi: Xen, VMware ESX

  • VMM ospitato (type-2): eseguito sul SO host, accede

alle risorse hw tramite le chiamate di sistema del SO host

– Interagisce con il SO host tramite l’ABI ed emula l’ISA di hw virtuale per i SO guest – Vantaggio: non occorre modificare il SO guest – Vantaggio: può usare il SO host per gestire le periferiche ed utilizzare servizi di basso livello (es. scheduling delle risorse) – Svantaggio: degrado delle prestazioni rispetto a VMM di sistema – Esempi: Bochs, Parallels Desktop, VirtualBox

Valeria Cardellini - SDCC 2017/18 21

slide-12
SLIDE 12

VMM reference architecture

  • Dispatcher: VMM entry point that reroutes the

instructions issued by the VM

  • Allocator: decides about the system resources to be

provided to the VM

  • Interpreter: executes a proper routine when the VM

executes a privileged instruction

Valeria Cardellini - SDCC 2017/18 22 Valeria Cardellini - SDCC 2017/18

Virtualizzazione completa o paravirtualizzazione

Quale modalità di dialogo tra la VM ed il VMM per l’accesso alle risorse fisiche? – Virtualizzazione completa (full) – Paravirtualizzazione

  • Virtualizzazione completa

– Il VMM espone ad ogni VM interfacce hw simulate funzionalmente identiche a quelle della sottostante macchina fisica – Il VMM intercetta le richieste di accesso privilegiato all’hw (ad es. istruzioni di I/O) e ne emula il comportamento atteso – Il VMM gestisce un contesto di CPU per ogni VM e condivide le CPU fisiche tra tutte le VM

– Esempi: KVM, VMware ESXi, Microsoft Hyper-V

23

slide-13
SLIDE 13

Valeria Cardellini - SDCC 2017/18

Virtualizzazione completa o paravirtualizzazione

Quale modalità di dialogo tra la VM ed il VMM per l’accesso alle risorse fisiche?

  • Paravirtualizzazione

– Il VMM espone ad ogni VM interfacce hw simulate funzionalmente simili (ma non identiche) a quelle della sottostante macchina fisica – Non viene emulato l’hw, ma viene creato uno strato minimale di sw (Virtual Hardware API) per assicurare la gestione delle singole istanze di VM ed il loro isolamento – Esempi: Xen, Oracle VM (basato su Xen), PikeOS

Confronto qualitativo delle diverse soluzioni per VM

en.wikipedia.org/wiki/Comparison_of_platform_virtual_machines

24

Virtualizzazione completa

  • Vantaggi

– Non occorre modificare il SO guest – Isolamento completo tra le istanze di VM: sicurezza, facilità di emulare diverse architetture

  • Svantaggi

– VMM più complesso – Collaborazione del processore per implementazione efficace

  • Perché?

Valeria Cardellini - SDCC 2017/18 25

slide-14
SLIDE 14

Problemi per la virtualizzazione

  • L’architettura del processore opera

secondo almeno 2 livelli (ring) di protezione: supervisor e user

– Ring 0: massimi privilegi – Ring 3: minimi privilegi

Valeria Cardellini - SDCC 2017/18 26

Architettura x86 senza virtualizzazione

  • Con la virtualizzazione:

– Solo il VMM opera in supervisor mode – ll SO guest e le applicazioni (quindi la VM) operano in user mode – Problema del ring deprivileging: il SO guest opera in un ring che non gli è proprio è non può eseguire istruzioni privilegiate – Problema del ring compression: poiché applicazioni e SO guest eseguono allo stesso livello, occorre proteggere lo spazio del SO

Virtualizzazione completa (2)

  • Come risolvere il ring deprivileging?

– Trap-and-emulate: quando il SO guest tenta di eseguire un’istruzione privilegiata (e.g., lidt in x86), il processore notifica un’eccezione (trap) al VMM e gli trasferisce il controllo; il VMM controlla la correttezza dell’operazione richiesta e ne emula il comportamento – Le istruzioni non privilegiate eseguite dal SO guest sono invece eseguite direttamente

  • Come realizzare il meccanismo di trap?

– A livello hardware: hardware-assisted CPU virtualization – A livello software: fast binary translation

Valeria Cardellini - SDCC 2017/18 27

slide-15
SLIDE 15

Hardware-assisted CPU virtualization

  • Hardware-assisted CPU virtualization (Intel VT-x and

AMD-V) provides two new forms of CPU operating modes, called root mode and non-root mode, each supporting all four x86 protection rings

Valeria Cardellini - SDCC 2017/18

  • VMM runs in root mode

(Root-Ring 0), while all guest OSes run in guest mode in their original privilege levels (Non-Root Ring 0): no longer ring deprivileging and ring compression problems

  • VMM can control guest

execution through control bits of hardware defined structures

28

X86 architecture with full virtualization and hardware-assisted CPU virtualization

Fast binary translation

  • Ma il meccanismo di trap al VMM per le istruzioni privilegiate

è offerto solo dai processori con supporto hardware per la virtualizzazione (Intel VT-x e AMD-V)

– IA-32 non lo è: come realizzare la virtualizzazione completa in mancanza del supporto hw?

  • Fast binary translation: il VMM scansiona il codice prima

della sua esecuzione per sostituire blocchi contenenti istruzioni privilegiate con blocchi funzionalmente equivalenti e contenenti istruzioni per la notifica di eccezioni al VMM

Valeria Cardellini - SDCC 2017/18

Architettura x86 con virtualizzazione completa e binary translation

  • I blocchi tradotti sono

eseguiti direttamente sull’hw e conservati in una cache per eventuali riusi futuri

  • Maggiore complessità del

VMM e minori prestazioni

29

slide-16
SLIDE 16

Paravirtualization

Valeria Cardellini - SDCC 2017/18

  • Non-transparent virtualization solution
  • The guest OS kernel must be modified to let it invoke the

special API exposed by the virtualization layer

  • Nonvirtualizable OS instructions are replaced by

hypercalls that communicate directly with the hypervisor

  • A hypercall is to a hypervisor what a syscall is to a kernel

X86 architecture with paravirtualization

30

Paravirtualization (2)

Valeria Cardellini - SDCC 2017/18

  • Pros (vs full virtualization):

– Overhead reduction – Relatively easier and more practical implementation: the VMM simply transfers the execution of performance-critical

  • perations (hard to virtualize) to the host
  • Cons:

– Requires the source code of OSes to be available

  • OSes that cannot be ported (e.g., Windows) can still take

advantage of virtualization by using ad hoc device drivers that remap the execution of critical instructions to the virtual API exposed by the VMM

– Cost of maintaining paravirtualized OSes

31

slide-17
SLIDE 17

Paravirtualization: hypercall execution

Valeria Cardellini - SDCC 2017/18

  • The hypervisor (not the kernel) has interrupt handlers installed
  • When a VM application issues a guest OS system call, execution

jumps to the hypervisor to handle, which then passes control back to the guest OS

Courtesy of “The Definitive Guide to XEN hypervisor” by D. Chisnall

32

Memory virtualization

  • In a non-virtualized environment

– One-level mapping: from virtual memory to physical memory provided by page tables – MMU and TLB hardware components to optimize virtual memory performance

  • In a virtualized environment

– All VMs share the same machine memory and VMM partitions memory among VMs – Two-level mapping: from virtual memory to physical memory and from physical memory to machine memory

  • Terminology

– Host physical memory: actual hw memory visible to VMM – Guest physical memory: memory visible to guest OS – Guest virtual memory: memory visible to applications; continuous virtual address space presented by guest OS to applications

Valeria Cardellini - SDCC 2017/18 33

slide-18
SLIDE 18

Two-level memory mapping

Valeria Cardellini - SDCC 2017/18

  • Going from guest virtual memory to host physical memory requires a

two-level memory mapping:

  • Guest VA (virtual address) è guest PA (physical address) è host

MA (machine address)

  • Guest physical address ≠ host machine address

34

Shadow page table

  • To avoid an unbearable performance drop due to the

extra memory mapping, VMM maintains shadow page tables (SPTs)

– Direct guest virtual-to-host physical address mapping

Valeria Cardellini - SDCC 2017/18 35

  • SPT maps guest virtual address to host

physical address

– Guest OS maintains its own virtual memory page table (PT) in the guest physical memory frames – For each guest physical memory frame, VMM should map it to host physical memory frame – SPT maintains the mapping from guest virtual address to host machine address – VMM needs to keep the SPTs consistent with changes made by the guest OS to its PT

slide-19
SLIDE 19

Challenges in memory virtualization with SPT

Valeria Cardellini - SDCC 2017/18

  • Address translation

– Guest OS expects contiguous, zero-based physical memory, but the underlying machine memory may be discontiguous: VMM must preserve this illusion

  • Page table shadowing

– SPT implementation is complex – VMM intercepts paging operations and constructs copy of PTs

  • Overheads

– VM exits add to execution time – SPTs consume significant host memory – SPTs need to be kept synchronized with guest PTs

36

Hw support for memory virtualization

  • SPT is a software-managed solution; let us consider

an hardware solution

– Second Level Address Translation (SLAT) is the hardware- assisted solution for memory virtualization (Intel EPT and AMD RVI) to translate the guest virtual address into the machine’s physical address

Valeria Cardellini - SDCC 2017/18

– Using SLAT significant performance gain with respect to SPT: around 50% for MMU intensive benchmarks

37

slide-20
SLIDE 20

Valeria Cardellini - SDCC 2017/18

Case study: Xen

  • The most notable example of paravirtualization

www.xenproject.org (developed at University of Cambridge) – Open-source type-1 (system VMM) hypervisor – Offers to the guest OS a virtual interface (hypercall API) to whom the guest OS must refer to access the machine physical resources – With paravirtualization (PV) Xen requires PV-enabled guest OSes and PV drivers (now part of the Linux kernel as well as

  • ther operating systems)
  • Oses ported to Xen: Linux, NetBSD, FreeBSD and OpenSolaris

– Can also support hardware-assisted virtualization (HVM)

  • With HVM unmodified guest OSes (e.g., Windows) can be used

– Foundation for many products and platform (e.g., Oracle VM and Qubes OS) and powers some of the largest IaaS providers (e.g., Amazon, Rackspace)

38 Valeria Cardellini - SDCC 2017/18

Xen: pros and cons

  • Pros

– Thin hypervisor model

  • 270K lines of code in Xen v4.0 (45K LoC in Xen v2.0)
  • More robust and secure than other hypervisors
  • But still vulnerable to attacks https://xenbits.xen.org/xsa/

– Continuously improved – Flexibility in management

  • Tuning for performance

– Minimal overhead (within 2.5%) with respect to the bare metal machine without virtualization – Supports migration

  • Cons
  • I/O performance still remains challenging

39

slide-21
SLIDE 21

Valeria Cardellini - SDCC 2017/18

Xen architecture

  • Goal of the Cambridge group (late 1990s)
  • Design a VMM capable of scaling to about 100 VMs running

applications and services without any modifications to ABI

  • First public release in 2003
  • Microkernel design
  • What can be paravirtualized?
  • Disk and network devices
  • Emulated platform: motherboard, device buses, BIOS, legacy

boot

  • Privileged instructions and page tables (memory access)
  • Privileged instructions issued by a guest OS are replaced with

hypercalls

40 Valeria Cardellini - SDCC 2017/18

Xen architecture: domains

  • Xen domain
  • Represents VM instance
  • Ensemble of address spaces hosting a guest OS and

applications running over the guest OS

  • Runs on a virtual CPU
  • Dom0 (or control domain): specialized domain devoted

to execution of Xen control functions and privileged instructions

  • Initial domain started by Xen hypervisor on boot
  • Special privileges: capability to access HW directly, access to

the system’s I/O functions and interaction with the other VMs

  • DomU (or unprivileged domain): user domain

41

slide-22
SLIDE 22

Xen architecture

wiki.xen.org/wiki/Xen_Project_Software_Overview

Valeria Cardellini - SDCC 2017/18 42

Xen architecture and guest OS management

  • Xen hypervisor runs in the most privileged mode and

controls the access of guest OS to underlying hw

– Domains are run in ring 1 – Applications in ring 3

Valeria Cardellini - SDCC 2017/18 43

slide-23
SLIDE 23

Xen components: XenStore and Toolstack

  • XenStore: information storage space shared between

domains managed by xenstored daemon

– System-wide registry and naming service – Implemented as a hierarchical key-value storage – When values are changed, a watch function informs listeners of changes of the key in storage they have subscribed to – Communicates with guest VMs via shared memory using Dom0 privileges

  • Toolstack: to manage VM lifecycle (create, shutdown,

pause, migrate)

– To create a new VM, a user provides a configuration file describing memory and CPU allocations and device configurations – Toolstack parses this file and writes this information in XenStore – Takes advantage of Dom0 privileges to map guest memory, to load kernel and virtual BIOS and to set up initial communication channels with XenStore and with virtual console when a new VM is created

Valeria Cardellini - SDCC 2017/18 44

CPU schedulers in Xen

  • The job of an hypervisor's scheduler is to decide, among

all the virtual CPUs (vCPUs) of the various VMs, which

  • nes should execute on the host's physical CPUs

(pCPUs), at any given point in time

  • Further scheduling level with respect to those provided by OS

(scheduling of processes and scheduling of user-level threads within processes)

  • Xen allows to choose among different CPU schedulers

– We consider the Credit scheduler (default scheduler in Xen)

  • Scheduling algorithm goals:

– Make sure that domains get “fair” share of CPU

  • Proportional share algorithm: allocates pCPU in proportion to the

number of shares (weights) assigned to vCPUs

– Keep the CPU busy

  • Work-conserving algorithm: does not allow the CPU to be idle when

there is work to be done

– Schedule with low latency

Valeria Cardellini - SDCC 2017/18 45

slide-24
SLIDE 24

Credit scheduler

  • Proportional fair share and work-conserving scheduler
  • Each domain is assigned a weight and optionally a cap

(tunable parameters)

– Weight: relative CPU allocation per domain (default 256) – Cap: maximum amount of CPU a domain can use. If cap is 0 (default), then vCPU can receive any extra CPU (i.e., work-conserving); non-zero cap limits the amount of CPU a vCPU receives (e.g., 100 = 1 pCPU, 50 = 0.5 pCPU) – The scheduler transforms the weight into a credit allocation for each vCPU; as a vCPU runs, it consumes credits

  • For each pCPU, the scheduler maintains a queue of vCPUs,

with all the under-credit vCPUs first, followed by the over- credit vCPUs; the scheduler picks the first vCPU in the queue

  • Automatically load balances vCPUs across pCPUs on SMP

host

– Before a pCPU goes idle, it will consider other pCPUs in order to find any runnable vCPU; this approach guarantees that no pCPU idles when there is runnable work in the system

Valeria Cardellini - SDCC 2017/18

wiki.xen.org/wiki/Credit_Scheduler

46

Performance comparison of hypervisors

  • Developments in virtualization techniques and CPU

architectures have reduced the performance cost of virtualization but overheads still exist

– Especially when multiple VMs compete for hw resources

  • We consider two performance comparison studies

– See the course site for full papers

  • Shared results of the studies

– No one size fits all solution exists – Different hypervisors show different performance characteristics for varying workloads

Valeria Cardellini - SDCC 2017/18 47

slide-25
SLIDE 25

Performance comparison of hypervisors (2)

A component-based performance comparison of four hypervisors (IM 2013) http://bit.ly/2igBGZX

– Microsoft Hyper-V, KVM, VMware vSphere and Xen, all with hardware-assisted virtualization settings – Analyzed components: CPU, memory, disk I/O and network I/O

  • Overall results

– Performance can vary between 3% and 140% depending on the type of hw resource, but no single hypervisor always

  • utperforms the others
  • vSphere performs the best, but the other 3 perform respectably
  • CPU and memory: lowest levels of overhead
  • I/O and network: Xen overhead for small disk operations
  • Takeaway: consider the type of applications because

different hypervisors may be best suited for different workloads

Valeria Cardellini - SDCC 2017/18 48

Performance comparison of hypervisors (3)

Performance overhead among three hypervisors: an experimental study using Hadoop benchmarks (BigData 2013) http://bit.ly/2ziKCZM

  • Use Hadoop MapReduce apps to evaluate and

compare the performance impact of three hypervisors

– A commercial one (not disclosed), Xen, and KVM

  • For CPU-intensive benchmarks, negligible

performance difference between the three hypervisors

  • Significant performance variations were seen for I/O-

intensive benchmarks

– Commercial hypervisor better at disk writing, while KVM better for disk reading – Xen better when there was a combination of disk reading and writing with CPU-intensive computations

Valeria Cardellini - SDCC 2017/18 49

slide-26
SLIDE 26

Virtualizzazione a livello di SO

  • Finora abbiamo considerato la virtualizzazione a livello di

sistema; analizziamo ora la virtualizzazione a livello di sistema operativo (o container-based virtualization)

  • Permette di eseguire molteplici ambienti di esecuzione tra

di loro isolati all’interno di un singolo SO

– Tali ambienti sono chiamati: container, jail, zone, virtual execution environment (VE), virtual private server – Soluzione nata come evoluzione di chroot dei sistemi Unix-like e di cgroups e namespaces di Linux

  • chroot (change root directory): comando per cambiare la directory di

riferimento dei processi in esecuzione

  • cgroups (control groups): meccanismo che permette di limitare,

misurare ed isolare l’utilizzo delle risorse (CPU, memoria, I/O a blocchi, rete) di un insieme di processi

  • namespaces: meccanismo che permette di isolare ciò che un insieme

di processi può vedere dell'ambiente operativo (file, porte, …)

Valeria Cardellini - SDCC 2017/18 50

Virtualizzazione a livello di SO (2)

  • Ogni container ha:
  • il proprio insieme di processi, file system, utenti, interfacce

di rete con indirizzi IP, tabelle di routing, regole del firewall, …

  • I container condividono il kernel dello stesso SO

(e.g., Linux)

Valeria Cardellini - SDCC 2017/18 51

slide-27
SLIDE 27

Virtualizzazione a livello di SO: vantaggi

  • Rispetto a virtualizzazione basata su VMM

ü Degrado di prestazioni pressoché nullo

Le applicazioni invocano direttamente le chiamate di sistema, non c’è bisogno di passare per il VMM

ü Tempi minimi di startup e shutdown/cleanup

Secondi per container, minuti per VM

ü Densità elevata

Centinaia di istanze su una singola macchina fisica (PM, physical machine), ad es. con Solaris Containers fino a 8191

ü Immagine (footprint) di dimensioni minori

Non comprende il kernel del SO

ü Possibilità di condividere pagine di memoria tra molteplici container in esecuzione sulla stessa PM ü Maggiore portabilità ed interoperabilità per applicazioni cloud

L’applicazione nel container è indipendente dall’ambiente di esecuzione

Valeria Cardellini - SDCC 2017/18 52

In a nutshell: lightweight vs. heavyweight

Virtualizzazione a livello di SO: svantaggi

  • Rispetto a virtualizzazione basata su VMM

– Minore flessibilità

  • Non si possono eseguire contemporaneamente kernel di SO

differenti sulla stessa PM

  • Solo applicazioni native per il SO supportato (e.g., applicazioni

native per Linux)

Valeria Cardellini - SDCC 2017/18 53

VMM-based (type 2) vs container-based virtualization

– Minore isolamento – Maggiore rischio di vulnerabilità

  • Una singola vulnerabilità nel

kernel del SO può compromettere l’intero sistema

slide-28
SLIDE 28

OS-level virtualization: products

  • Docker

– Our case study

  • FreeBSD Jail
  • Solaris Zones/Containers
  • LXC (LinuX Containers)

– Supported by the mainline Linux kernel – For full system containers (full OS image) – LXD

  • Built on top of LXC, it is a system container manager
  • Virtuozzo
  • OpenVZ (for Linux)
  • IBM LPAR
  • rkt

– Application container engine

Valeria Cardellini - SDCC 2017/18 54

OS-level virtualization: only Linux?

  • Windows and OS X now support container-based

virtualization

– Docker for Windows: integrated with Hyper-V virtualization, networking and file system

https://www.docker.com/docker-windows

  • You can always install a Linux VM and then use a

container-based virtualization product in Linux VM

– Cons: performance loss – Cons: apps in the containers must run on Linux (no OS X

  • r Windows native applications)

Valeria Cardellini - SDCC 2017/18 55

slide-29
SLIDE 29

Case study: Docker

  • Lighweight, open and secure container-based

virtualization

– Containers include the application and all of its dependencies, but share the kernel with other containers – Containers run as an isolated process in userspace on the host operating system – Containers are also not tied to any specific infrastructure

Valeria Cardellini - SDCC 2017/18 56

Topic of the lesson held on November 8

Docker and hypervisors: performance comparison

Hypervisors vs. lightweight virtualization: a performance comparison (IC2E 2015 Workshops) http://bit.ly/2yX8xNF

  • Performance comparison: hypervisor (KVM) vs.

lightweight virtualization (LXC, Docker and OSV)

– Docker runs over LXC, OSV runs over KVM (for OSV see next slides) – CPU, memory, disk I/O, and network

  • Overhead introduced by containers is almost

negligible

– Versatility and ease of management paid in terms of security

  • KVM improvement in the last years

– Disk I/O efficiency still a bottleneck for some types of apps

  • Network efficiency is still challenging

– Especially UDP traffic

Valeria Cardellini - SDCC 2017/18 57

slide-30
SLIDE 30

Microservices

  • Architectural style for distributed applications
  • Derive from SOA
  • Address how to build, manage, and evolve

architectures out of small, self-contained units

– Decompose app into a set of independently deployable services, that are loosely-coupled and cooperating and can be rapidly deployed and scaled – Services equipped with dedicated memory persistence tools (e.g., databases)

Valeria Cardellini - SDCC 2017/18 58

Example of microservices

  • E-commerce application that takes orders from customers,

verifies inventory and available credit, and ships them

  • Components: user interface along with some backend

services for checking credit, maintaining inventory and shipping orders

Valeria Cardellini - SDCC 2017/18 59

slide-31
SLIDE 31

Microservices and containers

  • Microservices as ideal complementation of

container technology

– Package each service as a container image and deploy each service instance as a container

  • Pros:

– Service instance scaling out/in by changing the number

  • f container instances

– Service instance isolation – Resource limits on service instance – Build and start rapidly

  • Cons:

– Need container orchestration to manage the multi- container app

Valeria Cardellini - SDCC 2017/18 60

Containers and DevOps

  • Container-based virtualization help in the shift to

DevOps https://www.docker.com/use-cases/devops

  • DevOps = Development and Operations

– “DevOps is a development methodology with a set of practices aimed at bridging the gap between Development and Operations, emphasizing communication and collaboration, continuous integration, quality assurance and delivery with automated deployment” (Jabbari et al., 2016)

Valeria Cardellini - SDCC 2017/18 61

slide-32
SLIDE 32

New lightweight approaches to virtualization

  • With microservices, increased demand for low-
  • verhead virtualization techniques

– OS-level virtualization is not enough – How to have tiny one-shot VMs that run on hypervisors with great density and that self-scale their resource needs? Lightweight OSes and unikernels – Basic idea: avoid OS overhead

  • CoreOS Container Linux https://coreos.com

– Open-source and Linux-based lightweight OS

  • Only minimal functionalities required for deploying apps inside

containers, together with built-in mechanisms for service discovery, container management and process management

– Designed for clustered deployments, with focus on automation, ease of application deployment, security, and scalability

Valeria Cardellini - SDCC 2017/18 62

New lightweight approaches to virtualization

  • Deployment strategies examined so far

Valeria Cardellini - SDCC 2017/18 63

slide-33
SLIDE 33

New lightweight approaches to virtualization

  • Unikernel: the library OS concept http://unikernel.org

– Single-purpose, single-language virtual machine hosted

  • n a minimal environment
  • Specialized OS with minimal set of libraries which

correspond to OS constructs required for app to run, all in a single address space

– Pros:

  • Lightweight (minimal memory footprint)
  • Fast (no context switching)
  • Secure (reduced attack surface)

– Cons:

  • Only work in hypervisor-based virtual environments
  • Poor debugging
  • Single language runtime

See https://www.youtube.com/watch?v=oHcHTFleNtg

Valeria Cardellini - SDCC 2017/18 64

New lightweight approaches to virtualization

  • Some examples of unikernels: Xen Mirage, OSV,

LinuxKit, includeOS

  • OSV http://osv.io

– Unikernel designed for the Cloud – Intended to be run on top of a hypervisor (e.g., KVM, Xen) – Achieves the isolation benefits of hypervisor-based systems, but avoids the overhead of the guest OS – Uses its own application-image system, not Docker

Valeria Cardellini - SDCC 2017/18 65

slide-34
SLIDE 34

Hypervisors and containers in the Cloud

  • Hypervisor-based virtualization: greater flexibility

(different OSs on same PM) and security

  • Container-based virtualization: smaller-size deployment
  • Containers and container development platforms now
  • ffered as first-class Cloud services

– Amazon EC2 Container Service (ECS) – Azure Container Service – Google Container Engine – Alauda (Container-as-a-Service solution) https://www.alauda.io – Docker Cloud

  • Some open questions

– Containers on top of VMs? – Will container engines replace hypervisors in Cloud offering? – Nested virtualization? (Now possible in Azure http://bit.ly/2zjqnZ2)

Valeria Cardellini - SDCC 2017/18 66

Dynamic resizing and migration

  • Two useful techniques to deploy and

manage large-scale virtualized environments

– Dynamic resizing of VMs and containers for vertical scaling – Live migration of VMs and containers

  • Move VM/container between different physical

machines (or data centers) without stopping it

Valeria Cardellini - SDCC 2017/18 67

slide-35
SLIDE 35

VM dynamic resizing

  • Fine-grain mechanism with respect to migrate or reboot

a VM

– Example: application running on a VM starts consuming a lot

  • f resources and the VM starts running out of RAM and CPU

resize the VM

  • Pros: more cost-effective and faster than VM reboot
  • Cons: not supported by all virtualization products and

guest OSs

  • What can be resized without powering off and

rebooting the VM?

– Number of CPUs – Memory size

Valeria Cardellini - SDCC 2017/18 68

VM dynamic resizing: CPU

  • To add or remove CPUs (without switching off the

machine)

  • In Linux-based systems support for CPU hot-plug/hot-

unplug (e.g., KVM)

– Uses information in virtual file system sysfs (processor info in /sys/devices/system/cpu) – /sys/devices/system/cpu/cpuX for cpuX (X=0, 1, 2, …) – To turn on cpu #5: echo 1 > /sys/devices/system/cpu/cpu5/online – To turn off cup #5: echo 0 > /sys/devices/system/cpu/cpu5/online

Valeria Cardellini - SDCC 2017/18 69

slide-36
SLIDE 36

VM dynamic resizing: memory

  • Based on memory ballooning

– In KVM: virtio_balloon driver

Valeria Cardellini - SDCC 2017/18

  • When balloon inflates

– swap area – out-of-memory (OOM) killer

  • When balloon deflates:

more memory for the VM

– the memory size cannot exceed maxMemory

70

Container dynamic resizing

  • Resize (CPU, memory, I/O) the container

limits

– Possibly without restarting the container – Low-level solution: cgroups limits can be changed

  • n the fly

Valeria Cardellini - SDCC 2017/18 71

slide-37
SLIDE 37

Migrazione di VM e container

  • Vantaggi della migrazione

– Utile in cluster e data center virtuali per:

  • Consolidare l’infrastruttura
  • Avere flessibilità nel failover
  • Bilanciare il carico
  • Svantaggi e problemi

– Supporto da parte del VMM – Overhead di migrazione non trascurabile – Migrazione in ambito WAN non banale

Valeria Cardellini - SDCC 2017/18 72

Migrazione di VM

  • Approcci per migrare istanze di macchine virtuali tra

macchine fisiche:

– Stop and copy: si spegne la VM sorgente e si trasferisce l’immagine della VM sull’host di destinazione, ma il downtime può essere troppo lungo

  • L’immagine della VM può essere grande e la banda di rete limitata

– Live migration: la VM sorgente è in funzione durante la migrazione

Valeria Cardellini - SDCC 2017/18 73

slide-38
SLIDE 38

Migrazione live di VM

  • Prima di avviare la migrazione live

– Fase di setup: si seleziona l’host di destinazione (ad es. con

  • biettivo di load balancing, energy efficiency, oppure server

consolidation)

  • Cosa migrare? Memoria, storage e connessioni di

rete

  • Come? In modo trasparente alle applicazioni in

esecuzione sulla VM

– Costo della migrazione live: vi è comunque un downtime dell’applicazione

Valeria Cardellini - SDCC 2017/18 74

Migrazione live di VM: storage e rete

  • Per migrare lo storage:

– Usare storage condiviso da host sorgente e destinazione

  • SAN (Storage Area Network) o più economico NAS (Network

Attached Server) o file system distribuito

– In assenza di storage condiviso: il VMM sorgente salva tutti i dati della VM sorgente in un file di immagine, che viene trasferito sull’host di destinazione

  • Per migrare le connessioni di rete:

– La VM sorgente ha un indirizzo IP virtuale (eventualmente anche un indirizzo MAC virtuale)

  • Il VMM conosce il mapping tra IP virtuale e VM

– Se sorgente e destinazione sono sulla stessa sottorete IP, non

  • ccorre fare forwarding sulla sorgente
  • Invio di risposta ARP non richiesta da parte della destinazione per

avvisare che l’indirizzo IP è stato spostato in una nuova locazione ed aggiornare quindi le tabelle ARP

75 Valeria Cardellini - SDCC 2017/18

slide-39
SLIDE 39

Migrazione live di VM: memoria

  • Per migrare la memoria (inclusi registri della CPU):

1. Fase di pre-copy: il VMM copia in modo iterativo le pagine da VM sorgente a VM di destinazione mentre la VM sorgente è in esecuzione 2. Fase di stop-and-copy: la VM sorgente viene fermata e vengono copiate soltanto le pagine dirty

  • Tempo di downtime: da qualche msec a qualche sec, in funzione

di dimensione della memoria, tipo di app e banda di rete

3. Fasi di commitment e reactivation: la VM di destinazione carica lo stato e riprende l’esecuzione; la VM sorgente viene rimossa (ed eventualmente spento l’host sorgente)

  • Chiamato approccio pre-copy

– Lo stato della VM è copiato da sorgente a destinazione prima che l’esecuzione della VM riprenda a destinazione – E’ la soluzione standard (ad es. in KVM)

Valeria Cardellini - SDCC 2017/18 76

VM live migration: overall process

Valeria Cardellini - SDCC 2017/18 77

Source: C. Clark et al., “Live Migration of Virtual Machines”, NSDI’05.

slide-40
SLIDE 40

VM live migration: alternatives for memory

  • Pre-copy cannot migrate in a transparent manner

workloads that are CPU and/or memory intensive

  • 1. Alternative approach: post-copy

– Post-copy moves the execution to the destination host at the beginning of the migration process and then transfers the memory pages in an on-demand manner as they are requested by the VM

  • 2. Alternative approach: hybrid

– A special case of post-copy migration: post-copy preceded by a limited pre-copy stage – Idea: a subset of the most frequently accessed memory pages is transferred before the VM execution is switched to the destination, so to reduce performance degradation after the VM is resumed

  • No standard implementation of post-copy and hybrid

approaches in current hypervisors

Valeria Cardellini - SDCC 2017/18 78

Approaches for migrating memory

Valeria Cardellini - SDCC 2017/18 79

Courtesy of C.Vojtech, http://bit.ly/2h7wSWB

slide-41
SLIDE 41

Migrazione live di VM e hypervisor

  • Supportata da hypervisor commerciali ed open-

source

– E.g., KVM, Hyper-V, Xen, VirtualBox

  • Migrazione in reti WAN supportata in modo limitato

Valeria Cardellini - SDCC 2017/18 80

VM migration in WAN environments

  • So far we focused on VM migration within a single

data center

  • How to achieve live migration of VMs across multiple

geo-distributed data centers?

Valeria Cardellini - SDCC 2017/18 81

slide-42
SLIDE 42

VM migration in WAN environments: storage

  • Some approaches to migrate storage in a WAN

– Shared storage

  • Cons: storage access time can be too slow

– On-demand fetching

  • Transfer only some blocks on the destination and then fetch

remaining blocks from the source only when requested

  • Cons: it does not work if the source crashes

– Pre-copy/write throttling

  • Pre-copy the disk image of the VM to the destination whilst

the VM continues to run, keep track of write operations on the source (delta) and then apply the delta on the destination

  • If the write rate at the source is too fast, use write throttling to

slow down the VM so that migration can proceed

Valeria Cardellini - SDCC 2017/18 82

VM migration in WAN environments: network

  • Some approaches to migrate network connections in

a WAN

– IP tunneling

  • Set up an IP tunnel between the old IP address at the source

and the new VM IP address at the destination

  • Use the tunnel to forward all packets that arrive at the source

for the old IP address

  • Once the migration has completed and the VM can respond at

its new location, update the DNS entry with the new IP address

  • Tear down the tunnel when no connections remain that use the
  • ld IP address
  • Cons: it does not work if the source crashes

– Virtual Private Network (VPN)

  • Use MPLS-based VPN to create the abstraction of a private

network and address space shared by multiple data centers

– Software-Defined Networking

  • Change the control plane, no need to change IP address!

Valeria Cardellini - SDCC 2017/18 83

slide-43
SLIDE 43

Migration in container-based virtualization

  • Up to now we focused on live migration of virtual

machines

  • What about live migration of containers?
  • As for VM migration we need to:

– Save state – Transfer state – Restore from state

  • State saving, transferring and restoring happen with tasks

frozen (migration downtime) – Use memory pre-copy or memory post-copy

  • More complicated than VM migration

Valeria Cardellini - SDCC 2017/18 84

Migration in container-based virtualization (2)

Valeria Cardellini - SDCC 2017/18 85

  • Use CRIU project and P.Haul
  • CRIU: for checkpointing/restoring in userspace
  • P.Haul: on top of CRIU, for pre-checks, memory pre-

copy and post-copy, and file system migration

slide-44
SLIDE 44
  • Let us take a look at:

– Storage virtualization – Network virtualization – Cluster virtualization

Valeria Cardellini - SDCC 2017/18 86

Storage virtualization

  • Decouple the physical organization of the storage from

its logical representation

– “Storage virtualization means that applications can use storage without any concern for where it resides, what the technical interface is, how it has been implemented, which platform it uses, and how much of it is available” (R. van der Lans)

  • Two primary types of storage virtualization

– Block level

  • Aggregate multiple network storage devices into a single

block-level substrate, present to users a logical space for data storage and handle the process of mapping it to the actual physical location

– File level

  • Decouple data access from location where files are

physically stored (e.g., distributed file system)

Valeria Cardellini - SDCC 2017/18 87

slide-45
SLIDE 45

Storage virtualization: SAN

  • Storage Area Networks (SAN): the most common

solution for block-level storage virtualization

– SAN uses a network-accessible device through a large bandwidth connection to provide storage facilities

  • Fiber Channel (FC): high-speed

network technology primarily used to connect storage

– Requires special-purpose cabling – For high performance requirements

  • Internet SCSI (iSCSI): IP-based

protocol for linking data storage facilities

– Uses existing network infrastructures – For moderate performance requirements

88 Valeria Cardellini - SDCC 2017/18

Network virtualization

  • Process of abstraction which separates logical

network behavior from the underlying physical network resources

– Virtualizable network resources: Network Interface Card (NIC), L2 switch, L2 network, L3 router, L3 network

  • A method of combining the available resources in a

network by splitting up the available bandwidth into channels

– Each channel is independent and can be assigned (or reassigned) to a server or device in real-time

  • Virtualization disguises the true complexity of the

network by separating it into manageable parts

Valeria Cardellini - SDCC 2017/18 89

slide-46
SLIDE 46

Network virtualization (2)

  • At the physical machine level (also internal network

virtualization): using VMM features

– Goal: to create a “network in the box” – Various options, including Network Address Translation (NAT)

  • Among multiple systems (also external network

virtualization): using virtual LAN (VLAN) technology and “intelligent” (layer 3) switches

– VLAN: a group of hosts with a common set of requirements that communicate as if they were attached to the same broadcast domain, regardless of their physical location

Valeria Cardellini - SDCC 2017/18 90

Software-Defined Networking

  • “Software-Defined Networking (SDN) is an emerging

network architecture where network control is decoupled from forwarding and is directly programmable” (www.opennetworking.org)

  • Characterized by four distinguished features

Valeria Cardellini - SDCC 2017/18

Reference: R. Jain and S.Paul, “Network Virtualization and Software Defined Networking for Cloud Computing: A Survey”, IEEE Comm., 2013.

– Decoupling the control plane from the data plane – Centralization of the control plane – Programmability of the control plane – API standardization

  • OpenFlow is the first

standard interface designed for SDN

91

slide-47
SLIDE 47

Virtual clusters

  • Virtual cluster nodes: either physical or virtual

machines

– The VMs/containers in a virtual cluster are interconnected logically by a virtual network across several physical networks

  • VMs/containers can be replicated and/or migrated on

multiple physical nodes to achieve elasticity, fault tolerance, and disaster recovery

– Also the size (number of nodes) of a virtual cluster can grow or shrink dynamically

  • How to efficiently store large number of VM images?

– VM images by hypervisors are large (typically 1-30 GB in size) – Contained-based virtualization helps in reducing the image size

Valeria Cardellini - SDCC 2017/18 92

Sharing resources in virtual clusters

93

  • Need to run multiple frameworks on a single (physical
  • r virtual) cluster
  • How to share the cluster resources among multiple and

non homogeneous frameworks executed in VMs/ containers?

  • The classical solution:

Static partitioning

  • Is it efficient?

Valeria Cardellini - SDCC 2017/18

slide-48
SLIDE 48

What we need

  • The Datacenter as a Computer idea by D.

Patterson

– Share resources to maximize their utilization – Share data among frameworks – Provide a unified API to the outside – Hide the internal complexity of the infrastructure from applications

  • The solution: a cluster-scale resource

manager that employs dynamic partitioning

Valeria Cardellini - SDCC 2017/18 94

Apache Mesos

95 Valeria Cardellini - SDCC 2017/18

Dynamic partitioning

  • Cluster manager that provides a common resource

sharing layer over which diverse frameworks can run

  • Abstracts the entire datacenter into a single pool of

computing resources, simplifying running distributed systems at scale

  • A distributed system to run distributed systems on top
  • f it
slide-49
SLIDE 49

Apache Mesos (2)

Valeria Cardellini - SDCC 2017/18 96

  • Designed and developed at UC Berkeley
  • Top open-source project by Apache mesos.apache.org
  • Twitter and Airbnb as first users; now supports some of

the most popular apps (e.g., Siri, Uber, Yelp)

  • Cluster: a dynamically shared pool of resources

Dynamic partitioning Static partitioning

Mesos in the data center

  • Where does Mesos fit as an abstraction layer

in the datacenter?

Valeria Cardellini - SDCC 2017/18 97

slide-50
SLIDE 50

Apache Mesos: architecture

Valeria Cardellini - SDCC 2017/18 98

  • Master-slave

architectures

  • Slaves publish

available resources to master

  • Master sends

resource offers to frameworks

  • Master election

and service discovery via ZooKeeper

Apache Mesos: resource offers

  • Mesos master offers resources to frameworks

– Framework selected by Dominant Resource Fairness (DRF) algorithm

Valeria Cardellini - SDCC 2017/18 99

slide-51
SLIDE 51

Container orchestration

  • Container orchestration: set of capabilities to provision,

deploy, schedule (i.e., place, migrate and replicate), monitor, and control at runtime (i.e., reconfigure) multi- container packaged applications (in the Cloud)

– Tools used to manage containers at scale

Valeria Cardellini - SDCC 2017/18 100

  • Some container
  • rchestration engines

– Docker Compose and Docker Swarm – Cloudify – Kubernetes See http://bit.ly/2j2uiFr

Kubernetes

  • Google’s open-source platform for automating

deployment, scaling, and operations of application containers across clusters of hosts, providing container-centric infrastructure

http://kubernetes.io

  • Features:

– Portable: public, private, hybrid, multi-cloud – Extensible: modular, pluggable, hookable, composable – Self-healing: auto-placement, auto-restart, auto-replication, auto-scaling – Can run on Mesos

  • Offered as Cloud service on Google Cloud Platform

– Kubernetes management and deployment on underlying infrastructure is up to Cloud provider

Valeria Cardellini - SDCC 2017/18 101