Contributions to Large Scale Distributed Systems The - PowerPoint PPT Presentation

Utility Computing Infrastructures • A common objective : provide computing resources   (both hardware and software) in a flexible, transparent, efficient, secured and reliable way • Distributed Infrastructures (since the 1990’s) • A lot of challenges (data-sharing, software/hardware heterogeneity, workload placements, isolation between applications, performance…) Frontend (Resource Management System) Alice Alice Alice Alice Storage nodes Compute (Distributed File System) nodes Alice Alice Alice 5

Utility Computing Infrastructures • A common objective : provide computing resources   (both hardware and software) in a flexible, transparent, efficient, secured and reliable way • Distributed Infrastructures (since the 1990’s) • A lot of challenges (data-sharing, software/hardware heterogeneity, workload placements, isolation between applications, performance…) Frontend (Resource Management System) Alice Alice Alice Alice Storage nodes Compute (Distributed File System) nodes Bob Alice Alice Alice 5

Utility Computing Infrastructures • A common objective : provide computing resources   (both hardware and software) in a flexible, transparent, efficient, secured and reliable way • Distributed Infrastructures (since the 1990’s) • A lot of challenges (data-sharing, software/hardware heterogeneity, workload placements, isolation between applications, performance…) ? ? Frontend (Resource Management System) Alice Bob Alice Alice Alice Bob Storage nodes Compute (Distributed File System) nodes Bob ? Alice Alice Alice Bob 5

(Dynamic VM) Placement Contributions Research activities   mainly supported by IMT Atlantique Inria PhD - J. Pastor   Co supervised with F. Desprez Locality Aware Placement PhD - F. Quesnel   Co supervised with M. Südholt Distributed VM Scheduler J.M. Menaud F. Hermenier Cluster-Wide context switch 10/2008 10/2009 10/2010 10/2011 10/2012 10/2013 10/2014 10/2015 10/2016 10/2017

Placement Problem Processors 1st 2nd • Jobs 1, 2, 3, and 4 arrive in the queue job in the job 4th job in 3rd job in queue in the the queue and have to be scheduled the queue queue Running Time Processors • FCFS + Easy back filling   3rd job in Although Jobs 2 and 3 have been 2nd the queue job 1st backed filled, some resources are in the job queue in the 4th job in unused (dark gray areas) queue Running the queue Time Processors • Easy back filling with preemption . 3rd job in Job 4 can be started earlier without 2nd the queue job 4th job in 1st impacting Job 1’s performance. in the the queue job queue in the queue Running Time Jobs cannot be easily preempted (OS’s internal states) Even with preemption, some resources are still not wasted 7

Virtual Machine   The New Building Block Virtual Machines (VMs) • System virtualization : One to multiple OSes on a physical machine thanks to a hypervisor (an operating system of OSes) Hypervisor • VM Capabilities Physical Machine   (PM) • Suspend/Resume • Live Migration 8

Virtual Machine   The New Building Block Virtual Machines (VMs) • System virtualization : One to multiple OSes on a physical machine thanks to a hypervisor (an operating system of OSes) Hypervisor • VM Capabilities VM 1 VM 3 VM 2 Physical Machine   (PM) • Suspend/Resume • Live Migration Hypervisor 8

Virtual Machine   The New Building Block Virtual Machines (VMs) • System virtualization : One to multiple OSes on a physical machine thanks to a hypervisor (an operating system of OSes) Hypervisor • VM Capabilities VM 1 VM 3 VM 2 Physical Machine   (PM) • Suspend/Resume • Live Migration Hypervisor Hypervisor 8

Virtual Machine   The New Building Block Virtual Machines (VMs) • System virtualization : One to multiple OSes on a physical machine thanks to a hypervisor (an operating system of OSes) Hypervisor • VM Capabilities VM 1 VM 2 VM 3 Physical Machine   (PM) • Suspend/Resume • Live Migration Hypervisor Hypervisor 8

Virtual Machine   The New Building Block Virtual Machines (VMs) • System virtualization : One to multiple OSes on a physical machine thanks to a hypervisor (an operating system of OSes) Hypervisor • VM Capabilities VM 1 VM 2 VM 3 Physical Machine   (PM) • Suspend/Resume • Live Migration Hypervisor 8

From Jobs to Virtualised Jobs migrate stop run running A job is now encapsulated in one or • resume waiting terminated several VMs suspend ready sleeping 9

From Jobs to Virtualised Jobs migrate stop run running A job is now encapsulated in one or • resume waiting terminated several VMs suspend ready sleeping Challenge : Maintain viable mappings between VMs and PMs • Each VM consumes CPU, RAM… • 9 credits: F. Hermenier, OSDI poster session 2008

From Jobs to Virtualised Jobs • Maintain viable mappings between VMs and PMs. • MAPE Control loop   (leveraging the Entropy framework) • Make the reconfiguration phase automatic   Cluster-Wide context switch Hypervisor Hypervisor Hypervisor PM 1 PM 2 PM 3 [VTDC2010] Infrastructure Current Status Correct Status cost: 3 Non-viable manipulations 10 credits: F. Hermenier, OSDI poster session 2008 cost: 2

Cluster-Wide Context Switch - Evaluations • Scheduling policy : A FIFO queue   (priority between jobs to prevent starvation) • Testbed (further details in the manuscript)   11 Working nodes (22 CPUs)   A queue of 8 vjobs (NASGrid benchmarks)   Hypervisor Hypervisor Hypervisor Each job uses 9 VMs (9CPUs) PM 1 PM 2 PM 3 Infrastructure Cumulated completion times have been reduced by 40% 11

Cluster-Wide Context Switch - Evaluations • Scheduling policy : A FIFO queue   (priority between jobs to prevent starvation)   … ? y s t M i v • Testbed (further details in the manuscript)   V i t c 2 a 7 e / r s / e y 11 Working nodes (22 CPUs)   t d i l o i b n a 1 l a 1 c A queue of 8 vjobs (NASGrid benchmarks)   s t u o b Hypervisor Hypervisor Hypervisor a Each job uses 9 VMs (9CPUs) s PM 1 PM 2 PM 3 ’ t a h W Infrastructure Cumulated completion times have been reduced by 40% credits: A. Simonet, Introduction to Cloud Computing Lecture - Inside a Google DC 12

Cluster-Wide Context Switch - Evaluations • Scheduling policy : A FIFO queue   (priority between jobs to prevent starvation)   … ? y s t M i v • Testbed (further details in the manuscript)   V i t c 2 a 7 e / r s / e y 11 Working nodes (22 CPUs)   t d i l o i b n a 1 l a 1 c A queue of 8 vjobs (NASGrid benchmarks)   s t u o b Hypervisor Hypervisor Hypervisor a Each job uses 9 VMs (9CPUs) s PM 1 PM 2 PM 3 ’ t a h W Infrastructure A Google Data Center… Cumulated completion times have been reduced by 40% credits: A. Simonet, Introduction to Cloud Computing Lecture - Inside a Google DC 12

Scalability/Reactivity challenge • Computing Phase: a NP hard problem in most cases • Most works have been focusing on proposing heuristics to reduce the computing phase but… reconfiguring the infrastructure is time consuming too ! Timer 1. Monitoring 2. Computing 3. Reconfiguring 1 2 3 Time credits: F. Quesnel, PhD defense 2013 13

Scalability/Reactivity challenge • Computing Phase: a NP hard problem in most cases • Most works have been focusing on proposing heuristics to reduce the computing phase but… reconfiguring the infrastructure is time consuming too ! e r u e r o t g u t fi CPU i p n n m o o M c o e Load C R VM1 Is the configuration   VM 2 viable? Time 1 2 3 credits: F. Quesnel, PhD defense 2013 14

Scalability/Reactivity challenge • Computing Phase: a NP hard problem in most cases • Most works have been focusing on proposing heuristics to reduce the computing phase but… reconfiguring the infrastructure is time consuming too ! e r u e r o t g u t fi CPU i p n n m o o M c o e Load C R VM1 Is the configuration   VM 2 viable? Time 1 2 3 Can we reduce all phases? credits: F. Quesnel, PhD defense 2013 14

Leverage P2P Algorithms • Make dynamic partitioning of the system according to the effective usage of resources • Make direct cooperations between An Event occurs on a node hypervisors (no service node) Can current node scheduler • Distributed Virtual Machine Scheduler calculate valid schedule? no yes • Event driven / P2P Like system Contact neighbor Apply • Local interactions between nodes   and ask it to solve the schedule the problem • Scheduling performed on partitions of the system, created dynamically (nodes are reserved for an exclusive [CCPE2012] use by a scheduler, to prevent several schedulers from migrating the same VMs) 15

Understanding DVMS Partition 3 2 1 4 5 6 9 8 7 16 credits: J. Pastor, PhD Defense 2016

DVMS Evaluations • Development of a PoC 2000 VMs ! 3325 VMs ! 4754 VMs ! 251 PMs ! 309 PMs ! 467 PMs ! 160 ! • Evaluations (in-vivo) up to 5KVMs 140 ! Number of PMs ! 120 ! 100 ! 80 ! 60 ! 40 ! • IEEE Scale challenge 2013 20 ! 0 ! Griffon ! Graphene ! Paradent ! Parapide ! Parapluie ! Sol ! Suno ! Pastel ! Cluster ! 100 Time to apply a reconfiguration (s) 20 50 DVMS Entropy Entropy Mean and standard deviation Mean and standard deviation Mean and standard deviation Duration of an iteration (s) Time to solve an event (s) DVMS DVMS 80 40 15 60 30 10 40 20 20 5 10 0 0 0 251 PMs 309 PMs 467 PMs 251 PMs 309 PMs 467 PMs 251 PMs 309 PMs 467 PMs 2000 VMs 3325 VMs 4754 VMs 2000 VMs 3325 VMs 4754 VMs 2000 VMs 3325 VMs 4754 VMs 17

DVMS Evaluations • Development of a PoC 2000 VMs ! 3325 VMs ! 4754 VMs ! 251 PMs ! 309 PMs ! 467 PMs ! 160 ! • Evaluations (in-vivo) up to 5KVMs 140 ! Number of PMs ! 120 ! 100 ! 80 ! ? e 60 ! l a c 40 ! • IEEE Scale challenge 2013 s t a 20 ! ? t s i 0 ! e t s h e Griffon ! Graphene ! Paradent ! Parapide ! Parapluie ! Sol ! Suno ! Pastel ! c t a e o Cluster ! w r p n p a a c s w r e o h H t o … h 100 g Time to apply a reconfiguration (s) 20 50 t i n w DVMS Entropy Entropy Mean and standard deviation i Mean and standard deviation Mean and standard deviation s e Duration of an iteration (s) i Time to solve an event (s) DVMS DVMS m r a 80 o 40 p r 15 m p o s c k 60 o 30 e o w l 10 t n I a 40 c 20 w o H 20 5 10 0 0 0 251 PMs 309 PMs 467 PMs 251 PMs 309 PMs 467 PMs 251 PMs 309 PMs 467 PMs 2000 VMs 3325 VMs 4754 VMs 2000 VMs 3325 VMs 4754 VMs 2000 VMs 3325 VMs 4754 VMs 17

VM Placement   (Hot Topic) Problem

VM Placement   (Hot Topic) Problem Lots of articles (too many ?)

VM Placement   (Hot Topic) Problem Lots of articles (too many ?) Evaluations are performed either at a low scale for in vivo experiments or with ad-hoc simulators.   How can we evaluate/compare them?

VM Simulator Toolkits Research activities   mainly supported by IMT Atlantique Inria French ANR SONGS Project Hemera Inria Large Scale Initiative Discovery Inria Project Lab A. Simonet   EU BigStorage Project Postdoc   VMPlaceS Energy dimension T. L. Nguyen   T. Hirofuchi   PhD   Postdoc/Invited researcher   Boot time model SimGridVM: VM abstractions Locality Aware Placement Distributed VM Scheduler Cluster-Wide context switch 10/2008 10/2009 10/2010 10/2011 10/2012 10/2013 10/2014 10/2015 10/2016 10/2017

Toward a VM PLACEment Simulator • A dedicated simulator to Evaluate/compare VM placement policies at large-scale (and in reproducible • manner) Relieve researchers of the burden of dealing with VM creations and workloads • generation/injection • SimGrid as a base • A scientific instrument to study the behaviour of large-scale distributed systems • Design abstractions and models to enable researchers to control VMs in the same manner as in the real world (e.g., create/destroy, start/shutdown, suspend/resume and migrate) Focus on the migration model 20

      Accurate Live Migration Model 220 200 Naive approximation Migration time 180 Migration time is not a linear function observed • 160 (in sec) 140 according to the size of the VM 120 100 80 60 The more your VM is memory intensive, • 40 20 the longer the migration will be 0 0 20 40 60 80 100 120 Memory Update Speed (MB/s) • Transfer VM’s states to destination without perfectible shutdown (pre-copy algorithm) 1. Transfer all memory pages of the VM   (but, keep in mind the VM is still running at source)   Memory Pages � 2. Transfer updated memory pages during the previous step   VM (Running) � 3. Iterate this step until the rest of memory pages becomes sufficiently small to meet an acceptable downtime   Source PM � Destination PM � (30ms in KVM).   4. Stop the VM. Transfer the rest of of memory pages and states 21

      Accurate Live Migration Model 220 200 Naive approximation Migration time 180 Migration time is not a linear function observed • 160 (in sec) 140 according to the size of the VM 120 100 80 60 The more your VM is memory intensive, • 40 20 the longer the migration will be 0 0 20 40 60 80 100 120 Memory Update Speed (MB/s) • Transfer VM’s states to destination without perfectible shutdown (pre-copy algorithm) 1. Transfer all memory pages of the VM   (but, keep in mind the VM is still running at source)   2. Transfer updated memory pages during the previous step   VM (Running) � 3. Iterate this step until the rest of memory pages becomes sufficiently small to meet an acceptable downtime   Source PM � Destination PM � (30ms in KVM).   4. Stop the VM. Transfer the rest of of memory pages and states 21

      Accurate Live Migration Model 220 200 Naive approximation Migration time 180 Migration time is not a linear function observed • 160 (in sec) 140 according to the size of the VM 120 100 80 60 The more your VM is memory intensive, • 40 20 the longer the migration will be 0 0 20 40 60 80 100 120 Memory Update Speed (MB/s) • Transfer VM’s states to destination without perfectible shutdown (pre-copy algorithm) 1. Transfer all memory pages of the VM   (but, keep in mind the VM is still running at source)   2. Transfer updated memory pages during the previous step   VM (Restart) � 3. Iterate this step until the rest of memory pages becomes sufficiently small to meet an acceptable downtime   Source PM � Destination PM � (30ms in KVM).   4. Stop the VM. Transfer the rest of of memory pages and states 21

  Accurate Live Migration Model • Application memory footprints can be considered as linear functions Apache Postgres SQL [CloudCom 2013] 45 First accurate live migration model Grid5000 • Simulation (Precopy) 40 Simulation (Naive) (implementing the pre-copy strategy)   35 Migration Time (s) 30 25 The time and the resulting traffic of a 20 15 migration should be computed by taking 10 5 into account competition arising in the 0 0 20 40 60 80 100 presence of resource sharing and the CPU Utilization (%) memory refresh rate. 22

SimGrid VM [TCC 2015] SimGrid VM allows users to launch hundreds of thousands of VMs on their • simulation programs and control VMs in the same manner as in the real world Users can execute computation and communication tasks on physical • machines (PMs) and VMs through the same SimGrid API, which will provide a seamless migration path to IaaS simulations for hundreds of SimGrid users SimGrid with Virtual Machine Support SimGrid without Virtual Machine Support Task11 Task12 Task21 Virtual Machine (X1,1) (X1,2) (X2,1) Extend VM1 VM2 Task3 Task1 Task2 (X1)   (X2)   (X3) (X1) (X2) Physical Machine Physical Machine Physical (Capacity C)   (Capacity C)   1. Solve all the constraint 1. Solve the constraint problems at the physical machine layer. problems at once. Eq1: X 1 + X 2 + X 3 < C Eq1: X 1 + X 2 < C 2. Solve the constraint problems at the virtual machine layer. Eq2: X 1,1 + X 1,2 < X 1 Eq3: X 2,1 < X 2 All extensions have been integrated into SimGrid 23

VMPlaceS • A three steps engine to evaluate VM Placement Strategies Input: infrastructure topology, VM Nb, Workloads Initialization Phase Injector/Scheduling Phase Injector Scheduler Injects events   Researchers should (only) develop Analysis Phase (CPU variations, node crashes…) their scheduling algorithm in JAVA (or SCALA) using the SimGrid MSG API and a more abstract interface provided by VMPlaceS Output: a JSON trace file which is then consumed by the statistics R system to deliver tables/graphs (VMPlaces records several metrics during the simulation execution) 24

VMPlaceS: A First Use-Case [EuroPar2015] • To illustrate how different strategies can be evaluated/compared Period GL Période 1. Collecte des 1. Resource Monitoring GMs informations LCs 3. Application de 2. Prise de 2. Computing a viable scheduling 3. Applying reconfiguration actions la décision décision Centralized   Hierarchical   Distributed Ti 1 2 3 durée Entropy [VEE’09] Snooze [CCGRID’12] DVMS [CCPE’12] Simulation Input parameters • • PMs: 8 cores, 32GB, 1Gbps, 7 cores are considered.   VMs: 1 core, 1GB, 1Gbps, memory footprint varies between 0 and 80%   VM CPU load ( μ =60, σ =20)   10 VMs per PM, Cluster infrastructure composed of 128/256/512/1024 PMs   Duration: 1800 seconds   Period of scheduling invocations: 30 seconds. 25

Entropy/Snooze/DVMS Analysis ● Centralized ● 40000 Distributed ● Hierarchical Without scheduling 30000 Time (s) The centralized strategy looks useless? 20000 ● 10000 ● ● ● ● ● ● 0 128 nodes 256 nodes 512 nodes 1024 nodes 1280 vms 2560 vms 5120 vms 10240 vms Infrastructure sizes Cumulated violation time 26

Entropy/Snooze/DVMS Analysis Entropy first DVMS Entropy first false positive DVMS false positive 25 20 Duration of the violation (s) 15 10 5 0 500 1000 1500 2000 2500 3000 3500 Time (s) Another view focusing on Entropy and DVMS 27

Entropy/Snooze/DVMS Analysis (AVG | STD) (AVG | STD) (AVG | STD) (AVG | STD) 28

Entropy/Snooze/DVMS Analysis (AVG | STD) DVMS outperforms the others !? (AVG | STD) (AVG | STD) (AVG | STD) 28

Entropy/Snooze/DVMS Analysis (AVG | STD) DVMS outperforms the others !? (AVG | STD) (AVG | STD) (AVG | STD) While the centralized approach does not scale, both phases are constant from the time viewpoint for the two other approaches 28

Entropy/Snooze/DVMS Analysis (AVG | STD) 1./ Can we find a good partitioning size for Snooze ? DVMS outperforms the others !? 2./ What would be the benefit for Snooze of a reactive approach? (AVG | STD) (AVG | STD) (AVG | STD) While the centralized approach does not scale, both phases are constant from the time viewpoint for the two other approaches 28

Investigate Variants • Evaluate the impact of having smaller partitions in Snooze Same numbers of PMs but partitions grow from 2 LCs to 32 LCs per GM • 25000 ● Hierarchical2LCs ● Hierarchical4LCs ● Hierarchical8LCs 20000 ● Hierarchical32LCs 15000 Time (s) 10000 ● (AVG | STD) ● 5000 ● ● ● ● 0 128 nodes 256 nodes 512 nodes 1024 nodes 1280 vms 2560 vms 5120 vms 10240 vms Cumulated violation time Infrastructure sizes 29

Investigate Variants • Evaluate the impact of having smaller partitions in Snooze The smaller is the size of the partition,   the bigger the probability to do not find a viable solution Same numbers of PMs but partitions grow from 2 LCs to 32 LCs per GM • 25000 ● Hierarchical2LCs ● Hierarchical4LCs ● Hierarchical8LCs 20000 ● Hierarchical32LCs 15000 Time (s) 10000 ● (AVG | STD) ● 5000 ● ● ● ● 0 128 nodes 256 nodes 512 nodes 1024 nodes 1280 vms 2560 vms 5120 vms 10240 vms Cumulated violation time Infrastructure sizes 29

Investigate Variants • Evaluate the impact of having smaller partitions in Snooze The smaller is the size of the partition,   the bigger the probability to do not find a viable solution Same numbers of PMs but partitions grow from 2 LCs to 32 LCs per GM • 25000 ● Hierarchical2LCs ● Hierarchical4LCs ● Other variants and possible improvements Hierarchical8LCs 20000 ● (for instance contact neighbours two by two in DVMS) Hierarchical32LCs 15000 Time (s) 10000 ● (AVG | STD) ● 5000 ● ● ● ● 0 128 nodes 256 nodes 512 nodes 1024 nodes 1280 vms 2560 vms 5120 vms 10240 vms Cumulated violation time Infrastructure sizes 29

VMPlaceS / VM Simulator toolkits • Difficulties to conduct relevant evaluation of VM placement strategies (in vivo conditions, lot of metrics to monitor, scalability/reactivity, …) • VMPlaceS, a framework providing Programming support for the definition of new VM placement strategies   • Execution support for their accurate simulation at large scale   Means to analyze the collected traces Validated up to 10K PMs/100K VMs • Available online : http://beyondtheclouds.github.io/VMPlaceS/ • • On-going and future work [TPDS submission under review] Collect energy metrics   • VM boot time   [PDP2017] VM image migrations (storage challenge)   Workloads reproducing real traces (complex to get real traces)   Provide similar abstractions for container technologies (must have) 30

Beyond the Clouds Research activities   mainly supported by Research Eng. IMT Atlantique R-A Cherrueau   Inria M. Simonin French ANR SONGS Project EnOS Hemera Inria Large Scale Initiative OpenStack: From SQL to noSQL backends Discovery Inria Project Lab EU BigStorage Project Discovery vision Discovery Inria Project Lab VMPlaceS SimGridVM Locality Aware Placement Distributed VM Scheduler Cluster-Wide context switch 10/2008 10/2009 10/2010 10/2011 10/2012 10/2013 10/2014 10/2015 10/2016 10/2017

UTILITY COMPUTING From mainframes to …

UTILITY COMPUTING From mainframes to … …larger “mainframes” Microsoft DC, Quincy, WA state

Jurisdiction concerns Reliability CC distance (network overheads) l e d o   m 3 C 1 C 0 2 e - h t 2 f 1 o 0 n 2 o i t p o d a e h t r o f s e k a r b r o j a M 33

Discovery Vision • Bring Clouds back to the cloud [VHPC2011] [Discovery2013] Leverage the concept of µDC/nDC to extend any point of presence of • network backbones (aka PoP) with servers From network hubs up to major DSLAMs that are operated by • telecom companies, network institutions… Geant RENATER Internet 2 34

Discovery Vision • Bring Clouds back to the cloud [VHPC2011] [Discovery2013] Leverage the concept of µDC/nDC to extend any point of presence of • network backbones (aka PoP) with servers From network hubs up to major DSLAMs that are operated by • telecom companies, network institutions… How operating/using such a massively distributed infrastructure from the software viewpoint? Geant RENATER Internet 2 34

What’s about Brokering Approaches? Sporadic (hybrid computing/cloud bursting) almost ready for production • Brokers are rather limited to simple usages and not advanced administration • operations Charles Alice Bob 35

What’s about Brokering Approaches? Sporadic (hybrid computing/cloud bursting) almost ready for production • Brokers are rather limited to simple usages and not advanced administration • operations Charles Alice Bob Advanced brokers must reimplement standard IaaS mechanisms while facing the API limitation 35

Would OpenStack be the solution? Do not reinvent the wheel… it’s too late   • OpenStack (20Millions of LOC, 3M just for the core-services) Discovery objectives (overview) • Study to what extent the current OpenStack mechanisms can handle such massively • distributed infrastructures Propose revisions/extensions of internal mechanisms when appropriate • Charle Alice Bo From SQL to NoSQL backend in OpenStack   [IC2E 2017] • (a research PoC, just the top of the iceberg, numerous challenges) [CCGRID 2017] Toward a Holistic Framework for Conducting Scientific Evaluations of OpenStack   • EnOS, A tool for diving into OpenStack and performing scientific investigations 36

Conclusion / Future Work Research activities   STACK   mainly supported by Proposal IMT Atlantique Inria EnOS French ANR SONGS Project Hemera Inria Large Scale Initiative OpenStack: From SQL to noSQL backends Discovery Inria Project Lab EU BigStorage Project Discovery vision Discovery Inria Project Lab VMPlaceS SimGridVM Locality Aware Placement Distributed VM Scheduler Cluster-Wide context switch 10/2008 10/2009 10/2010 10/2011 10/2012 10/2013 10/2014 10/2015 10/2016 10/2017

Conclusion / Future Work • Virtualization technologies: a key role in the Cloud Computing adoption (flexibility, portability) but with a cost… • Complexity of the software stack • Difficulty to guarantee performance • Placement challenges Alice • How to express placement constraints? [plasma2013] a good starting point. Alice Alice • Can we consider network and storage dimensions? Alice Frontend • People expect containers technologies will help but.. What you may expect ! What you may have! What you expect ! • Similar consolidation issues • Naive use (containers on top of VM on top of PM) Alice Alice Alice Map/Reduce framework   • Current trend: server densification: more cores per PMs, more RAMs…. (leverage attached storage facilities) 38

Conclusion / Future Work • Virtualization technologies: a key role in the Cloud Computing adoption (flexibility, portability) but with a cost… • Complexity of the software stack • Difficulty to guarantee performance • Placement challenges Alice Alice Alice • How to express placement constraints? [plasma2013] a good starting point. • Can we consider network and storage dimensions? Alice Frontend • People expect containers technologies will help but.. What you may expect ! What you may have! • Similar consolidation issues • Naive use (containers on top of VM on top of PM) Alice Alice Alice • Current trend: server densification: more cores per PMs, more RAMs…. 38

Contributions to Large Scale Distributed Systems The - PowerPoint PPT Presentation

Contributions to Large Scale Distributed Systems The infrastructure view point Adrien Lebre September 1, 2017 President and Examiner: Reviewers: Claude Jard, Nantes Univ. Erik Elmroth, Ume Univ. Frdric Desprez, Inria Manish

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

INFRASTRUCTURE 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2

Large Scale Knowledge Representation of Large Scale Knowledge Representation of Distributed

Meeting the Challenges of Ultra- -Large Large- -Scale Scale Meeting the Challenges of Ultra

Meeting the Challenges of Ultra- -Large Large- -Scale Scale Meeting the Challenges of Ultra

Distributed Systems (ICE 601) Distributed Transactions Dongman Lee ICU Class Overview

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals & Challenges

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals & Challenges

Motivation Large-scale distributed systems becoming more common multiple datacenters, cloud

Distributed File Systems Distributed File Systems A distributed file system (DFS) is a

Reliability of Cloud-Scale Systems (CS 598) Fall 2018 Tianyin Xu 1 Reliability of Cloud-Scale

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

OVERVIEW 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2 Overview

Introduction to Distributed * Systems Introduction to Distributed * Systems Outline Outline

Introduction to Distributed Systems Introduction to Distributed Systems Outline Outline

Unicorn: Unified Resource Orchestration for Multi- Domain, Geo-Distributed Data Analytics Qiao

BIM DIGITAL VALUE IN CONSTRUCTION SECTOR MAGDALENA PYSZKOWSKA GLOBAL HEAD OF BIM DEPLOYMENT

J. Willard Marriott Library ervices at the S GIS Defining and Understanding GIS GIS =

The Impact of New Technologies such as Broadband and VOIP on Telecommunication Markets Seminar

Cloud Trends for 2013 THETA 2013 Michael Chanter General Manager Cloud Services Frontline

The History of the Internet Presented by the St. Martins Episcopal Technology Department

A Delay-Tolerant Network Architecture for Challenged Internets Kevin Fall Presented by Ross

Building and deploying deep learning models in medicine Leon Chen co-founder, MD.ai Source:

Contributions to Large Scale Distributed Systems The - PowerPoint PPT Presentation

Contributions to Large Scale Distributed Systems The infrastructure view point Adrien Lebre September 1, 2017 President and Examiner: Reviewers: Claude Jard, Nantes Univ. Erik Elmroth, Ume Univ. Frdric Desprez, Inria Manish

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

INFRASTRUCTURE 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2

Large Scale Knowledge Representation of Large Scale Knowledge Representation of Distributed

Meeting the Challenges of Ultra- -Large Large- -Scale Scale Meeting the Challenges of Ultra

Meeting the Challenges of Ultra- -Large Large- -Scale Scale Meeting the Challenges of Ultra

Distributed Systems (ICE 601) Distributed Transactions Dongman Lee ICU Class Overview

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals &amp; Challenges

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals &amp; Challenges

Motivation Large-scale distributed systems becoming more common multiple datacenters, cloud

Distributed File Systems Distributed File Systems A distributed file system (DFS) is a

Reliability of Cloud-Scale Systems (CS 598) Fall 2018 Tianyin Xu 1 Reliability of Cloud-Scale

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

OVERVIEW 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2 Overview

Introduction to Distributed * Systems Introduction to Distributed * Systems Outline Outline

Introduction to Distributed Systems Introduction to Distributed Systems Outline Outline

Unicorn: Unified Resource Orchestration for Multi- Domain, Geo-Distributed Data Analytics Qiao

BIM DIGITAL VALUE IN CONSTRUCTION SECTOR MAGDALENA PYSZKOWSKA GLOBAL HEAD OF BIM DEPLOYMENT

J. Willard Marriott Library ervices at the S GIS Defining and Understanding GIS GIS =

The Impact of New Technologies such as Broadband and VOIP on Telecommunication Markets Seminar

Cloud Trends for 2013 THETA 2013 Michael Chanter General Manager Cloud Services Frontline

The History of the Internet Presented by the St. Martins Episcopal Technology Department

A Delay-Tolerant Network Architecture for Challenged Internets Kevin Fall Presented by Ross

Building and deploying deep learning models in medicine Leon Chen co-founder, MD.ai Source:

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals & Challenges

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals & Challenges