I N C LOUD C OMPUTING Christina Delimitrou Stanford University - PowerPoint PPT Presentation

I MPROVING R ESOURCE E FFICIENCY I N C LOUD C OMPUTING Christina Delimitrou Stanford University Defense ¡– ¡May ¡26 th ¡2015 ¡ ¡

Resource efficiency is a first-order system constraint How efficiently do we utilize resources? How efficiently do we utilize resources? How efficiently do we design systems? 2

Why Care about Resource Efficiency? Performance/Cost Time Performance/Cost Time 3

~10K commodity servers Sophisticated cluster managers ~10s MWatts $100,000,000s Private clouds: • Google, Microsoft, Twitter, eBay Public clouds: • Amazon EC2, Windows Azure, GCE 4

The Promise of Cloud Computing ¨ Flexibility ¤ Provision and launch new services in seconds ¨ High performance ¤ High throughput & low tail latency ¨ Cost effectiveness ¤ Low capital & operational expenses Cloud computing scalability: high performance AND low cost 5

The Reality of Cloud Computing 6

Scaling Datacenters ¨ Switch to commodity servers One time trick ¨ Improve cooling/power distribution < 10% ¨ Build more datacenters >$300M per datacenter ¨ Add more servers Power limit End of voltage scaling ¨ Rely on processor technology Use existing systems more efficiently 7

Datacenter Underutilization Twitter (Mesos) 1 Google (Borg) 2 4-5x 3-5x 0 10 20 30 40 50 60 70 80 90 100 CPU Utilization (%) 1 C. Delimitrou and C. Kozyrakis. Quasar: Resource-Efficient and QoS-Aware Cluster Management, ASPLOS 2014. 2 L. A. Barroso, U. Holzle. The Datacenter as a Computer, 2013. 8

Datacenter Underutilization… Is the cluster manager’s fault Is the user’s fault! 9

Reserved vs. Used Resources 1.5-2x 3-5x ¨ Twitter: up to 5x CPU & up to 2x memory overprovisioning 10

Reserved vs. Used Resources ~25,000 jobs 936 distinct users [ASPLOS’14] Reservation=Usage ¨ 20% of job under-sized, ~70% of jobs over-sized 11

Datacenter Underutilization… Is the user’s fault! (not really…) 12

Resource Management is Hard 13

Performance Depends on Scale-up Performance Cores 14

Performance Depends on Heterogeneity Performance Cores 15

Performance Depends on Heterogeneity Scale-out Performance Performance Servers Cores 16

Performance Depends on Heterogeneity Scale-out Performance Performance Servers Cores Input load Performance Input size 17

Performance Depends on Heterogeneity Scale-out Performance Performance Overprovision Reservations! Servers Cores When sw changes, when platforms change, etc. Input load Interference Performance Performance Input size Interference 18

Can we improve resource efficiency while preserving application QoS guarantees? Potential: 3-5x efficiency; $10Ms in cost savings 19

Requirements ¨ Automate resource management ¤ Large, multi-dimensional space à Leverage big data ¨ General solution ¤ Different application types (batch, latency-critical) ¤ Different types of hardware ¨ Cross-layer design ¤ Architecture à OS à Scheduler à Application design 20

Contributions 21

Contributions Paragon [ASPLOS’13, TopPicks’14] [IISWC’13] Resource reservations Users Scheduler Cluster 1. Practical data mining 22

Contributions Quasar [ASPLOS’14] 2. High level interface Resource Users Scheduler Cluster reservations 1. Practical data mining 23

Contributions Systems: Application assignment : Paragon [ASPLOS’13, TopPicks’14, CAL’13, IISWC’13] Cluster management: Quasar [ASPLOS’14] 24

Contributions Systems: Application assignment : Paragon [ASPLOS’13, TopPicks’14] , iBench [IISWC’13] Cluster management: Quasar [ASPLOS’14] Scalable scheduling: Tarcil [SOCC’15] 25

Contributions Systems: Application assignment : Paragon [ASPLOS’13, TopPicks’14] , iBench [IISWC’13] Cluster management: Quasar [ASPLOS’14] Scalable scheduling: Tarcil [SOCC’15] Cloud provisioning: Hybrid Cloud [in submission] 26

Contributions Systems: Application assignment : Paragon [ASPLOS’13, TopPicks’14] , iBench [IISWC’13] Cluster management: Quasar [ASPLOS’14] Scalable scheduling: Tarcil [SOCC’15] Cloud provisioning: Hybrid Cloud [in submission] Admission control: ARQ [ICAC’13] 27

Contributions Systems: Application assignment : Paragon [ASPLOS’13, TopPicks’14] , iBench [IISWC’13] Cluster management: Quasar [ASPLOS’14] Scalable scheduling: Tarcil [SOCC’15] Cloud provisioning: Hybrid Cloud [in submission] Admission control: ARQ [ICAC’13] Datacenter application modeling: ECHO [IISWC’12], Storage application modeling [CAL’12, IISWC’11, 28 ISPASS’11]

Paragon [ASPLOS’13, TopPicks’14] Resource reservations Scheduler Users Cluster Practical data mining techniques 29

Heterogeneity & Interference Matter ¨ Heterogeneity Ignore Heterogeneity Ignore Both ¤ DCs provisioned over 15 years ¤ Multiple server generations & configurations ¨ Interference ¤ Apps contend on shared resources n CPU & cache hierarchy n Memory system n Storage & network I/O 30

Extracting Resource Preferences ¨ Naïve: exhaustive characterization ¤ ~10-20 platforms x 1,000 apps Resource reservations App App App Users Scheduler Cluster App App Mine Data big data ¨ Looks like a recommendation problem 31

Recommendation Systems ¨ Content-based systems: ¤ Description of items (keywords, feature vector, etc. ) ¤ Profile of user preferences (history, model, user-system interaction, etc. ) ¨ Collaborative filtering: ¤ Uncover similarities between users and items ¤ No need to know item features or explicit user preferences in advance 32

Recommendation Systems ¨ Content-based systems: ¤ Description of items (keywords, feature vector, etc. ) ¤ Profile of user preferences (history, model, user-system interaction, etc. ) ¨ Collaborative filtering: ¤ Uncover similarities between users and items ¤ No need to know item features or explicit user preferences in advance 33

Something familiar… ¨ Collaborative filtering – similar to Netflix Challenge system ¤ Singular Value Decomposition (SVD) + PQ reconstruction (SGD) movies movies 5 4 1 3 5 4 5 4 3 1 3 3 4 1 5 3 3 3 2 4 4 3 5 2 4 1 2 users SVD SVD Recommendations 1 5 2 1 3 5 5 3 1 PQ reconstruction 2 3 1 4 2 3 4 3 2 3 5 3 2 4 3 5 5 5 2 3 2 1 3 4 5 3 4 Dense utility matrix Sparse utility matrix 34

SVD m 1 m 2 … m n movie ! $ u 1 a 11 a 12 ... a 1 n # & user u 2 a 21 a 22 ... a 2 n # & # &     … rating (e.g., ) # & u m a m 1 a m 2 ... a mn # & " % = m 1 … m n ! $ ! $ ! $ u 11 ... u 1 r ... 0 v 11 ... v 1 r u 1 σ 1 # & # & # & x x          # & # & # & … # & # & # & u m 1 ... u mr 0 ... v n 1 ... v nr σ r u m # & # & # & " % " % " % 35

SVD m 1 m 2 … m n movie ! $ u 1 a 11 a 12 ... a 1 n # & user u 2 a 21 a 22 ... a 2 n # & # &     … rating (e.g., ) # & u m a m 1 a m 2 ... a mn # & correlation of user " % to similarity concept = m 1 … m n ! $ ! $ ! $ u 11 ... u 1 r ... 0 v 11 ... v 1 r u 1 σ 1 # & # & # & x x          # & # & # & … # & # & # & u m 1 ... u mr 0 ... v n 1 ... v nr σ r u m # & # & # & " % " % " % similarity concept correlation of movie to similarity concept 36

Heterogeneity Classification … Movie 1 Movie 2 Movie 3 Movie 4 Movie 5 Movie M User A User B … User N 37

Heterogeneity Classification … Platform 1 Platform 2 Platform 3 Platform 4 Platform 5 Platform M User A User B … User N 38

Heterogeneity Classification … Platform 1 Platform 2 Platform 3 Platform 4 Platform 5 Platform M App A App B … App N 39

Heterogeneity Classification … Platform 1 Platform 2 Platform 3 Platform 4 Platform 5 Platform M App A 1,500QPS 843QPS App B 458QPS 946QPS … App N 1,016QPS 186QPS App performance 40

Heterogeneity Classification … Platform 1 Platform 2 Platform 3 Platform 4 Platform 5 Platform M App A 1,500QPS 843QPS App B … App N Profiled Performance Inferred Performance 41

Heterogeneity Classification … Platform 1 Platform 2 Platform 3 Platform 4 Platform 5 Platform M … App A 1,500QPS 843QPS 675QPS 843QPS 1,786QPS 8,675QPS App B … App N Profiled Performance Inferred Performance 42

Heterogeneity Classification … Platform 1 Platform 2 Platform 3 Platform 4 Platform 5 Platform M … App A 1,500QPS 843QPS 675QPS 843QPS 1,786QPS 8,675QPS … App B 987QPS 458QPS 773QPS 1,073QPS 986QPS 1,836QPS … App N Profiled Performance Inferred Performance 43

I N C LOUD C OMPUTING Christina Delimitrou Stanford University - PowerPoint PPT Presentation

I MPROVING R ESOURCE E FFICIENCY I N C LOUD C OMPUTING Christina Delimitrou Stanford University Defense May 26 th 2015 Resource efficiency is a first-order system constraint How efficiently do we utilize resources?

D ata I ntensive I ntensive S calable S calable C omputing C omputing Randal E. Bryant

Resource Efficiency in the Cloud I N C LOUD C OMPUTING Neeraj Kulkarni, Feng Qi, Glyfina Fernando

E NERGY -E FFICIENT D ATA R EPLICATION IN C LOUD C OMPUTING D ATACENTERS Presented by David Ocejo

C OMPUTING WITH E NCRYPTED D ATA AND P ROGRAMS Shai Halevi (IBM Research) CCC Symposium --- May

C LOUD C OMPUTING A ND M4D Balwinder Sodhi Indian Institute of Technology Ropar MOOC on M4D 2013

C LOUD C OMPUTING A ND M4D Balwinder Sodhi Indian Institute of Technology Ropar MOOC on M4D 2013

C LOUD C OMPUTING A ND M4D Balwinder Sodhi Indian Institute of Technology Ropar MOOC on M4D 2013

C LOUD C OMPUTING A ND M4D Balwinder Sodhi Indian Institute of Technology Ropar MOOC on M4D 2013

Loud Voices in the China Field Loud Voices in the China Field A recent debate in Eurasian

C loud A pp P rofiler: T elco C loud A pplications T racing and M onitoring CTPD Project By:

c cientific omputing Automatic Differentiation of Computational Fluid Dynamics Package DROPS

The he fut uture ure of of comput omputing: ng: Quant uantum um Andr ndrs s Gil

Musical Instruments A glass pane exposed to a loud, short sound A. A glass pane exposed to a

High Perfor orma mance Re Research Comp omputing - XSE SEDE & Se SeWHiP, David St

E NGINEERING C OMPUTING S ERVICES (785) 532-4643 support@engg.ksu.edu

E NGINEERING C OMPUTING S ERVICES (785) 532-4643 support@engg.ksu.edu

Architecture-based Dependability Prediction for Service-oriented Computing Vincenzo Grassi

TensorFlow w/XLA: TensorFlow, Compiled! Expressiveness with performance Pre-release

ECE 1747H ECE 1747H : Parallel Meeting time: Mon 4-6 PM Programming Meeting place: BA

Cloudy/Clear Sky Relative Humidity in the Upper Troposphere Observed by AIRS, CloudSat, and

Perf rform rmance ance Inter terfer ference ence on Multico ticore Processor essors

CSE 331 Composite Layouts; Decorators slides created by Marty Stepp based on materials by M.

Using Invariant Analysis for Improving Instrumentation-based Performance Evaluation of

Digging Into The Core of Boot Yuriy Bulygin @c7zero Oleksandr Bazhaniuk @ABazhaniuk Agenda

I N C LOUD C OMPUTING Christina Delimitrou Stanford University - PowerPoint PPT Presentation

I MPROVING R ESOURCE E FFICIENCY I N C LOUD C OMPUTING Christina Delimitrou Stanford University Defense May 26 th 2015 Resource efficiency is a first-order system constraint How efficiently do we utilize resources?

D ata I ntensive I ntensive S calable S calable C omputing C omputing Randal E. Bryant

Resource Efficiency in the Cloud I N C LOUD C OMPUTING Neeraj Kulkarni, Feng Qi, Glyfina Fernando

E NERGY -E FFICIENT D ATA R EPLICATION IN C LOUD C OMPUTING D ATACENTERS Presented by David Ocejo

C OMPUTING WITH E NCRYPTED D ATA AND P ROGRAMS Shai Halevi (IBM Research) CCC Symposium --- May

C LOUD C OMPUTING A ND M4D Balwinder Sodhi Indian Institute of Technology Ropar MOOC on M4D 2013

C LOUD C OMPUTING A ND M4D Balwinder Sodhi Indian Institute of Technology Ropar MOOC on M4D 2013

C LOUD C OMPUTING A ND M4D Balwinder Sodhi Indian Institute of Technology Ropar MOOC on M4D 2013

C LOUD C OMPUTING A ND M4D Balwinder Sodhi Indian Institute of Technology Ropar MOOC on M4D 2013

Loud Voices in the China Field Loud Voices in the China Field A recent debate in Eurasian

C loud A pp P rofiler: T elco C loud A pplications T racing and M onitoring CTPD Project By:

c cientific omputing Automatic Differentiation of Computational Fluid Dynamics Package DROPS

The he fut uture ure of of comput omputing: ng: Quant uantum um Andr ndrs s Gil

Musical Instruments A glass pane exposed to a loud, short sound A. A glass pane exposed to a

High Perfor orma mance Re Research Comp omputing - XSE SEDE &amp; Se SeWHiP, David St

E NGINEERING C OMPUTING S ERVICES (785) 532-4643 support@engg.ksu.edu

E NGINEERING C OMPUTING S ERVICES (785) 532-4643 support@engg.ksu.edu

Architecture-based Dependability Prediction for Service-oriented Computing Vincenzo Grassi

TensorFlow w/XLA: TensorFlow, Compiled! Expressiveness with performance Pre-release

ECE 1747H ECE 1747H : Parallel Meeting time: Mon 4-6 PM Programming Meeting place: BA

Cloudy/Clear Sky Relative Humidity in the Upper Troposphere Observed by AIRS, CloudSat, and

Perf rform rmance ance Inter terfer ference ence on Multico ticore Processor essors

CSE 331 Composite Layouts; Decorators slides created by Marty Stepp based on materials by M.

Using Invariant Analysis for Improving Instrumentation-based Performance Evaluation of

Digging Into The Core of Boot Yuriy Bulygin @c7zero Oleksandr Bazhaniuk @ABazhaniuk Agenda

High Perfor orma mance Re Research Comp omputing - XSE SEDE & Se SeWHiP, David St