tdlo CS 744: DATACENTER AS A COMPUTER Shivaram Venkataraman Fall - PowerPoint PPT Presentation

tdlo CS 744: DATACENTER AS A COMPUTER Shivaram Venkataraman Fall 2020

ANNOUNCEMENTS - Assignments Piazza - Assignment zero is due! → - Form groups for Assignment 1 on Piazza ↳ Thursday - Class format - Review - Lecture - Discussion

Applications Machine Learning SQL Streaming Graph Application ¥ Computational Engines ' Scalable Storage Systems Arch [ . Resource Management > Hardware → Architecture Datacenter Architecture

OUTLINE - Hardware Trends - Datacenter design - WSC workloads - Discussion

Why is One Machine Not Enough? parallelism limited - enough resources not → ^ ) high could be Cost - contd maqn.ge Redundancy → - high volumes are Data - - - slow →

What’s in a Machine? DRAM y Procecnpgr f. Interconnected compute and storage Memory Bus Newer Hardware - GPUs, FPGAs PCIe v4 - RDMA, NVlink → Ssp Ethernet SATA HDD →

Scale Up: Make More Powerful Machines Moore’s law ? ? O – Stated 52 years ago by Intel • / founder Gordon Moore – Number of transistors on microchip double every 2 years – Today “closer to 2.5 years” Intel CEO Brian Krzanich

Dennard Scaling is the Problem Core 32 or core If Suggested that power requirements are proportional ¥ to the area for transistors – Both voltage and current being proportional to length – Stated in 1974 by Robert H. Dennard (DRAM inventor) “Adapting to Thrive in a New Economy of Memory Abundance,” Bresniker et al Broken since 2005

⇒ Dennard Scaling is the Problem Performance per-core is stalled I Number of cores is increasing “Adapting to Thrive in a New Economy of Memory Abundance,” Bresniker et al

ft Memory TRENDS Cop awk or loot of t 's tater a B pi f - GB Is lo - 15 - core per log 100 M = DRAM O '

MEMORY TAKEAWAY Growing Data access from memory is getting more expensive ! +15% per year

HDD CAPACITY storage - Back blaze - - → backup O O O

HDD BANDWIDTH HM bandwidth read MB Is - 200 100 I Disk bandwidth is not growing

SSDs Performance: HDD of latency – Reads: 25us latency moms - – Write: 200us latency deleting data expensive is ~ – Erase: 1,5 ms overwriting - Steady state, when SSD full – One erase every 64 or 128 reads (depending on page size) Lifetime: 100,000-1 million writes per page

SSD VS HDD COST O " " O qq.FEYsn.tn O O - - - - - - -

100 MB Is Disk I Ethernet Bandwidth " " : r.oas.e.fi Growing 33-40% per year ! 2017 2002 1998 1995 o

AMAZON EC2 (2019) - t drive Flash tf Yat g

TRENDS SUMMARY CPU speed per core is flat Memory bandwidth growing slower than capacity SSD, NVMe replacing HDDs of limitations Ethernet bandwidth growing ? machine Single

net rack offer DATACENTER ARCHITECHTURE gas T Racks with fitches now µ racks → Memory Bus → PCIe → → Ethernet → → SATA Server Server

STORAGE HIERARCHY (DC AS A COMPUTER v2) = ↳ I 201 Or G - I a ::¥¥÷ : - GBH @ → 100M Bb -

Warehouse-Scale Computers Many concerns o – Infrastructure Single organization : – Networking Homogeneity (to some extent) - 19000 getters – Storage Cost efficiency at scale r - ← – Software – Multiplexing across = applications and services – Power/Energy - – Rent it out! – Failure/Recovery – …

SOFTWARE IMPLICATIONS Component → Reliability failures Storage Hierarchy - Workload Diversity Single organization -

WORKLOAD: Partition-Aggregate - - BigData - latency low Top-level Aggregator ijhtkggiegeted Mid-level Aggregators fry Workers shard ed Index

WORKLOAD: SCHOLAR SIMILARITY " mapped → → I quit → Not e Mir → re µ . I .÷÷:w . . Map Stage Reduce Stage

intensive paralleling VIDEO ENCODING compute f fragments TV f K " youtube → ' daleth ly v .

Wsc → MACHINE LEARNING grain we

DISCUSSION https://forms.gle/CrrrhCPYHerwXNEt5

Discussion sale Out Scale up Scale-up vs Scale-out parallelism doesn't have app your If communication → ← overkill dataset small Fault tolerance -8 - you to - as Miriam coiffeur > pay peggy 10 . 000 I

↳ DISCUSSION Microsoft Word vs. online document editor like Google Docs Word Docs challenge release Yearly is . , collaboration consistency a path monthly anywhere , Access it from - - Machine I hardware patches I release online compatibility Leek tag permanent redundancy → storage 99.99% uptime

DISCUSSION * 99% having well Even work servers makes Parallelism worse latencies tail 0 O X tin only ) - C have slowdown #

NEXT STEPS Next class: Storage Systems Assignment 1 out Thursday. Submit groups before that! Wait list

tdlo CS 744: DATACENTER AS A COMPUTER Shivaram Venkataraman Fall - PowerPoint PPT Presentation

tdlo CS 744: DATACENTER AS A COMPUTER Shivaram Venkataraman Fall 2020 ANNOUNCEMENTS - Assignments Piazza - Assignment zero is due! - Form groups for Assignment 1 on Piazza Thursday - Class format - Review - Lecture -

What is Modern Web? Web Frameworks Web Tooling Mobile / Tablet First

QUARTERLY MEETING Jacquie Vargas Building Coordinator Program Director, Communications Manager

I w I want nt to do th o do the rig ight ht thi hing ng but ut SHAPE APES S

Stata Conference Dario Sansone 2017 User Conference Baltimore Now You See Me High School

Bayesian Adjustment for Multiplicity Jim Berger Duke University with James Scott University of

Explainable (Deep) Learning and Simulation approaches Torsten Mller Visualization and

Understanding parallel analysis methods for rank selection in PCA David Hong Yue Sheng Edgar

THE 3-R'S OF DATA- THE 3-R'S OF DATA- SCIENCE: SCIENCE: REPEATABILITY REPEATABILITY, ,

Surv rviving Restructure Welcome Surviving Restructure - Introductions Sandra Leek Catherine

Statistical Foundations: Sampling 17 February 2020 Modern Research Methods The Single

Youth Involvement Team Brahmpreet Gulati Member of Leicester City Young Peoples Council

How to Make Best Use of Cross-Company Data for Web Effort Estimation? Leandro L. Minku

RCE-EM: from Citizen to Civic Science Linking our activities to quality education for

Automatic Presentations and Classes of Semigroups Graham Oliver University of Leicester Joint

The tricategory of formal composites and its strictification Peter Guthmann University of

Didier Ruedin, University of Neuchtel Laura Morales, University of Leicester Elections, Public

Non-Strict Temporal Exploration Thomas Erlebach and Jakob T. Spooner School of Informatics,

Exploration of temporal graphs with bounded degree Thomas Erlebach and Jakob Spooner University

Segal-type models of weak n -categories Simona Paoli Department of Mathematics University of

From MTL to Deterministic Timed Automata Dejan Nickovic Nir Piterman IST Austria Imperial

Grassmannian categories of infinite rank joint with Jenny August, Man-Wai Cheung, Eleonore Faber

inContext : A Pervasive and Collaborative Working Environment for Emerging Team Forms Hong-Linh

Exoplanets and the Nature of Other Worlds Didier Queloz Cavendish Laboratory University of

The Leicestershire Experience: Delivering Better Care Together with Pi Care & Health Cath

Sambuz

Useful Links

Newsletter

Mail Us

tdlo CS 744: DATACENTER AS A COMPUTER Shivaram Venkataraman Fall - PowerPoint PPT Presentation

tdlo CS 744: DATACENTER AS A COMPUTER Shivaram Venkataraman Fall 2020 ANNOUNCEMENTS - Assignments Piazza - Assignment zero is due! - Form groups for Assignment 1 on Piazza Thursday - Class format - Review - Lecture -

What is Modern Web? Web Frameworks Web Tooling Mobile / Tablet First

QUARTERLY MEETING Jacquie Vargas Building Coordinator Program Director, Communications Manager

I w I want nt to do th o do the rig ight ht thi hing ng but ut SHAPE APES S

Stata Conference Dario Sansone 2017 User Conference Baltimore Now You See Me High School

Bayesian Adjustment for Multiplicity Jim Berger Duke University with James Scott University of

Explainable (Deep) Learning and Simulation approaches Torsten Mller Visualization and

Understanding parallel analysis methods for rank selection in PCA David Hong Yue Sheng Edgar

THE 3-R'S OF DATA- THE 3-R'S OF DATA- SCIENCE: SCIENCE: REPEATABILITY REPEATABILITY, ,

Surv rviving Restructure Welcome Surviving Restructure - Introductions Sandra Leek Catherine

Statistical Foundations: Sampling 17 February 2020 Modern Research Methods The Single

Youth Involvement Team Brahmpreet Gulati Member of Leicester City Young Peoples Council

How to Make Best Use of Cross-Company Data for Web Effort Estimation? Leandro L. Minku

RCE-EM: from Citizen to Civic Science Linking our activities to quality education for

Automatic Presentations and Classes of Semigroups Graham Oliver University of Leicester Joint

The tricategory of formal composites and its strictification Peter Guthmann University of

Didier Ruedin, University of Neuchtel Laura Morales, University of Leicester Elections, Public

Non-Strict Temporal Exploration Thomas Erlebach and Jakob T. Spooner School of Informatics,

Exploration of temporal graphs with bounded degree Thomas Erlebach and Jakob Spooner University

Segal-type models of weak n -categories Simona Paoli Department of Mathematics University of

From MTL to Deterministic Timed Automata Dejan Nickovic Nir Piterman IST Austria Imperial

Grassmannian categories of infinite rank joint with Jenny August, Man-Wai Cheung, Eleonore Faber

inContext : A Pervasive and Collaborative Working Environment for Emerging Team Forms Hong-Linh

Exoplanets and the Nature of Other Worlds Didier Queloz Cavendish Laboratory University of

The Leicestershire Experience: Delivering Better Care Together with Pi Care &amp; Health Cath

Sambuz

Useful Links

Newsletter

Mail Us

The Leicestershire Experience: Delivering Better Care Together with Pi Care & Health Cath