[PPT] - From something that fits in your pocket ... ... to, well, this. The PowerPoint Presentation

SLIDE 1

SLIDE 2

From something that fits in your pocket ...

SLIDE 3

... to, well, this.

SLIDE 4

The future? ...

SLIDE 5

Energy

A look at cluster computers and datacenters

Tarun Prabhu, Radha Venkatagiri

SLIDE 6

Datacenters

SLIDE 7

Datacenters

Most (all?) of you probably know what they are Most (all?) of you know what they are used for

SLIDE 8

Energy usage in datacenters

Used 76,000,000,000 kWH in 2010 2% of all electricity produced in the US ≈1.3% of all electricity produced globally †

†Koomey, J. “Growth in Data Center Electricity Use 2005 to 2010”, Analytics Press, 2011

SLIDE 9

Increasing datacenter efficiency

Reduce infrastructure overheads Reduce ancillary (non-computing) costs Reduce computing costs

SLIDE 10

How efficient are these?

PUE(Power Energy Effectiveness): indicates how much energy is used for non-computing functions. Average PUE is 1.8 (this means that for every 1 Watt used for computing, another 0.8 Watts is used in overheads) † Company PUE Comments Facebook 1.07 Google 1.14 Individual facility goes to 1.06 Yahoo – Individual facility goes to 1.08 Amazon 1.45 Assumption by Amazon themselves Microsoft 1.25 Target for April 2013 Apple – They shall never tell ...

Table : Efficiency of datacenter giants‡

†http://www.datacenterknowledge.com/archives/2011/05/10/uptime-institute-the-average-pue-is-1-8/ ‡http://gigaom.com/cloud/whose-data-centers-are-more-efficient-facebooks-or-googles/

SLIDE 11

Infrastructure overheads - What are they?

SLIDE 12

Infrastructure overheads - Cooling, power etc.

Heat management to reduce hot spots Natural cooling

Air - Buffalo(Yahoo), Lulea(Facebook) Sea-water - Hamina(Google) Evaporative cooling - Prineville(Facebook)

Optimize power distribution

Efficient power-supplies Minimize AC/DC conversion stages

Nifty new ideas, for instance

il baths

SLIDE 13

Ancillary costs ...

What Facebook did

SLIDE 14

Ancillary costs ...

What Facebook did Toss everything and go back to the drawing board.

SLIDE 15

Ancillary costs ...

What Facebook did Toss everything and go back to the drawing board. Literally

SLIDE 16

Open Compute Project

Facebook custom-designed ... everything † Kept only what was strictly necessary 38% more efficient, 24% cheaper Made all specifications (CAD drawings etc.) publicly available http://www.opencompute.org

motherboards

power supplies

server chassis
server racks
battery cabinets

†https://www.facebook.com/notes/facebook-engineering/building-efficient-data-centers-with-the-opencompute-project/10150144039563920

SLIDE 17

Reducing computing costs

SLIDE 18

Reducing computing costs

Tackle under-utilization and overprovisioning

SLIDE 19

Server utilization

Average server utilization in datacenters is ≈ 50%

SLIDE 20

Reasons for under-utilization

Planning for traffic spikes Reliability considerations System-software maintenance is safer

SLIDE 21

Real reason

Clients get cranky!

SLIDE 22

Energy-proportional computing

Under-utilization is a problem because, as things stand today, power consumed is not proportional to work done Ideally, the dynamic range of energy consumption should be

increased. In this, no power will be consumed when idle, little

power will be consumed when doing minimal work and the consumption would increase gradually until the machine is fully loaded

SLIDE 23

Reducing compute costs - I

Tackling under-utilization with operating system support Turn/off suspend hosts during low-usage periods Intelligent load-balancing Resource-aware scheduling Power-aware scheduling

SLIDE 24

Google’s warehouse computing

Google’s approach to building datacenters Treat entire datacenter as one BIG computer Centralized resource management. Provides greater flexibility in decision-making to improve metrics

SLIDE 25

Reducing compute costs - II

Customizing hardware to applications

SLIDE 26

An example - FAWN

Fast Array of Wimpy Nodes Single-core AMD Geode processor(500 Mhz) 256 MB DDR SDRAM (400 Mhz) 4GB CompactFlash storage Intel Atom front-end

SLIDE 27

Academic research

Jointly optimize computing and cooling energy (ICDCS ’12) Data-centric approaches by focusing on where to place data to minimize energy consumption (SC ’12) Improving network and interconnect efficiency by scaling network up and down based on traffic demands (USENIX ’10) Intelligent allocation of work to compute-units based on job characteristics, environmental conditions etc.

SLIDE 28

Supercomputers

SLIDE 29

Supercomputers

Tens of thousands (or more) of compute elements operating together TB’s (now PB’s) of memory PB’s (nearing EB’s of storage)

SLIDE 30

Uses of supercomputers

Molecular dynamics Fluid dynamics: Airframe design Modelling astrophysics phenomena Earthquake system science Simulation of spread of contagion Cosmology (formation of the first galaxies) Climate modelling and hypothesis confirmation

SLIDE 31

Uses of supercomputers

Molecular dynamics Fluid dynamics: Airframe design Modelling astrophysics phenomena Earthquake system science Simulation of spread of contagion Cosmology (formation of the first galaxies) Climate modelling and hypothesis confirmation (of global warming perhaps)

SLIDE 32

Top 500 List

SLIDE 33

Top 500 List

# Name Location Cores PFLOPS Power(MW) 1 Titan USA 560K* 17.59 8.21 2 Sequoia USA 1572K 16.32 7.89 3 K Computer Japan 705K 10.5 12.66 4 Mira USA 786K 8.16 3.95 5 JuQueen Germany 131K 4.14 1.97 6 SuperMUC Germany 147K 2.89 3.42 7 Stampede USA 204K* 2.66 8 Tianhe-1A China 186K 2.56 4.04 9 Fermi Italy 163K 1.72 0.82 10 DTS USA 63K 1.51 3.57

Table : Power consumption of world’s fastest computers

http://www.top500.org/list/2012/06/100

SLIDE 34

How efficient is this?

These machines can simulate a rat’s brain

SLIDE 35

How efficient is this?

WARNING: Some math here

SLIDE 36

How efficient is this?

WARNING: Some Bad math here

SLIDE 37

How efficient is this?

WARNING: Some Bad math here Brain weight comparison Whuman = 1400 gms† Wrat = 2 gms† Phuman ≈ 30 W‡

†http://faculty.washington.edu/chudler/facts.html

SLIDE 38

How efficient is this?

WARNING: Some Bad math here Brain weight comparison Whuman = 1400 gms† Wrat = 2 gms† Phuman ≈ 30 W‡ ∴ Prat brain ≈ 2 1400 × 30 = 0.043W

†http://faculty.washington.edu/chudler/facts.html

SLIDE 39

How efficient is this?

WARNING: Some Bad math here Brain weight comparison Whuman = 1400 gms† Wrat = 2 gms† Phuman ≈ 30 W‡ ∴ Prat brain ≈ 2 1400 × 30 = 0.043W Metabolism fraction Prat ≈ Wrat Whuman × Phuman = 0.4 62 × 100 §‡ = 0.64 W

†http://faculty.washington.edu/chudler/facts.html ‡http://hypertextbook.com/facts/2001/JacquelineLing.shtml

SLIDE 40

How efficient is this?

WARNING: Some Bad math here Brain weight comparison Whuman = 1400 gms† Wrat = 2 gms† Phuman ≈ 30 W‡ ∴ Prat brain ≈ 2 1400 × 30 = 0.043W Metabolism fraction Prat ≈ Wrat Whuman × Phuman = 0.4 62 × 100 §‡ = 0.64 W Prat brain = 0.05 × Prat = 0.032W

†http://faculty.washington.edu/chudler/facts.html ‡http://hypertextbook.com/facts/2001/JacquelineLing.shtml

SLIDE 41

How efficient is this?

WARNING: Some Bad math here Brain weight comparison Whuman = 1400 gms† Wrat = 2 gms† Phuman ≈ 30 W‡ ∴ Prat brain ≈ 2 1400 × 30 = 0.043W Metabolism fraction Prat ≈ Wrat Whuman × Phuman = 0.4 62 × 100 §‡ = 0.64 W Prat brain = 0.05 × Prat = 0.032W

†http://faculty.washington.edu/chudler/facts.html ‡http://hypertextbook.com/facts/2001/JacquelineLing.shtml

SLIDE 42

How efficient is this?

WARNING: Some Bad math here Brain weight comparison Whuman = 1400 gms† Wrat = 2 gms† Phuman ≈ 30 W‡ ∴ Prat brain ≈ 2 1400 × 30 = 0.043W Metabolism fraction Prat ≈ Wrat Whuman × Phuman = 0.4 62 × 100 §‡ = 0.64 W Prat brain = 0.05 × Prat = 0.032W

†http://faculty.washington.edu/chudler/facts.html ‡http://hypertextbook.com/facts/2001/JacquelineLing.shtml

SLIDE 43

How efficient is this?

WARNING: Some Bad math here Brain weight comparison Whuman = 1400 gms† Wrat = 2 gms† Phuman ≈ 30 W‡ ∴ Prat brain ≈ 2 1400 × 30 = 0.043W Metabolism fraction Prat ≈ Wrat Whuman × Phuman = 0.4 62 × 100 §‡ = 0.64 W Prat brain = 0.05 × Prat = 0.032W

†http://faculty.washington.edu/chudler/facts.html ‡http://hypertextbook.com/facts/2001/JacquelineLing.shtml §http://www.biomedcentral.com/1471-2458/12/439

SLIDE 44

One of these ...

SLIDE 45

is equivalent to ...

SLIDE 46

is equivalent to ...

SLIDE 47

Exascale?

SLIDE 48

Exascale?

Enough computing power to simulate the human brain (2019?)

SLIDE 49

Exascale?

Needs 700 MW or more?

SLIDE 50

Exascale?

Needs 700 MW or more?

http://farm5.staticflickr.com/4011/4710638282 5e226f00f6.jpg

SLIDE 51

Exascale?

Needs 700 MW or more?

http://farm5.staticflickr.com/4011/4710638282 5e226f00f6.jpg http://images4.wikia.nocookie.net/ cb20100331223557/simpsons/images/0/0c/Springfield Nuclear Power Plant 1.PNG

SLIDE 52

Exascale?

Typical nuclear power plant produces 400-1200MW Needs 700 MW or more?

http://farm5.staticflickr.com/4011/4710638282 5e226f00f6.jpg http://images4.wikia.nocookie.net/ cb20100331223557/simpsons/images/0/0c/Springfield Nuclear Power Plant 1.PNG

SLIDE 53

Improving efficiency of clusters

Clusters Datacenters Tightly coupled execution Requests are typically independent Usually compute-intensive Usually data-intensive I/O tends to occur in waves Small amount of I/O most of the time Drastically different application characteristics makes tuning nearly impossible Workload for any one application is the same, so clusters of machines can be tuned

SLIDE 54

Improving efficiency of clusters

Modelling power consumption at a fine-granularity is even harder

SLIDE 55

Improving efficiency of clusters

Modelling power consumption at a fine-granularity is even harder Where do you stick the meter?

SLIDE 56

Improving efficiency of clusters

Divide application into phases and run each phase in the best power mode Reconfigurable network interconnects Minimize communication in the program. Or else, exploit patters in the communication and allocate interacting processes to nearby compute units Use dynamic voltage and frequency scaling intelligently

SLIDE 57

Improving efficiency of clusters

Divide application into phases and run each phase in the best power mode Reconfigurable network interconnects Minimize communication in the program. Or else, exploit patters in the communication and allocate interacting processes to nearby compute units Use dynamic voltage and frequency scaling intelligently Fallback on heterogeneity (each component of the cluster

ptimized for a specific task e.g. wimpy nodes for IO, GPU’s

for matrix operations)

SLIDE 58

The hard way

Programmer explicitly coding for energy-efficiency? Potentially nightmarish, but tool support might be possible (gcc -P2)?

SLIDE 59

Clusters: Green 500 List

# Name Country GFlops/W Power(kW) Top 500 # 1 Beacon USA 2.499 44.89 253 2 SANAM KSA 2.351 179.5 53 3 Titan USA 2.142 8209 1 4 Todi Sui 2.121 129 91 5 JuQueen Ger 2.102 1970 5 6 (UoT) Can 2.101 41.09 401 7 (LLNL) USA 2.101 41.09 399 8 (IBM) USA 2.101 41.09 400 9 (IBM) USA 2.101 82.19 140 10 CADMOS Sui 2.100 82.19 141

SLIDE 60

How green are the Top 500?

# Name Power(MW) Green 500 # 1 Titan 8.21 3 2 Sequoia 7.89 30 3 K Computer 12.66 85 4 Mira 3.95 29 5 JuQueen 1.97 5 6 SuperMUC 3.42 82 7 Stampede 2.66 142 8 Tianhe-1A 4.04 87 9 Fermi 1.72 23 10 DTS 1.51 159

SLIDE 61

What we really need to do is ...

SLIDE 62

From something that fits in your pocket ...

... to, well, this.

The future? ...

Energy

A look at cluster computers and datacenters

Datacenters

Datacenters

Energy usage in datacenters

Increasing datacenter efficiency

How efficient are these?

Infrastructure overheads - What are they?

Infrastructure overheads - Cooling, power etc.

Ancillary costs ...

Ancillary costs ...

Ancillary costs ...

Open Compute Project

Reducing computing costs

Reducing computing costs

Server utilization

Reasons for under-utilization

Real reason

Energy-proportional computing

Reducing compute costs - I

Google’s warehouse computing

Reducing compute costs - II

An example - FAWN

Academic research

Supercomputers

Supercomputers

Uses of supercomputers

Uses of supercomputers

Top 500 List

Top 500 List

How efficient is this?

How efficient is this?

How efficient is this?

How efficient is this?

How efficient is this?

How efficient is this?

How efficient is this?

How efficient is this?

How efficient is this?

How efficient is this?

One of these ...

is equivalent to ...

is equivalent to ...

Exascale?

Exascale?

Exascale?

Exascale?

Exascale?

Exascale?

Improving efficiency of clusters

Improving efficiency of clusters

Improving efficiency of clusters

Improving efficiency of clusters

Improving efficiency of clusters

The hard way

Clusters: Green 500 List

How green are the Top 500?

What we really need to do is ...

What we really need to do is ...

Innovate ...