A CASE FOR RANDOM TOPOLOGIES IN HPC INTERCONNECTS Henri Casanova - - PowerPoint PPT Presentation

a case for random topologies in hpc interconnects
SMART_READER_LITE
LIVE PREVIEW

A CASE FOR RANDOM TOPOLOGIES IN HPC INTERCONNECTS Henri Casanova - - PowerPoint PPT Presentation

A CASE FOR RANDOM TOPOLOGIES IN HPC INTERCONNECTS Henri Casanova Univ. of Hawai`i at Manoa with M. Koibuchi (NII, Japan) H. Matsutani and H. Amano (Keio Univ., Japan) D.F. Hsu (Fordham Univ., U.S.A.) D ISCLAIMER This is the first talk at the


slide-1
SLIDE 1

A CASE FOR RANDOM TOPOLOGIES IN HPC INTERCONNECTS

Henri Casanova

  • Univ. of Hawai`i at Manoa

with M. Koibuchi (NII, Japan)

  • H. Matsutani and H. Amano (Keio Univ., Japan)

D.F. Hsu (Fordham Univ., U.S.A.)

slide-2
SLIDE 2

DISCLAIMER

This is the first talk at the Scheduling Workshop Yet, I won’t talk about scheduling at all Instead, I’ll talk mostly about graphs and networking hardware

slide-3
SLIDE 3

WHY GIVE THIS TALK?

Pseudo-reason #1 - Among the research I did last year, this is probably the most fun I had

And after all it got published in ISCA 2012

Pseudo-reason #2 - It could revolutionize cluster interconnects (by tomorrow or so...)

at least for some kinds of applications/workloads impact on mapping applications to compute nodes

slide-4
SLIDE 4

MAIN IDEA

Forget age-old topologies (tori, grids, hypercubes, trees) that try to be economical or clever Instead, just run around the machine room and pull cables into routers at random

slide-5
SLIDE 5

QUEST FOR “GOOD” TOPOLOGIES

Diameter of a graph: longest shortest path between any two vertices

Highly correlated to communication latency in network topologies

Typical problem: maximize the number of vertices in a graph for a given diameter and degree

  • r equivalently: given vertices and a bound on the degree, add

edges so as to minimize diameter

Studied by graph theoreticians for decades

Moore bound gives an upper bound on (regular) graph size Many interesting graphs (De Bruijn, (n,k)-star, etc.)

Several graphs used in practice for HPC interconnects strike different compromises between diameter and degree:

grids and tori, hypercube (with many variations), omega and butterfly networks (with many variations), fat trees, etc.

slide-6
SLIDE 6

WHY WOULD WE CARE TODAY?

Isn’t this all done already? Platforms scales are increasing and platforms are built as networks of switches Switch delay > 100ns, link delay ~ 5ns/m As usual, we want low diameter (i.e., few hops on node- to-node paths) But switches with high radix (e.g., > 100 ports) are becoming cheaper Therefore, we can use topologies with relatively high degree without incurring too high a cost

Different from the “hypercube days” in which increasing the degree by 1 led to an n-fold increase in cost

slide-7
SLIDE 7

TOPOLOGIES OF SWITCHES

100-port switch 100-port switch 100-port switch 100-port switch 100-port switch 100-port switch 90

compute nodes

90

compute nodes

90

compute nodes

90

compute nodes

90

compute nodes

90

compute nodes

10 10 10 10 10 10

TOPOLOGY OF DEGREE 10

slide-8
SLIDE 8

TOPOLOGIES OF SWITCHES

What graph should we pick for creating a topology of high-radix switches?

  • M. Koibuchi came to visit my lab and asked this question

Our initial attempt: borrow some ideas from structured peer-to-peer networks

Degree is O(log n) to keep routing tables “small” So perhaps we can do something similar, but that’s better than, say, a hypercube?

and without constraints on the number of nodes

Common approach in p2p networks: add shortcut edges to a ring to build a Distributed Loop Network (DLN)

DLN-x: DLN with degree x

slide-9
SLIDE 9

DLN-2

diameter ~ n/2

slide-10
SLIDE 10

DLN-3

diameter ~ n/4

slide-11
SLIDE 11

DLN-5

diameter ~ n/8

slide-12
SLIDE 12

DLN-5

diameter ~ n/8

slide-13
SLIDE 13

DLN TOPOLOGIES

Many smarter (cheaper) ways to organize the shortcut links likely if your goal is the diameter For instance with irregular graphs

diam ~ n/16 + 1 + n/16 ~ n/8 (degree ≤ 4)

What’s a good (optimal) deterministic construction here for a bounded degree?

For regular graphs or irregular graphs

This is when we starting reading graph theory literature...

diam ~ n/8 (degree ≤ 3)

slide-14
SLIDE 14

RANDOM DLN???

The Diameter of a Cycle Plus a Random Matching, Bollobás, SIAM J. Discrete Math., 1988

Consider a ring of degree 2 (with an even number of vertices) Add one edge between two randomly picked vertices until all vertices have degree 3 Question: how good is the diameter? Answer: very close to optimal w.h.p. as n gets large

General lesson: for a given degree and given bound on the diameter, random graphs are much larger than all cleverly designed non-random graphs In other words, random graphs have low diameter We quit looking for a deterministic DLN and instead went random!

Edges are cheap, we like regular graphs, so perfect matchings are fine

slide-15
SLIDE 15

RANDOM DLN

DLN-x-y: DLN with degree x+y, where y “additional” random shortcut edges are added at each vertex

DLN-x-0 is a non-random DLN

y perfect matches are added to the DLN-x-0 graph using a simple algorithm Pick the best generated DLN-x-y sample (best diameter, best average shortest path length for equal diameters) among 100 trials Let’s compute the diameter and average shortest path length of DLN-2-(d-2) d for 215 vertices?

And show a comparison to DLN-2-0, just for kicks

slide-16
SLIDE 16

DLN VS. RANDOM DLN (n=215)

1 10 100 1000 10000 100000 5 10 15 20 25 30 Hops Degree Non-random, Diameter Non-random, Avg. Shortest Path Random, Diameter Random, Avg. Shortest Path

slide-17
SLIDE 17

DLN VS. RANDOM DLN (n=215)

1 10 100 1000 10000 100000 5 10 15 20 25 30 Hops Degree Non-random, Diameter Non-random, Avg. Shortest Path Random, Diameter Random, Avg. Shortest Path

At degree log2(215): diam(DLN-2-1) = 6 < diam(HyperCube)/2

slide-18
SLIDE 18

OUTLINE

It is still important to think of topologies today A few random shortcuts drastically reduce diameter Comparison to other topologies How random is it? Network simulations for throughput and latency Caveats Does any of this matter?

slide-19
SLIDE 19

COMPARISON TO OTHER TOPOLOGIES

TORUS-d: Torus of degree d

Not at all designed for good diameter of course

HYPERCUBE F-HYPERCUBE: Folded Hypercube [El-Amawy et al., 1991]

degree n+1 for 2n vertices add an edge between vertex x and !x

T-HYPERCUBE: Multiply-twisted Hypercube [Efe, 1991]

degree n for 2n vertices achieves a lower diameter than the hypercube

FLATBUTTERLY: Flattened Butterfly [Kim et al., 2007]

start with a k-ary, n-layer butterfly network then merge switches into higher-radix switches can be seen as a more extreme hypercube

for 2n vertices, we use the lowest degree flattened butterfly with degree > n

slide-20
SLIDE 20

DIAMETER COMPARISON (n=210)

1 10 100 1000 5 10 15 20 25 30 Diameter Degree DLN-x-0 TORUS-x-0 HYPERCUBE-0 F-HYPERCUBE-0 T-HYPERCUBE-0 FLATBUTTERFLY-0 DLN-2-y

slide-21
SLIDE 21

ASPL COMPARISON (n=210)

1 10 100 1000 5 10 15 20 25 30 Average Shortest Path Degree DLN-x-0 TORUS-x-0 HYPERCUBE-0 F-HYPERCUBE-0 T-HYPERCUBE-0 FLATBUTTERFLY-0 DLN-2-y

slide-22
SLIDE 22

DIAMETER IMPROVEMENT SCALING

2 4 6 8 10 6 8 10 12 14 16 18 20 Increase in diameter (hops) Network size [log2 N] DLN-3-0 DLN-5-0 DLN-7-0 TORUS-4-0 TORUS-6-0 TORUS-8-0 HYPERCUBE-0 F-HYPERCUBE-0 T-HYPERCUBE-0 FLATBUTTERFLY-0

slide-23
SLIDE 23

ASPL IMPROVEMENT SCALING

2 4 6 8 10 6 8 10 12 14 16 18 20 Increase in average path length (hops) Network size [log2 N] DLN-3-0 DLN-5-0 DLN-7-0 TORUS-4-0 TORUS-6-0 TORUS-8-0 HYPERCUBE-0 F-HYPERCUBE-0 T-HYPERCUBE-0 FLATBUTTERFLY-0

slide-24
SLIDE 24

OUTLINE

It is still important to think of topologies today A few random shortcuts drastically reduce diameter Comparison to other topologies Observations on randomness Network simulations for throughput and latency Caveats Does any of this matter?

slide-25
SLIDE 25

NEEDLE IN HAY STACK?

Question: what’s the variation among our 100 samples?

5 10 15 20 25 30

Degree

5 10 15 20

Diameter

20 40 60 80 100

% of samples with diameter

% of samples with best diameter Diameter

slide-26
SLIDE 26

NEEDLE IN HAY STACK?

In fact, at degree d, topologies have diameters that vary by at most 1 hop

Some have diameter x, some diameter x+1

Say that x decreases at degree d+1 Question: Is there a “lucky” topology with degree d and diameter x-1? Empirical answer: No improvement when using 10,000 samples In practice, a “good” topology is found in the first 100 samples

slide-27
SLIDE 27

BETTER RANDOMNESS?

We have generated random shortcut edges without caring about the “quality” of the shortcut

e.g., if two vertices already have a short shortest path, then it’s not useful to add a shortcut between them

When generating a shortcut, generate k candidate shortcuts and pick the one between the vertices that have the longest shortest path k=2 improves diameter over k=1 in < 8% of the cases k=5 improves diameter over k=2 in < 4% of the cases The improvement is one hop (and increasing the degree by 1 “negates” the improvement) Improvements in ASPL are at most 0.02% In the end, “stupid” shortcuts are fine

slide-28
SLIDE 28

NON-REGULAR TOPOLOGIES

How about not enforcing that the graph is regular

vertices can have different degree which is fine for a topology of high-radix switches

Makes shortcut generation simpler But in fact leads to slightly less good diameter and average path length In the end, enforcing regularity is a good idea

slide-29
SLIDE 29

LESS RANDOMNESS

How about replacing DLN-2 by a better base topology before adding shortcut? Perhaps enhancing a smart topology with a few random edges will lead to good results...

slide-30
SLIDE 30

LESS RANDOMNESS (DIAMETER)

1 10 100 2 4 6 8 10 12 14 16 18 20 Diameter Degree FLATBUTTERFLY-0-y HYPERCUBE-0-y F-HYPERCUBE-0-y T-HYPERCUBE-0-y TORUS-8-y TORUS-6-y TORUS-4-y DLN-5-y DLN-3-y DLN-2-y

slide-31
SLIDE 31

LESS RANDOMNESS (ASPL)

1 10 100 2 4 6 8 10 12 14 16 18 20 Average Shortest Path Degree FLATBUTTERFLY-0-y HYPERCUBE-0-y F-HYPERCUBE-0-y T-HYPERCUBE-0-y TORUS-8-y TORUS-6-y TORUS-4-y DLN-5-y DLN-3-y DLN-2-y

slide-32
SLIDE 32

LESS RANDOMNESS

Adding a few shortcut links to a base topology leads to large payoffs But starting with a good base topology doesn’t work better than using DLN-2 In the end, the more non-random edges the higher the diameter/APST

slide-33
SLIDE 33

GUIDELINES FOR RANDOM TOPOLOGIES

Use DLN-2 as a base topology Add perfect matchings to it to maintain regularity Few random samples are sufficient to obtain a good topology Generating high-quality shortcuts only pays off a little bit Great pay-offs at low degree And it can all be done for whatever number of switches

slide-34
SLIDE 34

OUTLINE

It is still important to think of topologies today A few random shortcuts drastically reduce diameter Comparison to other topologies How random is it? Network simulations for throughput and latency Caveats Does any of this matter?

slide-35
SLIDE 35

SIMULATION ENVIRONMENT

We use a cycle-accurate flit-level network simulator of cluster interconnects

1 packet is 33 flits, 1 flit is 256 bits

Switch delay = 100ns, Link delay = 20ns Link bandwidth = 96 Gbps Three classic traffic patterns used in “how good is my network?” studies:

Uniform (random) Matrix transpose Bit reversal

We measure latency and throughput

slide-36
SLIDE 36

SOME RESULTS

n=256, bit-reversal n=512, matrix-transpose

slide-37
SLIDE 37

SOME RESULTS

n=256, bit-reversal n=512, matrix-transpose

slide-38
SLIDE 38

OUTLINE

It is still important to think of topologies today A few random shortcuts drastically reduce diameter Comparison to other topologies How random is it? Network simulations for throughput and latency Caveats Does any of this matter?

slide-39
SLIDE 39

ARE YOU SERIOUS?

Graph analysis and simulation results seem to show that random topologies are a good idea Good luck trying to convince anybody to “go random” for a real platform today Likely complaints:

Routing scalability Cabling costs

slide-40
SLIDE 40

ROUTING SCALABILITY

Because the topology is random, routing must be done with routing tables

Routing on a torus is trivial No clever hypercube-like routing scheme with tiny electronics solutions

But, 87% of Top500 platforms use Ethernet or Infiniband, meaning that they use routing tables So the vast majority of high-end HPC platforms suffer from routing table scalability anyway

And there are solutions to compact routing tables anyway

We conclude that routing scalability is not a show stopper for random topologies

slide-41
SLIDE 41

CABLING COST

Cabling cost is proportional to cable length but mostly to link type:

passive copper: 10m active copper: 40m

  • ptical: ~100m

Assuming standard cabinet layouts and Manhattan distance

2 4 6 8 10 12 14 16 5 6 7 8 9 10 11 12 13 Average Physical Cable Length (m) Network Size (log2 N) TORUS-6-0 HYPERCUBE-0 DLN-2-4 DLN-2-8 DLN-2-12 DLN-2-22

slide-42
SLIDE 42

CABLING COST

DLN-2-x leads to longer average cable length But it can use the same cheap cabling technology as non-random topologies for most cables There may be some particularly shortcuts that require long cables

slide-43
SLIDE 43

DOES ANY OF THIS MATTER?

If you’re doing a single parallel dense linear algebra app, you want a torus anyway

And HPC networks will likely always provide a torus-like network Would be interesting to see how much is lost in practice when switching to a random topology

If you’re running an irregular application then good diameter and ASPL make your life easier

No matter your application mapping, you’ll do pretty well Coming up with a clever mapping of the application on a particular topology is known to be hard in general

but good research topics for students

Even more true if you’re running multiple arbitrary communicating applications/services onto a cluster

which is where most of the interest comes from I think

slide-44
SLIDE 44

CONCLUSION

Future work:

What’s the penalty for bounding above the maximum cable length of a shortcut edge? Are perfect matchings overkill?

Do we care?

Questions?