Augmenting Low-latency HPC Network with Free-space Optical Links - - PowerPoint PPT Presentation

augmenting low latency
SMART_READER_LITE
LIVE PREVIEW

Augmenting Low-latency HPC Network with Free-space Optical Links - - PowerPoint PPT Presentation

21st IEEE Symposium on High Performance Computer Architecture (HPCA) 2015-02-11 Augmenting Low-latency HPC Network with Free-space Optical Links Ikki Fujiwara National Institute of Informatics Michihiro Koibuchi Tomoya Ozaki Keio University


slide-1
SLIDE 1

Augmenting Low-latency HPC Network with Free-space Optical Links

Ikki Fujiwara

Tomoya Ozaki Henri Casanova Michihiro Koibuchi Hiroki Matsutani

National Institute of Informatics Keio University University of Hawai’i at Manoa

21st IEEE Symposium on High Performance Computer Architecture (HPCA) 2015-02-11

slide-2
SLIDE 2
  • What if steerable wireless links appear on top of

cabinets?

Story at a Glance

2

FSO Terminal Switch Cable Laser Beam Cabinet

Efficient Power- aware On/Off Link Regulation Reduced cable length & latency Topology

  • ptimization for

diverse apps

slide-3
SLIDE 3
  • Motivation
  • How to make Free Space Optics (FSO)

– FSO Terminal Devices – Layout of FSO Terminals

  • How to use FSOs in an HPC system

– For Reduced Cable Length and Latency – For Improved Topology Embedding – For Power-aware On/off Link Regulation

  • Conclusion
slide-4
SLIDE 4

Motivation 1/3: Cable Reduction

4

K Computer (6-D mesh/torus) Earth Simulator, 1st gen. (crossbar)

(c) kan-haru (c) Riken

83,200 cables 2,400 km 140 tons 200,000 cables 1,000 km

FSO provides shorter cable length and lower link delay

slide-5
SLIDE 5

Motivation 2/3: Topology Optimization

  • Diverse parallel applications have each different

preferable topology

5

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 CG FT Graph500

Relative performance

Torus Random Better NAS Parallel Benchmarks

FSO provides a reconfigurable network

Event-discrete simulation by SimGrid. 64 switches. Switch degree = 8.

slide-6
SLIDE 6

Motivation 3/3: Leveraging Power- aware On/Off Link Regulation

  • Link consumes power regardless of workload
  • Turned-off links saves link power, but causes a

negative impact on performance in HPC use [1]

  • Let’s turn off more links!

– As long as the performance loss is compensated by replacing wired links with FSO-based shortcuts

6

e.g. Energy Efficient Ethernet Performance loss is not acceptable for HPCs

[1] Saravanan et al., “Power/performance evaluation of energy efficient Ethernet (EEE) for High Performance Computing”, ISPASS 2013

slide-7
SLIDE 7
  • Motivation
  • How to make Free Space Optics (FSO)

– FSO Terminal Devices – Layout of FSO Terminals

  • How to use FSOs in an HPC system

– For Reduced Cable Length and Latency – For Improved Topology Embedding – For Power-aware On/off Link Regulation

  • Conclusion
slide-8
SLIDE 8

Free-space Optical Links

8

Collimator Lens Optical Circulator Transceiver Optical fiber

  • 10–100 Gbps, 200m distance using commodity laser (e.g. 1310 nm)
  • Negligible interference enables high-density layout on top of cabinets

[2] Hamedazimi et al, “FireFly: a reconfigurable wireless data center fabric using free-space optics”, SIGCOMM 2014 [3] Arimoto et al., “Wide field-of-view singlemode-fiber coupled laser communication terminal”, SPIE 2013

Our prototype Hamedazimi’s [2] Arimoto’s [3]

  • Terminal devices applicable to HPC use:
slide-9
SLIDE 9
  • Motivation
  • How to make Free Space Optics (FSO)

– FSO Terminal Devices – Layout of FSO Terminals

  • How to use FSOs in an HPC system

– For Reduced Cable Length and Latency – For Improved Topology Embedding – For Power-aware On/off Link Regulation

  • Conclusion
slide-10
SLIDE 10

Line-of-sight Layout of FSO Terminals10

  • No laser beam should be interrupted by the other

terminals

  • Want to layout FSO terminals so as to minimize the

interruption Maximize the line-of-sight ratio (LSR) =

2𝑀 𝑂(𝑂−1)

  • 𝑂 = number of terminals
  • 𝑀 = number of terminal pairs with direct line of sight
  • Calculated using a ray tracer

!

FSO Transmitter FSO Receiver Other FSO terminal

slide-11
SLIDE 11

Straight Layout (Naive)

11

0.5 0.6 0.7 0.8 0.9 1 100 200 300 400 500 600 700 800 Line-of-sight ratio Number of cabinets = FSO terminals

~60% LSR

slide-12
SLIDE 12

0.5 0.6 0.7 0.8 0.9 1 100 200 300 400 500 600 700 800 Line-of-sight ratio Number of cabinets = FSO terminals

Random Layout

12

92.5% LSR at 800 cabinets

slide-13
SLIDE 13

0.5 0.6 0.7 0.8 0.9 1 100 200 300 400 500 600 700 800 Line-of-sight ratio Number of cabinets = FSO terminals

Theater Layout

13

100% LSR up to 252 cabinets Scalability limited by the headroom

slide-14
SLIDE 14

Alternative Layout using a Mirror

  • FSO beams can be reflected by a mirror
  • Similar idea is used for 60GHz wireless [4]
  • Hereafter we assume 100% LSR

14

Mirror

100% LSR Unlimited scalability (ideally)

[4] Zhou et al, “Mirror mirror on the ceiling: flexible wireless links for data centers”, SIGCOMM 2012

slide-15
SLIDE 15
  • Motivation
  • How to make Free Space Optics (FSO)

– FSO Terminal Devices – Layout of FSO Terminals

  • How to use FSOs in an HPC system

– For Reduced Cable Length and Latency – For Improved Topology Embedding – For Power-aware On/off Link Regulation

  • Conclusion
slide-16
SLIDE 16
  • Reduced cable length
  • Lower end-to-end communication latency

– At most 53% lower latency (in theory)

  • Calculated using graph analysis

– When replacing long cables with FSO links – 1,024 switches; 512 cabinets; 1, 2, 4 FSO terminals/cabinet

Physical Merits of FSO links

16

Cable: 0.2 m/ns

Manhattan distance (Go orthogonal)

FSO: 0.3 m/ns

Euclidean distance (Go diagonal)

slide-17
SLIDE 17

Reduced Cable Length

17

20 40 60 80 100 120 2 4 2 4 2 4 2 4 Total cable length [km]

#FSOs/cab 3D Torus 5D Torus Rand deg=6 Rand deg=10 −32.0% −36.5% −23.0% −31.6%

Base topology

slide-18
SLIDE 18

Lower End-to-end Latency

18

0.5 1 1.5 2 2.5 3 3.5 4 2 4 2 4 2 4 2 4 Zero-load latency [μs]

−4.1% −8.9% −4.6% #FSOs/cab −2.9% −8.1% −5.1% −2.4% Max. −2.0% Avg. Link delay Switch delay 3D Torus 5D Torus Rand deg=6 Rand deg=10 Avg. Max.

Base topology

slide-19
SLIDE 19
  • Motivation
  • How to make Free Space Optics (FSO)

– FSO Terminal Devices – Layout of FSO Terminals

  • How to use FSOs in an HPC system

– For Reduced Cable Length and Latency – For Improved Topology Embedding – For Power-aware On/off Link Regulation

  • Conclusion
slide-20
SLIDE 20

Topology Embedding

  • Many small jobs run simultaneously in an HPC
  • Want to efficiently allocate their preferable topology

Graph embedding problem (NP-hard)

  • FSO largely alleviates the embedding problem
  • Optimized using a genetic algorithm

– So as to maximize the number of embedded topology

20

Switch FSO terminal

2×4 mesh found in a random topology

slide-21
SLIDE 21

2×4 Tori Found

21

20 40 60 80 100 8 12 16 20 24 28 32 36 40 Coverage [%] Degree of physical topology 4 FSOs/cab 2 FSOs/cab 1 FSO/cab

4 FSOs/cab leads to >80% of nodes to be well allocated

(random)

FSO opens a possibility for a better job allocation

slide-22
SLIDE 22
  • Motivation
  • How to make Free Space Optics (FSO)

– FSO Terminal Devices – Layout of FSO Terminals

  • How to use FSOs in an HPC system

– For Reduced Cable Length and Latency – For Improved Topology Embedding – For Power-aware On/off Link Regulation

  • Conclusion
slide-23
SLIDE 23

Power-aware On/Off Link Regulation

  • Our idea: let’s turn off more links!

– As long as the performance loss is compensated by replacing wired links with FSO-based shortcuts depending

  • n a given workload

23

  • 1. Deactivate wired links less

contributing to avg path length

  • 2. Insert an FSO shortcut to

remedy the avg path length

Deactivate Compensate

Loop

slide-24
SLIDE 24

p=0, q=20 p=0, q=40 p=20, q=0 p=20, q=20 p=20, q=40 0.7 0.8 0.9 1 1.1 BT CG IS LU SP Average latency relative to p=0, q=0 0.7 0.8 0.9 1 1.1 BT CG IS LU SP Hop count relative to p=0, q=0

Power-aware On/Off Link Regulation

  • Evaluation results using flit-level simulator

– 𝑞 percent of the wired links are replaced with FSO – 𝑟 percent of the links are deactivated

24

FSO works well with a power-aware link regulation

slide-25
SLIDE 25

Comparable Technologies

  • 60 GHz radio wireless links

– Larger interference than FSO

  • Embedding using Optical Circuit Switches (OCS)

– Wired links via an optical circuit switch can support partial reconfiguration – Its embedding capability is lower than FSO

25

[Zhou et al., SIGCOMM 2012]

Only FSO realizes our three objectives

slide-26
SLIDE 26
  • Motivation
  • How to make Free Space Optics (FSO)

– FSO Terminal Devices – Layout of FSO Terminals

  • How to use FSOs in an HPC system

– For Reduced Cable Length and Latency – For Improved Topology Embedding – For Power-aware On/off Link Regulation

  • Conclusion
slide-27
SLIDE 27
  • Augmenting Low-latency HPC Network with Free-

space Optical Links, we get…

Conclusion

27

FSO Terminal Switch Cable Laser Beam Cabinet

Efficient Power- aware On/Off Link Regulation Reduced cable length (−36%) & latency (−9%) Topology

  • ptimization for

diverse apps

Random FatTree Torus