LIRA: Adaptive Contention-Aware Thread Placement for Parallel - - PowerPoint PPT Presentation

lira adaptive contention aware thread
SMART_READER_LITE
LIVE PREVIEW

LIRA: Adaptive Contention-Aware Thread Placement for Parallel - - PowerPoint PPT Presentation

LIRA: Adaptive Contention-Aware Thread Placement for Parallel Runtime Systems Alexander Collins*, Tim Harris , Murray Cole*, Christian Fensch * University of Edinburgh Oracle Labs, UK Heriot Watt University 1 The Problem


slide-1
SLIDE 1

LIRA: Adaptive Contention-Aware Thread Placement for Parallel Runtime Systems

Alexander Collins*, Tim Harris†, Murray Cole*, Christian Fensch‡

1

* University of Edinburgh † Oracle Labs, UK

‡ Heriot Watt University

slide-2
SLIDE 2

The Problem

  • Multi-socket machines common-place
  • Run multiple parallel programs
  • Co-location affects performance
  • Which programs should we co-locate?

2

slide-3
SLIDE 3

3

slide-4
SLIDE 4

The Problem

  • System workload is constantly changing
  • Best co-location changes
  • Need an online adaptive solution

4

slide-5
SLIDE 5

Our Insight

  • Balance load instruction rate across sockets

5

slide-6
SLIDE 6

Our Solution

  • Schedule programs to sockets
  • Maximise difference in load instruction rate

(LIRA heuristic)

  • Built on top of Callisto[1]
  • Each program pins one thread to each core
  • One thread on each core is high priority
  • High priority thread runs unless it stalls

[1] Callisto: Co-scheduling Parallel Runtime Systems, Harris et al. EuroSys ‘14

6

slide-7
SLIDE 7

Our Solution

7

slide-8
SLIDE 8

Our Solution

8

slide-9
SLIDE 9

Our Solution

9

slide-10
SLIDE 10

Our Solution

10

slide-11
SLIDE 11

Our Solution

11

slide-12
SLIDE 12

Our Solution

12

slide-13
SLIDE 13

Our Solution

13

slide-14
SLIDE 14

Our Solution

14

slide-15
SLIDE 15

15

slide-16
SLIDE 16

16

slide-17
SLIDE 17

17

slide-18
SLIDE 18

18

slide-19
SLIDE 19

19

slide-20
SLIDE 20

20

slide-21
SLIDE 21

21

slide-22
SLIDE 22

22

slide-23
SLIDE 23

23

slide-24
SLIDE 24

24

slide-25
SLIDE 25

25

slide-26
SLIDE 26

26

slide-27
SLIDE 27

27

slide-28
SLIDE 28

28

slide-29
SLIDE 29

29

slide-30
SLIDE 30

Evaluation

  • 11 benchmarks from SPEC OpenMP 2001
  • 4 from GreenMarl project
  • 1 using CDDP (betweeness-centrality)
  • Dual-socket Xeon E5-2660
  • 8 cores each (hyperthreading disabled)

30

slide-31
SLIDE 31

Evaluation

  • Measure 32 combinations of four programs
  • ANTT and STP system performance metrics
  • Comparing:
  • Socket unaware Callisto
  • LIRA static tuning
  • LIRA adaptive tuning

31

slide-32
SLIDE 32

Evaluation

32

slide-33
SLIDE 33

Conclusions

  • Co-location affects performance
  • Adaptive online tuning is required
  • LIRA heuristic improves performance
  • More details in the paper

33

slide-34
SLIDE 34

LIRA: Adaptive Contention-Aware Thread Placement for Parallel Runtime Systems

Alexander Collins*, Tim Harris†, Murray Cole*, Christian Fensch‡

34

* University of Edinburgh † Oracle Labs, UK

‡ Heriot Watt University

slide-35
SLIDE 35

35

slide-36
SLIDE 36
slide-37
SLIDE 37

37

slide-38
SLIDE 38

38

slide-39
SLIDE 39

39

slide-40
SLIDE 40

40