3 rd Data Prefetching Championship June 23 rd , 2019 Held in - - PowerPoint PPT Presentation

3 rd data prefetching
SMART_READER_LITE
LIVE PREVIEW

3 rd Data Prefetching Championship June 23 rd , 2019 Held in - - PowerPoint PPT Presentation

3 rd Data Prefetching Championship June 23 rd , 2019 Held in conjunction with ISCA 2019 Seth Pugsley (Intel Labs) Closing and Results Thank you for your hard work 14 submissions Top 6 performing prefetchers presented Final Score


slide-1
SLIDE 1

3rd Data Prefetching Championship

June 23rd, 2019 Held in conjunction with ISCA 2019 Seth Pugsley (Intel Labs)

slide-2
SLIDE 2

Closing and Results

  • Thank you for your hard work
  • 14 submissions
  • Top 6 performing prefetchers presented
  • Final Score
  • (Geomean of all single core speedups) + (Geomean of all 4 core speedups)
  • Prize for the overall winner
slide-3
SLIDE 3

Overall Score

Red = presented at DPC3

  • Most submitted prefetchers

very competitive

  • Top of the field looks crowded
  • +0.6% between 1st and 2nd
  • +1.4% between 1st and 6th
  • Strengths vary between single-

and multi-core

2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 14 13 12 11 10 9 8 7 6 5 4 3 2 1

Total Score

Higher Rank 

Score

slide-4
SLIDE 4

Single-Core Speedup S-Curves

0.9 1.4 1.9 2.4 2.9 3.4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46

1-Core Speedup S-Curves

1st 2nd 3rd average

  • 1st place prefetcher shows

large benefits compared to 2nd and 3rd place

  • 2nd place behind 3rd place for

single core

  • 2nd place also behind average
  • 11th place single-core
  • 1st place 1.3% ahead of second

best single-core prefetcher (not shown)

Speedup

Workloads sorted by increasing speedup per prefetcher

slide-5
SLIDE 5

4-core Speedup S-Curves

0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

4-Core Speedup S-Curves

1st 2nd 3rd average

  • 2nd place prefetcher wins

4-core configuration due to greatly improved worst case

  • 2nd place 1.5% ahead of second

best 4-core prefetcher (3rd)

  • Top 3 4-core prefetchers also

top 3 overall

Speedup

Workloads sorted by increasing speedup per prefetcher

slide-6
SLIDE 6

Score Breakdowns

1 1.05 1.1 1.15 1.2 1.25 1.3 1.35 1.4 1.45 14 13 12 11 10 9 8 7 6 5 4 3 2 1

Score Breakdown

1core score 4core score

Higher Rank 

  • 1st place has 1st best single-

core and 3rd best 4-core perf.

  • 2nd place has 11th best single-

core and 1st best 4-core perf.

  • 3rd place has 7th best single-

core and 2nd best 4-core perf.

  • Strong 4-core performance

made the difference

Score

slide-7
SLIDE 7

Top 3 prefetchers …

slide-8
SLIDE 8

Top 3 prefetchers …

  • 3rd place: “Multi-Lookahead Offset Prefetching”
  • by Mehran Shakerinava, Mohammad Bakhshalipour, Pejman Lotfi-Kamran,

Hamid Sarbazi-Azad

slide-9
SLIDE 9

Top 3 prefetchers …

  • 3rd place: “Multi-Lookahead Offset Prefetching”
  • by Mehran Shakerinava, Mohammad Bakhshalipour, Pejman Lotfi-Kamran,

Hamid Sarbazi-Azad

  • 2nd place: “Accurately and Maximally Prefetching Spatial Data Access

Patterns with Bingo”

  • by Mohammad Bakhshalipour, Mehran Shakerinava, Pejman Lotfi-Kamran,

Hamid Sarbazi-Azad

slide-10
SLIDE 10

Top 3 prefetchers …

  • 3rd place: “Multi-Lookahead Offset Prefetching”
  • by Mehran Shakerinava, Mohammad Bakhshalipour, Pejman Lotfi-Kamran,

Hamid Sarbazi-Azad

  • 2nd place: “Accurately and Maximally Prefetching Spatial Data Access

Patterns with Bingo”

  • by Mohammad Bakhshalipour, Mehran Shakerinava, Pejman Lotfi-Kamran,

Hamid Sarbazi-Azad

  • 1st place: “Bouquet of Instruction Pointers: Instruction Pointer

Classifier-based Hardware Prefetching”

  • by Samuel Pakalapati, Biswabandan Panda
slide-11
SLIDE 11

Thank You!

  • Congratulations to the winners!