LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne , Stanislav - - PowerPoint PPT Presentation

loga low overhead gpu accounting using events
SMART_READER_LITE
LIVE PREVIEW

LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne , Stanislav - - PowerPoint PPT Presentation

LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne , Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 10 th ACM International Systems and Storage Conference Operating Systems Group, Karlsruhe Institute of Technology


slide-1
SLIDE 1

KIT – The Research University in the Helmholtz Association

Operating Systems Group, Karlsruhe Institute of Technology (KIT)

www.kit.edu

LoGA: Low-Overhead GPU Accounting Using Events

Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 10th ACM International Systems and Storage Conference

slide-2
SLIDE 2

Operating Systems Group Karlsruhe Institute of Technology 2 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

GPU Sharing

GPUs increasingly popular in computing Not every application saturates a GPU Move GPUs to the cloud

Sharing increases cost-efficiency

Problem: Need fairness Software scheduling is inefficient

time

slide-3
SLIDE 3

Operating Systems Group Karlsruhe Institute of Technology 3 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

NEON (University of Rochester, ASPLOS '14)

Applies fair queuing to GPUs

App 1 App 2

NEON

App 1 App 2

slide-4
SLIDE 4

Operating Systems Group Karlsruhe Institute of Technology 4 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

NEON (University of Rochester, ASPLOS '14)

Applies fair queuing to GPUs

App 1 App 2

NEON

App 1 App 2

slide-5
SLIDE 5

Operating Systems Group Karlsruhe Institute of Technology 5 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

NEON (University of Rochester, ASPLOS '14)

Applies fair queuing to GPUs

App 1 App 2

NEON

App 1 App 2

slide-6
SLIDE 6

Operating Systems Group Karlsruhe Institute of Technology 6 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

NEON (University of Rochester, ASPLOS '14)

Applies fair queuing to GPUs

App 1 App 2

NEON

App 1 App 2

slide-7
SLIDE 7

Operating Systems Group Karlsruhe Institute of Technology 7 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

NEON (University of Rochester, ASPLOS '14)

Applies fair queuing to GPUs

App 1 App 2

NEON

App 1 App 2

Threshold

slide-8
SLIDE 8

Operating Systems Group Karlsruhe Institute of Technology 8 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

NEON (University of Rochester, ASPLOS '14)

Applies fair queuing to GPUs

App 1 App 2

NEON

App 1 App 2

STOP

slide-9
SLIDE 9

Operating Systems Group Karlsruhe Institute of Technology 9 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

NEON: Accounting Problem

NEON‘s accounting disables GPU access

Freerun Phase

STOP

Sampling Phase

GPU idle!  High accounting overhead if application does not saturate the GPU

slide-10
SLIDE 10

Operating Systems Group Karlsruhe Institute of Technology 10 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

Idea

GPUs have lots of status registers for the device driver

Leak information about GPU‘s internal state

Reading has no effect on GPU driver or running application Idea:

Poll status registers Infer GPU-internal context switches

slide-11
SLIDE 11

Operating Systems Group Karlsruhe Institute of Technology 11 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

Design App

Submit

Accounting thread

Poll

Scheduler

Query

STOP

slide-12
SLIDE 12

Operating Systems Group Karlsruhe Institute of Technology 12 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

Accounting Thread

Accounting thread Status reg

App 1 App 2 Poll

slide-13
SLIDE 13

Operating Systems Group Karlsruhe Institute of Technology 13 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

Accounting Thread

Accounting thread Status reg

App 1 App 2 Poll

Status reg

slide-14
SLIDE 14

Operating Systems Group Karlsruhe Institute of Technology 14 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

Accounting Thread

Poll frequency must be faster than kernel length High CPU load Poll periodically  Accounting thread Status reg

App 1 App 2 Poll

Status reg Status reg

slide-15
SLIDE 15

Operating Systems Group Karlsruhe Institute of Technology 15 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

Scheduling

Scheduling thread queries accounting thread Updates fair queuing counters Stops applications if necessary Different metric than NEON

NEON: Average kernel length LoGA: Total GPU time consumed

slide-16
SLIDE 16

Operating Systems Group Karlsruhe Institute of Technology 16 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

Benchmark Scenario

Accounting overhead Accuracy of accounting (→ scheduling quality) Competing workload: Throttle

Creates well-defined GPU load

run sleep

slide-17
SLIDE 17

Operating Systems Group Karlsruhe Institute of Technology 17 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

Accounting Overhead (10% load)

b a c k p r

  • p

b f s b + t r e e c f d d w t 2 d g a u s s i a n h e a r t w a l l h

  • t

s p

  • t

h

  • t

s p

  • t

3 D h u f f m a n h y b r i d s

  • r

t k m e a n s l a v a M D l e u k

  • c

y t e l u d m u m m e r g p u m y

  • c

y t e n n n w p a r t i c l e f i l t e r p a t h f i n d e r s r a d _ v 1 s r a d _ v 2 s t r e a m c l u s t e r G e

  • m

e a n 0,7 0,8 0,9 1 1,1 1,2 1,3 1,4 1,5 1,6 NEON LoGA

Normalized runtime Normalized against execution time w/o accounting 10 instances of throttle, 1% load each Scheduling disabled

slide-18
SLIDE 18

Operating Systems Group Karlsruhe Institute of Technology 18 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

Accounting Overhead (90% load)

b a c k p r

  • p

b f s b + t r e e c f d d w t 2 d g a u s s i a n h e a r t w a l l h

  • t

s p

  • t

h

  • t

s p

  • t

3 D h u f f m a n h y b r i d s

  • r

t k m e a n s l a v a M D l e u k

  • c

y t e l u d m u m m e r g p u m y

  • c

y t e n n n w p a r t i c l e f i l t e r p a t h f i n d e r s r a d _ v 1 s r a d _ v 2 s t r e a m c l u s t e r G e

  • m

e a n 0,7 0,8 0,9 1 1,1 1,2 1,3 1,4 1,5 1,6 NEON LoGA

Normalized runtime Normalized against execution time w/o accounting 10 instances of throttle, 9% load each Scheduling disabled

slide-19
SLIDE 19

Operating Systems Group Karlsruhe Institute of Technology 19 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

Fairness

Goal: LoGA should not reduce fairness Problem: Which application runtime is fair? Measure total GPU time for each application Calculate optimal scheduling

Next slide: Speedup over fair schedule

slide-20
SLIDE 20

Operating Systems Group Karlsruhe Institute of Technology 20 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

Fairness: Results (4x throttle, 20% load each)

b a c k p r

  • p

b f s b + t r e e c f d d w t 2 d g a u s s i a n h e a r t w a l l h

  • t

s p

  • t

3 D h

  • t

s p

  • t

h u f f m a n h y b r i d s

  • r

t k m e a n s 0,5 1 1,5 2 2,5 3 3,5 4 4,5 5 No scheduling NEON LoGA LoGA-CL

Speedup over fair schedule

slide-21
SLIDE 21

Operating Systems Group Karlsruhe Institute of Technology 21 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

Conclusion

Sharing beneficial if applications do not saturate the GPU Scheduler interference reduces sharing LoGA accounts GPU usage without overhead

Poll GPU status registers, detect context switches Time between context switches as input for fair queuing

slide-22
SLIDE 22

Operating Systems Group Karlsruhe Institute of Technology 22 LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 22.05.17

Finding Registers

Envytools project has documented some registers

Unfortunately, far from complete

Trial and error:

Run workload with known behavior Dump register values See which registers correlate with workload behavior

Two registers identified:

ID of currently running GPU context Activity status of entire GPU