 
              LoGA: Low-Overhead GPU Accounting Using Events Jens Kehne , Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa 10 th ACM International Systems and Storage Conference Operating Systems Group, Karlsruhe Institute of Technology (KIT) www.kit.edu KIT – The Research University in the Helmholtz Association
GPU Sharing GPUs increasingly popular in computing Not every application saturates a GPU time Move GPUs to the cloud Sharing increases cost-efficiency Problem: Need fairness Software scheduling is inefficient 2 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
NEON (University of Rochester, ASPLOS '14) Applies fair queuing to GPUs NEON App 1 App 2 App 1 App 2 3 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
NEON (University of Rochester, ASPLOS '14) Applies fair queuing to GPUs NEON App 1 App 2 App 1 App 2 4 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
NEON (University of Rochester, ASPLOS '14) Applies fair queuing to GPUs NEON App 1 App 2 App 1 App 2 5 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
NEON (University of Rochester, ASPLOS '14) Applies fair queuing to GPUs NEON App 1 App 2 App 1 App 2 6 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
NEON (University of Rochester, ASPLOS '14) Applies fair queuing to GPUs NEON App 1 App 2 App 1 App 2 Threshold 7 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
NEON (University of Rochester, ASPLOS '14) Applies fair queuing to GPUs NEON STOP App 1 App 2 App 1 App 2 8 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
NEON: Accounting Problem NEON‘s accounting disables GPU access GPU idle! STOP Sampling Phase Freerun Phase  High accounting overhead if application does not saturate the GPU 9 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
Idea GPUs have lots of status registers for the device driver Leak information about GPU‘s internal state Reading has no effect on GPU driver or running application Idea: Poll status registers Infer GPU-internal context switches 10 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
Design STOP App Scheduler Submit Query Accounting thread Poll 11 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
Accounting Thread Accounting Poll Status reg thread App 1 App 2 12 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
Accounting Thread Accounting Poll Status reg Status reg thread App 1 App 2 13 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
Accounting Thread Accounting Poll Status reg Status reg Status reg thread App 1 App 2 Poll frequency must be faster than kernel length High CPU load Poll periodically  14 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
Scheduling Scheduling thread queries accounting thread Updates fair queuing counters Stops applications if necessary Different metric than NEON NEON: Average kernel length LoGA: Total GPU time consumed 15 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
Benchmark Scenario Accounting overhead Accuracy of accounting (→ scheduling quality) Competing workload: Throttle Creates well-defined GPU load run sleep 16 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
Accounting Overhead (10% load) 1,6 10 instances of throttle, 1% load each 1,5 1,4 Normalized against execution Normalized runtime Scheduling disabled time w/o accounting 1,3 1,2 1,1 NEON LoGA 1 0,9 0,8 0,7 u p e d n l t D n t s D d n w r r 1 2 r s d l e e n o r e a p e e o f e 2 a a o n t u n v v f 3 M t n a b w p y d c g y t _ _ t r r t i m s a l s s t l e p t w s c c i n o a r d d + t d e f u s f o e m k r t o e i d p f v f a a b u a o i m l c u r k m h y l c h s b a c r r o a e a h k u m t s s m t l y m i a e g h o t b e r h p a G h a l u e p m r t s 17 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
Accounting Overhead (90% load) 1,6 10 instances of throttle, 9% load each 1,5 1,4 Normalized against execution Normalized runtime Scheduling disabled time w/o accounting 1,3 1,2 1,1 NEON LoGA 1 0,9 0,8 0,7 u p e d n l t D n t s D d n w r r 1 2 r s d l e e n o r e a p e e o f e 2 a a o n t u n v v f 3 M t n a b w p y d c g y t _ _ t r r t i m s a l s s t l e p t w s c c i n o a r d d + t d e f u s f o e m k r t o e i d p f v f a a b u a o i m l c u r k m h y l c h s b a c r r o a e a h k u m t s s m t l y m i a g h o t e b e r h p a h a G l u e p m r t s 18 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
Fairness Goal: LoGA should not reduce fairness Problem: Which application runtime is fair? Measure total GPU time for each application Calculate optimal scheduling Next slide: Speedup over fair schedule 19 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
Fairness: Results (4x throttle, 20% load each) 5 4,5 Speedup over fair schedule 4 3,5 3 2,5 No scheduling NEON 2 LoGA LoGA-CL 1,5 1 0,5 0 p d n l D n t s e d t s l r a o o e 2 a a n f f 3 o b c w p r r t i m s a t p w s t o s t d e + s k r t f d p a o f i m b u c u r s e h b a a h k t h y b g o h h 20 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
Conclusion Sharing beneficial if applications do not saturate the GPU Scheduler interference reduces sharing LoGA accounts GPU usage without overhead Poll GPU status registers, detect context switches Time between context switches as input for fair queuing 21 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
Finding Registers Envytools project has documented some registers Unfortunately, far from complete Trial and error: Run workload with known behavior Dump register values See which registers correlate with workload behavior Two registers identified: ID of currently running GPU context Activity status of entire GPU 22 22.05.17 LoGA: Low-Overhead GPU Accounting Using Events Operating Systems Group Jens Kehne, Stanislav Spassov, Marius Hillenbrand, Marc Rittinghaus, Frank Bellosa Karlsruhe Institute of Technology
Recommend
More recommend