C o z : F i n d i n g C o d e t h a t C o u n t s w i t h C a s u a l P r o f i l i n g C h a r l i e C u r t s i n g e r , E m e r y D . B e r g e r G r i n e l l C o l l e g e U n i v e r s i t y o f M a s s a c h u s e t t s A m h e r s t P r e s e n t e d a t S O S P 2 0 1 5 P a p e r - R e a d i n g G r o u p 2 2 . 1 0 . 2 0 1 5
C a u s a l C o z : F i n d i n g C o d e t h a t C o u n t s w i t h C a s u a l P r o f i l i n g C h a r l i e C u r t s i n g e r , E m e r y D . B e r g e r G r i n e l l C o l l e g e U n i v e r s i t y o f M a s s a c h u s e t t s A m h e r s t P r e s e n t e d a t S O S P 2 0 1 5 P a p e r - R e a d i n g G r o u p 2 2 . 1 0 . 2 0 1 5
M o t i v a t i o n C o n v e n t i o n a l P r o f i l e r s : ● d o n ' t t e l l y o u w h a t t o o p t i m i z e ● d o n ' t t e l l y o u a b o u t g a i n s ● a r e n o t r e a l l y m a d e f o r c o n c u r r e n c y P a p e r - R e a d i n g G r o u p 2 2 . 1 0 . 2 0 1 5
M o t i v a t i o n 4 P a p e r - R e a d i n g G r o u p 2 2 . 1 0 . 2 0 1 5
I d e a P e r f o r m p e r f o r m a n c e e x p e r i m e n t s d u r i n g p r o g r a m e x e c u t i o n t h a t m e a s u r e h o w f a s t a p r o g r a m g e t s i f y o u s l o w d o w n a s p e c i f i c l i n e o f c o d e 5 P a p e r - R e a d i n g G r o u p 2 2 . 1 0 . 2 0 1 5
I d e a 6 P a p e r - R e a d i n g G r o u p 2 2 . 1 0 . 2 0 1 5
I mp l e me n t a t i o n 1) Startup c o z r u n - - - < p r o g r a m > < a r g s > 2) Experiment Initialization randomly choose source line and speedup (5% steps) 3) Apply virtual speedup sample each thread every ms, apply virtual speedup when selected source line runs 4) Ending virtual experiment after predetermined time, at least five visits to selected line, else double time 5) Produce profile analyze all logged results, calculate speedups 6) Interpret casual profiles 7 P a p e r - R e a d i n g G r o u p 2 2 . 1 0 . 2 0 1 5
C h a l l a n g e s Blocking: Who has to wait when? → I/O simple (delay after io finishes) → Synchronization primitives: hard, don't punish twice 8 P a p e r - R e a d i n g G r o u p 2 2 . 1 0 . 2 0 1 5
R e s u l t s ( d e d u p ) While toop in h function a s h t a b l e _ s e a r c h p u d e e p S % 9 9 P a p e r - R e a d i n g G r o u p 2 2 . 1 0 . 2 0 1 5
R e s u l t s ( f e r r e t ) p u Improvement d opportunities in three of e the four threads e p S % 1 2 1 0 P a p e r - R e a d i n g G r o u p 2 2 . 1 0 . 2 0 1 5
R e s u l t s ( S Q L i t e ) p u d e b e p S % 6 2 Remove unnecessary 'vtable' calls 1 1 P a p e r - R e a d i n g G r o u p 2 2 . 1 0 . 2 0 1 5
R e s u l t s ( mo r e ) Benchmark Cause Improvement fluidanimate Custom barrier 37.5% streamcluster Custom barrier 68% memcached 'false' Lock sharing 9% blackscholes Common subexpressions 2.6% swaptions Inefficient array handling 15% 1 2 P a p e r - R e a d i n g G r o u p 2 2 . 1 0 . 2 0 1 5
E fg i c i e n c y 1 3 P a p e r - R e a d i n g G r o u p 2 2 . 1 0 . 2 0 1 5
D i s c u s s i o n ● Cool Idea ● How long do you have to sample for reasonable coverage of 148k lines of sqlite? (1ms, each 5 times, so 500s? How is the coverage of this?) 1 4 P a p e r - R e a d i n g G r o u p 2 2 . 1 0 . 2 0 1 5
Recommend
More recommend