limits of parallel marking garbage collection
play

Limits of Parallel Marking Garbage Collection ...how parallel can a - PowerPoint PPT Presentation

Limits of Parallel Marking Garbage Collection ...how parallel can a GC become? Dr. Fridtjof Siebert CTO, aicas ISMM 2008, Tucson, 7. June 2008 Limits of Parallel Marking Garbage Collection Introduction Parallel Hardware is becoming the norm


  1. Limits of Parallel Marking Garbage Collection ...how parallel can a GC become? Dr. Fridtjof Siebert CTO, aicas ISMM 2008, Tucson, 7. June 2008

  2. Limits of Parallel Marking Garbage Collection Introduction Parallel Hardware is becoming the norm ● even for embedded computers ● even for real-time systems We need parallel garbage collection ● That is not only optimized for max. throughput ● But that gives guarantees on its performance ● The worst-case GC timing must predictable and fast 2

  3. Limits of Parallel Marking Garbage Collection Terminology blocking GC cycle 1 cycle 2 3

  4. Limits of Parallel Marking Garbage Collection Terminology blocking GC Incremental GC cycle 1 cycle 2 4

  5. Limits of Parallel Marking Garbage Collection Terminology blocking GC Incremental GC cycle 1 cycle 2 Concurrent GC CPU 1: Application CPU 2: GC CPU 3: Application 5

  6. Limits of Parallel Marking Garbage Collection Terminology blocking GC Incremental GC cycle 1 cycle 2 Concurrent GC parallel GC CPU 1: Application CPU 1 cycle 1 cycle 2 CPU 2: GC CPU 2 cycle 1 cycle 2 CPU 3: Application CPU 3 cycle 1 cycle 2 6

  7. Limits of Parallel Marking Garbage Collection Terminology blocking GC Incremental GC cycle 1 cycle 2 Concurrent GC parallel GC CPU 1: Application CPU 1 cycle 1 cycle 2 CPU 2: GC CPU 2 cycle 1 cycle 2 CPU 3: Application CPU 3 cycle 1 cycle 2 Parallel & Concurrent CPU 1: Application CPU 2: GC CPU 3: GC 7

  8. Limits of Parallel Marking Garbage Collection Terminology blocking GC Incremental GC cycle 1 cycle 2 Concurrent GC parallel GC CPU 1: Application CPU 1 cycle 1 cycle 2 CPU 2: GC CPU 2 cycle 1 cycle 2 CPU 3: Application CPU 3 cycle 1 cycle 2 Parallel & Concurrent Parallel & Concurrent CPU 1: Application CPU 1 CPU 2: GC CPU 2 CPU 3: GC CPU 3 8

  9. Limits of Parallel Marking Garbage Collection Terminology blocking GC Incremental GC cycle 1 cycle 2 Concurrent GC parallel GC CPU 1: Application CPU 1 cycle 1 cycle 2 CPU 2: GC CPU 2 cycle 1 cycle 2 CPU 3: Application CPU 3 cycle 1 cycle 2 Parallel & Concurrent Parallel & Concurrent CPU 1: Application CPU 1 CPU 2: GC CPU 2 CPU 3: GC CPU 3 9

  10. Limits of Parallel Marking Garbage Collection Parallel Mark & Sweep Incremental Mark & Sweep ● uses three color marking: white, grey and black ● mark phase step is ● find take grey object o ● mark all white objects referenced by o grey ● mark o black ● sweep phase step is ● take white object ● free its memory 10

  11. Limits of Parallel Marking Garbage Collection Parallel Mark & Sweep Parallel Sweep Steps ● not addressed here ● sweeping can be performed fully in parallel by ● sweeping different regions of the heap by different CPUs ● need parallel access to the free lists 11

  12. Limits of Parallel Marking Garbage Collection Parallel Mark & Sweep Parallel Mark ● several threads may scan grey objects in parallel ● new color anthracite for grey object that is being scanned by one CPU ● stalls possible if grey set temporarily empty! 12

  13. Limits of Parallel Marking Garbage Collection Worst Case: Linked List root 13

  14. Limits of Parallel Marking Garbage Collection Worst Case: Linked List root CPU1 CPU2 CPU3 14

  15. Limits of Parallel Marking Garbage Collection Worst Case: Linked List root starts mark CPU1 step CPU1 CPU2 CPU3 15

  16. Limits of Parallel Marking Garbage Collection Worst Case: Linked List root CPU1 CPU1 no grey CPU2 object, stalls! CPU3 16

  17. Limits of Parallel Marking Garbage Collection Worst Case: Linked List root CPU1 CPU1 CPU2 no grey CPU3 object, stalls! 17

  18. Limits of Parallel Marking Garbage Collection Worst Case: Linked List root mark step CPU1 finished CPU1 CPU2 CPU3 18

  19. Limits of Parallel Marking Garbage Collection Worst Case: Linked List root CPU1 all CPUs CPU1 compete for one grey object! CPU2 CPU3 19

  20. Limits of Parallel Marking Garbage Collection Worst Case: Linked List root CPU1 eg., CPU2 CPU1 successful, CPU1 + CPU3 stall! CPU2 CPU2 CPU3 20

  21. Limits of Parallel Marking Garbage Collection Worst Case: Linked List With n CPUs performing mark in parallel ● there might be n-1 stalls for each mark step ● only one CPU is performing a mark step at any time Worst-case performance equal to non-parallel GC! 21

  22. Limits of Parallel Marking Garbage Collection Can we find a better limit for real applications? First, look at two processor parallel mark only ● what if memory graph consists of two linked lists? 22

  23. Limits of Parallel Marking Garbage Collection Two Linked Lists with two CPUs root CPU1 we might be lucky and see no stalls CPU2 23

  24. Limits of Parallel Marking Garbage Collection Two Linked Lists with two CPUs root CPU1 but we might have bad luck: one list is scanned CPU2 first, there is a single linked list left! 24

  25. Limits of Parallel Marking Garbage Collection Limit on stalls depends on object depth root 1 2 3 4 13 CPU1 2 11 10 5 12 3 12 9 6 11 CPU2 4 13 8 7 10 5 6 7 8 9 25

  26. Limits of Parallel Marking Garbage Collection Limit on stalls depends on object depth (2-processors) ● after 1 st stall, all objects with depth ≤ 1 are black ● after 2 nd stall, all objects with depth ≤ 2 are black ● etc. ● after n th stall, all objects with depth ≤ n are black 26

  27. Limits of Parallel Marking Garbage Collection Limit on stalls depends on object depth (2-processors) # of stalls s on two-processor parallel mark is limited by max. depth of the memory graph H: 27

  28. Limits of Parallel Marking Garbage Collection Generalization for more processors # of stalls s on p -processor parallel mark is limited by: 28

  29. Limits of Parallel Marking Garbage Collection Analysis and Measurements Instrumented JamaicaVM Java implementation to ● measure the maximum depth of the heap graph, ● make samples of the current heap graph all 10,000 reference store operations, and ● output the maximum depths and the maximum ratios depth / heap size in # of objects The instrumented VM was then used to run the SPECjvm98 benchmark suite 29

  30. Limits of Parallel Marking Garbage Collection Measurements Maximum depths of SPECjvm98 benchmarks 1500 1250 1000 750 500 250 0 check jess db mpegaudio jack compress raytrace javac mtrt 30

  31. Limits of Parallel Marking Garbage Collection Measurements Maximum relative depths of SPECjvm98 benchmarks 4,00% 3,50% 3,00% 2,50% 2,00% 1,50% 1,00% 0,50% 0,00% check jess db mpegaudio jack compress raytrace javac mtrt 31

  32. Limits of Parallel Marking Garbage Collection Measurements Worst-case scalability of SPECjvm98 benchmarks 1,0 0,9 100 0,8 ideal ideal check check 0,7 compress compress jess 0,6 jess raytrace raytrace 0,5 db db javac 10 javac 0,4 mpegaudio mpegaudio mtrt 0,3 jack mtrt non-parallel jack 0,2 0,1 1 0,0 1 2 4 8 16 32 64 128 256 1 2 4 8 16 32 64 128 256 32

  33. Limits of Parallel Marking Garbage Collection Conclusions In the general case, parallel marking garbage collection can not be parallelized. However, if the depth of the memory graph is limited, then parallel mark phase generally works well. To be able to give realtime guarantees on the performance of the mark phase, we need a guarantee from the application on its maximum heap depth. 33

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend