SLIDE 1 Derivation And Evaluation
David F. Bacon, Perry Cheng and Dave Grove IBM T.J. Watson Research Center
Martin T. Vechev University of Cambridge
SLIDE 2
Outline
Motivation and Benefits New Generalizations Abstract Algorithms Practical Algorithms Derivations Evaluation
SLIDE 3
Motivation
Concurrent Collectors: Difficult to Construct Correctly
Initial Errors in Dijkstra and Steele Algorithms
Difficult to Understand Difficult to Implement
No systematic comparisons, largely folklore
SLIDE 4
Contributions
Generalization Of Existing Mechanisms Abstract Collectors Based on Generalizations
Precise, but inefficient
New Algorithm
Derived from the power of generalizations
Experimental Evaluation Of 4 Concurrent GC
SLIDE 5 Memory Overhead (Floating Garbage) Completion Time
Benefits of Generalization
Steele Dijkstra Hybrid Yuasa
SLIDE 6
Assumptions
Single Collector Thread Multiple Mutator Threads Atomic Write Barrier Non-Moving
SLIDE 7 Why Is It Hard ?
Concurrent Interleaving
Time
GC marks B
A C B
R1
D
SLIDE 8 Why Is It Hard ?
Time
GC marks B Mutator creates R2
A C B
R1
A C B
R1 R2
D D
SLIDE 9 Why Is It Hard ?
Time
GC marks B Mutator creates R2
A C B
R1
A C B
R1 R2
A C B
R2
Mutator removes R1
XR1
D D D
SLIDE 10 Why Is It Hard ?
Time
GC marks B Mutator creates R2
A C B
R1
A C B
R1 R2
A C B
R2
Mutator removes R1
A C B
R2
GC reclaims C & D live – WRONG!
XR1
D D D D
SLIDE 11
Wavefront
A C B D F E G
R1 R2
SLIDE 12
Protection
A C B
R2 R1 D
F E G A C B D F E G
X
Installation Deletion
Remember Crossing Pointer R1 Remember R2
R1 R2
SLIDE 13
New generalizations
Precise Wavefront
Shade
Precise Counting Of Cross Pointers
Scanned Reference Count (S-RC)
SLIDE 14
Collector Progress (Shade)
C B
R2
D F E G
SHADE:0
A
SLIDE 15
Collector Progress (Shade)
C B
R2
D F E G
SHADE:1
A
SLIDE 16
Collector Progress (Shade)
C B
R2
D F E G
SHADE:2
A
SLIDE 17
Collector Progress (Shade)
C B
R2
D F E G
SHADE:3
A
SLIDE 18
Shade Observations
Computed by Collector Generalization of the tri-color abstraction Different Granularities Different Objects
SLIDE 19
Scanned Reference Count (S-RC)
A C B D F E G
S-RC:0
SLIDE 20
Scanned Reference Count (S-RC)
A C B D F E G
S-RC:1
SLIDE 21
Scanned Reference Count (S-RC)
A C B D F E G
S-RC:2
SLIDE 22
Scanned Reference Count (S-RC)
A C B D F E G
S-RC:1
X
SLIDE 23
Scanned Reference Count (S-RC)
A C B D F E G
S-RC:0
X
SLIDE 24
Scanned Reference Count (S-RC)
A C B D F E G
S-RC:0
SLIDE 25
Outline
Motivation and Benefits New Generalizations Abstract Algorithms Practical Algorithms Derivations Evaluation
SLIDE 26
Abstract Algorithms
Utilize Shade and S-RC Installation-Based and Deletion-Based Mutator nominates candidates
Does not mark objects
SLIDE 27 Concurrent System Structure
COLLECT() do mark(); processNominated(); while (!finished); MUTATE (obj, field, target)
nominate(target);
SLIDE 28 Mutator Nominates (Installation)
A C B D F E G
S-RC:0
NOMINATED OBJECT BUFFER
SLIDE 29 Mutator Nominates (Installation)
A C B D F E G
S-RC:1
NOMINATED OBJECT BUFFER A
SLIDE 30 Mutator Nominates (Installation)
A C B D F E G
S-RC:1
NOMINATED OBJECT BUFFER A
S-RC:1
C
SLIDE 31 Mutator Nominates (Installation)
A C B D F E G
S-RC:0
NOMINATED OBJECT BUFFER A
S-RC:1
C
X
SLIDE 32 After Mark (Installation)
A C B D F E G
S-RC:0
NOMINATED OBJECT BUFFER A
S-RC:1
C
COLLECT() do mark(); processNominated(); while (!finished);
SLIDE 33 After Find (Installation)
Collector Marks Object C
A C B D F E G
S-RC:0
NOMINATED OBJECT BUFFER A
S-RC:1
C
COLLECT() do mark(); processNominated(); while (!finished);
SLIDE 34
Allocation
In Installation-Based Collectors
No difference
In Deletion-Based Collectors
Remembered Upon Allocation
SLIDE 35 Allocation
A C B D F E G
NOMINATED OBJECT BUFFER
SLIDE 36 Allocation
A C B D F E G N
NOMINATED OBJECT BUFFER N
S-RC:1
SLIDE 37
Outline
Motivation and Benefits New Generalizations Abstract Algorithms Practical Algorithms Derivations Evaluation
SLIDE 38
Practical Algorithms
Stacks
Non-Barriered Region Scanned Object : behind wavefront
S-RC affected
Stack rescanning
S-RC and Shade compression (tri-color)
Reachability Effect
SLIDE 39
Compressing S-RC (sticky bit)
A C B D F E G
S-RC:0
SLIDE 40
Compressing S-RC (sticky bit)
A C B D F E G
S-RC:1
SLIDE 41
Compressing S-RC (sticky bit)
A C B D F E G
S-RC:1
X
SLIDE 42
Compressing S-RC (sticky bit)
A C B D F E G
S-RC:1
Object A – Unreachable but kept Alive
SLIDE 43
Compressing Shade
C B D F E G
SHADE:0
A
SLIDE 44
Collector Progress (Shade)
C B D F E G
SHADE:3
A
SLIDE 45
Collector Progress (Shade)
C B
R1
D F E G
SHADE:3 PRECISE : 1
A
S-RC:1
SLIDE 46
Collector Progress (Shade)
C B
R1
D F E G
SHADE:3 PRECISE : 1
A
S-RC:1
X
(NOT decremented)
SLIDE 47
Collector Progress (Shade)
C B D F E G
SHADE:3 PRECISE : 1
A Object C – Unreachable but kept Alive
SLIDE 48
Outline
Motivation and Benefits New Generalizations Abstract Algorithms Practical Algorithms Derivations Evaluation
SLIDE 49 Deriving Dijkstra
RESCANNED STACKS INSTALLATION with 1-bit
Compress S-RC to sticky bit for ALL objects
DIJKSTRA
Compress Shade to 1-bit for ALL objects
INSTALLATION-BASED GC
Stack Regions
SLIDE 50 Deriving Yuasa
RESCANNED STACKS DELETION with 0-bits
Compress S-RC to 0-bit for ALL objects
Deletion with NO rescanning DELETION-BASED GC
Stack Regions NO S-RC needed => NO rescanning
YUASA
Compress Shade to 1-bit for ALL objects
SLIDE 51 Deriving a New Collector (Hybrid)
RESCANNED STACKS MIXED DELETION
Compress S-RC to sticky bit for Allocated objects Compress S-RC to 0-bit for Existing objects
DELETION-BASED GC
Stack Regions
HYBRID
Compress Shade to 1-bit for ALL objects
SLIDE 52 Memory Overhead (Floating Garbage) Completion Time
Algorithms
Steele Dijkstra Hybrid Yuasa
SLIDE 53 Evaluation
First Systematic Comparison Of Concurrent Collectors IBM J9 Production Virtual Machine
J2ME Profile, microJit 512MB RAM, Pentium 4, 3GHz
Comparison in terms of Execution time And Space Overhead
Dijkstra, Steele, Yuasa, Hybrid
Which benchmarks:
SpecJVM98 (-s100)
Work-Based Incremental Scheme
Collect 9K for every 6K allocated.
SLIDE 54 Maximum Space Usage
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
Benchmarks
Maximum Space
YU ASA DIJKSTRA STEELE H YBRID
jess db javac mtrt jack geomean
SLIDE 55 Execution Time
0.2 0.4 0.6 0.8 1 1.2 1.4
Benchmarks End-to-End Time
YU ASA DIJKSTRA STEELE H YBRID
jess db javac mtrt jack geomean
SLIDE 56
Summary
Generalization Of Existing Mechanisms
S-RC, Shade
Abstract Collectors Based on Generalizations
Precise, but inefficient (S-RC, Shade)
New Concrete Algorithm
Combines good properties of Yuasa and Dijkstra Suitable for Real-Time Domains
Experimental Evaluation
SLIDE 57
On-Going Work
More Transformations Formal Proof Of Correctness
Transformations Unified Abstract Collector
Formal Relation Between Algorithms
SLIDE 58
SLIDE 59
IF TIME PERMITS SLIDES
SLIDE 60
SLIDE 61
SLIDE 62
Abstract Object Layout
Barrier Reference Count Marked Shade Don’t Sweep Recorded
DATA
Computed By Mutator Computed By Collector
SLIDE 63 The Transitive Loss
Time
Collector Working Thread Working
- Installs P2
- Marks B as live
A C B
P1
D A C B
P1
D
P2
A C B D
P2
Thread Working
unreachable
A C B D
- C was not seen
- Reclaims C is
OK
Collector Working
P2
SLIDE 64 Common
uctur ure e Example xample
WriteBarrier(Obj, field, New) { if (Phase == Tracing) { Old = Obj[field]; Remember(Old); } Obj[field] = New; } WriteBarrier(Obj, field, New) { if (Phase == Tracing) { Remember(New); } Obj[field] = New; }
YUASA DIJKSTRA