Derivation And Evaluation of Concurrent Collectors Martin T. Vechev - - PowerPoint PPT Presentation

derivation and evaluation of concurrent collectors
SMART_READER_LITE
LIVE PREVIEW

Derivation And Evaluation of Concurrent Collectors Martin T. Vechev - - PowerPoint PPT Presentation

Derivation And Evaluation of Concurrent Collectors Martin T. Vechev University of Cambridge David F. Bacon, Perry Cheng and Dave Grove IBM T.J. Watson Research Center Outline Motivation and Benefits New Generalizations Abstract


slide-1
SLIDE 1

Derivation And Evaluation

  • f Concurrent Collectors

David F. Bacon, Perry Cheng and Dave Grove IBM T.J. Watson Research Center

Martin T. Vechev University of Cambridge

slide-2
SLIDE 2

Outline

 Motivation and Benefits  New Generalizations  Abstract Algorithms  Practical Algorithms  Derivations  Evaluation

slide-3
SLIDE 3

Motivation

Concurrent Collectors:  Difficult to Construct Correctly

 Initial Errors in Dijkstra and Steele Algorithms

 Difficult to Understand  Difficult to Implement

 No systematic comparisons, largely folklore

slide-4
SLIDE 4

Contributions

 Generalization Of Existing Mechanisms  Abstract Collectors Based on Generalizations

 Precise, but inefficient

 New Algorithm

 Derived from the power of generalizations

 Experimental Evaluation Of 4 Concurrent GC

slide-5
SLIDE 5

Memory Overhead (Floating Garbage) Completion Time

Benefits of Generalization

Steele Dijkstra Hybrid Yuasa

slide-6
SLIDE 6

Assumptions

 Single Collector Thread  Multiple Mutator Threads  Atomic Write Barrier  Non-Moving

slide-7
SLIDE 7

Why Is It Hard ?

 Concurrent Interleaving

Time

GC marks B

A C B

R1

D

slide-8
SLIDE 8

Why Is It Hard ?

Time

GC marks B Mutator creates R2

A C B

R1

A C B

R1 R2

D D

slide-9
SLIDE 9

Why Is It Hard ?

Time

GC marks B Mutator creates R2

A C B

R1

A C B

R1 R2

A C B

R2

Mutator removes R1

XR1

D D D

slide-10
SLIDE 10

Why Is It Hard ?

Time

GC marks B Mutator creates R2

A C B

R1

A C B

R1 R2

A C B

R2

Mutator removes R1

A C B

R2

GC reclaims C & D live – WRONG!

XR1

D D D D

slide-11
SLIDE 11

Wavefront

A C B D F E G

R1 R2

slide-12
SLIDE 12

Protection

A C B

R2 R1 D

F E G A C B D F E G

X

Installation Deletion

Remember Crossing Pointer R1 Remember R2

R1 R2

slide-13
SLIDE 13

New generalizations

 Precise Wavefront

 Shade

 Precise Counting Of Cross Pointers

 Scanned Reference Count (S-RC)

slide-14
SLIDE 14

Collector Progress (Shade)

C B

R2

D F E G

SHADE:0

A

slide-15
SLIDE 15

Collector Progress (Shade)

C B

R2

D F E G

SHADE:1

A

slide-16
SLIDE 16

Collector Progress (Shade)

C B

R2

D F E G

SHADE:2

A

slide-17
SLIDE 17

Collector Progress (Shade)

C B

R2

D F E G

SHADE:3

A

slide-18
SLIDE 18

Shade Observations

 Computed by Collector  Generalization of the tri-color abstraction  Different Granularities  Different Objects

slide-19
SLIDE 19

Scanned Reference Count (S-RC)

A C B D F E G

S-RC:0

slide-20
SLIDE 20

Scanned Reference Count (S-RC)

A C B D F E G

S-RC:1

slide-21
SLIDE 21

Scanned Reference Count (S-RC)

A C B D F E G

S-RC:2

slide-22
SLIDE 22

Scanned Reference Count (S-RC)

A C B D F E G

S-RC:1

X

slide-23
SLIDE 23

Scanned Reference Count (S-RC)

A C B D F E G

S-RC:0

X

slide-24
SLIDE 24

Scanned Reference Count (S-RC)

A C B D F E G

S-RC:0

slide-25
SLIDE 25

Outline

 Motivation and Benefits  New Generalizations  Abstract Algorithms  Practical Algorithms  Derivations  Evaluation

slide-26
SLIDE 26

Abstract Algorithms

 Utilize Shade and S-RC  Installation-Based and Deletion-Based  Mutator nominates candidates

 Does not mark objects

slide-27
SLIDE 27

Concurrent System Structure

COLLECT() do mark(); processNominated(); while (!finished); MUTATE (obj, field, target)

  • bj.field = target;

nominate(target);

slide-28
SLIDE 28

Mutator Nominates (Installation)

A C B D F E G

S-RC:0

NOMINATED OBJECT BUFFER

slide-29
SLIDE 29

Mutator Nominates (Installation)

A C B D F E G

S-RC:1

NOMINATED OBJECT BUFFER A

slide-30
SLIDE 30

Mutator Nominates (Installation)

A C B D F E G

S-RC:1

NOMINATED OBJECT BUFFER A

S-RC:1

C

slide-31
SLIDE 31

Mutator Nominates (Installation)

A C B D F E G

S-RC:0

NOMINATED OBJECT BUFFER A

S-RC:1

C

X

slide-32
SLIDE 32

After Mark (Installation)

A C B D F E G

S-RC:0

NOMINATED OBJECT BUFFER A

S-RC:1

C

COLLECT() do mark(); processNominated(); while (!finished);

slide-33
SLIDE 33

After Find (Installation)

 Collector Marks Object C

A C B D F E G

S-RC:0

NOMINATED OBJECT BUFFER A

S-RC:1

C

COLLECT() do mark(); processNominated(); while (!finished);

slide-34
SLIDE 34

Allocation

 In Installation-Based Collectors

 No difference

 In Deletion-Based Collectors

 Remembered Upon Allocation

slide-35
SLIDE 35

Allocation

A C B D F E G

NOMINATED OBJECT BUFFER

slide-36
SLIDE 36

Allocation

A C B D F E G N

NOMINATED OBJECT BUFFER N

S-RC:1

slide-37
SLIDE 37

Outline

 Motivation and Benefits  New Generalizations  Abstract Algorithms  Practical Algorithms  Derivations  Evaluation

slide-38
SLIDE 38

Practical Algorithms

 Stacks

 Non-Barriered Region  Scanned Object : behind wavefront

 S-RC affected

 Stack rescanning

 S-RC and Shade compression (tri-color)

 Reachability Effect

slide-39
SLIDE 39

Compressing S-RC (sticky bit)

A C B D F E G

S-RC:0

slide-40
SLIDE 40

Compressing S-RC (sticky bit)

A C B D F E G

S-RC:1

slide-41
SLIDE 41

Compressing S-RC (sticky bit)

A C B D F E G

S-RC:1

X

slide-42
SLIDE 42

Compressing S-RC (sticky bit)

A C B D F E G

S-RC:1

Object A – Unreachable but kept Alive

slide-43
SLIDE 43

Compressing Shade

C B D F E G

SHADE:0

A

slide-44
SLIDE 44

Collector Progress (Shade)

C B D F E G

SHADE:3

A

slide-45
SLIDE 45

Collector Progress (Shade)

C B

R1

D F E G

SHADE:3 PRECISE : 1

A

S-RC:1

slide-46
SLIDE 46

Collector Progress (Shade)

C B

R1

D F E G

SHADE:3 PRECISE : 1

A

S-RC:1

X

(NOT decremented)

slide-47
SLIDE 47

Collector Progress (Shade)

C B D F E G

SHADE:3 PRECISE : 1

A Object C – Unreachable but kept Alive

slide-48
SLIDE 48

Outline

 Motivation and Benefits  New Generalizations  Abstract Algorithms  Practical Algorithms  Derivations  Evaluation

slide-49
SLIDE 49

Deriving Dijkstra

RESCANNED STACKS INSTALLATION with 1-bit

Compress S-RC to sticky bit for ALL objects

DIJKSTRA

Compress Shade to 1-bit for ALL objects

INSTALLATION-BASED GC

Stack Regions

slide-50
SLIDE 50

Deriving Yuasa

RESCANNED STACKS DELETION with 0-bits

Compress S-RC to 0-bit for ALL objects

Deletion with NO rescanning DELETION-BASED GC

Stack Regions NO S-RC needed => NO rescanning

YUASA

Compress Shade to 1-bit for ALL objects

slide-51
SLIDE 51

Deriving a New Collector (Hybrid)

RESCANNED STACKS MIXED DELETION

Compress S-RC to sticky bit for Allocated objects Compress S-RC to 0-bit for Existing objects

DELETION-BASED GC

Stack Regions

HYBRID

Compress Shade to 1-bit for ALL objects

slide-52
SLIDE 52

Memory Overhead (Floating Garbage) Completion Time

Algorithms

Steele Dijkstra Hybrid Yuasa

slide-53
SLIDE 53

Evaluation

 First Systematic Comparison Of Concurrent Collectors  IBM J9 Production Virtual Machine

 J2ME Profile, microJit  512MB RAM, Pentium 4, 3GHz

 Comparison in terms of Execution time And Space Overhead

 Dijkstra, Steele, Yuasa, Hybrid

 Which benchmarks:

 SpecJVM98 (-s100)

 Work-Based Incremental Scheme

 Collect 9K for every 6K allocated.

slide-54
SLIDE 54

Maximum Space Usage

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

Benchmarks

Maximum Space

YU ASA DIJKSTRA STEELE H YBRID

jess db javac mtrt jack geomean

slide-55
SLIDE 55

Execution Time

0.2 0.4 0.6 0.8 1 1.2 1.4

Benchmarks End-to-End Time

YU ASA DIJKSTRA STEELE H YBRID

jess db javac mtrt jack geomean

slide-56
SLIDE 56

Summary

 Generalization Of Existing Mechanisms

 S-RC, Shade

 Abstract Collectors Based on Generalizations

 Precise, but inefficient (S-RC, Shade)

 New Concrete Algorithm

 Combines good properties of Yuasa and Dijkstra  Suitable for Real-Time Domains

 Experimental Evaluation

slide-57
SLIDE 57

On-Going Work

 More Transformations  Formal Proof Of Correctness

 Transformations  Unified Abstract Collector

 Formal Relation Between Algorithms

slide-58
SLIDE 58
slide-59
SLIDE 59

IF TIME PERMITS SLIDES

slide-60
SLIDE 60
slide-61
SLIDE 61
slide-62
SLIDE 62

Abstract Object Layout

Barrier Reference Count Marked Shade Don’t Sweep Recorded

DATA

Computed By Mutator Computed By Collector

slide-63
SLIDE 63

The Transitive Loss

Time

Collector Working Thread Working

  • Installs P2
  • Marks B as live

A C B

P1

D A C B

P1

D

P2

A C B D

P2

Thread Working

  • Deletes P1
  • C becomes

unreachable

A C B D

  • C was not seen
  • Reclaims C is

OK

  • Reclaims D: live!

Collector Working

P2

slide-64
SLIDE 64

Common

  • mmon Struct

uctur ure e Example xample

WriteBarrier(Obj, field, New) { if (Phase == Tracing) { Old = Obj[field]; Remember(Old); } Obj[field] = New; } WriteBarrier(Obj, field, New) { if (Phase == Tracing) { Remember(New); } Obj[field] = New; }

YUASA DIJKSTRA