Garbage Collection Jan Midtgaard Michael I. Schwartzbach Aarhus - - PowerPoint PPT Presentation

garbage collection
SMART_READER_LITE
LIVE PREVIEW

Garbage Collection Jan Midtgaard Michael I. Schwartzbach Aarhus - - PowerPoint PPT Presentation

Compilation 2010 Garbage Collection Jan Midtgaard Michael I. Schwartzbach Aarhus University The Garbage Collector A garbage collector is part of the runtime system It reclaims heap-allocated records (objects) that are no longer in use


slide-1
SLIDE 1

Compilation 2010

Garbage Collection

Jan Midtgaard Michael I. Schwartzbach Aarhus University

slide-2
SLIDE 2

2

Garbage Collection

The Garbage Collector

  • A garbage collector is part of the runtime system
  • It reclaims heap-allocated records (objects) that

are no longer in use

  • A garbage collector should:
  • reclaim all unused records
  • spend very little time per record
  • not cause significant delays
  • allow all of memory to be used
  • These are difficult and conflicting requirements
slide-3
SLIDE 3

3

Garbage Collection

Life Without Garbage Collection

  • Unused records must be explicitly deallocated
  • This is superior if done correctly
  • But it is easy to miss some records
  • And it is dangerous to handle pointers
  • Memory leaks in real life (ical v.2.1):

5 10 15 20 25 30 35

MB hours

slide-4
SLIDE 4

4

Garbage Collection

Record Liveness

  • Which records are still in use?
  • Ideally, those that will be accessed in the future

execution of the program

  • But that is of course undecidable...
  • Basic conservative approximation:

A record is live if it is reachable from a stack location (local variable or local stack)

  • Dead records may still point to each other
slide-5
SLIDE 5

5

Garbage Collection

A Heap With Live and Dead Records

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

slide-6
SLIDE 6

6

Garbage Collection

The Mark-and-Sweep Algorithm

  • Explore pointers starting from all stack locations

and mark all the records encountered

  • Sweep through all records in the heap and

reclaim the unmarked ones

  • Unmark all marked records
  • Assumptions:
  • we know the start and size of each record in memory
  • we know which record fields are pointers
  • reclaimed records are kept in a freelist
slide-7
SLIDE 7

7

Garbage Collection

Pseudo Code for Mark-and-Sweep

function DFS(x) { if (x is a heap pointer) if (x is not marked) { mark x; for (i=1; i<=|x|; i++) DFS(x.fi) } } function Sweep() { p = first address in heap; while (p<last address in heap) { if (p is marked) unmark p; else { p.f1 = freelist; freelist = p; } p = next object pointer after p } } function Mark() { foreach (v in a stack frame) DFS(v); }

slide-8
SLIDE 8

8

Garbage Collection

Marking and Sweeping (1/11)

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

slide-9
SLIDE 9

9

Garbage Collection

Marking and Sweeping (2/11)

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

slide-10
SLIDE 10

10

Garbage Collection

Marking and Sweeping (3/11)

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

slide-11
SLIDE 11

11

Garbage Collection

Marking and Sweeping (4/11)

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

slide-12
SLIDE 12

12

Garbage Collection

Marking and Sweeping (5/11)

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

slide-13
SLIDE 13

13

Garbage Collection

Marking and Sweeping (6/11)

p q r 37 15 12 7 37 59 9 20 freelist

00017 00008 00042 00113 00249 00371 00738

slide-14
SLIDE 14

14

Garbage Collection

Marking and Sweeping (6/11)

p q r 37 15 12 7 37 59 9 20 freelist

00017 00008 00042 00113 00249 00371 00738

slide-15
SLIDE 15

15

Garbage Collection

Marking and Sweeping (6/11)

p q r 37 15 12 7 37 59 9 20 freelist

00017 00008 00042 00113 00249 00371 00738

slide-16
SLIDE 16

16

Garbage Collection

Marking and Sweeping (7/11)

p q r 37 15 12 7 37 59 20 freelist

00017 00008 00042 00113 00249 00371 00738

slide-17
SLIDE 17

17

Garbage Collection

Marking and Sweeping (8/11)

p q r 37 15 12 7 37 59 20 freelist

00017 00008 00042 00113 00249 00371 00738

slide-18
SLIDE 18

18

Garbage Collection

Marking and Sweeping (9/11)

p q r 37 15 12 7 37 59 20 freelist

00017 00008 00042 00113 00249 00371 00738

slide-19
SLIDE 19

19

Garbage Collection

Marking and Sweeping (10/11)

p q r 37 15 12 7 37 59 20 freelist

00017 00008 00042 00113 00249 00371 00738

slide-20
SLIDE 20

20

Garbage Collection

Marking and Sweeping (11/11)

p q r 37 15 12 7 37 59 20 freelist

00017 00008 00042 00113 00249 00371 00738

slide-21
SLIDE 21

21

Garbage Collection

Analysis of Mark-and-Sweep

  • Assume the heap has H words
  • Assume that R words are reachable
  • The cost of garbage collection is:

c1R + c2H

  • The cost per reclaimed word is:

(c1R + c2H)/(H - R)

  • If R is close to H, then this is expensive
slide-22
SLIDE 22

22

Garbage Collection

Allocation

  • The freelist must be searched for a record

that is large enough to provide the requested memory

  • Free records may be sorted by size
  • The freelist may become fragmented:

containing many small free records but none that is large enough

  • Defragmentation joins adjacent free records
slide-23
SLIDE 23

23

Garbage Collection

Pointer Reversal

  • The DFS recursion stack could have size H
  • It has at least size log(H)
  • This may be too much (after all, memory is low)
  • The recursion stack may be cleverly embedded in

the fields of the marked records

  • This technique makes mark-and-sweep practical
slide-24
SLIDE 24

24

Garbage Collection

The Reference Counting Algorithm

  • Maintain a counter of the total number of

references to each record

  • For each assignment, update the counters
  • A record is dead when its counter is zero
  • Advantages:
  • catches dead records immediately
  • does not cause long pauses
  • Disadvantages:
  • cannot detect cycles of dead records
  • is rather expensive
slide-25
SLIDE 25

25

Garbage Collection

Pseudo Code for Reference Counting

function Increment(x) { x.count++; } function Decrement(x) { x.count--; if (x.count==0) PutOnFreeList(x); } function PutOnFreelist(x) { Decrement(x.f1); x.f1 = freelist; freelist = x; } function RemoveFromFreelist(x) { for (i=2; i<=|x|; i++) Decrement(x.fi); }

slide-26
SLIDE 26

26

Garbage Collection

The Stop-and-Copy Algorithm

  • Divide the heap space into two parts
  • Only use one part at a time
  • When it runs full, copy live records to the other

part of the heap space

  • Then switch the roles of the two parts
  • Advantages:
  • fast allocation (no freelist)
  • avoids fragmentation
  • Disadvantage:
  • wastes half your memory
slide-27
SLIDE 27

27

Garbage Collection

Before and After Stop-and-Copy

8 7 6 4 3 5

from-space to-space next limit

8 7 6 5 4 3

to-space from-space limit next

slide-28
SLIDE 28

28

Garbage Collection

Pseudo Code for Stop-and-Copy

function Forward(x) { if (x  from-space) { if (x.f1  to-space) return x.f1; else for (i=1; i<|x|; i++) next.fi = x.fi; x.f1 = next; next = next + sizeof(x); return x.f1; } else return x; } function Copy() { scan = next = start of to-space; foreach (v in a stack frame) v = Forward(v); while (scan < next) { for (i=1; i<=|scan|; i++) scan.fi = Forward(scan.fi); scan = scan + sizeof(scan); } }

slide-29
SLIDE 29

29

Garbage Collection

Stopping and Copying (1/13)

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

from-space to-space

15

00017

slide-30
SLIDE 30

30

Garbage Collection

Stopping and Copying (2/13)

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

from-space to-space

15

00017

15

09000

slide-31
SLIDE 31

31

Garbage Collection

Stopping and Copying (3/13)

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

from-space to-space

15

00017

15

09000

slide-32
SLIDE 32

32

Garbage Collection

Stopping and Copying (4/13)

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

from-space to-space

15

00017

15

09000

37

09012

slide-33
SLIDE 33

33

Garbage Collection

Stopping and Copying (5/13)

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

from-space to-space

15

00017

15

09000

37

09012

slide-34
SLIDE 34

34

Garbage Collection

Stopping and Copying (6/13)

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

from-space to-space

15

00017

15

09000

37

09012

12

09024

slide-35
SLIDE 35

35

Garbage Collection

Stopping and Copying (7/13)

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

from-space to-space

15

00017

15

09000

37

09012

12

09024

slide-36
SLIDE 36

36

Garbage Collection

Stopping and Copying (8/13)

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

from-space to-space

15

00017

15

09000

37

09012

12

09024

slide-37
SLIDE 37

37

Garbage Collection

Stopping and Copying (9/13)

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

from-space to-space

15

00017

15

09000

37

09012

12

09024

20

00249

slide-38
SLIDE 38

38

Garbage Collection

Stopping and Copying (10/13)

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

from-space to-space

15

00017

15

09000

37

09012

12

09024

20

00936

slide-39
SLIDE 39

39

Garbage Collection

Stopping and Copying (11/13)

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

from-space to-space

15

00017

15

09000

37

09012

12

09024

20

00936

59

00948

slide-40
SLIDE 40

40

Garbage Collection

Stopping and Copying (12/13)

p q r 37 15 12 7 37 59 9 20

00017 00008 00042 00113 00249 00371 00738

from-space to-space

15

00017

15

09000

37

09012

12

09024

20

00936

59

00948

slide-41
SLIDE 41

41

Garbage Collection

Stopping and Copying (13/13)

p q r 37

to-space from-space

15

09000

37

09012

12

09024

20

00936

59

00948

slide-42
SLIDE 42

42

Garbage Collection

Analysis of Stop-and-Copy

  • Assume the heap has H words
  • Assume that R words are reachable
  • The cost of garbage collection is:

c3R

  • The cost per reclaimed word is:

c3R/(H/2 - R)

  • This has no lower bound as H grows
slide-43
SLIDE 43

43

Garbage Collection

Recognizing Records and Pointers

  • Earlier assumptions:
  • we know the start and size of each record in memory
  • we know which record fields are pointers
  • For object-oriented languages, each record

already contains a pointer to a class descriptor

  • For general languages, we must sacrifice a few

bytes per record

  • For the stack frame:
  • use a bit per stack location
  • use a table per program point
slide-44
SLIDE 44

44

Garbage Collection

Conservative Garbage Collection

  • For mark-and-sweep, we may use a conservative

approximation to recognize pointers

  • A word is a pointer if it looks like one (its value is

an address in the range of the heap space)

  • This will recognize too many pointers
  • Thus, too many records will be marked as live
  • This does not work for stop-and-copy...
slide-45
SLIDE 45

45

Garbage Collection

Triggering Garbage Collection

  • A collection must be triggered when there is no

more free heap space

  • But this may cause a long pause in the execution
  • Collections may be triggered by heuristics:
  • after a certain number of records have been allocated
  • when only a certain fraction of the heap is free
  • after a certain period of time
  • when the program is not busy
slide-46
SLIDE 46

46

Garbage Collection

Generational Collection

  • Observation: the young die quickly!
  • The collector should focus on young records
  • Divide the heap into generations: G0, G1, G2, ...
  • All records in Gi are younger than records in Gi+1
  • Collect G0 often, G1 less often, and so on
  • Promote a record from Gi to Gi+1 when it survives

several collections

slide-47
SLIDE 47

47

Garbage Collection

Collecting a Generation

  • How to collect the G0 generation:
  • roots are no longer just stack locations, but also

pointers from G1, G2, ...

  • it could be expensive to find those pointers
  • fortunately they are rare, so we can remember them
  • Ways to remember pointers:
  • maintain a set of all updated records
  • mark pages of memory that contain updated records

(using hardware or software)

slide-48
SLIDE 48

48

Garbage Collection

Incremental Collection

  • A garbage collector creates (long) pauses
  • This is bad for real-time programs
  • An incremental collector runs concurrently with

the program (in a separate thread)

  • It must now handle simultaneous heap updates
slide-49
SLIDE 49

49

Garbage Collection

The Tricoloring Algorithm

  • Records are colored black, grey, or white
  • visited and all direct children visited
  • visited, but not all direct children visited
  • not visited
  • The program may update the heap as it pleases,

but must maintain an invariant: no black record points to a white record

slide-50
SLIDE 50

50

Garbage Collection

Function Tricolor() { color all records white; color all roots grey; while (more grey records) { x = a grey record; for (i=1; i<=|x|; i++) if x.fi is white then color x.fi grey; color x black; } reclaim all white records; }

Pseudo Code for Tricoloring

slide-51
SLIDE 51

51

Garbage Collection

Maintaining the Invariant

  • Two possibilities: Write barriers and read barriers
  • Write barriers:

x.fi = y; black2grey(x).fi = y;

  • Read barriers:

x.fi = y; x.fi = white2grey(y);

  • Requires synchronizations between the running

program and the collector

slide-52
SLIDE 52

52

Garbage Collection

Garbage Collection in Java

  • Sun's HotSpot VM uses by default:
  • two generations: "nursery" and "old objects"
  • the nursery is collected using stop-and-copy
  • the old objects are collected using mark-and-sweep in

a version that also compacts the live records

  • For real-time applications:
  • use option -Xincgc
  • a more sophisticated incremental algorithm
  • 10% slower
  • but with shorter pauses
slide-53
SLIDE 53

53

Garbage Collection

Finalizers

  • If an object has a finalize() method, it will be

invoked before the object is reclaimed by the garbage collector

  • But there is no guarantee how soon this happens
  • This method may actually resurrect the object
  • Typically, the garbage collector needs an extra

pass to find out if the dead really stay dead

slide-54
SLIDE 54

54

Garbage Collection

Interacting With the Garbage Collector

  • Trigger the garbage collector manually:
  • System.gc();
  • The java.lang.ref package allows variations
  • f the pointer concept:
  • SoftReference
  • WeakReference
slide-55
SLIDE 55

55

Garbage Collection

Soft References

  • The garbage collector may reclaim an
  • bject that has soft references but no
  • rdinary (strong) references
  • This is typically used for caching:

SoftReference sr; ... Image img = (sr == null)? null : sr.get() ; if (img == null) { img = getImage("huge.gif"); sr = new SoftReference(img); } display(img); img = null;

slide-56
SLIDE 56

56

Garbage Collection

Weak References

  • The garbage collector will reclaim an object that

has weak references but no strong or soft references

  • This is used in java.util.WeakHashMap,

where keys are automatically removed when they are no longer in use