garbage collection
play

Garbage Collection COMP 520: Compiler Design (4 credits) Professor - PowerPoint PPT Presentation

COMP 520 Winter 2016 Garbage Collection (1) Garbage Collection COMP 520: Compiler Design (4 credits) Professor Laurie Hendren, hendren@cs.mcgill.ca q q q q q q q q q q q q q q q q q q


  1. COMP 520 Winter 2016 Garbage Collection (1) Garbage Collection COMP 520: Compiler Design (4 credits) Professor Laurie Hendren, hendren@cs.mcgill.ca ✲ ✛ ✲ q ✲ q q q q q q q q ✲ ✛ ✛ q ✲ ✛ q q q q q q q q ✛ ✛ ✲ ✛ q q q q ✲ ✛ ✲ q ✛ q q q q WendyTheWhitespace-IntolerantDragon ✛ ✛ ✛ q q WendyTheWhitespacenogarDtnarelotnI q q q q ✲ ✲ ✲ ✲ ✛ q q q q

  2. COMP 520 Winter 2016 Garbage Collection (2) A garbage collector is part of the run-time system: it reclaims heap-allocated records that are no longer used. A garbage collector should: • reclaim all unused records; • spend very little time per record; • not cause significant delays; and • allow all of memory to be used. These are difficult and often conflicting requirements.

  3. COMP 520 Winter 2016 Garbage Collection (3) Life without garbage collection: MB 31 30 • unused records must be explicitly deal- 29 28 27 located; 26 25 24 23 22 • superior if done correctly; 21 20 19 18 • but it is easy to miss some records; and 17 16 15 14 • it is dangerous to handle pointers. 13 12 11 10 9 8 7 6 Memory leaks in real life ( ical v.2.1 ): 5 4 3 2 1 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 hours

  4. COMP 520 Winter 2016 Garbage Collection (4) Which records are dead , i.e. no longer in use? Ideally, records that will never be accessed in the future execution of the program. But that is of course undecidable... Basic conservative assumption: A record is live if it is reachable from a stack-based program variable, otherwise dead. Dead records may still be pointed to by other dead records.

  5. COMP 520 Winter 2016 Garbage Collection (5) 12 ✛ r p r r q 37 ✲ 15 r r r r ✛ r 7 ✲ 37 ✛ A heap with live and dead records: r r 59 ✛ r r r 9 ✲ ✲ 20 r r

  6. COMP 520 Winter 2016 Garbage Collection (6) The mark-and-sweep algorithm: • explore pointers starting from the program variables, and mark all records encountered; • sweep through all records in the heap and reclaim the unmarked ones; also • unmark all marked records. Assumptions: • we know the size of each record; • we know which fields are pointers; and • reclaimed records are kept in a freelist .

  7. COMP 520 Winter 2016 Garbage Collection (7) Pseudo code for mark-and-sweep: function Mark() for each program variable v do DFS( v ) function DFS( x ) if x is a pointer into the heap then function Sweep() if record x is not marked then p := first address in heap mark record x while p < last address in heap do for i := 1 to | x | do if record p is marked then DFS( x.f i ) unmark record p else p.f 1 := freelist freelist := p p := p +sizeof(record p )

  8. COMP 520 Winter 2016 Garbage Collection (8) Marking and sweeping: ✛ ✛ 12 12 r r p r r p r q 37 q 37 ✲ ✲ 15 15 r r r r r r r ✛ ✛ r r 7 7 ✲ ✛ ✲ ✛ 37 37 r r r r ✛ ✛ 59 59 r r r r ✲ r r r ✲ freelist 9 9 ✲ ✲ 20 20 r r r r

  9. COMP 520 Winter 2016 Garbage Collection (9) Analysis of mark-and-sweep: • assume the heap has size H words; and • assume that R words are reachable. The cost of garbage collection is: c 1 R + c 2 H Realistic values are: 10 R + 3 H The cost per reclaimed word is: c 1 R + c 2 H H − R • if R is close to H , then this is expensive; • the lower bound is c 2 ; • increase the heap when R > 0 . 5 H ; then • the cost per word is c 1 + 2 c 2 ≈ 16 .

  10. COMP 520 Winter 2016 Garbage Collection (10) Other relevant issues: • The DFS recursion stack could have size H (and has at least size log H ), which may be too much; however, the recursion stack can cleverly be embedded in the fields of marked records (pointer reversal). • Records can be kept sorted by sizes in the freelist . Records may be split into smaller pieces if necessary. • The heap may become fragmented : containing many small free records but none that are large enough.

  11. COMP 520 Winter 2016 Garbage Collection (11) The reference counting algorithm: • maintain a counter of the references to each record; • for each assignment, update the counters appropriately; and • a record is dead when its counter is zero. Advantages: • is simple and attractive; • catches dead records immediately; and • does not cause long pauses. Disadvantages: • cannot detect cycles of dead records; and • is much too expensive.

  12. COMP 520 Winter 2016 Garbage Collection (12) Pseudo code for reference counting: function Increment( x ) function PutOnFreelist( x ) x .count := x .count +1 Decrement( x.f 1 ) x.f 1 := freelist function Decrement( x ) freelist := x x .count := x .count − 1 if x .count=0 then function RemoveFromFreelist( x ) for i := 2 to | x | do PutOnFreelist( x ) Decrement( x.f i )

  13. COMP 520 Winter 2016 Garbage Collection (13) The stop-and-copy algorithm: • divide the heap into two parts; • only use one part at a time; • when it runs full, copy live records to the other part; and • switch the roles of the two parts. Advantages: • allows fast allocation (no freelist ); • avoids fragmentation; • collects in time proportional to R ; and • avoids stack and pointer reversal. Disadvantage: • wastes half your memory.

  14. COMP 520 Winter 2016 Garbage Collection (14) Before and after stop-and-copy: ✲ q q q q q q ✛ q q q q q ✲ q q q q ✛ q q ✲ q q next ✛ q q ✛ q q q limit ✛ ✛ from-space to-space to-space from-space next limit • next and limit indicate the available heap space; and • copied records are contiguous in memory.

  15. COMP 520 Winter 2016 Garbage Collection (15) Pseudo code for stop-and-copy: function Forward( p ) if p ∈ from-space then if p.f 1 ∈ to-space then function Copy() return p.f 1 scan := next := start of to-space for each program variable v do else for i := 1 to | p | do v := Forward( v ) next . f i := p.f i while scan < next do for i := 1 to | scan | do p.f 1 := next next := next + sizeof(record p ) scan .f i := Forward( scan .f i ) return p.f 1 scan := scan + sizeof(record scan ) else return p

  16. COMP 520 Winter 2016 Garbage Collection (16) Snapshots of stop-and-copy: ✲ 12 15 ✛ ✲ q ✲ q q q p p q q q q q 37 q q 37 ✲ ✛ 15 37 ✛ q ✲ ✛ scan r r q q q q q q q q 12 ✛ ✛ ✲ ✛ q q 7 7 q q ✲ ✛ 37 ✲ q next ✛ q q q q ✛ 59 ✛ 59 ✛ q q q q q q 9 9 ✲ ✲ ✲ 20 ✲ 20 ✛ q q q q before after forwarding p and q and scanning 1 record

  17. COMP 520 Winter 2016 Garbage Collection (17) Analysis of stop-and-copy: • assume the heap has size H words; and • assume that R words are reachable. The cost of garbage collection is: c 3 R A realistic value is: 10 R The cost per reclaimed word is: c 3 R H 2 − R • this has no lower bound as H grows; • if H = 4 R then the cost is c 3 ≈ 10 .

  18. COMP 520 Winter 2016 Garbage Collection (18) Earlier assumptions: • we know the size of each record; and • we know which fields are pointers. For object-oriented languages, each record already contains a pointer to a class descriptor. For general languages, we must sacrifice a few bytes per record.

  19. COMP 520 Winter 2016 Garbage Collection (19) We use mark-and-sweep or stop-and-copy. But garbage collection is still expensive: ≈ 100 instructions for a small object! Each algorithm can be further extended by: • generational collection (to make it run faster); and • incremental (or concurrent) collection (to make it run smoother).

  20. COMP 520 Winter 2016 Garbage Collection (20) Generational collection: • observation: the young die quickly; • hence the collector should focus on young records; • divide the heap into generations: G 0 , G 1 , G 2 , . . . ; • all records in G i are younger than records in G i +1 ; • collect G 0 often, G 1 less often, and so on; and • promote a record from G i to G i +1 when it survives several collections.

  21. COMP 520 Winter 2016 Garbage Collection (21) How to collect the G 0 generation: • it might be very expensive to find those pointers; • fortunately, they are rare; so • we can try to remember them. Ways to remember: • maintain a list of all updated records (use marks to make this a set); or • mark pages of memory that contain updated records (in hardware or software).

  22. COMP 520 Winter 2016 Garbage Collection (22) Incremental collection: • garbage collection may cause long pauses; • this is undesirable for interactive or real-time programs; so • try to interleave the garbage collection with the program execution. Two players access the heap: • the mutator : creates records and moves pointers around; and • the collector : tries to collect garbage. Some invariants are clearly required to make this work. The mutator will suffer some slowdown to maintain these invariants.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend