CS502: Compiler Design Runtime Environments Manas Thakur Fall 2020 - - PowerPoint PPT Presentation

cs502 compiler design runtime environments manas thakur
SMART_READER_LITE
LIVE PREVIEW

CS502: Compiler Design Runtime Environments Manas Thakur Fall 2020 - - PowerPoint PPT Presentation

CS502: Compiler Design Runtime Environments Manas Thakur Fall 2020 Going backstage Character stream Machine-Independent Machine-Independent Lexical Analyzer Lexical Analyzer Code Optimizer Code Optimizer B a c k e n d Intermediate


slide-1
SLIDE 1

CS502: Compiler Design Runtime Environments Manas Thakur

Fall 2020

slide-2
SLIDE 2

Manas Thakur CS502: Compiler Design 2

Going backstage

Lexical Analyzer Lexical Analyzer Syntax Analyzer Syntax Analyzer Semantic Analyzer Semantic Analyzer Intermediate Code Generator Intermediate Code Generator Character stream Token stream Syntax tree Syntax tree Intermediate representation Machine-Independent Code Optimizer Machine-Independent Code Optimizer Code Generator Code Generator Target machine code Intermediate representation Machine-Dependent Code Optimizer Machine-Dependent Code Optimizer Target machine code Symbol Table

F r o n t e n d B a c k e n d

slide-3
SLIDE 3

Manas Thakur CS502: Compiler Design 3

What all from the runtime interests a compiler?

  • Memory

– holds data and code – Our interest: Storage layouts

  • Processor(s)

– perform(s) computations in registers – Our interest: Register allocation

  • Instruction set

– defines the primitives available for execution – Our interest: Code generation

  • Ultimate aim: Performance

– in terms of time and memory – Our interest: Optimization

slide-4
SLIDE 4

Manas Thakur CS502: Compiler Design 4

Typical memory subdivision while executing a program Code Static Heap Stack Free memory

Instructions Data that outlives procedures

(malloc, new)

Data local to procedures

(variables, parameters, temporaries)

Data across all procedures

(globals, constants)

Compiler’s responsibility: To reserve space for all these kinds of memory

slide-5
SLIDE 5

Manas Thakur CS502: Compiler Design 5

Procedure abstraction

  • A namespace for locals and parameters
  • Also the return value
  • Compiler passes (recall ICG) introduce temporaries
  • We need to reserve space for all of them
  • Operations:

– Call another procedure

  • caller vs callee

– Return from the current procedure

slide-6
SLIDE 6

Manas Thakur CS502: Compiler Design 6

How do we call and return from a procedure?

  • In the caller:

– Save state of current procedure

  • Program counter (where to resume)
  • Registers (holding current computations)

– Store arguments in a callee-accessible location – Transfer control-flow

  • In the callee:

– Collect parameters – Declare variables – Perform computations (perhaps in temporaries)

  • May involve accessing globals

– Return to caller

  • Store return value in a caller-accessible location

Some of these tasks can be performed either by the caller or by the callee e.g., caller-save vs callee-save registers

slide-7
SLIDE 7

Manas Thakur CS502: Compiler Design 7

Supporting procedure calls

  • Only one procedure runs at a time

– Unless?

  • If foo calls bar, bar returns before foo

– bar comes last but goes fjrst – Last In, First Out!

  • Procedure calls are modelled using a stack

– Called control-stack or “the stack”

  • Each active procedure has an activation record or frame in

the stack

slide-8
SLIDE 8

Manas Thakur CS502: Compiler Design 8

Activation record (a general structure)

Actual parameters Return value(s) Control/access link Saved machine status Local data Temporaries Previous frame Next frame

Frame pointer

(boundary of current frame)

Stack pointer

(used to access

  • ther items)

(point to callers

  • r other frames)

(e.g., register values while transferring control-fmow)

slide-9
SLIDE 9

Manas Thakur CS502: Compiler Design 9

Addressing items in activation records

Actual parameters Return value(s) Control/access link Saved machine status Local data Temporaries

SP

Growing addresses SP - offset SP + offset

slide-10
SLIDE 10

Manas Thakur CS502: Compiler Design 10

Activation records: Design decisions

  • Items communicated between caller and

callee placed near the caller

– Parameters – Return value – Advantage?

  • Fixed-length items placed together

– Parameters – Return value – Control link

  • Space requirement of locals/temporaries

sometimes not known early

Actual parameters Return value(s) Control/access link Saved machine status Local data Temporaries Caller’s AR

slide-11
SLIDE 11

Manas Thakur CS502: Compiler Design 11

Complications

  • Access to non-local data

– Store globals at a “globally known” location (recall Static from Slide 4?)

  • Nested procedures

– Similar to yet different from nested blocks – Store nesting-depth with each variable – Use access links to point to the frames

  • f enclosing procedures
  • Passing procedures as arguments

– Or as return values – Functional languages

  • Challenge:

– Doing all this efficiently

Some other time!

slide-12
SLIDE 12

Manas Thakur CS502: Compiler Design 12

Referencing variables with access links

  • An access link points to the most recent activation of the

procedure that contains the current procedure

– When can we have multiple activations of a procedure on the

control-stack?

  • Suppose

– Np is the nesting-depth of procedure p that refers to non-local

variable a

– Na is the nesting-depth of the procedure, say q, that defines a

  • Np – Na access links would have to be traversed when in

procedure p to get to the activation record of q

  • Can we make this more efficient?
slide-13
SLIDE 13

Manas Thakur CS502: Compiler Design 13

Displays as an alternative to access links

  • Traversing access links one-by-one may be costly in case of a

high nesting-depth difference for the variable to be accessed

  • Idea:

– Use a global array with the pointer to the most recently active

procedure with nesting-depth i at index i

– The array is called a display (say d) – Advantage:

  • If I am a procedure m with nesting-depth k,

and I want to access a variable a with nesting-depth l ≤ k, I only have to follow a maximum of two pointers:

– One to d[l], which gives the AR defining a – Another for the offset of a from the SP of the obtained AR

Next class: Heap management.

slide-14
SLIDE 14

CS502: Compiler Design Runtime Environments (Cont.) Manas Thakur

Fall 2020

slide-15
SLIDE 15

Manas Thakur CS502: Compiler Design 15

Heap

  • A chunk of memory used usually for dynamically allocated data

– using malloc, calloc, new, etc.

  • Goal:

– Have as much space as possible to serve allocation requests

  • Challenge:

– When to deallocate a previously allocated chunk – Why didn’t this challenge exist with a stack?

  • Memory associated with a frame gets popped out automatically once

the corresponding procedure finishes execution.

slide-16
SLIDE 16

Manas Thakur CS502: Compiler Design 16

Memory allocation

  • Simple task
  • Keep a pointer to the first available memory location
  • Allocate the requested block when a request comes

– Well, there are again multiple ways to do this:

  • First fit
  • Best fit
  • Read OS books for more!
  • Move the pointer to the next free location
  • Challenge:

– Memory eventually fills up! – Need deallocations.

slide-17
SLIDE 17

Manas Thakur CS502: Compiler Design 17

Explicit deallocation

  • Programmer’s task to deallocate memory
  • Most languages till 1990s had explicit deallocation

– Exception: Lisp had garbage collection far back in 1958!

  • Examples:

– free in C – delete in C++

  • Problem:

– Often difficult to visualize when to free memory – Deleting conservatively as well as aggressively may lead to

memory-related issues

slide-18
SLIDE 18

Manas Thakur CS502: Compiler Design 18

Problems with bad explicit deallocation

  • Too conservative:

– Memory leaks

  • Memory fills up while running applications
  • Buy next smartphone with higher GBs of RAM!

– What if it’s a high-end server at a government institute? – What if it’s an iPhone? :-)

  • Too aggressive:

– Dangling pointers

  • A pointer to freed memory
  • Using such pointers might lead to weird (and harmful) behaviour
slide-19
SLIDE 19

Manas Thakur CS502: Compiler Design 19

Implicit deallocation of memory

  • Also called garbage collection
  • Motto:

– Don’t trust the programmer

Instead:

– Trust the compiler writer!

  • Idea:

– Memory that is no longer in use should be reclaimed automatically

  • Examples:

– OO: Java, Smalltalk – Functional: Lisp, ML, Haskell – Logic: Prolog – Scripting: Awk, Perl

slide-20
SLIDE 20

Manas Thakur CS502: Compiler Design 20

Garbage collection schemes

  • One shot:

– Pause the program – Give full control to a GC pass – Hope the situation improves once GC is over

  • On-the-fly (aka incremental):

– Perform some GC actions periodically

  • say after each call to new, and/or every time a procedure returns

– Sometimes a one-shot GC may be kept as backup

  • Concurrent:

– Separate thread for GC – Relatively complicated, but gaining popularity

slide-21
SLIDE 21

Manas Thakur CS502: Compiler Design 21

Garbage collection algorithms

  • Reference counting
  • Mark and sweep
  • Baker’s
  • Lieberman’s
  • Generational
  • Region-based
  • Parallel
  • Many in the JVM itself:

– G1, Parallel, Concurrent mark and sweep (CMS), Serial,

Shenandoah, ZGC ... list keeps growing.

We will get a glimpse of the colored ones

slide-22
SLIDE 22

Manas Thakur CS502: Compiler Design 22

Reference counting GC

  • With each allocated chunk (from now on, object) obj:

– maintain the count of references (or pointers) that point to obj

  • Operations and actions:

– Allocate obj (e.g., q = new T(), such that the allocated chunk is

named obj):

  • Initialize obj.rc to one

– Copy obj (e.g., using p = q):

  • ++obj.rc

– Reference changes to obj’ (e.g., q = r):

  • --obj.rc

– obj.rc becomes zero:

  • Reclaim obj
slide-23
SLIDE 23

Manas Thakur CS502: Compiler Design 23

Reference counting: Disadvantages

  • Expensive to maintain counts with each object

– Extra storage – Extra computation with each statement that updates references

  • Cyclic data structures

– References keep pointing to each other, though no external

reference may be pointing to the data structure as a whole

  • Memory fragmentation

– No inbuilt compaction

  • Still a simple technique

– Example: Used by the UNIX kernel to recover file descriptors

slide-24
SLIDE 24

Manas Thakur CS502: Compiler Design 24

Mark and sweep GC

  • Two phases:

– Mark – Sweep

  • Mark phase:

– Marks all objects that are reachable from at least one reference

  • Sweep phase:

– Reclaims all objects that were not marked in the mark phase

  • Advantages:

– Cost is incurred only during the GC phase(s) – Compaction can be performed before actually reclaiming the

memory during the sweep phase

slide-25
SLIDE 25

Manas Thakur CS502: Compiler Design 25

Generational GC

  • (Statistical) Idea:

– Most objects fall out of use very quickly (high infant mortality!) – Conversely, an old object might stay in business for some more time

  • Divide objects into two or more classes (generations):

– Older generation – Younger generation

  • Each GC pass checks objects only in the younger generation for

reclamation, and keeps moving objects to the older generation

  • Trigger full GC (of any kind) less frequently
  • Overall, reduces GC cost while still being correct
slide-26
SLIDE 26

Manas Thakur CS502: Compiler Design 26

Swachch Heap Abhiyaan

  • Motto: Help the garbage collector
  • Why?

– GC is costly; pauses can be noticed while running large programs

  • How?

– Program analyses:

  • Escape analysis (identify and allocate objects local to a procedure on

the stack)

– Programmer-guided:

  • Assign null to no-longer needed reference variables
  • Mix explicit and implicit deallocation
  • Easier said than done; popular subjects of research.
slide-27
SLIDE 27

Manas Thakur CS502: Compiler Design 27

What next?

  • Code Generation and Optimization (CGO)

– A very interleaved and interesting topic – Final outcome:

  • Target code that is also efficient