Runtime System COMP 524: Programming Languages Based in part on - - PowerPoint PPT Presentation

runtime system
SMART_READER_LITE
LIVE PREVIEW

Runtime System COMP 524: Programming Languages Based in part on - - PowerPoint PPT Presentation

Runtime System COMP 524: Programming Languages Based in part on slides and notes by J. Erickson, S. Krishnan, B. Brandenburg, S. Olivier, A. Block, and others What is the Runtime System (RTS)? Language runtime environment. OS view : RTS


slide-1
SLIDE 1

COMP 524: Programming Languages

Based in part on slides and notes by J. Erickson, S. Krishnan, B. Brandenburg, S. Olivier, A. Block, and others

Runtime System

slide-2
SLIDE 2
  • B. Ward — Spring 2014

What is the Runtime System (RTS)?

Language runtime environment.

➡OS view: RTS is part of the user program. ➡But RTS was not programmed by the language user. ➡The RTS is everything not part of the OS and not explicitly

provided by the user (i.e., the program or 3rd party libraries).

2

Operating System Interpreter / Virtual Machine Hardware Standard Library Program

Interpreted

Operating System Hardware Standard Library Program

Compiled

slide-3
SLIDE 3
  • B. Ward — Spring 2014

What is the Runtime System (RTS)?

Language runtime environment.

➡OS view: RTS is part of the user program. ➡But RTS was not programmed by the language user. ➡The RTS is everything not part of the OS and not explicitly

provided by the user (i.e., the program or 3rd party libraries).

3

Operating System Interpreter / Virtual Machine Hardware Standard Library Program

Interpreted

Operating System Hardware Standard Library Program

Compiled

Examples: memory allocator, garbage collector, support for runtime casts, exception handling infrastructure, just-in-time (JIT) compiler, support for closure and anonymous functions, lazy evaluation, dynamic type checking, byte code verifier, OS abstraction layers (if any), class- loading and plugin support (if any), multi-threading support, remote procedure calls (e.g., Java RMI), …

slide-4
SLIDE 4
  • B. Ward — Spring 2014

What is the Runtime System (RTS)?

Language runtime environment.

➡OS view: RTS is part of the user program. ➡But RTS was not programmed by the language user. ➡The RTS is everything not part of the OS and not explicitly

provided by the user (i.e., the program or 3rd party libraries).

4

Operating System Interpreter / Virtual Machine Hardware Standard Library Program

Interpreted

Operating System Hardware Standard Library Program

Compiled

RTS: the infrastructure required to (transparently) realize
 higher-level language abstractions at runtime.

slide-5
SLIDE 5
  • B. Ward — Spring 2014

Our Focus

We’ll discuss three RTS components.

➡Garbage collection. ➡Just-in-Time compilation. ➡Security issues.

5

slide-6
SLIDE 6
  • B. Ward — Spring 2014

Heap Management

Allocation and deallocation of objects on the heap.

➡Arbitrary object lifetime. ➡Traditional language design:

  • Code, static, and runtime stack managed by compiler / interpreter.
  • Heap managed by programmer.

6

Code

Static

Runtime stack Heap

Simplified 32-bit Memory Model 0x0 Increasing Virtual Addresses 0xffffffff

slide-7
SLIDE 7
  • B. Ward — Spring 2014

Garbage and Memory Reclamation

Memory reclamation.

➡An object is “garbage” if it is not going to

be used again.

➡Memory holding garbage must be

reclaimed in long-running programs.

7

slide-8
SLIDE 8
  • B. Ward — Spring 2014

Explicit Heap Management

Classic imperative approach

➡malloc/free, new/delete, etc. ➡Problems: dangling pointers, memory leaks… ➡Experience suggests that programmers, on

average, are not very good at correctly identifying garbage.

8

slide-9
SLIDE 9
  • B. Ward — Spring 2014

Garbage Collection

Automatic heap management.

➡The RTS, not the programmer, should manage

memory.

➡First developed for Lisp in 1958 ➡Merits hotly contested until the ‘90s.

9

slide-10
SLIDE 10
  • B. Ward — Spring 2014

Languages Using GC

➡Essential in functional languages

  • e.g., Haskell, ML.

➡Key feature of scripting languages

  • e.g., Python, Perl.

➡Increasingly popular in modern imperative

languages

  • e.g, Java, C#.

10

slide-11
SLIDE 11
  • B. Ward — Spring 2014

Reachable Objects

11

Root Set The set of objects that are immediately available to a program without following any pointers/references.

Object graph.

➡Allocated objects form a graph.

  • Vertices: objects.
  • Edges: references/pointers.

➡Any non-garbage object must be reachable from

the root set.

slide-12
SLIDE 12
  • B. Ward — Spring 2014

Detecting Garbage

➡When is an object no longer being

referenced?

➡False positives: program crash. ➡False negatives: memory leak.

12

slide-13
SLIDE 13
  • B. Ward — Spring 2014

Garbage Collection Techniques

Garbage collection techniques.

➡Reference counting. ➡Mark-and-sweep collection. ➡Store-and-copy. ➡Generational collection.

13

slide-14
SLIDE 14
  • B. Ward — Spring 2014

Reference Counting

Indirect reachability.

➡Each object has an associated reference

counter.

➡Object graph: how many incoming edges?

14

slide-15
SLIDE 15
  • B. Ward — Spring 2014

Reference Counting Invariant

➡Counter is incremented when a new reference is

acquired.

➡Counter is decremented when a reference is

removed.

➡If an object is reachable, then its associated

reference counter is positive.

➡So what can we say if the reference counter is

zero?

15

slide-16
SLIDE 16
  • B. Ward — Spring 2014

Reference Counting

Widespread use.

➡Easy to implement in C (but error-prone). ➡Used in Linux kernel, Python, many other

projects.

➡Some C++ projects/libraries have a RefPtr class

to automate the process (counter updated in constructor/destructor; less error prone than C)

16

slide-17
SLIDE 17
  • B. Ward — Spring 2014

Reference Counting Example

17

1

“foo”

str1 Heap Stack str2

Each object has an associated reference counter

  • System keeps reference counter up to date.

str1 = “foo”

slide-18
SLIDE 18
  • B. Ward — Spring 2014

Reference Counting Example

18

1

“foo”

str1 Heap Stack str2

After object allocation: reference counter is initially one.

str1 = “foo”

slide-19
SLIDE 19
  • B. Ward — Spring 2014

Reference Counting Example

19

2

“foo”

str1 Heap Stack str2

Adding a new reference increments the counter.

str2 = str1

slide-20
SLIDE 20
  • B. Ward — Spring 2014

Reference Counting Example

20

1

“foo”

str1 Heap Stack str2

Removing a reference decrements the counter.

str1 = null

slide-21
SLIDE 21
  • B. Ward — Spring 2014

Reference Counting Example

21

“foo”

str1 Heap Stack str2

No remaining references: it is now safe to deallocate the object.

str2 = null

slide-22
SLIDE 22
  • B. Ward — Spring 2014

Reference Counting Efficiency Problems

22

➡Increases number of (slow) writes. ➡With multithreading, it may require (even

slower) atomic updates.

slide-23
SLIDE 23
  • B. Ward — Spring 2014

Reference Counting Accuracy Problems

➡Disjoint union types: what if one variant contains

a reference, and another doesn’t?

  • Reference counting must track variant tags.

➡In a weakly typed language such as C?

  • Cannot reliably tell pointers from integers apart.

➡Cannot detect circular garbage.

23

slide-24
SLIDE 24
  • B. Ward — Spring 2014

Cycles in the Object Graph

24

2

“larry”

1

“moe”

1

“curly”

stooges Heap Stack 1

“larry”

1

“moe”

1

“curly”

Heap Stack stooges

slide-25
SLIDE 25
  • B. Ward — Spring 2014

Cycles in the Object Graph

25

2

“larry”

1

“moe”

1

“curly”

stooges Heap Stack 1

“larry”

1

“moe”

1

“curly”

Heap Stack stooges

Memory leak: not reachable, but will not be deallocated.

slide-26
SLIDE 26
  • B. Ward — Spring 2014

Mark and Sweep GC

Direct reachability.

➡Instead of using a counter to track possible

incoming paths, actually discover all paths at runtime by traversing the object graph.

➡Anything not visited must be garbage. ➡Every object carries an “in-use” flag.

26

slide-27
SLIDE 27
  • B. Ward — Spring 2014

Mark and Sweep, contd.

Algorithm concept.

➡Mark every object in the heap as unreachable by

clearing all “in-use” flag.

➡Starting from the root set, traverse all references. ➡Mark every visited object as reachable by

setting its flag.

➡Reclaim all unused objects (“sweep”). ➡Run when memory is “low”.

27

slide-28
SLIDE 28
  • B. Ward — Spring 2014

Mark and Sweep: Challenges

28

“Stop the world” GC.

➡What if object graph is changed during traversal? ➡Simple solution: program execution is halted during GC.

  • Can cause noticeable pauses.
slide-29
SLIDE 29
  • B. Ward — Spring 2014

Mark and Sweep: Challenges

29

Concurrent Garbage Collector: GC and program can run concurrently (i.e., any interleaving is acceptable).

  • Incremental Garbage Collector:

GC does not process whole object graph at

  • nce. Instead, it is invoked more frequently.

“Stop the world” GC.

➡What if object graph is changed during traversal? ➡Simple solution: program execution is halted during GC.

  • Can cause noticeable pauses.
slide-30
SLIDE 30
  • B. Ward — Spring 2014

Mark and Sweep: Challenges

Identifying objects.

➡How to identify objects in the heap?

  • Must carry size/type tags, or have uniform size.
  • Alternative: allocate objects of equal size/type from

specific address ranges.

  • Sometimes called “Big Bag of Pages” (BIBOP).

➡How to discern arbitrary values from pointers?

  • Could have a number that “points” to a garbage object.
  • Could have a number that “points” outside of heap

bounds.

30

slide-31
SLIDE 31
  • B. Ward — Spring 2014

Mark and Sweep: Challenges

Identifying objects.

➡How to identify objects in the heap?

  • Must carry size/type tags, or have uniform size.
  • Alternative: allocate objects of equal size/type from

specific address ranges.

  • Sometimes called “Big Bag of Pages” (BIBOP).

➡How to discern arbitrary values from pointers?

  • Could have a number that “points” to a garbage object.
  • Could have a number that “points” outside of heap

bounds.

31

Precise Garbage Collector: GC can unambiguously determine whether a given value is a pointer/reference.

  • Conservative Garbage Collector:

works without discerning pointers/reference from other values with certainty.

slide-32
SLIDE 32
  • B. Ward — Spring 2014

Mark and Sweep: Challenges

Memory requirements.

➡GC algorithm runs when memory is scarce. ➡Graph traversal requires memory itself!

  • Proportional to the longest path in the object

graph.

  • Reserves are wasteful…

Tradeoff.

➡Implementation complexity vs. efficiency. ➡Could use incremental GC to reduce problem. ➡Specialized stack-less techniques exist.

32

slide-33
SLIDE 33
  • B. Ward — Spring 2014

Mark&Sweep vs. Ref. Counting

Reference counting.

➡Occurs continuously: no pauses.

  • But: overheads are incurred continuously, too.

➡Leaks circular structures. ➡Relatively easy to implement.

Mark & Sweep.

➡Difficult to implement efficiently.

  • Esp. avoiding “stop the world”.

➡Pauses, but otherwise fast execution and allocation. ➡With precise GC, no leaking of unreachable objects.

33

slide-34
SLIDE 34
  • B. Ward — Spring 2014

Copying Garbage Collection

Partitioned heap.

➡Two arenas: live objects arena and free space. ➡Allocate from live object area until full. ➡Then mark&sweep to find all live objects. ➡Copy all live objects to free space.

  • Fast consecutive allocation.

➡Switch roles: formerly live arena is now free.

34

simply identifying and freeing garbage doesn’t solve fragmentation

slide-35
SLIDE 35
  • B. Ward — Spring 2014

Copying Garbage Collection

35

Live Arena Free Arena

slide-36
SLIDE 36
  • B. Ward — Spring 2014

Copying Garbage Collection

36

Live Arena Free Arena

Useful data

slide-37
SLIDE 37
  • B. Ward — Spring 2014

Copying Garbage Collection

37

Live Arena Free Arena

Garbage

slide-38
SLIDE 38
  • B. Ward — Spring 2014

Copying Garbage Collection

38

Live Arena Free Arena

(Initially) failed allocation

slide-39
SLIDE 39
  • B. Ward — Spring 2014

Copying Garbage Collection

39

Live Arena Free Arena Free Arena Live Arena

Copy GC

slide-40
SLIDE 40
  • B. Ward — Spring 2014

Copying Garbage Collection

40

Live Arena Free Arena Free Arena Live Arena

Copy GC

Garbage doesn’t need to be explicitly reclaimed.

slide-41
SLIDE 41
  • B. Ward — Spring 2014

Copying Garbage Collection

41

Live Arena Free Arena Free Arena Live Arena

Copy GC

Very fast allocation: no searching for available space.

slide-42
SLIDE 42
  • B. Ward — Spring 2014

Copying Garbage Collection

42

Live Arena Free Arena Free Arena Live Arena

Copy GC

Limitation: half of the heap is unused.

slide-43
SLIDE 43
  • B. Ward — Spring 2014

Generational GC

Generational Hypothesis.

➡In many programs there is high “infant

mortality.”

➡Most objects are short-lived: they become

garbage quickly after allocation.

➡Thus, “older” objects are less likely to

become garbage.

43

slide-44
SLIDE 44
  • B. Ward — Spring 2014

Generational GC

44

Arenas for different “ages”.

➡Multiple allocation arenas. ➡The “generation 0 arena” (the “nursery”) is used

for new allocations.

➡“Survivors” are copied to the next arena. ➡Which is also gc’ed at some point, at which

generation 1 objects move to the generation 2 arena, etc.

slide-45
SLIDE 45
  • B. Ward — Spring 2014

Arenas for different “ages”.

➡Multiple allocation arenas. ➡The “generation 0 arena” (the “nursery”) is used

for new allocations.

➡“Survivors” are copied to the next arena. ➡Which is also gc’ed at some point, at which

generation 1 objects move to the generation 2 arena, etc.

Generational GC

45

Objects that are unlikely to be garbage are only examined infrequently: reduced GC runtime.

  • New objects can be allocated very cheaply

from the nursery (simply increment the “end of last object” pointer).

  • Modern high-performance VMs often use this

approach (e.g., Java Virtual Machine).

slide-46
SLIDE 46
  • B. Ward — Spring 2014

GC vs. Manual Deallocation

Efficiency.

➡Correct manual heap management is more

efficient than naive GC.

➡But software development cost considerations

strongly favor GC.

➡GC can be faster than manual management

due to reduced allocation costs (copying GC).

46

slide-47
SLIDE 47
  • B. Ward — Spring 2014

GC vs. Manual Deallocation, contd.

Finalizers and non-memory resources.

➡Languages such as Java use finalizers to free

non-memory resources (such as file handles) when an object is freed.

➡Problem: may run out of non-memory resources

before GC kicks in.

47

slide-48
SLIDE 48
  • B. Ward — Spring 2014

Just-in-Time Compilation (JIT)

Static compilation.

➡Compile time vs. run time. ➡Compiler produces machine code once;

resulting program is executed many times. Pure interpretation.

➡interpreter evaluates syntax tree directly. ➡Slow.

Bytecode interpretation.

➡Source compiled to bytecode. ➡Bytecode interpreted by VM. ➡Still slower than statically compiled programs.

48

Source Program Compiler

Intermediate Program

Input Output Virtual Machine

slide-49
SLIDE 49
  • B. Ward — Spring 2014

Just-in-Time Compilation (JIT)

Static compilation.

➡Compile time vs. run time. ➡Compiler produces machine code once;

resulting program is executed many times.

Pure interpretation.

➡interpreter evaluates syntax tree directly. ➡Slow.

Bytecode interpretation.

➡Source compiled to bytecode. ➡Bytecode interpreted by VM. ➡Still slower than statically compiled programs.

49

JIT: compile byte code at run time to speed up overall program execution.

slide-50
SLIDE 50
  • B. Ward — Spring 2014

Just-in-Time Compilation (JIT)

Static compilation.

➡Compile time vs. run time. ➡Compiler produces machine code once;

resulting program is executed many times.

Pure interpretation.

➡interpreter evaluates syntax tree directly. ➡Slow.

Bytecode interpretation.

➡Source compiled to bytecode. ➡Bytecode interpreted by VM. ➡Still slower than statically compiled programs.

50

Sometimes referred to as ahead-of-time compilation (AOT).

slide-51
SLIDE 51
  • B. Ward — Spring 2014

Idea

“Write once, run anywhere.”

➡Combine efficiency of compilation with

flexibility of interpretation.

➡“Late binding of machine code.” ➡Java: web applets, mobile phones, embedded

systems…

51

slide-52
SLIDE 52
  • B. Ward — Spring 2014

Limitations and How Addressed

Overheads.

➡Startup delay.

  • After a program starts, parts must be compiled

before output is produced, which can result in a noticeable delay.

  • Hide by running interpreter and JIT compiler in

parallel.

  • Avoid compiling whole program at once.

52

slide-53
SLIDE 53
  • B. Ward — Spring 2014

JIT Overhead Management

Piecewise compilation.

➡Program is compiled on demand in small

chunks.

➡Subroutine at a time, maybe even only parts of

a subroutine. Tradeoff.

➡Compilation takes considerable time… ➡…but compiled code is faster. ➡Thus: compiled code must be executed many

times to make tradeoff beneficial.

53

slide-54
SLIDE 54
  • B. Ward — Spring 2014

JIT Overhead Management

54

Threshold.

➡Practical JIT systems trigger compilation only for

code fragments that are executed more often then some threshold (e.g., 100 times).

➡Intuition: focus on the common paths.

  • avoid initialization code and rare error paths
  • optimize main work loops
slide-55
SLIDE 55
  • B. Ward — Spring 2014

Threshold.

➡Practical JIT systems trigger compilation only for

code fragments that are executed more often then some threshold (e.g., 100 times).

➡Intuition: focus on the common paths.

  • avoid initialization code and rare error paths
  • optimize main work loops

JIT Overhead Management

55

The exact threshold depends on the efficiency of the byte code interpreter and the JIT compilation speed and must be determined experimentally.

slide-56
SLIDE 56
  • B. Ward — Spring 2014

Optimization vs. JIT Compilation

56

Fast Machine Code Many Advanced Optimizations Increased Total Runtime Slower JIT C. Higher JIT Threshold Goal:
 Lower Total Runtime

slide-57
SLIDE 57
  • B. Ward — Spring 2014

JIT Overhead Management

Simplicity wins.

➡Only simple transformations. ➡No “big picture” optimization. ➡Fast, non-optimal algorithms instead of slower,

provably better algorithms.

57

slide-58
SLIDE 58
  • B. Ward — Spring 2014

Optimizations

58

Source Program AOT Compiler

Byte Code

Input Output VM + JIT

slide-59
SLIDE 59
  • B. Ward — Spring 2014

Source Program AOT Compiler

Byte Code

Input Output VM + JIT

Optimizations

59

The “heavy lifting”: intra-procedural analysis, common sub-expression analysis, dead code eliminations, flow analysis, polymorphism, etc.

slide-60
SLIDE 60
  • B. Ward — Spring 2014

Source Program AOT Compiler

Byte Code

Input Output VM + JIT

Optimizations

60

Simple transformations: basic byte code blocks to equivalent machine code.

slide-61
SLIDE 61
  • B. Ward — Spring 2014

JIT Advantages

Trace collection.

➡Record execution statistics during interpretation. ➡Can (re-)optimize at run time.

JIT can outperform AOT.

➡Additional information available at run time.

  • Specific types (instead of interfaces), accurate

branch prediction.

➡Can be used to generate specialized code.

  • E.g., suppress error checking that is not needed

for a particular data set.

➡Additional inlining possibilities.

61

slide-62
SLIDE 62
  • B. Ward — Spring 2014

JIT Advantages

Trace collection.

➡Record execution statistics during interpretation. ➡Can (re-)optimize at run time.

JIT can outperform AOT.

➡Additional information available at run time.

  • Specific types (instead of interfaces), accurate

branch prediction.

➡Can be used to generate specialized code.

  • E.g., suppress error checking that is not needed

for a particular data set.

➡Additional inlining possibilities.

62

Tradeoff: long-running vs. short-running processes Example: Java VM has a server mode that does spend more time on aggressive optimizations.

slide-63
SLIDE 63
  • B. Ward — Spring 2014

JIT and Prototype-Based Languages

Challenges.

➡Java: JIT on class methods. ➡What if there are no classes?

Tracing JIT.

➡Derive “implicit” classes based on source code

location where object was created (i.e., where the prototype was assigned).

➡Most prototypes are not changed during run time. ➡Must re-JIT an object if either

  • the object’s prototype is changed, or
  • a new prototype is assigned.

63

slide-64
SLIDE 64
  • B. Ward — Spring 2014

Binary Translation / Binary Rewriting

Compiling machine code to machine code.

➡Either AOT or JIT. ➡Basically a compiler without source code.

Uses.

➡Debugging, logging (add invariant checking, etc.). ➡Virtualization e.g. VMware ESX Server, Early IBM Servers ➡Performance analysis. ➡Adding security hooks.

  • Or exploits…

➡Legacy system emulation.

  • E.g.: Apple’s Rosetta.

64

slide-65
SLIDE 65
  • B. Ward — Spring 2014

Security Issues

Untrusted code.

➡Third party code that might be malicious. ➡Often downloaded automatically via

Internet.

  • Embedded Javascript, Java applets,

Flash, etc.

  • Browser plugins.

65

slide-66
SLIDE 66
  • B. Ward — Spring 2014

Sandboxing

➡Idea: Do not give untrusted code

direct system access.

➡Instead: Controlled environment:

code must perform actions through the runtime system and request permission for resources it needs.

➡Examples: JRE, Android, iOS,

Javascript in browsers, etc.

➡Can be a hassle from the user’s

standpoint, depending on the implementation.

66

slide-67
SLIDE 67
  • B. Ward — Spring 2014

Trusted Code

67

Byte code validation.

➡Proving arbitrary properties of arbitrary source

code is impossible.

  • Halting problem…

➡Idea: allow only “known good” byte code.

  • Be conservative.

Alternative.

➡Code signing: attestation by trusted third party

“this is ok.”

slide-68
SLIDE 68
  • B. Ward — Spring 2014

Byte code validation.

➡Proving arbitrary properties of arbitrary source

code is impossible.

  • Halting problem…

➡Idea: allow only “known good” byte code.

  • Be conservative.

Alternative.

➡Code signing: attestation by trusted third party

“this is ok.”

Trusted Code

68

Java Track Record: Many bugs and thus security vulnerabilities over the years.

slide-69
SLIDE 69
  • B. Ward — Spring 2014

Byte code validation.

➡Proving arbitrary properties of arbitrary source

code is impossible.

  • Halting problem…

➡Idea: allow only “known good” byte code.

  • Be conservative.

Alternative.

➡Code signing: attestation by trusted third party

“this is ok.”

Trusted Code

69

Example: Microsoft-certified Windows device drivers.