CSCI 3136 Principles of Programming Languages Data Types and Memory - - PowerPoint PPT Presentation

csci 3136 principles of programming languages
SMART_READER_LITE
LIVE PREVIEW

CSCI 3136 Principles of Programming Languages Data Types and Memory - - PowerPoint PPT Presentation

CSCI 3136 Principles of Programming Languages Data Types and Memory Management Summer 2013 Faculty of Computer Science Dalhousie University 1 / 56 What is a Type System? A type system is a mechanism for defining types and associating them


slide-1
SLIDE 1

CSCI 3136 Principles of Programming Languages

Data Types and Memory Management

Summer 2013 Faculty of Computer Science Dalhousie University

1 / 56

slide-2
SLIDE 2

What is a Type System?

A type system is a mechanism for defining types and associating them with operations that can be performed on objects of this type. A type system includes rules that specify

  • Type equivalence: Do two values have the same type?
  • Type compatibility: Can a value of a certain type be used in a

certain context?

  • Type inference: How is the type of an expression computed from the

types of its parts?

2 / 56

slide-3
SLIDE 3

Types in a Language

  • Strongly typed: Prohibits application of an operation to any object

not supporting this operation.

  • Statically typed: Strongly typed and type checking is performed at

compile time (Pascal, C, Haskell, . . . )

  • Dynamically typed:Types of operands of operations are checked at

run time (LISP, Smalltalk, . . . )

  • Programmer does not specify types at all, compiler infers types from

context (e.g., ML)

3 / 56

slide-4
SLIDE 4

Definition of Types

Similar to subroutines in many languages, defining a type has two parts:

  • A type’s declaration introduces its name into the current scope.
  • A type’s definition describes the type (the simpler types it is

composed of).

4 / 56

slide-5
SLIDE 5

Definition of Types

Similar to subroutines in many languages, defining a type has two parts:

  • A type’s declaration introduces its name into the current scope.
  • A type’s definition describes the type (the simpler types it is

composed of). Three ways to think about types:

  • Denotational: A type is a set of values.
  • Constructive: A type is built-in or composite.
  • Abstraction-based: A type is defined by an interface, the set of
  • perations it supports.

5 / 56

slide-6
SLIDE 6

Classification of Types

Built-in types

  • Integers, Booleans, characters, real numbers, . . .

Enumeration and range types (neither built-in nor composite)

  • C:

enum DAY /* Defines an enumeration type */ { saturday, /* Names day and declares a */ sunday = 0, /* variable named workday with */ monday, /* that type */ tuesday, wednesday, /* wednesday is associated with 3 */ thursday, friday } workday;

  • Pascal: 0..100

Composite types

  • Records, arrays, files, lists, sets, pointers, . . .

6 / 56

slide-7
SLIDE 7

Records

  • A nested record definition in Pascal:

type ore = record name : short_string; element_yielded : record name : two_chars; atomic_n : integer; atomic_weight : real; metallic : Boolean end end;

  • Accessing fields
  • re.element yielded.name

name of element yielded of ore

7 / 56

slide-8
SLIDE 8

Memory Layout of Records

Aligned (fixed ordering) Packed Aligned (optimized ordering)

− Potential waste of space + One machine operation per element access + Guaranteed layout in memory (good for systems programming) + No waste of space − Multiple machine operations per memory access + Guaranteed layout in memory (good for systems programming) ± Reduced space overhead + One machine operation per memory access − No guarantee of layout in memory (bad for systems programming)

8 / 56

slide-9
SLIDE 9

Contiguous Memory Layouts for 2-d Arrays

Row-major layout Column-major layout There are more sophisticated block-recursive layouts which, combined with the right algorithms, achieve much better cache efficiency than the above.

9 / 56

slide-10
SLIDE 10

Contiguous and Row-Pointer Memory Layout

char days[][10] = { "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday" }; days[2][3] == ’s’;

10 / 56

slide-11
SLIDE 11

Contiguous and Row-Pointer Memory Layout

char days[][10] = { "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday" }; days[2][3] == ’s’; char *days[] = { "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday" }; days[2][3] == ’s’;

11 / 56

slide-12
SLIDE 12

Contiguous and Row-Pointer Memory Layout

char days[][10] = { "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday" }; days[2][3] == ’s’; char *days[] = { "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday" }; days[2][3] == ’s’;

Memory layout determines space usage and nature and efficiency of address calculations.

12 / 56

slide-13
SLIDE 13

Arrays

  • Time at which array shape is bound is important

13 / 56

slide-14
SLIDE 14

Arrays

  • Time at which array shape is bound is important

For example,

  • static array is bound at compile time (stored in static memory)
  • local array is bound at either compile time or elaboration time (i.e.,

binding creation time) (stored in stack)

  • . . .

14 / 56

slide-15
SLIDE 15

Arrays

  • Time at which array shape is bound is important

For example,

  • static array is bound at compile time (stored in static memory)
  • local array is bound at either compile time or elaboration time (i.e.,

binding creation time) (stored in stack)

  • . . .
  • Issues
  • Memory allocation
  • Bounds checks
  • Index calculations (higher-dimensional arrays)

15 / 56

slide-16
SLIDE 16

Arrays, Lists, and Strings

A list

16 / 56

slide-17
SLIDE 17

Arrays, Lists, and Strings

A list

  • Most imperative languages provide excellent built-in support for

array manipulation but not for operations on lists.

  • Most functional languages provide excellent built-in support for list

manipulation but not for operations on arrays.

17 / 56

slide-18
SLIDE 18

Arrays, Lists, and Strings

A list

  • Most imperative languages provide excellent built-in support for

array manipulation but not for operations on lists.

  • Most functional languages provide excellent built-in support for list

manipulation but not for operations on arrays.

  • Arrays are a natural way to store sequences when manipulating

individual elements in place (i.e., imperatively).

  • Lists are naturally recursive and thus fit extremely well into the

recursive approach taken to most problems in functional programming.

18 / 56

slide-19
SLIDE 19

Arrays, Lists, and Strings

A list

  • Most imperative languages provide excellent built-in support for

array manipulation but not for operations on lists.

  • Most functional languages provide excellent built-in support for list

manipulation but not for operations on arrays.

  • Arrays are a natural way to store sequences when manipulating

individual elements in place (i.e., imperatively).

  • Lists are naturally recursive and thus fit extremely well into the

recursive approach taken to most problems in functional programming.

  • Strings are arrays of characters in imperative languages and lists of

characters in functional languages.

19 / 56

slide-20
SLIDE 20

Pointers

  • Point to memory locations that store data (often of a specified type,

e.g., int*)

  • Are not required in languages with reference model of variables

(Lisp, ML, CLU, Java)

  • Are required for recursive types in languages with value model of

variables (C, Pascal, Ada)

20 / 56

slide-21
SLIDE 21

Pointers

  • Point to memory locations that store data (often of a specified type,

e.g., int*)

  • Are not required in languages with reference model of variables

(Lisp, ML, CLU, Java)

  • Are required for recursive types in languages with value model of

variables (C, Pascal, Ada) Storage reclamation

  • Explicit (manual)
  • Automatic (garbage collection)

21 / 56

slide-22
SLIDE 22

Pointers

  • Point to memory locations that store data (often of a specified type,

e.g., int*)

  • Are not required in languages with reference model of variables

(Lisp, ML, CLU, Java)

  • Are required for recursive types in languages with value model of

variables (C, Pascal, Ada) Storage reclamation

  • Explicit (manual)
  • Automatic (garbage collection)

Advantages and disadvantages of explicit reclamation + Garbage collection can incur serious run-time overhead − Potential for memory leaks − Potential for dangling pointers and segmentation faults

22 / 56

slide-23
SLIDE 23

Pointer Allocation and Deallocation

C

  • p = (element *)malloc(sizeof(element))
  • free(p)
  • Explicit deallocation

23 / 56

slide-24
SLIDE 24

Pointer Allocation and Deallocation

C

  • p = (element *)malloc(sizeof(element))
  • free(p)
  • Explicit deallocation

Pascal

  • new(p)
  • dispose(p)
  • Explicit deallocation

24 / 56

slide-25
SLIDE 25

Pointer Allocation and Deallocation

C

  • p = (element *)malloc(sizeof(element))
  • free(p)
  • Explicit deallocation

Pascal

  • new(p)
  • dispose(p)
  • Explicit deallocation

Java/C++

  • p = new element() (semantics different between Java and C++,

how?)

  • delete p (in C++)
  • Explicit deallocation in C++, garbage collection in Java

25 / 56

slide-26
SLIDE 26

Dangling References

  • A dangling reference is a pointer to an already reclaimed object.

26 / 56

slide-27
SLIDE 27

Dangling References

  • A dangling reference is a pointer to an already reclaimed object.

Dangling references are notoriously hard to debug and a major source of program misbehaviour and security holes.

27 / 56

slide-28
SLIDE 28

Dangling References

  • A dangling reference is a pointer to an already reclaimed object.

Dangling references are notoriously hard to debug and a major source of program misbehaviour and security holes.

  • Techniques to catch them:
  • Tombstones
  • Keys and locks

28 / 56

slide-29
SLIDE 29

Tombstones

new(p) q := p delete(p)

29 / 56

slide-30
SLIDE 30

Tombstones

new(p) q := p delete(p) Issues:

30 / 56

slide-31
SLIDE 31

Tombstones

new(p) q := p delete(p) Issues:

  • Space overhead

31 / 56

slide-32
SLIDE 32

Tombstones

new(p) q := p delete(p) Issues:

  • Space overhead
  • Runtime overhead

32 / 56

slide-33
SLIDE 33

Tombstones

new(p) q := p delete(p) Issues:

  • Space overhead
  • Runtime overhead
  • Easy to change location
  • f object in heap (just

change tombstone address)

33 / 56

slide-34
SLIDE 34

Tombstones

new(p) q := p delete(p) Issues:

  • Space overhead
  • Runtime overhead
  • Easy to change location
  • f object in heap (just

change tombstone address)

  • Invalid tombstones (RIP

= null pointer)

34 / 56

slide-35
SLIDE 35

Tombstones

new(p) q := p delete(p) Issues:

  • Space overhead
  • Runtime overhead
  • Easy to change location
  • f object in heap (just

change tombstone address)

  • Invalid tombstones (RIP

= null pointer)

  • Deallocate the

tombstones (need reference count or other garbage collection strategy)

35 / 56

slide-36
SLIDE 36

Locks and Keys

new(p) q := p delete(p)

36 / 56

slide-37
SLIDE 37

Locks and Keys

new(p) q := p delete(p)

  • Pointer = address + key

Object has lock

37 / 56

slide-38
SLIDE 38

Locks and Keys

new(p) q := p delete(p)

  • Pointer = address + key

Object has lock

  • When reclaiming object, change the

lock

38 / 56

slide-39
SLIDE 39

Locks and Keys

new(p) q := p delete(p)

  • Pointer = address + key

Object has lock

  • When reclaiming object, change the

lock

  • Tombstones vs. locks/keys:

Efficiency comparison (space

  • verhead or run-time overhead)

unclear

39 / 56

slide-40
SLIDE 40

Locks and Keys

new(p) q := p delete(p)

  • Pointer = address + key

Object has lock

  • When reclaiming object, change the

lock

  • Tombstones vs. locks/keys:

Efficiency comparison (space

  • verhead or run-time overhead)

unclear

Most compilers do not by default generate code to check for dangling

  • references. Most Pascal compilers allow the programmer to request

dynamic checks, which are usually implemented with locks and keys.

40 / 56

slide-41
SLIDE 41

Garbage Collection

Automatic reclamation of space/objects

  • Essential for functional languages
  • Popular in imperative languages (Clu, Ada, Modula-3, Java)
  • Difficult to implement
  • Slower than manual reclamation

41 / 56

slide-42
SLIDE 42

Garbage Collection

Automatic reclamation of space/objects

  • Essential for functional languages
  • Popular in imperative languages (Clu, Ada, Modula-3, Java)
  • Difficult to implement
  • Slower than manual reclamation

Garbage collection methods

  • Reference counts
  • Mark and sweep
  • Mark and sweep variants
  • Stop and copy
  • Generational technique

42 / 56

slide-43
SLIDE 43

Reference Counts

43 / 56

slide-44
SLIDE 44

Reference Counts

a = new Obj();

  • Associate reference count

with each object.

  • Set to 1 when object is

allocated.

44 / 56

slide-45
SLIDE 45

Reference Counts

a = new Obj(); b = new Obj();

  • Associate reference count

with each object.

  • Set to 1 when object is

allocated.

45 / 56

slide-46
SLIDE 46

Reference Counts

a = new Obj(); b = new Obj(); b = a;

  • Associate reference count

with each object.

  • Set to 1 when object is

allocated.

  • Adjust
  • when one pointer

assigned to another.

  • on subroutine return.

46 / 56

slide-47
SLIDE 47

Reference Counts

a = new Obj(); b = new Obj(); b = a;

  • Associate reference count

with each object.

  • Set to 1 when object is

allocated.

  • Adjust
  • when one pointer

assigned to another.

  • on subroutine return.

47 / 56

slide-48
SLIDE 48

Reference Counts

a = new Obj(); b = new Obj(); b = a; a = null;

  • Associate reference count

with each object.

  • Set to 1 when object is

allocated.

  • Adjust
  • when one pointer

assigned to another.

  • on subroutine return.

48 / 56

slide-49
SLIDE 49

Reference Counts

a = new Obj(); b = new Obj(); b = a; a = null; b = null;

  • Associate reference count

with each object.

  • Set to 1 when object is

allocated.

  • Adjust
  • when one pointer

assigned to another.

  • on subroutine return.

49 / 56

slide-50
SLIDE 50

Reference Counts

a = new Obj(); b = new Obj(); b = a; a = null; b = null;

  • Associate reference count

with each object.

  • Set to 1 when object is

allocated.

  • Adjust
  • when one pointer

assigned to another.

  • on subroutine return.

50 / 56

slide-51
SLIDE 51

Reference Counts

a = new Obj(); b = new Obj(); b = a; a = null; b = null;

  • Associate reference count

with each object.

  • Set to 1 when object is

allocated.

  • Adjust
  • when one pointer

assigned to another.

  • on subroutine return.

Pros/cons + Fairly simple to implement + Fairly low cost − Does not work when there are circular references.

51 / 56

slide-52
SLIDE 52

Garbage

garbage root local var

  • bject4
  • bject1
  • bject2
  • bject3

static var live live live root

52 / 56

slide-53
SLIDE 53

Mark and Sweep

for each root variable r mark (r); sweep (); ———————————–

53 / 56

slide-54
SLIDE 54

Mark and Sweep

for each root variable r mark (r); sweep (); ———————————– void mark (Object p) if (!p.marked) p.marked = true; for each Object q referenced by p mark (q); ———————————–

54 / 56

slide-55
SLIDE 55

Mark and Sweep

for each root variable r mark (r); sweep (); ———————————– void mark (Object p) if (!p.marked) p.marked = true; for each Object q referenced by p mark (q); ———————————– void sweep () for each Object p in the heap if (p.marked) p.marked = false else heap.release (p); ———————————–

55 / 56

slide-56
SLIDE 56

Mark and Sweep

Pros/cons: − More complicated to implement − Requires inspection of all allocated blocks in a sweep: costly. − High space usage if the recursion is deep. − Requires type descriptor at the beginning of each block to know the size of the block and to find the pointers in the block. + Works with circular data structures.

56 / 56