Concepts of programming languages OCaml - low-level Daan Knoope, - - PowerPoint PPT Presentation

concepts of programming languages
SMART_READER_LITE
LIVE PREVIEW

Concepts of programming languages OCaml - low-level Daan Knoope, - - PowerPoint PPT Presentation

[Faculty of Science Information and Computing Sciences] Concepts of programming languages OCaml - low-level Daan Knoope, Bas van Rooij, Jorrit Dorrestijn, Ivor van der Hoog, Wouter ten Bosch 1 [Faculty of Science Information and Computing


slide-1
SLIDE 1

[Faculty of Science Information and Computing Sciences] 1

Concepts of programming languages

OCaml - low-level

Daan Knoope, Bas van Rooij, Jorrit Dorrestijn, Ivor van der Hoog, Wouter ten Bosch

slide-2
SLIDE 2

[Faculty of Science Information and Computing Sciences] 2

Content

▶ Introduction OCaml ▶ Interfacing ▶ Memory ▶ Garbage collection

slide-3
SLIDE 3

[Faculty of Science Information and Computing Sciences] 3

OCaml

▶ OCaml is an object oriented, functional and imperative

language.

▶ OCaml uses type-inference.

slide-4
SLIDE 4

[Faculty of Science Information and Computing Sciences] 4

Operators

Lets start with a simple addition: 1 + 2;;

  • : int = 3

1.0 + 2.0;; File "", line 1, characters 0-3: Error: This expression has type float but an expression was expected of type int 1.0 +. 2.0;;

  • : float = 3.
slide-5
SLIDE 5

[Faculty of Science Information and Computing Sciences] 4

Operators

Lets start with a simple addition: 1 + 2;;

  • : int = 3

1.0 + 2.0;; File "", line 1, characters 0-3: Error: This expression has type float but an expression was expected of type int 1.0 +. 2.0;;

  • : float = 3.
slide-6
SLIDE 6

[Faculty of Science Information and Computing Sciences] 4

Operators

Lets start with a simple addition: 1 + 2;;

  • : int = 3

1.0 + 2.0;; File "", line 1, characters 0-3: Error: This expression has type float but an expression was expected of type int 1.0 +. 2.0;;

  • : float = 3.
slide-7
SLIDE 7

[Faculty of Science Information and Computing Sciences] 5

Expressions

The let keyword is used to defjne a named expressions. let x = 5;; val x : int = 5 let add x y = x + y;; val add : int -> int -> int = <fun> add 1 2;;

  • : int = 3
slide-8
SLIDE 8

[Faculty of Science Information and Computing Sciences] 5

Expressions

The let keyword is used to defjne a named expressions. let x = 5;; val x : int = 5 let add x y = x + y;; val add : int -> int -> int = <fun> add 1 2;;

  • : int = 3
slide-9
SLIDE 9

[Faculty of Science Information and Computing Sciences] 5

Expressions

The let keyword is used to defjne a named expressions. let x = 5;; val x : int = 5 let add x y = x + y;; val add : int -> int -> int = <fun> add 1 2;;

  • : int = 3
slide-10
SLIDE 10

[Faculty of Science Information and Computing Sciences] 6

Variables

Expressions in OCaml are immutable. If you want to change a variable, you can declare a reference variable. let y = ref 5;; val y : int ref = {contents = 5} y := 4; !y;;

  • : int = 4
slide-11
SLIDE 11

[Faculty of Science Information and Computing Sciences] 6

Variables

Expressions in OCaml are immutable. If you want to change a variable, you can declare a reference variable. let y = ref 5;; val y : int ref = {contents = 5} y := 4; !y;;

  • : int = 4
slide-12
SLIDE 12

[Faculty of Science Information and Computing Sciences] 7

Polymorphic functions

let justatwo x = 2;; val justatwo : 'a -> int = <fun> 'a is an argument that can be anything. justatwo 3;;

  • : int = 2

justatwo "Foo";;

  • : int = 2
slide-13
SLIDE 13

[Faculty of Science Information and Computing Sciences] 7

Polymorphic functions

let justatwo x = 2;; val justatwo : 'a -> int = <fun> 'a is an argument that can be anything. justatwo 3;;

  • : int = 2

justatwo "Foo";;

  • : int = 2
slide-14
SLIDE 14

[Faculty of Science Information and Computing Sciences] 8

Classes

class stack = object (self) ... end;;

slide-15
SLIDE 15

[Faculty of Science Information and Computing Sciences] 9

Classes

class stack = object (self) val mutable list = ([] : int list) ... end;;

slide-16
SLIDE 16

[Faculty of Science Information and Computing Sciences] 10

Classes

class stack = object (self) val mutable list = ([] : int list) method size = List.length list ... end;;

slide-17
SLIDE 17

[Faculty of Science Information and Computing Sciences] 11

Classes

class stack = object (self) val mutable list = ([] : int list) method size = List.length list method push x = list <- x :: list method pop = let result = List.hd list in list <- List.tl list; result end;;

slide-18
SLIDE 18

[Faculty of Science Information and Computing Sciences] 12

Classes

Classes are initiated using the ‘new’ keyword. let s = new stack;; val s : stack = <obj> s#push 1; s#push 2; s#push 3; s#size;;

  • : int = 3
slide-19
SLIDE 19

[Faculty of Science Information and Computing Sciences] 12

Classes

Classes are initiated using the ‘new’ keyword. let s = new stack;; val s : stack = <obj> s#push 1; s#push 2; s#push 3; s#size;;

  • : int = 3
slide-20
SLIDE 20

[Faculty of Science Information and Computing Sciences] 13

Classes

class ['a] stack = object (self) val mutable list = ([] : 'a list) method size = List.length list method push x = list <- x :: list method pop = let result = List.hd list in list <- List.tl list; result end;;

slide-21
SLIDE 21

[Faculty of Science Information and Computing Sciences] 14

Classes

let s = new stack;; val s : '_a stack = <obj> s#push 1.0; s;;

  • : float stack = <obj>
slide-22
SLIDE 22

[Faculty of Science Information and Computing Sciences] 15

Classes

let clear_stack s = while s#size > 0 do s#pop done;; val clear_stack : < pop : 'a; size : int; .. > -> unit = <fun>

slide-23
SLIDE 23

[Faculty of Science Information and Computing Sciences] 16

Tuples and Records

▶ Tuples

let cityPopulation = ("Utrecht", 312634);; val cityPopulation : bytes * int Records: tuples with named elements type coordinates = {long : float; lat: float};; let uithof = {long = 52.1; lat = 5.2};; val uithof : coordinates = {long = 52.1; lat = 5.2}

slide-24
SLIDE 24

[Faculty of Science Information and Computing Sciences] 16

Tuples and Records

▶ Tuples

let cityPopulation = ("Utrecht", 312634);; val cityPopulation : bytes * int

▶ Records: tuples with named elements

type coordinates = {long : float; lat: float};; let uithof = {long = 52.1; lat = 5.2};; val uithof : coordinates = {long = 52.1; lat = 5.2}

slide-25
SLIDE 25

[Faculty of Science Information and Computing Sciences] 17

Variants

▶ Variants as enums:

type building = BBG | KBG | EDUC;; Variants as qualifjed unions: type location = | Building of building | Coordinates of coordinates | City of string;;

slide-26
SLIDE 26

[Faculty of Science Information and Computing Sciences] 17

Variants

▶ Variants as enums:

type building = BBG | KBG | EDUC;;

▶ Variants as qualifjed unions:

type location = | Building of building | Coordinates of coordinates | City of string;;

slide-27
SLIDE 27

[Faculty of Science Information and Computing Sciences] 18

Polymorphic variants

Creating a polymorphic binary tree: type 'a binary_tree = | Leaf of 'a | Tree of 'a binary_tree * 'a binary_tree;; Tree (Leaf 4, Leaf 5);;

slide-28
SLIDE 28

[Faculty of Science Information and Computing Sciences] 19

Pattern matching

Turning a tree into list let rec asList tree = match tree with | Leaf l -> [l] | Tree (a,b) -> asList(a) @ asList(b);; val asList : 'a binary_tree -> 'a list = <fun>

slide-29
SLIDE 29

[Faculty of Science Information and Computing Sciences] 20

Partial application

Is used for creating more specifjc functions based of a more general function let add x y = x + y;; val add : int -> int -> int = <fun> let increment = add 1;; val increment : int -> int = <fun>

slide-30
SLIDE 30

[Faculty of Science Information and Computing Sciences] 20

Partial application

Is used for creating more specifjc functions based of a more general function let add x y = x + y;; val add : int -> int -> int = <fun> let increment = add 1;; val increment : int -> int = <fun>

slide-31
SLIDE 31

[Faculty of Science Information and Computing Sciences] 21

Partial application - issues

This is very consice code, but

▶ in certain cases it might be much slower 1 ▶ it might lead to a loss of polymorphism

1Jane Street: The dangers of being too partial

slide-32
SLIDE 32

[Faculty of Science Information and Computing Sciences] 22

Problem: polymorphism and mutability

▶ OCaml supports both polymorphism and mutability. ▶ This causes trouble. For example writing to a device:

write : 'a -> unit() read : unit() -> 'a

▶ This is too permissive: if we have written a string fjrst,

we cannot read an int later. Solution: weak polymorphism, i.e. polymorphic until the fjrst run. write : '_a -> unit() read : unit() -> '_a

slide-33
SLIDE 33

[Faculty of Science Information and Computing Sciences] 22

Problem: polymorphism and mutability

▶ OCaml supports both polymorphism and mutability. ▶ This causes trouble. For example writing to a device:

write : 'a -> unit() read : unit() -> 'a

▶ This is too permissive: if we have written a string fjrst,

we cannot read an int later.

▶ Solution: weak polymorphism, i.e. polymorphic until the

fjrst run. write : '_a -> unit() read : unit() -> '_a

slide-34
SLIDE 34

[Faculty of Science Information and Computing Sciences] 23

Value restriction

▶ OCaml needs to decide when to make something either

strongly or weakly polymorphic

▶ Decision: only fully evaluated or immutable

expressions can be strongly polymorphic

▶ Functions are strongly polymorphic

Applications of functions are not strongly polymorphic: they can be evaluated further Partial applications are also weakly polymorphic

slide-35
SLIDE 35

[Faculty of Science Information and Computing Sciences] 23

Value restriction

▶ OCaml needs to decide when to make something either

strongly or weakly polymorphic

▶ Decision: only fully evaluated or immutable

expressions can be strongly polymorphic

▶ Functions are strongly polymorphic ▶ Applications of functions are not strongly polymorphic:

they can be evaluated further

▶ Partial applications are also weakly polymorphic

slide-36
SLIDE 36

[Faculty of Science Information and Computing Sciences] 24

Value restriction on partial application

▶ This might be an issue:

let id x = x;; let mapId = List.map(id);; val mapId : '_a list -> '_a list = <fun> mapId [1;2;3];; mapId;; int list -> int list = <fun> mapId ["This won't work", "anymore"] Error: This expression has type bytes but an expression was expected of type int

slide-37
SLIDE 37

[Faculty of Science Information and Computing Sciences] 24

Value restriction on partial application

▶ This might be an issue:

let id x = x;; let mapId = List.map(id);; val mapId : '_a list -> '_a list = <fun> mapId [1;2;3];; mapId;; int list -> int list = <fun> mapId ["This won't work", "anymore"] Error: This expression has type bytes but an expression was expected of type int

slide-38
SLIDE 38

[Faculty of Science Information and Computing Sciences] 24

Value restriction on partial application

▶ This might be an issue:

let id x = x;; let mapId = List.map(id);; val mapId : '_a list -> '_a list = <fun> mapId [1;2;3];; mapId;; int list -> int list = <fun> mapId ["This won't work", "anymore"] Error: This expression has type bytes but an expression was expected of type int

slide-39
SLIDE 39

[Faculty of Science Information and Computing Sciences] 25

Solving value restriction on partial application

▶ To solve this, use eta-expansion to turn it into a fully

evaluated expression: let mapId = List.map(id);; val mapId : '_a list -> '_a list = <fun> let mapId x = List.map(id) x;; val mapId : 'a list -> 'a list = <fun> Eta-expansion solves both the loss of polymorphism and the possible speed degradation… … but it is also less pretty.

slide-40
SLIDE 40

[Faculty of Science Information and Computing Sciences] 25

Solving value restriction on partial application

▶ To solve this, use eta-expansion to turn it into a fully

evaluated expression: let mapId = List.map(id);; val mapId : '_a list -> '_a list = <fun> let mapId x = List.map(id) x;; val mapId : 'a list -> 'a list = <fun> Eta-expansion solves both the loss of polymorphism and the possible speed degradation… … but it is also less pretty.

slide-41
SLIDE 41

[Faculty of Science Information and Computing Sciences] 25

Solving value restriction on partial application

▶ To solve this, use eta-expansion to turn it into a fully

evaluated expression: let mapId = List.map(id);; val mapId : '_a list -> '_a list = <fun> let mapId x = List.map(id) x;; val mapId : 'a list -> 'a list = <fun>

▶ Eta-expansion solves both the loss of polymorphism and

the possible speed degradation… … but it is also less pretty.

slide-42
SLIDE 42

[Faculty of Science Information and Computing Sciences] 25

Solving value restriction on partial application

▶ To solve this, use eta-expansion to turn it into a fully

evaluated expression: let mapId = List.map(id);; val mapId : '_a list -> '_a list = <fun> let mapId x = List.map(id) x;; val mapId : 'a list -> 'a list = <fun>

▶ Eta-expansion solves both the loss of polymorphism and

the possible speed degradation…

▶ … but it is also less pretty.

slide-43
SLIDE 43

[Faculty of Science Information and Computing Sciences] 26

Foreign Function Interface

At some point you might want to interact with non-OCaml code.

▶ Use libraries that cannot be written in the language ▶ Use high-performance libraries already implemented in

another language OCaml uses C as a interoperability language:

▶ It is a standardized language (ISO C) ▶ It is a popular language for operating systems ▶ A great many libraries are written in C ▶ Most programming languages ofger a C interface

slide-44
SLIDE 44

[Faculty of Science Information and Computing Sciences] 27

Foreign Function Interface

There are two ways to interact with C code:

▶ Calling C functions from OCaml ▶ Calling OCaml functions from C

There are quite a lot of difgerences between C and OCaml: types, currying, memory allocation, pointers, …

slide-45
SLIDE 45

[Faculty of Science Information and Computing Sciences] 28

Example: Using <stdlib.h> rand()

void srand(unsigned int seed); int rand(void); Print 5 random numbers from 0 to 49 in C: srand((unsigned) 123); for( i = 0 ; i < 5 ; i++ ) { printf("%d\n", rand() % 50); }

slide-46
SLIDE 46

[Faculty of Science Information and Computing Sciences] 29

Example: Using <stdlib.h> rand()

void srand(unsigned int seed); int rand(void); C’s int, uint and void types are difgerent from the types used in OCaml: the types need to be mapped. (* open Ctypes; *) val void : unit typ val int : int typ val uint : uint typ

slide-47
SLIDE 47

[Faculty of Science Information and Computing Sciences] 30

Example: Using <stdlib.h> rand()

  • pen Foreign

let srand = foreign "srand" (uint @-> returning void) let rand = foreign "rand" (void @-> returning int) Print 5 random numbers from 0 to 49 in OCaml: let () = srand 123; for i = 1 to 5 do print_int rand (); done

slide-48
SLIDE 48

[Faculty of Science Information and Computing Sciences] 31

Example: Using Pointers

The time library of C uses pointers: time_t time(time_t *); double difftime(time_t, time_t); char *ctime(const time_t *timep); These pointer types need to be translated to OCaml: type time_t = unit ptr let time_t : time_t typ = ptr void let time = foreign "time" (ptr time_t @-> returning time_t);

slide-49
SLIDE 49

[Faculty of Science Information and Computing Sciences] 31

Example: Using Pointers

The time library of C uses pointers: time_t time(time_t *); double difftime(time_t, time_t); char *ctime(const time_t *timep); These pointer types need to be translated to OCaml: type time_t = unit ptr let time_t : time_t typ = ptr void let time = foreign "time" (ptr time_t @-> returning time_t);

slide-50
SLIDE 50

[Faculty of Science Information and Computing Sciences] 32

Example: Using Pointers

Use from_voidp to create an empty pointer: time (from_voidp time_t null); (* val cur_time : time_t = <abstr> *) let time' () = time (from_voidp time_t null); (* val time' : unit -> time_t = <fun> *) let t1 = time' () in Unix.sleep 2; let t2 = time' () in difftime t2 t1;

slide-51
SLIDE 51

[Faculty of Science Information and Computing Sciences] 32

Example: Using Pointers

Use from_voidp to create an empty pointer: time (from_voidp time_t null); (* val cur_time : time_t = <abstr> *) let time' () = time (from_voidp time_t null); (* val time' : unit -> time_t = <fun> *) let t1 = time' () in Unix.sleep 2; let t2 = time' () in difftime t2 t1;

slide-52
SLIDE 52

[Faculty of Science Information and Computing Sciences] 33

Example: Using Pointers

char *ctime(const time_t *timep); let ctime = foreign "ctime" (ptr time_t @-> returning string); ctime (time' ()); Error: This expression has type time_t but an expression was expected of type time_t ptr Solution: ‘manually’ allocate time_t to create a pointer: let t_ptr = allocate time_t (time' ()); ctime t_ptr (* string = "Tue Nov 5 08:51:55 2013\n" *)

slide-53
SLIDE 53

[Faculty of Science Information and Computing Sciences] 33

Example: Using Pointers

char *ctime(const time_t *timep); let ctime = foreign "ctime" (ptr time_t @-> returning string); ctime (time' ()); Error: This expression has type time_t but an expression was expected of type time_t ptr Solution: ‘manually’ allocate time_t to create a pointer: let t_ptr = allocate time_t (time' ()); ctime t_ptr (* string = "Tue Nov 5 08:51:55 2013\n" *)

slide-54
SLIDE 54

[Faculty of Science Information and Computing Sciences] 34

Foreign: @-> and returning

To defjne a C function in OCaml you need to use the foreign function: foreign "srand" (uint @-> returning void) foreign: string -> ('a -> 'b) Ctypes.fn -> 'a -> 'b

▶ Functions are fjrst class citizens in OCaml, not in C ▶ OCaml functions are defjned in a curried style

slide-55
SLIDE 55

[Faculty of Science Information and Computing Sciences] 35

Foreign: @-> and returning

val curried : int -> int -> int val curried : int -> (int -> int) int uncurried_C(int, int); uncurried_C(3, 4); (* int -> int -> int *) typedef int (function_t)(int); function_t *curried_C(int); curried_C(3)(4); (* int -> (int -> int) *) int @-> int @-> returning int int @-> returning (int @-> returning int)

slide-56
SLIDE 56

[Faculty of Science Information and Computing Sciences] 36

Memory representation

Most data types are boxed: stored on the heap in a block. The layout of a block is as follows: +------+-------+----------+----------+----------+---- | size | color | tag byte | value[0] | value[1] | ... +------+-------+----------+----------+----------+----

▶ Size determines the size of the block ▶ Color is used by the GC algorithm ▶ The tag gives the type of the stored data ▶ The remaining bits are the actual values, which could be

pointers to other blocks

slide-57
SLIDE 57

[Faculty of Science Information and Computing Sciences] 37

Memory representation

There are tags for most data types; some have the same tag:

▶ strings ▶ fmoat (double-precision) ▶ closures ▶ tuples, records, arrays ▶ …

slide-58
SLIDE 58

[Faculty of Science Information and Computing Sciences] 38

Unboxed

Two main types are stored unboxed: integers and pointers.

▶ Each integer/pointer is stored in a word (64-bit or 32-bit) ▶ The lowest bit is 1 for integers and 0 for pointers ▶ Integers can efgectively store only 63 (or 31) bits ▶ Pointers are word aligned so the missing bit does not

matter

slide-59
SLIDE 59

[Faculty of Science Information and Computing Sciences] 39

Unboxed

A number of data types are actually stored as integers:

▶ chars ▶ bools ▶ variants without parameters ▶ unit ▶ the empty list

slide-60
SLIDE 60

[Faculty of Science Information and Computing Sciences] 40

Unboxed trade-ofgs

There are two basic advantages for using unboxed datatypes:

▶ No more pointer to follow ▶ Smaller memory footprint

However the disadvantage is that some extra arithmetic is required to manipulate the data. For example to compute a+b: let c = (((a lsr 1) + (b lsr 1)) lsl 1) + 1;; (* c = ((a>>1 + b>>1)<<1) | 1 *)

slide-61
SLIDE 61

[Faculty of Science Information and Computing Sciences] 41

Unboxed trade-ofgs

Actually this can be improved. Instead of: let c = (((a lsr 1) + (b lsr 1)) lsl 1) + 1;; (* c = ((a>>1 + b>>1)<<1) | 1 *) we can write let c = a + b - 1;;

slide-62
SLIDE 62

[Faculty of Science Information and Computing Sciences] 42

Unboxed trade-ofgs

Why aren’t all data types unboxed?

▶ Some data types have variable length ▶ The GC uses header information

▶ Some data types store other data types

▶ Floats specifjcally cannot be stored easily in 63 bits

▶ At least one attempt has been made

▶ Polymorphic functions can use the tag

slide-63
SLIDE 63

[Faculty of Science Information and Computing Sciences] 43

Float arrays

A solution for fmoat arrays:

▶ Normally fmoats are boxed ▶ However within arrays fmoats are stored directly ▶ A drawback is that this requires many exceptions in the

compiler code

slide-64
SLIDE 64

[Faculty of Science Information and Computing Sciences] 44

Garbage Collection

Lots of C programmers don฀t like automated garbage collection฀

▶ Free() is a very expensive operation. ▶ Garbage collection can compact the heap and move

memory.

▶ But if you do it, you must do it well.

slide-65
SLIDE 65

[Faculty of Science Information and Computing Sciences] 45

Garbage Collection

Generational GC.

▶ Ocaml has a functional coding style.

▶ Most created variables are immediately deleted. ▶ Others tend to live for very long.

▶ The young variables land on the minor heap. ▶ Older on the major heap.

slide-66
SLIDE 66

[Faculty of Science Information and Computing Sciences] 46

Allocating memory

Virtual memory of a few MBs long, claimed by C฀s function

  • malloc. - Range is defjned by two pointers, start and end. -

Start starts from the fjrst the nearsest word boundary (32/64 bit). - Two other pointers manage memory: limit and

  • ptr. base - - - - < - - - - - - - - - - - - > (size) start | <- - - - |end limit |

ptr | - ptr crosses limit? Stop the world.

slide-67
SLIDE 67

[Faculty of Science Information and Computing Sciences] 47

Allocating memory

▶ Next-fjt allocation (default). ▶ First fjt allocation (possible option).

slide-68
SLIDE 68

[Faculty of Science Information and Computing Sciences] 48

Mark and Sweep

▶ Marking ฀stops the world฀. ▶ Determine slices (runtime determined, can be manually

set).

▶ Block only a slice at a time.

slide-69
SLIDE 69

[Faculty of Science Information and Computing Sciences] 49

Mark and Sweep

Each value has a header with a 2-bit color.

▶ White = not reached yet. ▶ Blue = On the free list and not in use. ▶ Gray = Reachable but not fully scanned. ▶ Black = Reachable and fully scanned. ▶ Always-reachable values (application stack) as a root

and do depth fjrst search (stack). White -> Grey -> black.

▶ Remaining white values can be removed.

slide-70
SLIDE 70

[Faculty of Science Information and Computing Sciences] 50

Mark and Sweep

▶ Search stack has fjnite size, so we limit the number of

grey values.

▶ Crossing that, makes it impure.

▶ Impure -> Process all existing grey values as normal in

  • rder of memory address.

▶ This turns grey values into black values. ▶ Repeat until no more grey values are found.

▶ Sweep. ▶ Very ineffjcient. But i assume it banks on next-fjt

allocation.