[Faculty of Science Information and Computing Sciences] 1
Concepts of programming languages OCaml - low-level Daan Knoope, - - PowerPoint PPT Presentation
Concepts of programming languages OCaml - low-level Daan Knoope, - - PowerPoint PPT Presentation
[Faculty of Science Information and Computing Sciences] Concepts of programming languages OCaml - low-level Daan Knoope, Bas van Rooij, Jorrit Dorrestijn, Ivor van der Hoog, Wouter ten Bosch 1 [Faculty of Science Information and Computing
[Faculty of Science Information and Computing Sciences] 2
Content
▶ Introduction OCaml ▶ Interfacing ▶ Memory ▶ Garbage collection
[Faculty of Science Information and Computing Sciences] 3
OCaml
▶ OCaml is an object oriented, functional and imperative
language.
▶ OCaml uses type-inference.
[Faculty of Science Information and Computing Sciences] 4
Operators
Lets start with a simple addition: 1 + 2;;
- : int = 3
1.0 + 2.0;; File "", line 1, characters 0-3: Error: This expression has type float but an expression was expected of type int 1.0 +. 2.0;;
- : float = 3.
[Faculty of Science Information and Computing Sciences] 4
Operators
Lets start with a simple addition: 1 + 2;;
- : int = 3
1.0 + 2.0;; File "", line 1, characters 0-3: Error: This expression has type float but an expression was expected of type int 1.0 +. 2.0;;
- : float = 3.
[Faculty of Science Information and Computing Sciences] 4
Operators
Lets start with a simple addition: 1 + 2;;
- : int = 3
1.0 + 2.0;; File "", line 1, characters 0-3: Error: This expression has type float but an expression was expected of type int 1.0 +. 2.0;;
- : float = 3.
[Faculty of Science Information and Computing Sciences] 5
Expressions
The let keyword is used to defjne a named expressions. let x = 5;; val x : int = 5 let add x y = x + y;; val add : int -> int -> int = <fun> add 1 2;;
- : int = 3
[Faculty of Science Information and Computing Sciences] 5
Expressions
The let keyword is used to defjne a named expressions. let x = 5;; val x : int = 5 let add x y = x + y;; val add : int -> int -> int = <fun> add 1 2;;
- : int = 3
[Faculty of Science Information and Computing Sciences] 5
Expressions
The let keyword is used to defjne a named expressions. let x = 5;; val x : int = 5 let add x y = x + y;; val add : int -> int -> int = <fun> add 1 2;;
- : int = 3
[Faculty of Science Information and Computing Sciences] 6
Variables
Expressions in OCaml are immutable. If you want to change a variable, you can declare a reference variable. let y = ref 5;; val y : int ref = {contents = 5} y := 4; !y;;
- : int = 4
[Faculty of Science Information and Computing Sciences] 6
Variables
Expressions in OCaml are immutable. If you want to change a variable, you can declare a reference variable. let y = ref 5;; val y : int ref = {contents = 5} y := 4; !y;;
- : int = 4
[Faculty of Science Information and Computing Sciences] 7
Polymorphic functions
let justatwo x = 2;; val justatwo : 'a -> int = <fun> 'a is an argument that can be anything. justatwo 3;;
- : int = 2
justatwo "Foo";;
- : int = 2
[Faculty of Science Information and Computing Sciences] 7
Polymorphic functions
let justatwo x = 2;; val justatwo : 'a -> int = <fun> 'a is an argument that can be anything. justatwo 3;;
- : int = 2
justatwo "Foo";;
- : int = 2
[Faculty of Science Information and Computing Sciences] 8
Classes
class stack = object (self) ... end;;
[Faculty of Science Information and Computing Sciences] 9
Classes
class stack = object (self) val mutable list = ([] : int list) ... end;;
[Faculty of Science Information and Computing Sciences] 10
Classes
class stack = object (self) val mutable list = ([] : int list) method size = List.length list ... end;;
[Faculty of Science Information and Computing Sciences] 11
Classes
class stack = object (self) val mutable list = ([] : int list) method size = List.length list method push x = list <- x :: list method pop = let result = List.hd list in list <- List.tl list; result end;;
[Faculty of Science Information and Computing Sciences] 12
Classes
Classes are initiated using the ‘new’ keyword. let s = new stack;; val s : stack = <obj> s#push 1; s#push 2; s#push 3; s#size;;
- : int = 3
[Faculty of Science Information and Computing Sciences] 12
Classes
Classes are initiated using the ‘new’ keyword. let s = new stack;; val s : stack = <obj> s#push 1; s#push 2; s#push 3; s#size;;
- : int = 3
[Faculty of Science Information and Computing Sciences] 13
Classes
class ['a] stack = object (self) val mutable list = ([] : 'a list) method size = List.length list method push x = list <- x :: list method pop = let result = List.hd list in list <- List.tl list; result end;;
[Faculty of Science Information and Computing Sciences] 14
Classes
let s = new stack;; val s : '_a stack = <obj> s#push 1.0; s;;
- : float stack = <obj>
[Faculty of Science Information and Computing Sciences] 15
Classes
let clear_stack s = while s#size > 0 do s#pop done;; val clear_stack : < pop : 'a; size : int; .. > -> unit = <fun>
[Faculty of Science Information and Computing Sciences] 16
Tuples and Records
▶ Tuples
let cityPopulation = ("Utrecht", 312634);; val cityPopulation : bytes * int Records: tuples with named elements type coordinates = {long : float; lat: float};; let uithof = {long = 52.1; lat = 5.2};; val uithof : coordinates = {long = 52.1; lat = 5.2}
[Faculty of Science Information and Computing Sciences] 16
Tuples and Records
▶ Tuples
let cityPopulation = ("Utrecht", 312634);; val cityPopulation : bytes * int
▶ Records: tuples with named elements
type coordinates = {long : float; lat: float};; let uithof = {long = 52.1; lat = 5.2};; val uithof : coordinates = {long = 52.1; lat = 5.2}
[Faculty of Science Information and Computing Sciences] 17
Variants
▶ Variants as enums:
type building = BBG | KBG | EDUC;; Variants as qualifjed unions: type location = | Building of building | Coordinates of coordinates | City of string;;
[Faculty of Science Information and Computing Sciences] 17
Variants
▶ Variants as enums:
type building = BBG | KBG | EDUC;;
▶ Variants as qualifjed unions:
type location = | Building of building | Coordinates of coordinates | City of string;;
[Faculty of Science Information and Computing Sciences] 18
Polymorphic variants
Creating a polymorphic binary tree: type 'a binary_tree = | Leaf of 'a | Tree of 'a binary_tree * 'a binary_tree;; Tree (Leaf 4, Leaf 5);;
[Faculty of Science Information and Computing Sciences] 19
Pattern matching
Turning a tree into list let rec asList tree = match tree with | Leaf l -> [l] | Tree (a,b) -> asList(a) @ asList(b);; val asList : 'a binary_tree -> 'a list = <fun>
[Faculty of Science Information and Computing Sciences] 20
Partial application
Is used for creating more specifjc functions based of a more general function let add x y = x + y;; val add : int -> int -> int = <fun> let increment = add 1;; val increment : int -> int = <fun>
[Faculty of Science Information and Computing Sciences] 20
Partial application
Is used for creating more specifjc functions based of a more general function let add x y = x + y;; val add : int -> int -> int = <fun> let increment = add 1;; val increment : int -> int = <fun>
[Faculty of Science Information and Computing Sciences] 21
Partial application - issues
This is very consice code, but
▶ in certain cases it might be much slower 1 ▶ it might lead to a loss of polymorphism
1Jane Street: The dangers of being too partial
[Faculty of Science Information and Computing Sciences] 22
Problem: polymorphism and mutability
▶ OCaml supports both polymorphism and mutability. ▶ This causes trouble. For example writing to a device:
write : 'a -> unit() read : unit() -> 'a
▶ This is too permissive: if we have written a string fjrst,
we cannot read an int later. Solution: weak polymorphism, i.e. polymorphic until the fjrst run. write : '_a -> unit() read : unit() -> '_a
[Faculty of Science Information and Computing Sciences] 22
Problem: polymorphism and mutability
▶ OCaml supports both polymorphism and mutability. ▶ This causes trouble. For example writing to a device:
write : 'a -> unit() read : unit() -> 'a
▶ This is too permissive: if we have written a string fjrst,
we cannot read an int later.
▶ Solution: weak polymorphism, i.e. polymorphic until the
fjrst run. write : '_a -> unit() read : unit() -> '_a
[Faculty of Science Information and Computing Sciences] 23
Value restriction
▶ OCaml needs to decide when to make something either
strongly or weakly polymorphic
▶ Decision: only fully evaluated or immutable
expressions can be strongly polymorphic
▶ Functions are strongly polymorphic
Applications of functions are not strongly polymorphic: they can be evaluated further Partial applications are also weakly polymorphic
[Faculty of Science Information and Computing Sciences] 23
Value restriction
▶ OCaml needs to decide when to make something either
strongly or weakly polymorphic
▶ Decision: only fully evaluated or immutable
expressions can be strongly polymorphic
▶ Functions are strongly polymorphic ▶ Applications of functions are not strongly polymorphic:
they can be evaluated further
▶ Partial applications are also weakly polymorphic
[Faculty of Science Information and Computing Sciences] 24
Value restriction on partial application
▶ This might be an issue:
let id x = x;; let mapId = List.map(id);; val mapId : '_a list -> '_a list = <fun> mapId [1;2;3];; mapId;; int list -> int list = <fun> mapId ["This won't work", "anymore"] Error: This expression has type bytes but an expression was expected of type int
[Faculty of Science Information and Computing Sciences] 24
Value restriction on partial application
▶ This might be an issue:
let id x = x;; let mapId = List.map(id);; val mapId : '_a list -> '_a list = <fun> mapId [1;2;3];; mapId;; int list -> int list = <fun> mapId ["This won't work", "anymore"] Error: This expression has type bytes but an expression was expected of type int
[Faculty of Science Information and Computing Sciences] 24
Value restriction on partial application
▶ This might be an issue:
let id x = x;; let mapId = List.map(id);; val mapId : '_a list -> '_a list = <fun> mapId [1;2;3];; mapId;; int list -> int list = <fun> mapId ["This won't work", "anymore"] Error: This expression has type bytes but an expression was expected of type int
[Faculty of Science Information and Computing Sciences] 25
Solving value restriction on partial application
▶ To solve this, use eta-expansion to turn it into a fully
evaluated expression: let mapId = List.map(id);; val mapId : '_a list -> '_a list = <fun> let mapId x = List.map(id) x;; val mapId : 'a list -> 'a list = <fun> Eta-expansion solves both the loss of polymorphism and the possible speed degradation… … but it is also less pretty.
[Faculty of Science Information and Computing Sciences] 25
Solving value restriction on partial application
▶ To solve this, use eta-expansion to turn it into a fully
evaluated expression: let mapId = List.map(id);; val mapId : '_a list -> '_a list = <fun> let mapId x = List.map(id) x;; val mapId : 'a list -> 'a list = <fun> Eta-expansion solves both the loss of polymorphism and the possible speed degradation… … but it is also less pretty.
[Faculty of Science Information and Computing Sciences] 25
Solving value restriction on partial application
▶ To solve this, use eta-expansion to turn it into a fully
evaluated expression: let mapId = List.map(id);; val mapId : '_a list -> '_a list = <fun> let mapId x = List.map(id) x;; val mapId : 'a list -> 'a list = <fun>
▶ Eta-expansion solves both the loss of polymorphism and
the possible speed degradation… … but it is also less pretty.
[Faculty of Science Information and Computing Sciences] 25
Solving value restriction on partial application
▶ To solve this, use eta-expansion to turn it into a fully
evaluated expression: let mapId = List.map(id);; val mapId : '_a list -> '_a list = <fun> let mapId x = List.map(id) x;; val mapId : 'a list -> 'a list = <fun>
▶ Eta-expansion solves both the loss of polymorphism and
the possible speed degradation…
▶ … but it is also less pretty.
[Faculty of Science Information and Computing Sciences] 26
Foreign Function Interface
At some point you might want to interact with non-OCaml code.
▶ Use libraries that cannot be written in the language ▶ Use high-performance libraries already implemented in
another language OCaml uses C as a interoperability language:
▶ It is a standardized language (ISO C) ▶ It is a popular language for operating systems ▶ A great many libraries are written in C ▶ Most programming languages ofger a C interface
[Faculty of Science Information and Computing Sciences] 27
Foreign Function Interface
There are two ways to interact with C code:
▶ Calling C functions from OCaml ▶ Calling OCaml functions from C
There are quite a lot of difgerences between C and OCaml: types, currying, memory allocation, pointers, …
[Faculty of Science Information and Computing Sciences] 28
Example: Using <stdlib.h> rand()
void srand(unsigned int seed); int rand(void); Print 5 random numbers from 0 to 49 in C: srand((unsigned) 123); for( i = 0 ; i < 5 ; i++ ) { printf("%d\n", rand() % 50); }
[Faculty of Science Information and Computing Sciences] 29
Example: Using <stdlib.h> rand()
void srand(unsigned int seed); int rand(void); C’s int, uint and void types are difgerent from the types used in OCaml: the types need to be mapped. (* open Ctypes; *) val void : unit typ val int : int typ val uint : uint typ
[Faculty of Science Information and Computing Sciences] 30
Example: Using <stdlib.h> rand()
- pen Foreign
let srand = foreign "srand" (uint @-> returning void) let rand = foreign "rand" (void @-> returning int) Print 5 random numbers from 0 to 49 in OCaml: let () = srand 123; for i = 1 to 5 do print_int rand (); done
[Faculty of Science Information and Computing Sciences] 31
Example: Using Pointers
The time library of C uses pointers: time_t time(time_t *); double difftime(time_t, time_t); char *ctime(const time_t *timep); These pointer types need to be translated to OCaml: type time_t = unit ptr let time_t : time_t typ = ptr void let time = foreign "time" (ptr time_t @-> returning time_t);
[Faculty of Science Information and Computing Sciences] 31
Example: Using Pointers
The time library of C uses pointers: time_t time(time_t *); double difftime(time_t, time_t); char *ctime(const time_t *timep); These pointer types need to be translated to OCaml: type time_t = unit ptr let time_t : time_t typ = ptr void let time = foreign "time" (ptr time_t @-> returning time_t);
[Faculty of Science Information and Computing Sciences] 32
Example: Using Pointers
Use from_voidp to create an empty pointer: time (from_voidp time_t null); (* val cur_time : time_t = <abstr> *) let time' () = time (from_voidp time_t null); (* val time' : unit -> time_t = <fun> *) let t1 = time' () in Unix.sleep 2; let t2 = time' () in difftime t2 t1;
[Faculty of Science Information and Computing Sciences] 32
Example: Using Pointers
Use from_voidp to create an empty pointer: time (from_voidp time_t null); (* val cur_time : time_t = <abstr> *) let time' () = time (from_voidp time_t null); (* val time' : unit -> time_t = <fun> *) let t1 = time' () in Unix.sleep 2; let t2 = time' () in difftime t2 t1;
[Faculty of Science Information and Computing Sciences] 33
Example: Using Pointers
char *ctime(const time_t *timep); let ctime = foreign "ctime" (ptr time_t @-> returning string); ctime (time' ()); Error: This expression has type time_t but an expression was expected of type time_t ptr Solution: ‘manually’ allocate time_t to create a pointer: let t_ptr = allocate time_t (time' ()); ctime t_ptr (* string = "Tue Nov 5 08:51:55 2013\n" *)
[Faculty of Science Information and Computing Sciences] 33
Example: Using Pointers
char *ctime(const time_t *timep); let ctime = foreign "ctime" (ptr time_t @-> returning string); ctime (time' ()); Error: This expression has type time_t but an expression was expected of type time_t ptr Solution: ‘manually’ allocate time_t to create a pointer: let t_ptr = allocate time_t (time' ()); ctime t_ptr (* string = "Tue Nov 5 08:51:55 2013\n" *)
[Faculty of Science Information and Computing Sciences] 34
Foreign: @-> and returning
To defjne a C function in OCaml you need to use the foreign function: foreign "srand" (uint @-> returning void) foreign: string -> ('a -> 'b) Ctypes.fn -> 'a -> 'b
▶ Functions are fjrst class citizens in OCaml, not in C ▶ OCaml functions are defjned in a curried style
[Faculty of Science Information and Computing Sciences] 35
Foreign: @-> and returning
val curried : int -> int -> int val curried : int -> (int -> int) int uncurried_C(int, int); uncurried_C(3, 4); (* int -> int -> int *) typedef int (function_t)(int); function_t *curried_C(int); curried_C(3)(4); (* int -> (int -> int) *) int @-> int @-> returning int int @-> returning (int @-> returning int)
[Faculty of Science Information and Computing Sciences] 36
Memory representation
Most data types are boxed: stored on the heap in a block. The layout of a block is as follows: +------+-------+----------+----------+----------+---- | size | color | tag byte | value[0] | value[1] | ... +------+-------+----------+----------+----------+----
▶ Size determines the size of the block ▶ Color is used by the GC algorithm ▶ The tag gives the type of the stored data ▶ The remaining bits are the actual values, which could be
pointers to other blocks
[Faculty of Science Information and Computing Sciences] 37
Memory representation
There are tags for most data types; some have the same tag:
▶ strings ▶ fmoat (double-precision) ▶ closures ▶ tuples, records, arrays ▶ …
[Faculty of Science Information and Computing Sciences] 38
Unboxed
Two main types are stored unboxed: integers and pointers.
▶ Each integer/pointer is stored in a word (64-bit or 32-bit) ▶ The lowest bit is 1 for integers and 0 for pointers ▶ Integers can efgectively store only 63 (or 31) bits ▶ Pointers are word aligned so the missing bit does not
matter
[Faculty of Science Information and Computing Sciences] 39
Unboxed
A number of data types are actually stored as integers:
▶ chars ▶ bools ▶ variants without parameters ▶ unit ▶ the empty list
[Faculty of Science Information and Computing Sciences] 40
Unboxed trade-ofgs
There are two basic advantages for using unboxed datatypes:
▶ No more pointer to follow ▶ Smaller memory footprint
However the disadvantage is that some extra arithmetic is required to manipulate the data. For example to compute a+b: let c = (((a lsr 1) + (b lsr 1)) lsl 1) + 1;; (* c = ((a>>1 + b>>1)<<1) | 1 *)
[Faculty of Science Information and Computing Sciences] 41
Unboxed trade-ofgs
Actually this can be improved. Instead of: let c = (((a lsr 1) + (b lsr 1)) lsl 1) + 1;; (* c = ((a>>1 + b>>1)<<1) | 1 *) we can write let c = a + b - 1;;
[Faculty of Science Information and Computing Sciences] 42
Unboxed trade-ofgs
Why aren’t all data types unboxed?
▶ Some data types have variable length ▶ The GC uses header information
▶ Some data types store other data types
▶ Floats specifjcally cannot be stored easily in 63 bits
▶ At least one attempt has been made
▶ Polymorphic functions can use the tag
[Faculty of Science Information and Computing Sciences] 43
Float arrays
A solution for fmoat arrays:
▶ Normally fmoats are boxed ▶ However within arrays fmoats are stored directly ▶ A drawback is that this requires many exceptions in the
compiler code
[Faculty of Science Information and Computing Sciences] 44
Garbage Collection
Lots of C programmers dont like automated garbage collection
▶ Free() is a very expensive operation. ▶ Garbage collection can compact the heap and move
memory.
▶ But if you do it, you must do it well.
[Faculty of Science Information and Computing Sciences] 45
Garbage Collection
Generational GC.
▶ Ocaml has a functional coding style.
▶ Most created variables are immediately deleted. ▶ Others tend to live for very long.
▶ The young variables land on the minor heap. ▶ Older on the major heap.
[Faculty of Science Information and Computing Sciences] 46
Allocating memory
Virtual memory of a few MBs long, claimed by Cs function
- malloc. - Range is defjned by two pointers, start and end. -
Start starts from the fjrst the nearsest word boundary (32/64 bit). - Two other pointers manage memory: limit and
- ptr. base - - - - < - - - - - - - - - - - - > (size) start | <- - - - |end limit |
ptr | - ptr crosses limit? Stop the world.
[Faculty of Science Information and Computing Sciences] 47
Allocating memory
▶ Next-fjt allocation (default). ▶ First fjt allocation (possible option).
[Faculty of Science Information and Computing Sciences] 48
Mark and Sweep
▶ Marking stops the world. ▶ Determine slices (runtime determined, can be manually
set).
▶ Block only a slice at a time.
[Faculty of Science Information and Computing Sciences] 49
Mark and Sweep
Each value has a header with a 2-bit color.
▶ White = not reached yet. ▶ Blue = On the free list and not in use. ▶ Gray = Reachable but not fully scanned. ▶ Black = Reachable and fully scanned. ▶ Always-reachable values (application stack) as a root
and do depth fjrst search (stack). White -> Grey -> black.
▶ Remaining white values can be removed.
[Faculty of Science Information and Computing Sciences] 50
Mark and Sweep
▶ Search stack has fjnite size, so we limit the number of
grey values.
▶ Crossing that, makes it impure.
▶ Impure -> Process all existing grey values as normal in
- rder of memory address.
▶ This turns grey values into black values. ▶ Repeat until no more grey values are found.