Faculty of Science Information and Computing Sciences 1
Concepts of programming languages Lecture 10 Wouter Swierstra - - PowerPoint PPT Presentation
Concepts of programming languages Lecture 10 Wouter Swierstra - - PowerPoint PPT Presentation
Faculty of Science Information and Computing Sciences 1 Concepts of programming languages Lecture 10 Wouter Swierstra Faculty of Science Information and Computing Sciences 2 Last time How do other (typed) languages support
Faculty of Science Information and Computing Sciences 2
Last time
▶ How do other (typed) languages support metaprogramming? ▶ Case studies: when do people use reflection? ▶ Embedding DSLs using quasiquotation.
Faculty of Science Information and Computing Sciences 3
This time
Quasiquotation and DSLs Concurrency and parallelism
▶ Why is this important? ▶ What are these two concepts? ▶ How does Erlang address these challenges?
Faculty of Science Information and Computing Sciences 4
Quasiquotation
As a last example, I want to briefly mention quasiquotation. We’ve seen how to embed domain specific languages in Haskell using deep/shallow embeddings. When we do so, we are constrained by Haskell’s syntax and static semantics. Racket shows how to use macros to write custom language dialects. Haskell’s quasiquotation framework borrows these ideas.
Faculty of Science Information and Computing Sciences 5
Quasiquotation: example
Suppose I’m writing a Haskell library for manipulating and generating C code. But working with C ASTs directly is pretty painful:
add n = Func (DeclSpec [ ] [ ] (Tint Nothing)) (Id "add") DeclRoot (Args [Arg (Just (Id "x")) ...
What I’d like to do is embed (a fragment of) C in my Haskell library.
Faculty of Science Information and Computing Sciences 6
Using quasiquotation
The quasiquoter allows me to do just that:
add n = [cfun | int add (int x ) { return x + $int : n$; } |]
The cfun quasiquoter tells me how to turn a string into a suitable Exp.
Faculty of Science Information and Computing Sciences 7
Defining quasiquoters
A quasiquoter is nothing more than a series of parsers for expressions, patterns, types and declarations:
data QuasiQuoter = QuasiQuoter { quoteExp :: String -> Q Exp, quotePat :: String -> Q Pat, quoteType :: String -> Q Type, quoteDec :: String -> Q [Dec] }
Whenever the Haskell parser encounters a quasiquotation [ myQQ |
... |] it will run the parser associated with the quasiquoter myQQ to
generate the quoted expression/pattern/type/declaration.
Faculty of Science Information and Computing Sciences 8
Multiline strings
As a simple example, suppose we want to have multi-line string literals. We can define a quasiquoter:
ml :: QuasiQuoter ml = QuasiQuoter { quoteExp = (\a -> LitE (StringL a)), ... }
And call it as follows:
example : String example = [ml | hello beautiful world|]
Faculty of Science Information and Computing Sciences 9
Quasiquoting
The quasiquoting mechanism allows you to embed arbitrary syntax within your Haskell program. And still use Template Haskell’s quotation and splicing to mix your
- bject language with Haskell code.
This is a mix of the embedded and stand-alone approaches to domain specific languages that we saw over the last few weeks.
Faculty of Science Information and Computing Sciences 10
Parallelism and concurrency
Faculty of Science Information and Computing Sciences 11
Modern server software is demanding to develop and operate: It must be available at all times and in all locations; it must reply within milliseconds to user requests; it must respond quickly to capacity demands; it must process a lot of data and even more traffic; it must adapt quickly to changing product needs; and, in many cases, it must accommodate a large engineering organization, its many engineers the proverbial cooks in a big, messy kitchen. Marius Eriksen, Twitter Principle Engineer, Functional at Scale, CACM December 2016
Faculty of Science Information and Computing Sciences 12
Concurrency & parallelism
A parallel program is a program that uses multiplicity of computer hardware to perform a computation more quickly A concurrent program has multiple threads of control. Conceptually, these threads execute ‘at the same’ time, interleaving their effects. These notions are not the same! Programs may be both parallel and concurrent, one of the two, or neither.
Faculty of Science Information and Computing Sciences 13
Determinism and non-determinism
A program that will return a single result is said to be deterministic. When different runs of the program may yield different results, the program is non-deterministic. Concurrent programs are necessarily non-deterministic – depending on how threads are interleaved, they may compute different results. Concurrent and non-deterministic programs are much harder to test and verify.
Faculty of Science Information and Computing Sciences 14
Why care about concurrency and parallelism?
Computers are not getting much faster: instead they have more and more cores. To exploit this hardware we need parallelism. Programs no longer run in isolation. They are inter-connected – to different machines, to users, to subsystems. To structure such interconnected programs, we need concurrency.
Faculty of Science Information and Computing Sciences 15
Why concurrency?
First, scale implies concurrency. For example, a search engine may split its index into many small pieces (or shards) so the entire corpus can fit in main memory. To satisfy queries efficiently, all shards must be queried concurrently. Second, communication between servers is asynchronous and must be handled concurrently for efficiency and safety. Marius Eriksen, Twitter Principle Engineer, Functional at Scale, CACM December 2016
Faculty of Science Information and Computing Sciences 16
Concurrency models
Traditionally, concurrency is done through spawning or forking new threads. There are numerous synchronization primitives to co-ordinate between threads: locks, mutexes, semaphores, … If you’ve taken the course on Concurrency, you will have seen how to write algorithms using them.
Faculty of Science Information and Computing Sciences 17
Concurrency models
Programming with concurrency is really hard. In particular, once you have state that is mutable and shared between threads, it becomes almost impossible to reason about your program. How can both threads reach the critical section simultaneously?
Faculty of Science Information and Computing Sciences 18
a=0;
Thread one
a = a + 1; if (a == 1) { critical_section(); }
Thread two
a = a + 1; if (a == 1) { critical_section(); }
Question
What can go wrong?
Faculty of Science Information and Computing Sciences 19
Atomicity
Assignments are not atomic operations. Instead they read in the current value of a, increment that value, and store it again. This can cause all kinds of undesirable interaction! Check out The Deadlock Empire (https://deadlockempire.github.io/) or
- ur Concurrency course.
Programming with locks and threads directly is really hard.
Faculty of Science Information and Computing Sciences 20
Faculty of Science Information and Computing Sciences 21
Beyond locks and threads
Locks and threads are at the heart of any concurrent program. More recently, languages have started to offer different abstractions,
- ften built on top of simple locks and threads:
▶ actor model/message passing ▶ software transactional memory ▶ exploiting data parallelism ▶ executing code on GPUs
These all represent different tools, specifically designed for different problems.
Faculty of Science Information and Computing Sciences 22
Erlang
Originally developed in 1986 at Ericsson, it aimed to improve the software running on telephony switches. Specifically designed for concurrent programming with many different processes that might fail at any time. Open source release together with OTP (Open Telecom Platform) containing:
▶ Erlang compiler and interpreter; ▶ communication protocol between servers; ▶ a static analysis tool (Dialyzer); ▶ a distributed database server (Mnesia); ▶ …
Faculty of Science Information and Computing Sciences 23
Does anyone use Erlang?
▶ Social media:
▶ WhatsApp ▶ Grindr
▶ Betting websites
▶ William Hill ▶ Bet365
▶ Telecoms
▶ TMobile ▶ AT&T
▶ Much, much more..
Faculty of Science Information and Computing Sciences 23
Does anyone use Erlang?
▶ Social media:
▶ WhatsApp ▶ Grindr
▶ Betting websites
▶ William Hill ▶ Bet365
▶ Telecoms
▶ TMobile ▶ AT&T
▶ Much, much more..
Faculty of Science Information and Computing Sciences 24
Erlang for Haskell programmers
Erlang is:
▶ dynamically typed; ▶ strict rather than lazy; ▶ impure (there are no restrictions on effects and assignments); ▶ the syntax is familiar, but slightly different.
Faculty of Science Information and Computing Sciences 25
Quicksort
qsort([]) -> []; qsort([X|XS]) -> qsort([Y || Y <- Xs, Y < X]) ++ [X] ++ qsort([Y || Y <- Xs, Y >= X]). ▶ No ‘equations’ but ‘arrows’ when pattern matching. ▶ Clauses are separated with semi-colons; a function definition is
finished with a period.
▶ All Erlang variables start with capital letters; names starting with
lower-case letters are atoms.
Faculty of Science Information and Computing Sciences 26
Modules
To compile the code, we need to include some module information
- module(myModuleName)
- compile(export_all)
qsort([]) -> ...
This declares the module myModuleName and exports all its declarations.
Faculty of Science Information and Computing Sciences 27
Hello Erlang!
- module(hello).
- export([start/0]).
start() -> io:format("Hello Erlang!").
The export statement can be used to expose certain functions: note that each statements records the functions arity (the number of arguments it expects), but not its type. We can call the format function from the module io using io:format.
Faculty of Science Information and Computing Sciences 28
Concurrency
Concurrent Erlang programs are organized into processes, a lightweight virtual machine that can communicate with other processes. There are three important functions to define concurrent programs:
▶ spawn creates a new process; ▶ send sends a message to another process; ▶ receive receives a message sent by another process.
Faculty of Science Information and Computing Sciences 29
Case study: a concurrent file server
As a simple example to illustrate this concurrency model, consider defining a simple concurrent file server and its client:
▶ the afile_server module waits for messages from the client; ▶ the afile_client module may send messages to the server
requesting data.
Faculty of Science Information and Computing Sciences 30
Case study: the server
- module(afile_server)
- export([start/1,loop/1])
start(Dir) -> spawn(afile_server, loop, [Dir]) loop(Dir) -> ...
Modules and processes are a bit like classes and objects: there may be many processes running code defined in the same module. When a new afile_server is started using start, a new process is
- spawned. In this example, the process is spawned by calling the method
loop from the module afile_server with the argument [Dir].
Faculty of Science Information and Computing Sciences 31
Case study: the server
loop(Dir) -> receive {Client, list_dir} -> Client ! {self(), file:list_dir(Dir)}; {Client, {get_file, File}} -> Full = filename:join(Dir,File) Client ! {self(), file:read_file(Full)} end, loop(Dir).
This code waits for a message from the client (receive …). Once the message has been received and processed, it calls loop(Dir) again to receive a new message, ad infinitum.
Faculty of Science Information and Computing Sciences 32
Case study: the server
receive {Client, list_dir} -> Client ! {self(), file:list_dir(Dir)}; {Client, {get_file, File}} -> Full = filename:join(Dir,File) Client ! {self(), file:read_file(Full)} end,
The receive ... end statement pattern matches on the message it receives from the client. Note: variables bound are written with capitals; constants start with a lower-case letter.
Faculty of Science Information and Computing Sciences 33
Case study: the server
receive {Client, list_dir} -> Client ! {self(), file:list_dir(Dir)}; {Client, {get_file, File}} -> Full = filename:join(Dir,File) Client ! {self(), file:read_file(Full)} end,
Here we expect one of two commands:
▶ {Client, list_dir} – the client asks us to list the files in the
directory Dir;
▶ {Client, {get_file, File}} – the client asks us to return the
contents of the file File.
Faculty of Science Information and Computing Sciences 34
Case study: the server
receive ... {Client, {get_file, File}} -> Full = filename:join(Dir,File) Client ! {self(), file:read_file(Full)}
If we receive the {get_file, File} message from the client Client:
▶ compute the absolute file name. ▶ send a new message to the client with the file contents. ▶ the sender of this message is self()
All messages are tagged with their sender. (The notation {X,Y} creates a tuple in Erlang.)
Faculty of Science Information and Computing Sciences 35
Recap: main server code
loop(Dir) -> receive {Client, list_dir} -> Client ! {self(), file:list_dir(Dir)}; {Client, {get_file, File}} -> Full = filename:join(Dir,File) Client ! {self(), file:read_file(Full)} end, loop(Dir).
Question
How can we add a new command to change the current directory?
Faculty of Science Information and Computing Sciences 36
Case study: the client
- module(afile_client).
- export([ls/1,get_file/2]).
ls(Server) -> Server ! {self(), list_dir}, receive {Server, FileList} -> FileList end. get_file(Server, File) -> Server ! {self(), {get_file, File}}, receive {Server, Content} -> Content end.
Faculty of Science Information and Computing Sciences 37
Case study: the client
This example shows that there is symmetry between the client and server. Whenever one sends a message, the other is expecting one and visa versa. Our code is structured in different modules, each which may correspond to one or more processes at run-time. These processes communicate through message passing.
Faculty of Science Information and Computing Sciences 38
Connecting the two
We can test our server in isolation by sending it messages from the REPL. Or start a server and a client and connect the two:
> c(afile_server). {ok,afile_server} > c(afile_client). {ok,afile_client} > FileServer = afile_server:start("."). <0.43.0> > afile_client:get_file(FileServer,"missingfile"). {error,enoent} > afile_client:get_file(FileServer,"hello.txt"). {ok, <<"hello there...}
Faculty of Science Information and Computing Sciences 39
Data types
Erlang lets you define algebraic data types and declare types for your functions:
- spec plan_route(point(), point()) -> route().
- type direction() :: north | east | south | west.
- type point() :: {integer(),integer()}.
- type route() :: [{go,direction(),integer()}].
The Erlang Dialyzer is a static-analysis tool that tries to detect whether or not a program will crash.
Faculty of Science Information and Computing Sciences 40
The Dialyzer
When the Erlang Dialyzer warns your program will fail, you can be sure that it will fail (soundness). If the Dialyzer does not warn that your program will fail, you cannot be sure that it will not fail (completeness). It works even if you do not provide type annotations (type inference), and uses what Erlang calls success typing – assume everything will work
- ut, but raise a warning if it cannot.
Faculty of Science Information and Computing Sciences 41
Pattern matching
Note that pattern matching may be contain more than one occurence of the same variable:
eq(X,X) -> "The arguments are equal"; eq(X,Y) -> "The arguments are different".
Erlang is dynamically typed: you can tag tuples with an atom rather than introduce an algebraic data type if you wish.
Faculty of Science Information and Computing Sciences 42
Error handling
Many languages encourage defensive programming – check all possible ways a function can fail. In Erlang, this idea is built into the language: when you call a function in an invalid way, it will crash. But it is also part of the Erlang philosophy: Let it crash!
▶ exit(Why) terminate current process (and notify any connected
processes that this process has terminated)
▶ throw(Why) throw an exception that the caller may want to catch. ▶ error(Why) crash unexpectedly
Faculty of Science Information and Computing Sciences 43
Erlang philosophy: let it crash!
Never return a bogus value (unlike say, Javascript). Fail fast and noisily and let the caller handle the error, using a
try-catch block: try FunctionOrExpression of Pattern1 -> Expression1; Pattern2 -> Expression2; ... catch ExceptionType1: ExPattern1 -> ExExpression1; ... after AfterExpressions end.
Faculty of Science Information and Computing Sciences 44
Erlang design principles
▶ Everything is a process. ▶ Processes are strongly isolated – they do not share memory. ▶ Process creation and destruction is a lightweight operation. ▶ Message passing is the only way for processes to interact. ▶ Processes have unique names. ▶ If you know the name of a process you can send it a message. ▶ Processes share no resources. ▶ Error handling is non-local. ▶ Processes do what they are supposed to do or fail.
Faculty of Science Information and Computing Sciences 45
Erlang philosophy: concurrency everywhere
Object oriented programmers try to model the world using classes and
- bjects.
Erlang takes a similar ‘philosophical position’: everything is a process; processes communicate through message passing. When a process fails, all connected processes are notified.
Faculty of Science Information and Computing Sciences 46
Erlang processes
Note: Erlang processes are not operating system processes. They are small, self-contained virtual machines running Erlang code. These processes are created (spawn) and can send (Pid ! Message) and receive (receive ... end) messages. Processes all have an associated ‘mailbox’ – even when they are not waiting to receive a message, incoming messages will be stored.
Faculty of Science Information and Computing Sciences 47
Processes vs functions
It’s trivial to turn any function into a ‘server’ handling that function:
area({rectangle, Width, Height}) -> Width * Height; area({square, Side}) -> Side * Side.
Versus
loop() -> receive {rectangle, Width,Height} -> io:format("Area is~p~n", [Width * Height]) loop(); ...
Faculty of Science Information and Computing Sciences 48
Organizing processes
The lightweight nature of creating new processes raises new a design question: How do we organize our code into processes? Which processes communicate? And what messages do they send one another?
Client-server
Two processes:
▶ the client sends requests to the server; ▶ the server computes a suitable reply and sends a response to the
client. We saw this already in the file server example.
Faculty of Science Information and Computing Sciences 48
Organizing processes
The lightweight nature of creating new processes raises new a design question: How do we organize our code into processes? Which processes communicate? And what messages do they send one another?
Client-server
Two processes:
▶ the client sends requests to the server; ▶ the server computes a suitable reply and sends a response to the
client. We saw this already in the file server example.
Faculty of Science Information and Computing Sciences 49
Client server
Instead of tagging all messages with process ids explicitly, we can define an auxiliary function rpc (remote procedure call):
rpc(Pid,Request) -> Pid ! {self(), Request}, receive Response -> Response end
This tags every request with the Pid to which to respond.
Faculty of Science Information and Computing Sciences 50
Updating the loop function
The branches of our loop function send messages, instead of printing to the terminal:
loop() -> receive {From, {rectangle, Width, Height}} -> From ! Width * Height, loop(); {From, {square, Width}} -> From ! Width * Width, loop(); end
Faculty of Science Information and Computing Sciences 51
But…
▶ If we receive an unexpected message, we’ll get an dynamic pattern
match failure.
▶ The rpc function sends a request to the server and awaits a
- response. If some other process sends a message before the server
responds, we may return an incorrect result.
receive Response -> Response end ▶ We need to explicitly spawn the server when we want to start it up.
Faculty of Science Information and Computing Sciences 52
Receiving unexpected message types
To handle unexpected messages, we can simply send an error back to the sender. We add a third branch to the loop function:
... {From, Other} -> From ! {error, Other}, loop()
If we receive an unexpected message, we reply by throwing back an error to the caller.
Faculty of Science Information and Computing Sciences 53
Receiving messages from unexpected processes
We need to ensure that the response we get in the rpc function is really from the process we’re expecting:
rpc(Pid, Request) -> Pid ! {self(), Request), receive {Pid, Response} -> Response end.
Question
Is the third Pid in the receive clause a binding or an applied occurrence? The Pid in the receive clause is no longer a binding occurrence! It must be equal to the Pid function argument. Any messages from a different Pid are queued.
Faculty of Science Information and Computing Sciences 53
Receiving messages from unexpected processes
We need to ensure that the response we get in the rpc function is really from the process we’re expecting:
rpc(Pid, Request) -> Pid ! {self(), Request), receive {Pid, Response} -> Response end.
Question
Is the third Pid in the receive clause a binding or an applied occurrence? The Pid in the receive clause is no longer a binding occurrence! It must be equal to the Pid function argument. Any messages from a different Pid are queued.
Faculty of Science Information and Computing Sciences 54
Creating a server
By adding start and area methods, we can make the interaction with the server a bit easier:
start() -> spawn(area_server, loop, []) area(Pid,What) -> rpc(Pid, What). > Pid = area_server:start() <0.36.0> > area_server:area(Pid, {rectangle, 10, 8}). 80
Faculty of Science Information and Computing Sciences 55
Processes features
▶ Creating Erlang processes is very cheap – in contrast to creating
new OS processes!
▶ We can add an after clause to a receive statement to timeout after
10 milliseconds:
receive ... -> ... after 10 -> ...timeout code...
Faculty of Science Information and Computing Sciences 56
Receiving messages
- 1. When entering a receive statement, we start a timer;
- 2. Match the first message in the mailbox, with the patterns one by one.
Enter the branch corresponding to the first match.
- 3. If no branch matches, add the message to the ‘save queue’. Try the
next message until one does match.
- 4. If no message matches, suspend the process until a new message
arrives.
- 5. If a new message matches, reset the timer and add the saved
messages to the mailbox in the order that they arrived.
- 6. If the timer elapses before a message is matched, execute the
corresponding timeout expression and restore the messages in the mailbox.
Faculty of Science Information and Computing Sciences 57
Error handling
In a single threaded program, crashing is bad. In a concurrent program, if a single process crashes, it is not the end of
- ur program.
Instead of avoiding crashes, Erlang embraces the fact that any single process may crash. The question is: how do we detect crashes and restore crashed processes? The focus is on cure rather than prevention.
Faculty of Science Information and Computing Sciences 58
Two slogans
Let some other process fix it
Processes are arranged to monitor one another for health. When a process crashes, the observing process is informed and takes restorative action.
Let it crash
If necessary, let any individual process crash.
Faculty of Science Information and Computing Sciences 59
Let it crash
▶ No need for checking arguments – just crash ▶ No need to perform additional calculations after things have gone
wrong.
▶ Clean separation of concerns: other processes handle the crash;
this process does computation only.
▶ Crashing early makes it easier to diagnose what went wrong. ▶ …
Faculty of Science Information and Computing Sciences 60
Erlang: recap
Erlang is a ‘concurrency oriented language’. Processes are (almost) as easy to work with as functions. Processes communicate through message passing, but do not share state. Processes are free to fail; other processes will clean up the mess. Processes can be run on a single machine, or distributed over many different machines.
Faculty of Science Information and Computing Sciences 61
Coming up
▶ Parallel programming with Erlang ▶ Concurrent and parallel programming with Haskell
Faculty of Science Information and Computing Sciences 62