Faculty of Science Information and Computing Sciences 1
Concepts of programming languages Lecture 9 Wouter Swierstra - - PowerPoint PPT Presentation
Concepts of programming languages Lecture 9 Wouter Swierstra - - PowerPoint PPT Presentation
Concepts of programming languages Lecture 9 Wouter Swierstra Faculty of Science Information and Computing Sciences 1 Last time How do other (typed) languages support this style of metaprogramming ? Case studies : when do people use
Faculty of Science Information and Computing Sciences 2
Last time
▶ How do other (typed) languages support this style of
metaprogramming?
▶ Case studies: when do people use refmection? ▶ Embedding DSLs using quasiquotation.
Faculty of Science Information and Computing Sciences 3
This time
Concurrency and parallelism
▶ Why is this important? ▶ What are these two concepts? ▶ How does Erlang address these challenges?
Faculty of Science Information and Computing Sciences 4
Modern server software is demanding to develop and
- perate: It must be available at all times and in all
locations; it must reply within milliseconds to user requests; it must respond quickly to capacity demands; it must process a lot of data and even more traffjc; it must adapt quickly to changing product needs; and, in many cases, it must accommodate a large engineering organization, its many engineers the proverbial cooks in a big, messy kitchen. Marius Eriksen, Twitter Principle Engineer, Functional at Scale, CACM December 2016
Faculty of Science Information and Computing Sciences 5
Concurrency & parallelism
A parallel program is a program that uses multiplicity of computer hardware to perform a computation more quickly A concurrent program has multiple threads of control. Conceptually, these threads execute ‘at the same’ time, interleaving their efgects. These notions are not the same! Programs may be both parallel and concurrent, one of the two, or neither.
Faculty of Science Information and Computing Sciences 6
Determinism and non-determinism
A program that will return a single result is said to be deterministic. When difgerent runs of the program may yield difgerent results, the program is non-deterministic. Concurrent programs are necessarily non-deterministic – depending on how threads are interleaved, they may compute difgerent results. Concurrent and non-deterministic programs are much harder to test and verify.
Faculty of Science Information and Computing Sciences 7
Why care about concurrency and parallelism?
Computers are not getting much faster: instead they have more and more cores. To exploit this hardware we need parallelism. Programs no longer run in isolation. They are inter-connected – to difgerent machines, to users, to
- subsystems. To structure such interconnected programs, we
need concurrency.
Faculty of Science Information and Computing Sciences 8
Why concurrency?
First, scale implies concurrency. For example, a search engine may split its index into many small pieces (or shards) so the entire corpus can fjt in main memory. To satisfy queries effjciently, all shards must be queried concurrently. Second, communication between servers is asynchronous and must be handled concurrently for effjciency and safety. Marius Eriksen, Twitter Principle Engineer, Functional at Scale, CACM December 2016
Faculty of Science Information and Computing Sciences 9
Concurrency models
Traditionally, concurrency is done through spawning or forking new threads. There are numerous synchronization primitives to co-ordinate between threads: locks, mutexes, semaphores, … If you’ve taken the course on Concurrency, you will have seen how to write algorithms using them.
Faculty of Science Information and Computing Sciences 10
Concurrency models
Programming with concurrency is really hard. In particular, once you have state that is mutable and shared between threads, it becomes almost impossible to reason about your program. How can both threads reach the critical section simultaneously?
Faculty of Science Information and Computing Sciences 11
a=0;
Thread one
a = a + 1; if (a == 1) { critical_section(); }
Thread two
a = a + 1; if (a == 1) { critical_section(); }
Faculty of Science Information and Computing Sciences 12
Atomicity
Assignments are not atomic operations. Instead they read in the current value of a, increment that value, and store it again. This can cause all kinds of undesirable interaction! Check out The Deadlock Empire (https://deadlockempire.github.io/) or our Concurrency course. Programming with locks and threads directly is really hard.
Faculty of Science Information and Computing Sciences 13
—
Faculty of Science Information and Computing Sciences 14
Beyond locks and threads
Locks and threads are at the heart of any concurrent program. More recently, languages have started to ofger difgerent abstractions, often built on top of simple locks and threads:
▶ actor model/message passing ▶ software transactional memory ▶ exploiting data parallelism ▶ executing code on GPUs
These all represent difgerent tools, specifjcally designed for difgerent problems.
Faculty of Science Information and Computing Sciences 15
Erlang
Originally developed in 1986 at Ericsson, it aimed to improve the software running on telephony switches. Specifjcally designed for concurrent programming with many difgerent processes that might fail at any time. Open source release together with OTP (Open Telecom Platform) containing:
▶ Erlang compiler and interpreter; ▶ communication protocol between servers; ▶ a static analysis tool (Dialyzer); ▶ a distributed database server (Mnesia); ▶ …
Faculty of Science Information and Computing Sciences 16
Does anyone use Erlang?
Social media:
WhatsApp Grindr
Betting websites
William Hill Bet365
Telecoms
TMobile AT&T
Much, much more..
Faculty of Science Information and Computing Sciences 16
Does anyone use Erlang?
▶ Social media:
▶ WhatsApp ▶ Grindr
▶ Betting websites
▶ William Hill ▶ Bet365
▶ Telecoms
▶ TMobile ▶ AT&T
▶ Much, much more..
Faculty of Science Information and Computing Sciences 17
Erlang for Haskell programmers
Erlang is:
▶ dynamically typed; ▶ strict rather than lazy; ▶ impure (there are no restrictions on efgects and
assignments);
▶ the syntax is familiar, but slightly difgerent.
Faculty of Science Information and Computing Sciences 18
Quicksort
qsort([]) -> []; qsort([X|XS]) -> qsort([Y || Y <- Xs, Y < X]) ++ [X] ++ qsort([Y || Y <- Xs, Y >= X]).
▶ No ‘equations’ but ‘arrows’ when pattern matching. ▶ Clauses are separated with semi-colons; a function
defjnition is fjnished with a period.
▶ All Erlang variables start with capital letters; names
starting with lower-case letters are atoms.
Faculty of Science Information and Computing Sciences 19
Modules
To compile the code, we need to include some module information
- module(myModuleName)
- compile(export_all)
qsort([]) -> ... This declares the module myModuleName and exports all its declarations.
Faculty of Science Information and Computing Sciences 20
Hello Erlang!
- module(hello).
- export([start/0]).
start() -> io:format("Hello Erlang!"). The export statement can be used to expose certain functions: note that each statements records the functions arity (the number of arguments it expects), but not its type. We can call the format function from the module io using io:format.
Faculty of Science Information and Computing Sciences 21
Concurrency
Concurrent Erlang programs are organized into processes, a lightweight virtual machine that can communicate with
- ther processes.
There are three important functions to defjne concurrent programs:
▶ spawn creates a new process; ▶ send sends a message to another process; ▶ receive receives a message sent by another process.
Faculty of Science Information and Computing Sciences 22
Case study: a concurrent fjle server
As a simple example to illustrate this concurrency model, consider defjning a simple concurrent fjle server and its client:
▶ the afile_server module waits for messages from the
client;
▶ the afile_client module may send messages to the
server requesting data.
Faculty of Science Information and Computing Sciences 23
Case study: the server
- module(afile_server)
- export([start/1,loop/1])
start(Dir) -> spawn(afile_server, loop, [Dir]) loop(Dir) -> ... Modules and processes are a bit like classes and objects: there may be many processes running code defjned in the same module. When a new afile_server is started using start, a new process is spawned. In this example, the process is spawned by calling the method loop from the module afile_server with the argument [Dir].
Faculty of Science Information and Computing Sciences 24
Case study: the server
loop(Dir) -> receive {Client, list_dir} -> Client ! {self(), file:list_dir(Dir)}; {Client, {get_file, File}} -> Full = filename:join(Dir,File) Client ! {self(), file:read_file(Full)} end, loop(Dir). This code waits for a message from the client (receive …). Once the message has been received and processed, it calls loop(Dir) again to receive a new message, ad infjnitum.
Faculty of Science Information and Computing Sciences 25
Case study: the server
receive {Client, list_dir} -> Client ! {self(), file:list_dir(Dir)}; {Client, {get_file, File}} -> Full = filename:join(Dir,File) Client ! {self(), file:read_file(Full)} end, The receive ... end statement pattern matches on the message it receives from the client. Note: variables bound are written with capitals; constants start with a lower-case letter.
Faculty of Science Information and Computing Sciences 26
Case study: the server
receive {Client, list_dir} -> Client ! {self(), file:list_dir(Dir)}; {Client, {get_file, File}} -> Full = filename:join(Dir,File) Client ! {self(), file:read_file(Full)} end, Here we expect one of two commands:
▶ {Client, list_dir} – the client asks us to list the fjles
in the directory Dir;
▶ {Client, {get_file, File}} – the client asks us to
return the contents of the fjle File.
Faculty of Science Information and Computing Sciences 27
Case study: the server
receive ... {Client, {get_file, File}} -> Full = filename:join(Dir,File) Client ! {self(), file:read_file(Full)} If we receive the {get_file, File} message from the client Client: * compute the absolute fjle name. * send a new message to the client with the fjle contents. * the sender of this message is self() All messages are tagged with their sender. (The notation {X,Y} creates a tuple in Erlang.)
Faculty of Science Information and Computing Sciences 28
Case study: the client
- module(afile_client).
- export([ls/1,get_file/2]).
ls(Server) -> Server ! {self(), list_dir}, receive {Server, FileList} -> FileList end. get_file(Server, File) -> Server ! {self(), {get_file, File}}, receive {Server, Content} -> Content end.
Faculty of Science Information and Computing Sciences 29
Case study: the client
This example shows that there is symmetry between the client and server. Whenever one sends a message, the other is expecting one and visa versa. Our code is structured in difgerent modules, each which may correspond to one or more processes at run-time. These processes communicate through message passing.
Faculty of Science Information and Computing Sciences 30
Connecting the two
We can test our server in isolation by sending it messages from the REPL. Or start a server and a client and connect the two: > c(afile_server). {ok,afile_server} > c(afile_client). {ok,afile_client} > FileServer = afile_server:start("."). <0.43.0> > afile_client:get_file(FileServer,"missingfile"). {error,enoent} > afile_client:get_file(FileServer,"hello.txt"). {ok, <<"hello there...}
Faculty of Science Information and Computing Sciences 31
Data types
Erlang lets you defjne algebraic data types and declare types for your functions:
- spec plan_route(point(), point()) -> route().
- type direction() :: north | east | south | west.
- type point() :: {integer(),integer()}.
- type route() :: [{go,direction(),integer()}.
The Erlang Dialyzer is a static-analysis tool that tries to detect whether or not a program will crash.
Faculty of Science Information and Computing Sciences 32
The Dialyzer
When the Erlang Dialyzer warns your program will fail, you can be sure that it will fail (soundness). If the Dialyzer does not warn that your program will fail, you cannot be sure that it will not fail (completeness). It works even if you do not provide type annotations (type inference), and uses what Erlang calls success typing – assume everything will work out, but raise a warning if it cannot.
Faculty of Science Information and Computing Sciences 33
Pattern matching
Note that pattern matching may be contain more than one
- ccurence of the same variable:
eq(X,X) -> "The arguments are equal"; eq(X,Y) -> "The arguments are different". Erlang is dynamically typed: you can tag tuples with an atom rather than introduce a algebraic data type if you wish.
Faculty of Science Information and Computing Sciences 34
Error handling
Many languages encourage defensive programming – check all possible ways a function can fail. In Erlang, this idea is built into the language: when you call a function in an invalid way, it will crash. But it is also part of the Erlang philosophy: Let it crash!
▶ exit(Why) terminate current process (and notify any
connected processes that this process has terminated)
▶ throw(Why) throw an exception that the caller may
want to catch.
▶ error(Why) crash unexpectedly
Faculty of Science Information and Computing Sciences 35
Erlang philosophy: let it crash!
Never return a bogus value (like other languages such as Javascript). Fail fast and noisily and let the caller handle the error, using a try-catch block: try FunctionOrExpression of Pattern1 -> Expression1; Pattern2 -> Expression2; ... catch ExceptionType1: ExPattern1 -> ExExpression1; ... after AfterExpressions end.
Faculty of Science Information and Computing Sciences 36
Erlang design principles
▶ Everything is a process. ▶ Processes are strongly isolated – they do not share
memory.
▶ Process creation and destruction is a lightweight
- peration.
▶ Message passing is the only way for processes to
interact.
▶ Processes have unique names. ▶ If you know the name of a process you can send it a
message.
▶ Processes share no resources. ▶ Error handling is non-local. ▶ Processes do what they are supposed to do or fail.
Faculty of Science Information and Computing Sciences 37
Erlang philosophy: concurrency everywhere
Object oriented programmers try to model the world using classes and objects. Erlang takes a similar ‘philosophical position’: everything is a process; processes communicate through message passing. When a process fails, all connected processes are notifjed.
Faculty of Science Information and Computing Sciences 38
Erlang processes
Note: Erlang processes are not operating system processes. They are small, self-contained virtual machines running Erlang code. These processes are created (spawn) and can send (Pid ! Message) and receive (receive ... end) messages. Processes all have an associated ‘mailbox’ – even when they are not waiting to receive a message, incoming messages will be stored.
Faculty of Science Information and Computing Sciences 39
Processes vs functions
It’s trivial to turn any function into a ‘server’ handling that function: area({rectangle, Width, Height}) -> Width * Height; area({square, Side}) -> Side * Side. Versus loop() -> receive {rectangle, Width,Height} -> io:format("Area is~p~n", [Width * Height]) loop(); ...
Faculty of Science Information and Computing Sciences 40
Organizing processes
The lightweight nature of creating new processes raises new a design question: How do we organize our code into processes? Which processes communicate? And what messages do they send one another?
Client-server
Two processes: the client sends requests to the server; the server computes a suitable reply and sends a response to the client. We saw this already in the fjle server example.
Faculty of Science Information and Computing Sciences 40
Organizing processes
The lightweight nature of creating new processes raises new a design question: How do we organize our code into processes? Which processes communicate? And what messages do they send one another?
Client-server
Two processes:
▶ the client sends requests to the server; ▶ the server computes a suitable reply and sends a
response to the client. We saw this already in the fjle server example.
Faculty of Science Information and Computing Sciences 41
Client server
Instead of tagging all messages with process ids explicitly, we can defjne an auxiliary function rpc (remote procedure call): rpc(Pid,Request) -> Pid ! {self(), Request}, receive Response -> Response end This tags every request with the Pid to which to respond.
Faculty of Science Information and Computing Sciences 42
Updating the loop function
The branches of our loop function send messages, instead
- f printing to the terminal:
{From, {rectangle, Width, Height}} -> From ! Width * Height, loop();
Faculty of Science Information and Computing Sciences 43
But…
▶ If we receive an unexpected message, we’ll get an
dynamic pattern match failure.
▶ The rpc function sends a request to the server and
awaits a response. If some other process sends a message before the server responds, we may return an incorrect result.
▶ We need to explicitly spawn the server when we want
to start it up.
Faculty of Science Information and Computing Sciences 44
Receiving unexpected message types
To handle unexpected messages, we can simply send an error back to the sender. We add a third branch to the loop function: {From, Other} -> From ! {error, Other}, loop()
Faculty of Science Information and Computing Sciences 45
Receiving messages from unexpected processes
We need to ensure that the response we get in the rpc function is really from the process we’re expecting: rpc(Pid, Request) -> Pid ! {self(), Request), receive {Pid, Response} -> Response end. The Pid in the receive clause is no longer a binding
- ccurrence! It must be equal to the Pid function argument.
Any messages from a difgerent Pid are queued.
Faculty of Science Information and Computing Sciences 46
Creating a server
By adding start and area methods, we can make the interaction with the server a bit easier: start() -> spawn(area_server, loop, []) area(Pid,What) -> rpc(Pid, What). > Pid = area_server:start() <0.36.0> > area_server:area(Pid, {rectangle, 10, 8}). 80
Faculty of Science Information and Computing Sciences 47
Processes features
▶ Creating Erlang processes is very cheap – in contrast to
creating new OS processes!
▶ We can add an after clause to a receive statement to
timeout after 10 milliseconds: receive ... -> ... after 10 -> ...timeout code...
Faculty of Science Information and Computing Sciences 48
Receiving messages
- 1. When entering a receive statement, we start a timer;
- 2. Match the fjrst message in the mailbox, with the
patterns one by one. Enter the branch corresponding to the fjrst match.
- 3. If no branch matches, add the message to the ‘save
queue’. Try the next message until one does match.
- 4. If no message matches, suspend the process until a new
message arrives.
- 5. If a new message matches, reset the timer and add the
saved messages to the mailbox in the order that they arrived.
- 6. If the timer elapses before a message is matched,
execute the corresponding timeout expression and restore the messages in the mailbox.
Faculty of Science Information and Computing Sciences 49
Error handling
In a single threaded program, crashing is bad. In a concurrent program, if a single process crashes, it is not the end of our program. Instead of avoiding crashes, Erlang embraces the fact that any single process may crash. The question is: how do we detect crashes and restore crashed processes? The focus is on cure rather than prevention.
Faculty of Science Information and Computing Sciences 50
Two slogans
Let some other process fjx it
Processes are arranged to monitor one another for health. When a process crashes, the observing process is informed and takes restorative action.
Let it crash
If necessary, let any individual process crash.
Faculty of Science Information and Computing Sciences 51
Let it crash
▶ No need for checking arguments – just crash ▶ No need to perform additional calculations after things
have gone wrong.
▶ Clean separation of concerns: other processes handle
the crash; this process does computation only.
▶ Crashing early makes it easier to diagnose what went
wrong.
▶ …
Faculty of Science Information and Computing Sciences 52
Erlang: recap
Erlang is a ‘concurrency oriented language’. Processes are (almost) as easy to work with as functions. Processes communicate through message passing, but do not share state. Processes are free to fail; other processes will clean up the mess. Processes can be run on a single machine, or distributed
- ver many difgerent machines.
Faculty of Science Information and Computing Sciences 53
Coming up
▶ Parallel programming with Erlang ▶ Concurrent and parallel programming with Haskell
Don’t forget: intermediate project reports due after Christmas
Faculty of Science Information and Computing Sciences 54
Student assistants wanted!
Looking for active involvement in our teaching?
▶ Apply for a Teaching Assistantship! ▶ For period 3 and/or 4 ▶ 383 ~ 446 euro per month (before tax) ▶ 8 hours a week (4 contact hours @UU, 4 @home) ▶ Use the form at:
https://wwwsec.cs.uu.nl/education/sollicitatie.php
▶ Deadline: January 9, 2017
Faculty of Science Information and Computing Sciences 55