Faculty of Science Information and Computing Sciences 1
Concepts of programming languages Lecture 10 Wouter Swierstra - - PowerPoint PPT Presentation
Concepts of programming languages Lecture 10 Wouter Swierstra - - PowerPoint PPT Presentation
Concepts of programming languages Lecture 10 Wouter Swierstra Faculty of Science Information and Computing Sciences 1 Project progress Please send me a brief progress no later than Friday! Each time will be given a 15 minute slot to present
Faculty of Science Information and Computing Sciences 2
Project progress
Please send me a brief progress no later than Friday! Each time will be given a 15 minute slot to present their results on 26/1 – we’ll use both the lecture and lab slot. Hand in your code and report on Friday 27/1.
Faculty of Science Information and Computing Sciences 3
Last time
Concurrency and parallelism
▶ Why is this important? ▶ What are these two concepts? ▶ How does Erlang address these challenges?
Faculty of Science Information and Computing Sciences 4
Today
Erlang’s error handling Parallel programming in Erlang
Faculty of Science Information and Computing Sciences 5
Erlang recap
Erlang is a concurrency-oriented programming language. It is cheap to spawn new processes. These processes do not share any state, but communicate through message passing. A process can send and receive messages to any other process, using send and receive.
Faculty of Science Information and Computing Sciences 6
Example: area of a shape
area({square, X}) -> X * X; area({rectangle, X,Y}) -> X * Y. We can now test such functions: 8>test:area({rectangle,3,4}). 12 9>test:area({circle,2}). **exception error: no function clause matching test:area({circle,2}) in (test.erl, line 16)
Faculty of Science Information and Computing Sciences 7
Erlang
Instead of crashing, we could choose to adapt our function to try and cope with unexpected inputs: area({square, X}) -> X * X; area({rectangle, X,Y}) -> X * Y; area(_) -> 0. Now unexpected calls, such as test:area({circle, 2}) will no longer crash the process, but return a dummy result instead. Question: Why is this a good idea? Why is it not?
Faculty of Science Information and Computing Sciences 7
Erlang
Instead of crashing, we could choose to adapt our function to try and cope with unexpected inputs: area({square, X}) -> X * X; area({rectangle, X,Y}) -> X * Y; area(_) -> 0. Now unexpected calls, such as test:area({circle, 2}) will no longer crash the process, but return a dummy result instead. Question: Why is this a good idea? Why is it not?
Faculty of Science Information and Computing Sciences 8
Time passes…
You add many new difgerent functions manipulating shapes. And you decide to extend the supported shapes with circles after all. At some point, we notice that things don’t work as expected for our new shapes – we are still silently producing a bogus answer. And we end up writing lots of code to fjx this elsewhere.
Faculty of Science Information and Computing Sciences 9
Error handling
▶ Handling errors can account to more than 60% of the
code base – it is expensive to develop and maintain.
▶ Typically, error handling code is poorly tested as most
testing efgort is spent on handling the success scenarios. About two-thirds of system crashes are caused by bugs in the error-handling code.
Faculty of Science Information and Computing Sciences 10
Let it crash!
The Erlang philosophy is that stopping a malfunctioning process is better than let it continue to efgect the overall system. As the system is composed of many difgerent processes, terminating one process should (hopefully) not compromise the entire system’s integrity. Instead of trying to prevent errors, focus on how to detect and recover from errors.
Faculty of Science Information and Computing Sciences 11
Let it crash!
This sounds like a brazen strategy for error handling, but:
▶ there is no shared memory between processes; ▶ there is no mutable data;
We can isolate the failure to a single process. One user or client may observe the failed process, but the entire system should keep going.
Faculty of Science Information and Computing Sciences 12
Supervisor processes
One way to ensure that the system remains stable would be through a supervisor process that detects the failure of
- ther processes.
Crucially, when other processes fail, these failures should not cause the supervisor to fail. This is very ‘coarse-grained’ error handling on the process level (the big picture), rather than trying to fjx smaller problems that might occur in individual functions. What language features does Erlang provide for error handling on the process level?
Faculty of Science Information and Computing Sciences 13
Terminology
▶ Two processes may be linked. If one process crashes, its
linked processes are notifjed.
▶ The set of process linked to P is referred to as the link
set.
▶ One directional links are refgered to as monitors. ▶ Communication is done through message passing or
error signals. We’ve seen messages already; error signals are sent automatically to linked processes when a process terminates.
▶ Error signals are of the form {'EXIT', Pid, Why}
describing what process ended and why it ended.
▶ Any process can call Exit(Why) to terminate and notify
its linked processes.
Faculty of Science Information and Computing Sciences 14
Creating links
Links are created manually using the link function. The call link(Pid) links the current process to the process with id Pid. Processes may be linked with zero, one, or many other processes in this fashion.
Faculty of Science Information and Computing Sciences 15
Error handling
When a we receive an error signal from a linked process, two things may happen:
▶ if the process is not a system process, it dies and notifjes
all its linked processes.
▶ if the process is a system process, we can stop the
propagation of the signal. Any process can become a system process using the process_flag function; all spawned processes are not system processes by default.
Faculty of Science Information and Computing Sciences 16
Creating monitors and links
Simply calling spawn will not link the new process. There are variants spawn_link and spawn_monitor than create links/monitors.
▶ Using functions like link or monitor we can create
bidirectional or unidirectional links between processes explicitly.
▶ We can destroy links using, unlink and unmonitor
functions.
▶ We can crash by calling exit.
Faculty of Science Information and Computing Sciences 17
Linking processes
We can trigger code to run when another program exits by calling on_exit(Pid,Function). That way, when a linked process dies, we can restart it or take suitable action.
Faculty of Science Information and Computing Sciences 18
Example: division server
5> Pid = spawn(fun()->receive N -> 1/N end end). 6> test:on_exit(Pid,fun(Why)-> io:format("***exit: ~p\n",[Why]) end). 7> Pid ! 1. ***exit: normal 1 8> Pid ! 0. =ERROR REPORT==== 25-Apr-2012::19:57:07 === Error in process <0.60.0> with exit value: {badarith,[{erlang,'/',[1,0],[]}]} ***exit: {badarith,[{erlang,'/',[1,0],[]}]}
Faculty of Science Information and Computing Sciences 19
Live together, die together
If we want to split a problem into smaller pieces to achieve parallelism, we want the whole process to fail if the subprocesses do. This is easy to achieve by:
▶ spawning new worker processes; ▶ linking the new processes with the parent process.
If any worker process dies, all the processes die.
Faculty of Science Information and Computing Sciences 20
Eternal threads
Using functions like on_exit we can restart a process whenever it dies. keep_alive(Fun) -> Pid = spawn(Fun),
- n_exit(Pid,fun(_) -> keep_alive(Fun) end).
This is particularly useful for servers that should never go down. There is one problem: upon restarting you may have a new pid – how will anyone ever communicate with the restarted server?
Faculty of Science Information and Computing Sciences 20
Eternal threads
Using functions like on_exit we can restart a process whenever it dies. keep_alive(Fun) -> Pid = spawn(Fun),
- n_exit(Pid,fun(_) -> keep_alive(Fun) end).
This is particularly useful for servers that should never go down. There is one problem: upon restarting you may have a new pid – how will anyone ever communicate with the restarted server?
Faculty of Science Information and Computing Sciences 21
The process registry
There is a global process registry that lets you associate names (or rather, atoms) with pids.
▶ register(Name,Pid) - enters a process in the registry; ▶ whereis(Name) - looks up a process in the registry; ▶ unregister(Name) - remove a process from the registry.
Faculty of Science Information and Computing Sciences 22
Always alive divider process
divider() -> keep_alive(fun() -> register(divider,self()), receive N -> io:format(..., [1/N]) end end). Now we can always send messages to this process, even after it crashes. > divider ! 0. =ERROR REPORT=== .... > divider ! 2. 1 divided by 2 is 0.5
Faculty of Science Information and Computing Sciences 23
Process architecture
Deciding how to split up your problem across processes is not easy. It requires skill and experience – similar to splitting up code across objects, or choosing precise algebraic data types to model some domain. The good news is: it is cheap (both in time and syntax) to create new threads in Erlang!
Faculty of Science Information and Computing Sciences 24
Case study: bank accounts
Suppose we want to write a process that handles bank account transactions. I’ll start with a simple fjle and layer error-handling and recovery on top of it.
Faculty of Science Information and Computing Sciences 25
rpc and reply
I’ll use a variation of the rpc function we saw previously rpc(Pid,Request) -> Ref = make_ref(), Pid ! {{self(),Ref}, Request}, receive {Ref,Response} -> Response end reply({ClientPid,Ref},Response}) -> ClientPid ! {Ref,Response} This tags each request with a unique value (make_ref) to ensure that each response is sent back to the right client.
Faculty of Science Information and Computing Sciences 26
account(Name,Balance) -> receive {Client,Msg} -> case Msg of {deposit,N} -> reply(Client,ok), account(Name,Balance+N); {withdraw,N} when N=<Balance -> reply(Client,ok), account(Name,Balance-N); {withdraw,N} when N>Balance -> reply(Client,{error,insufficient_funds}), account(Name,Balance) end end.
Faculty of Science Information and Computing Sciences 27
account(Name,Balance) -> receive {Client,Msg} -> case Msg of {deposit,N} -> reply(Client,ok), account(Name,Balance+N); {withdraw,N} when N=<Balance -> reply(Client,ok), account(Name,Balance-N); ...
▶ Upon receiving a message, reply to the client. ▶ Recursive calls update the current ‘state’.
Faculty of Science Information and Computing Sciences 28
Towards a generic server
We would like to decompose this into:
▶ a generic part that handles communication; ▶ a specifjc part that handles the domain logic.
This has the advantage that we can then add more complex error handling code in the generic part, without having this appeare in our domain logic.
Faculty of Science Information and Computing Sciences 29
Separating communication and logic
The server is responsible for communication: server(State) -> receive {Client,Msg} -> {Reply,NewState} = handle(Msg,State), reply(Client,Reply), server(NewState). end. The handle function does all the computation: handle(Msg,Balance) -> case Msg of {deposit,N} -> {ok,Balance+N};
Faculty of Science Information and Computing Sciences 30
Separating communication and logic
Our server now calls a fjxed handle function: server(State) -> receive {Client,Msg} -> {Reply,NewState} = handle(Msg,State), ... This is fjne, but what if we want to use difgerent handle functions for our server?
Faculty of Science Information and Computing Sciences 31
Modules in Erlang
In Erlang, we can call the function foo from the module myModule: myModule:foo(atom,A,3) But myModule does not need to be known statically… Module:foo(atom,A,3) Will call the function foo associated with the module name bound by the variable Module. Question: What are the pros and cons of this design?
Faculty of Science Information and Computing Sciences 32
A generic server
We can parametrize our server by the module name containing the domain logic: server(Mod,State) -> receive {Client,Msg} -> {Reply,NewState} = Mod:handle(Msg,State), reply(Client,Reply), server(Mod,NewState). end.
Faculty of Science Information and Computing Sciences 33
Running forever
Provided our module defjnes a suitable initial value init we can start up a server that will run forever: new_server(Name,Mod) -> keep_alive(fun () -> register(Name,self()), server(Mod,Mod:init())).
Faculty of Science Information and Computing Sciences 34
The Bank Account module
We can put all the domain logic in a separate module: handle(Msg,Balance) -> case Msg of {deposit,N} -> {ok, Balance+N} {withdraw,N} when N=< Balance -> {ok, Balance-N} {withdraw,N} when N>Balance -> {{error, insufficiennt_funds}, Balance} This code is entirely sequential – it doesn’t know anything about the surrounding server code. This is the only application code that you need to write.
Faculty of Science Information and Computing Sciences 35
Fun with module names
The module storing the handler function is now simply a variable passed to the server when it is started: server(Mod,State) -> receive {Client,Msg} -> ... end. What if we want to update the handler code?
Faculty of Science Information and Computing Sciences 36
Fun with module names
We can extend our server with a new branch to do just that! server(Mod,State) -> receive {Client,{new_code,NewMod}} -> reply(Client,ok), server(NewMod,State); {Client,Msg} -> ... end. We can hot-swap new code into our server with zero downtime, without losing the server state.
Faculty of Science Information and Computing Sciences 37
Robustness
The handle function processes several difgerent kinds of requests. The server passes the clients’ requests to the handle function. But what happens when a client makes an invalid request – such as close_account? The server passes it on to the bank account module, which raises an exception, which crashes the server. Let’s pass the error on to the client instead.
Faculty of Science Information and Computing Sciences 38
server(Mod,State) -> receive {Client,Msg} -> {Reply,NewState} = Mod:handle(Msg,State), reply(Client,Reply), server(Mod,NewState). end. This was our server code so far.
Faculty of Science Information and Computing Sciences 39
server(Mod,State) -> receive {Client,Msg} -> case catch Mod:handle(Msg,State) of {'EXIT', Reason} -> ... {Reply,NewState} -> reply(Client,{ok,Reply}), server(Mod,NewState). end. We can catch exceptions raised by our handle code. What should we do when this happens?
Faculty of Science Information and Computing Sciences 40
case catch Mod:handle(Msg,State) of {'EXIT', Reason} -> reply(Client,{crash,Reason}) server(Mod,State) If the call to handle throws an exception, we tell the client to crash and continue with the old state.
Faculty of Science Information and Computing Sciences 41
Revising the rpc function
Of course, we need to update the rpc function to handle the crash message: rpc(Name,Msg) -> ... receive {Ref,{crash,Reason}} -> exit(Reason); {Ref,{ok,Reply}} -> Reply end.
Faculty of Science Information and Computing Sciences 42
Transaction semantics
This server now has transaction semantics
▶ each request is handled atomically, difgerent requests
are never interleaved.
▶ an invalid request crashes the client, but leaves the
server state unafgected.
▶ other clients are not afgected by such an invalid request
Faculty of Science Information and Computing Sciences 43
Advantages of Erlang
The server code fjts on a slide, yet it can:
▶ hot-swap new code with zero downtime without losing
server state;
▶ implement transactional semantics.
This is really hard in most languages, but was easy in Erlang:
▶ Our server function was pure: all the state captured in a
single value;
▶ All the state updates are done by a single handler
function.
Faculty of Science Information and Computing Sciences 44
Real servers
The Erlang library provides a more realistic gen_server function:
▶ initialization; ▶ handle_call – messages expecting a reply; ▶ handle_cast – messages not expecting a reply; ▶ handle_info – timeouts or unexpected messages; ▶ termination ▶ code_change – hot swapping code
And on top of that, various ways to trace, log, monitor, etc. the server and client processes. But the key concepts are in our little server function.
Faculty of Science Information and Computing Sciences 45
Writing distributed programs
So far, we have used functions such as spawn, link, or receive to communicate between processes. But where are these processes run? This could be on the same machine. But not necessarily! We can distribute our processes across many difgerent physical machines.
Faculty of Science Information and Computing Sciences 46
Distributed programming
▶ Performance – more hardware, means faster solutions. ▶ Reliability – less prone to one machine failing. ▶ Scalability – we can add new machines, without
rewriting the architecture of our application.
▶ Intrinsic distribution – if you’re writing a MMOG/chat
application/… - you want to run code across difgerent machines.
Faculty of Science Information and Computing Sciences 47
Flavours of distributed Erlang
Erlang code is organized in nodes, each with its own set of associated processes, address space, virtual machine, etc. We can run many Erlang nodes on a single machine. Or run many Erlang nodes across difgerent machines. This requires some trust – any node can perform any
- peration on any other Erlang node.
Instead, separate nodes can communicate through TCP/IP sockets. Or mix both communication fmavours.
Faculty of Science Information and Computing Sciences 48
Easy to distribute
One of the key advantages of Erlang is that it is easy to write distributed code gradually:
▶ Start with many processes running on a single node on a
single machine.
▶ Add new Erlang nodes on the same machine. ▶ Add new machines on the same local network. ▶ Add new machines anywhere on the internet.
Each step requires very little modifjcation to the existing code.
Faculty of Science Information and Computing Sciences 49
Parallelism through message passing
We have seen how Erlang’s concurrency works very well for certain classes of problems. What about parallelism? How can we speed up a (sequential) computation using message passing?
Faculty of Science Information and Computing Sciences 50
Sequential quicksort
Previously we saw a sequential version of quicksort: qsort([]) -> []; qsort([X|XS]) -> qsort([Y || Y <- Xs, Y < X]) ++ [X] ++ qsort([Y || Y <- Xs, Y >= X]). Why not spawn the recursive call in a separate process?
Faculty of Science Information and Computing Sciences 51
Parallel quicksort
pqsort([]) -> []; pqsort([X|XS]) -> spawn_link(fun() -> pqsort([Y || Y <- Xs, Y >= X])), pqsort([Y || Y <- Xs, Y < X]) ++ [X] ++ ???. It’s easy enough to spawn a thread. Question: how do we get the result back?
Faculty of Science Information and Computing Sciences 52
Parallel quicksort
pqsort([]) -> []; pqsort([X|XS]) -> Parent = self(), spawn_link(fun() -> Parent ! pqsort([Y || Y <- Xs, Y >= X])), pqsort([Y || Y <- Xs, Y < X]) ++ [X] receive Ys -> Ys end. We spawn a new thread that will send a message back to us with the second-half of the list. Upon receiving that message, we are done sorting the list.
Faculty of Science Information and Computing Sciences 53
Benchmarking
Once we start benchmarking our implementation, we see that our parallel quicksort is slower – even on multi-core CPUs. Question: Why might that be? The overhead for spawning new processes is small, but can still outweigh the advantages of multiple CPUs. We need to control the granularity of our parallelism.
Faculty of Science Information and Computing Sciences 53
Benchmarking
Once we start benchmarking our implementation, we see that our parallel quicksort is slower – even on multi-core CPUs. Question: Why might that be? The overhead for spawning new processes is small, but can still outweigh the advantages of multiple CPUs. We need to control the granularity of our parallelism.
Faculty of Science Information and Computing Sciences 54
Controlling granularity
psort(Xs) -> pqsort(5,Xs). pqsort(0,XS) -> qsort(Xs), pqsort(D,[]) -> ... pqsort(depth,[X|XS]) -> ... spawn_link(fun() -> Parent ! pqsort(D-1,[Y || Y <- Xs, Y >= X])), qsort(D-1,[Y || Y <- Xs, Y < X]) ... We can control the depth of the parallelism manually.
Faculty of Science Information and Computing Sciences 55
Correctness?
Unfortunately, this simple defjnition of parallel sorting is not correct. As we make recursive calls, we spawn new processes to sort sublists. But these processes may return in a difgerent order. We need to tag our messages with references to disambiguate them.
Faculty of Science Information and Computing Sciences 56
No shared memory
Remember, Erlang processes do not share any memory. In the call spawn_link(fun() -> Parent ! pqsort(D-1,[Y || Y <- Xs, Y >= X])), we copy all of Xs to the new process. Question: Can we do better?
Faculty of Science Information and Computing Sciences 57
No shared memory
We can copy only the larger elements instead: Grtr = [Y || Y <- Xs, Y >= X] spawn_link(fun() -> Parent ! pqsort(D-1,Grtr)), This reduces the amount of data copied between processes.
Faculty of Science Information and Computing Sciences 58
Distributed sorting
We can even go so far as to distribute our sorting algorithm across difgefrent machines. Communication is slower – but our algorithm already doesn’t rely on any shared memory, so it requires little adaptation. But getting good speedups requires quite some work to manage the tasks done by every machine and minimize communication overhead.
Faculty of Science Information and Computing Sciences 59
Today
Erlang’s error handling Parallel programming in Erlang
Faculty of Science Information and Computing Sciences 60