CPL 2016, week 9 Erlang fault tolerance and distributed programming - - PowerPoint PPT Presentation
CPL 2016, week 9 Erlang fault tolerance and distributed programming - - PowerPoint PPT Presentation
CPL 2016, week 9 Erlang fault tolerance and distributed programming Oleg Batrashev Institute of Computer Science, Tartu, Estonia April 4, 2016 Overview Previous week Erlang: functional core and agents Today Erlang fault tolerance and
Overview
Previous week
◮ Erlang: functional core and agents
Today
◮ Erlang fault tolerance and distributed programming
Next weeks
◮ Clojure language and asynchronous Javascript
Fault tolerance 27/50
- General ideas
◮ system process – handles exit signals from linked processes
◮ traps exit messages {’EXIT’,Pid,Why}
A
Fault tolerance 27/50
- General ideas
◮ system process – handles exit signals from linked processes
◮ traps exit messages {’EXIT’,Pid,Why} ◮ default: usual process dies if Why = normal
A B
Fault tolerance 27/50
- General ideas
◮ system process – handles exit signals from linked processes
◮ traps exit messages {’EXIT’,Pid,Why} ◮ default: usual process dies if Why = normal
◮ link – connection between 2 processes
◮ symmetric, must be set explicitly
A B
Fault tolerance 27/50
- General ideas
◮ system process – handles exit signals from linked processes
◮ traps exit messages {’EXIT’,Pid,Why} ◮ default: usual process dies if Why = normal
◮ link – connection between 2 processes
◮ symmetric, must be set explicitly
◮ exit signal – sent to the set of linked processes
◮ when process dies {’EXIT’,B,Why} ◮ when process finishes {’EXIT’,B,normal}
A B {’EXIT’,B,Why}
Fault tolerance 28/50
- Links
◮ defines error propagation path between 2 processes ◮ if one dies then another gets exit signal ◮ links are established with
◮ link(B) or ◮ spawn_link(Fun)
Fault tolerance 29/50
- Signals
Exit signal
◮ generated when process dies or finishes ◮ {’EXIT’,Pid,Why}
◮ Why=normal if a process just finishes (i.e. recursion ends) ◮ Why=<exception desc> if there was a problem ◮ exit(Why) may be called to stop itself
◮ sent to all linked processes
Faking death: exit(Pid2, Why)
◮ sends {’EXIT’,Pid,Why} to process Pid2 ◮ continues exection
Fault tolerance 30/50
- System processes
◮ usual process
◮ dies if receives exit signal from any linked process where
Why=normal
◮ system process
◮ set with process_flag(trap_exit,true) ◮ traps exit signals from linked processes ◮ messages {’EXIT’,Pid,Why} are added to its mailbox
◮ exit signals with Why=kill are not caught at all!
◮ process is killed, even system process ◮ {’EXIT’,Pid,killed} broadcasted to all linked processes
(notice that kill is propagated as killed)
Fault tolerance 31/50
- Example (1)
- n_exit(Pid , Fun) ->
spawn(fun () -> process_flag (trap_exit , true), link(Pid), receive {'EXIT ',Pid ,Why} -> Fun(Why) end end ). ◮ creates a process that “monitors” the process with given Pid ◮ upon exit calls the given function Fun
Fault tolerance 32/50
- Example (2)
F = fun () -> receive X -> list_to_atom (X) end end. Pid = spawn(F).
- n_exit(Pid ,
fun(Why) -> io:format("~p died with ~p~n",[Pid ,Why]) end ). ◮ create a process that transforms lists to atoms ◮ add error handler with on_exit ◮ Now sending Pid !
hello.
◮ results in <0.61.0> died with:{badarg,[{· · ·
Fault tolerance 33/50
- Summary of exit signals
What happens if
◮ a process with given trap_exit (i.e. system or not) ◮ receives the given Exit signal
trap_exit Exit signal Action true kill Die: broadcast the exit signal killed to the link set true X Add {’EXIT’,Pid,X} to the mailbox false normal Continue: Do nothing, signal vanishes false kill Die: broadcast the exit signal killed to the link set false X Die: broadcast the exit signal X to the link set
Fault tolerance 34/50
- Idioms for trapping exits
- 1. Don’t care about new process
Pid=spawn(fun () -> ... end)
- 2. Want to die if new process dies
Pid= spawn_link (fun () -> ... end)
- 3. Want to handle errors if new process dies
... process_flag (trap_exit , true), Pid= spawn_link (fun () -> ... end), loop (...). loop(State) -> receive {'EXIT ', SomePid , Reason} -> %% do something with the error loop(State1) ... end.
Distributed programming 35/50
- Overview
◮ In trusted environment – allow to run any code remotely
◮ Distributed Erlang – all message passing and error handling
work automatically
◮ In non-trusted environment – restrict what can be run
◮ lib_chan library – RPC like Erlang specific, automates
serialization of Erlang objects;
◮ socket based programming – low-level, but can be used with
- ther languages
◮ Binary data manipulation in Erlang – allows to encode/decode
messages for socket based programming
Distributed programming 36/50 Erlang specific -
Outline
Fault tolerance Distributed programming Erlang specific
Distributed Erlang Chan library
Binary data manupulation Socket programming Erlang support libraries
Distributed programming 37/50 Erlang specific - Distributed Erlang
Erlang node
◮ Erlang node is a separate Erlang VM
◮ may be run on the same or different host
◮ if it fails/exits then other VMs are not directly affected ◮ all communication between two or more nodes is transparent ◮ programmer does not see a difference
◮ except when creating processes or making explicit remote calls ◮ values (numeric, tuples, etc) are copied ◮ agents are provided with proxies to communicate
Distributed programming 38/50 Erlang specific - Distributed Erlang
Running nodes
◮ run erl with the argument -sname <nodename>
◮ may run any code from other local node
◮ run erl with the argument -name <nodename@host>
◮ may run on local or remote network ◮ use the same version of the code ◮ nodes must have the same cookie -setcookie <abc> ◮ make sure Erlang Port Mapper Daemon port is available (4369) ◮ choose a range of ports to be used (-kernel
inet_dist_listen_min <port>)
Distributed programming 39/50 Erlang specific - Distributed Erlang
Example
◮ rmt.erl
- module(rmt ).
- export([runme /0]).
runme () -> io:format("Running
- n ~p~n", [node ()]).
◮ Run node a erl -sname a 1> c(rmt ). 2> rmt:runme (). ◮ Run node b erl -sname b 1> rpc:call(a@mycomputer ,rmt ,runme ,[]). Running
- n
a@mycomputer
◮ replace mycomputer with your host name
Distributed programming 40/50 Erlang specific - Distributed Erlang
Distribution primitives
◮ built in rpc module rpc:call(Node ,Mod ,Function ,Args) -> Result|{badrpc ,Reason} ◮ built in global module ◮ extended spawn functions, etc spawn(Node ,Fun) -> Pid spawn(Node ,Mod ,Func ,ArgList) -> Pid spawn_link (Node ,Fun) -> Pid spawn_link (Node ,Mod ,Func ,ArgList) -> Pid disconnect_node (Node) -> bool ()| ignored node () -> Node node(Arg) -> Node nodes () -> [Node] Pid ! Msg {RegName ,Node} ! Msg
Distributed programming 41/50 Binary data manupulation -
Outline
Fault tolerance Distributed programming Erlang specific
Distributed Erlang Chan library
Binary data manupulation Socket programming Erlang support libraries
Distributed programming 42/50 Binary data manupulation -
Binary data
◮ used with data from external programs/sources ◮ double bracket syntax to define binaries (arrays) 2> R=31,G=0,B=0. 3> Color = <<R:5, G:6, B:5>>. <<248,0>>
◮ Integer:NumOfBits ◮ tricky with endianess (big or little)
◮ shorthand for strings 4> <<64,65,66>>. <<"@AB">> ◮ IoList is a list of integers (0..255), binaries, or IoLists 5> Lst = [ <<1,2,3>>,4,[ <<5,6>>,7] ,8]. ◮ convenient, if we want to defer combining into the final binary 6> list_to_binary (Lst ). <<1,2,3,4,5,6,7,8>>
Distributed programming 43/50 Binary data manupulation -
BIFs for binaries
BIF – built-in function in Erlang
◮ list_to_binary(IoList) ->binary()
◮ many IO functions accept IoLists, so no need to transform
◮ split_binary(Bin,Pos) ->{Bin1,Bin2}
◮ pattern matching for binaries may be more convenient
7> <<Z:3/ binary , Rest/binary >> = <<1,2,3,4,5,6,7,8>>. <<1,2,3,4,5,6,7,8>> 8> Z. <<1,2,3>> 9> Rest. <<4,5,6,7,8>> ◮ term_to_binary(Term) ->Bin ◮ binary_to_term(Bin) ->Term ◮ size(Bin) ->Int
Distributed programming 44/50 Binary data manupulation -
The bit syntax
Full syntax for binaries
◮ <<E1,E2,...,En>> where Ei are the elements ◮ each element is one of the forms Ei = Value | Value:Size | Value/ TypeSpecifierList | Value:Size/ TypeSpecifierList ◮ where TypeSpecifierList is hypnen separated list of
◮ endianess: big | little | native ◮ sign: signed | unsigned ◮ type: integer | float | binary
◮ Size is in bits for integer/float and in bytes for binaries 13> <<3:16/big -unsigned -integer , 6,7>>. <<0,3,6,7>> 14> <<3:16/ integer , 6,7, 2.1415:64/ float >>. < <0 ,3,6,7 ,64 ,1,33,202,192 ,131 ,18 ,111 >> 15> <<A:2/ binary ,B:16/ little -integer >> = <<1,2,3,0>>.
Distributed programming 45/50 Binary data manupulation -
Macros
◮ Erlang has macros that allow to define constants
- define(BYTE ,8/ signed -big -integer ).
- define(INT ,32/ signed -big -integer ).
- define(LONG ,64/ signed -big -integer ).
◮ use them in the code with the question sign <<DLen: ?INT , Salt:DLen/binary >> = Content , {params , binary_to_list (Salt)};
◮ notice how length of the binary is read from the same stream
first
Distributed programming 46/50 Socket programming -
Outline
Fault tolerance Distributed programming Erlang specific
Distributed Erlang Chan library
Binary data manupulation Socket programming Erlang support libraries
Distributed programming 47/50 Socket programming -
Sockets
Erlang gen_tcp library
◮ gen_tcp:connect(Host,Port,Options) -> {ok, Socket} ◮ list of options, some possible:
◮ binary – open in binary mode ◮ {packet, Len} – Len is the number of bytes before each
packet, that define the length of the packet; Erlang splits the stream and delivers the whole packets
◮ 0 means deliver unchanged stream
◮ process that created the socket is the controlling process
◮ process exit closes the socket ◮ data from socket is delivered as {tcp,Socket,Bin} message ◮ on socket close the process gets {tcp_closed,Socket}
Distributed programming 48/50 Socket programming -
Parallel server
start_parallel_server () -> {ok , Listen} = gen_tcp:listen (...) , spawn(fun () -> par_connect (Listen) end ). par_connect (Listen) -> {ok , Socket} = gen_tcp:accept(Listen), spawn(fun () -> par_connect (Listen) end), loop(Socket ). loop (...)
- > ...
◮ be cautious of controlling process
◮ spawn new process to listen another connection
Distributed programming 49/50 Socket programming -
Control issues
Different regimes for socket reception
◮ active – process gets {tcp, Socket, Bin} messages
◮ may be flooded with messages, but non-blocking
◮ passive – process must call gen_tcp:recv(Socket,N)
◮ may block the server (client) if buffers are empty (full)
◮ mixed – active for single message
◮ create socket with {active,once} ◮ re-enable after each message with
inet:setopts(Socket , [{active ,once}])
◮ best of two worlds
Erlang support libraries 50/50
- List of Erlang libraries
◮ ETS and DETS: Large data storage mechanisms
◮ ETS in memory, DETS on disk ◮ key-value; mutable unlike Erlang core! ◮ table types: sets, ordered sets, bags, duplicate bags
◮ OTP (Open Telecom Platform) is like J2EE to Java
◮ gen_server – transaction and hot-swap ◮ supervision trees (erl -man supervisor)
◮ Mnesia: the Erlang database
◮ relational, replication
◮ Crypto library, make a sha-1 hash
◮ Digest = crypto:hash(sha, Salt++"mypassword")