Erlang: Developing Scalable Systems Rise of collaborative systems - - PowerPoint PPT Presentation
Erlang: Developing Scalable Systems Rise of collaborative systems - - PowerPoint PPT Presentation
Erlang: Developing Scalable Systems Rise of collaborative systems Sequential system cannot scale. Need more concurrency. Parallel code can scale Collaborative computing Amdhals law Dont Scale up and Scale out Hitting
Rise of collaborative systems
Collaborative computing
Sequential system cannot scale. Need more concurrency. Parallel code can scale Amdhal’s law Don’t Scale up and Scale out
Hitting performance bottleneck….
Clock speed not increasing Not much gain in system performance Demand for concurrency in application Demand for fault tolerant technologies Natural progression of existing Natural progression of existing technologies Reaching the limits of sequential programming Transistors Clock speed Power Performance / Clock
Scalability
Scalability is the ability of a system, network,
- r process, to handle a growing amount of
work in a capable manner or its ability to be enlarged to accommodate that growth. Scalability: 1. Distributed-ness of applications.
- 2. Fault tolerance and failure management.
- 2. Fault tolerance and failure management.
- 3. Concurrency.
- 4. Performance.
So: 1. Go parallel .
- 2. Go asynchronous.
Background
Erlang is a programming language designed for developing robust systems of programs that can be distributed among different computers in a network. Named for the Danish mathematician Agner Krarup Erlang, Erlang is a functional and actor based programming language Named for the Danish mathematician Agner Krarup Erlang, the language was developed by the Ericsson Computer Sciences Lab to build software for its own telecommunication products.
Erlang has been successfully used in production systems for over 20 years (with reported uptimes of 9-nines — that's 31ms of downtime a year). Ericsson themselves have used Erlang extensively for many projects of varying sizes, both commercial and internal. The AXD301 ATM, one of Ericsson's flagship
More on background
sizes, both commercial and internal. The AXD301 ATM, one of Ericsson's flagship products, may be the largest Erlang project in existence at over 1.1 million lines of Erlang
Guiding principles
- Handling a very large number of concurrent activities.
- Actions to be performed at a certain point of time or within a certain
period of time.
- Systems distributed over several computers.
- Interaction with hardware.
- Very large software systems.
- Complex functionality such as feature interaction.
- Complex functionality such as feature interaction.
- Continuous operation over several years.
- Software maintenance (reconfiguration, etc.) without stopping the
system.
- Stringent quality and reliability requirements.
- Fault tolerance both to hardware failures and software errors.
So what Erlang is not,
Is not an object oriented programming language… No shared states… No concepts of threads… No concept of String ( they are stored as list of integer )… Does not have built in looping constructs like …… while , for… There is no constructs conditions like ….. If .. then .. else…
Then what Erlang is ,
Is Functional and concurrent programming language… Light weight Processes… Has message passing… ( between the processes ) Does have recursion and tail recursion…for looping Does have constructs for pattern matching , guards … for condition evaluation.
Erlang , brief overview
Constructs for sequential functional code.
- var, numbers , tuples , list , atoms , functions , guards , records , bifs.
Constructs for concurrent code.
- process/actors , OTP ( open telecom platform)
Inbuilt database / storage
- Mnesia / ETS (Erlang Term Storage) , DETS disk .
- Mnesia / ETS (Erlang Term Storage) , DETS disk .
Underlying Operating system Erlang Virtual Machine
- Memory management , portability , concurrency , garbage collection.
- Schedulers , I/O Model (Event Based)
Erlang , variables and atoms …
- Variables
- In Erlang variable is immutable.
- Once assigned cannot be reassigned.
- Variable name starts with a capital letter.
e.g. MyVariable = 54. MyVariable = 55. { this will throw an error.}
- Atoms
- Atoms
- They start with lower case.
- They should be enclosed in single quotes if not beginning in lower
case.
- They are literals with their name as value.
- They are constants which cannot be changed.
e.g. myatom
- Tuples
- It is a way to organize data.
- They can contain any kind of data.
- Their structure cannot be altered once assigned.
e.g. Types = { Orange , Apple }. Values = { 10 , 14}.
Erlang , Tuples and List …
Values = { 10 , 14}. Types = Values. value of Orange is 10 and the value of Apple is 14.
- List and List Comprehensions
- Used as a collection.
- inbuilt pattern matching for Head and Tail .
e.g. [Head | Tail ] = [ 1,2,3,4,5 ]. value of Head is 1 and the value of Tail is [2,3,4,5]
e.g. [X || X <- [1,2,3,4,5,6,7,8,9,10] , X rem 2 =:= 0].
Erlang , functions and anonymous functions…
Functions
- Higher order functions.
- Can be passed around as Arguments.
- e.g. arg1() -> 1.
arg2() -> 2. sum(X,Y) -> X()+y(). %% mymod:sum( fun mymod:arg1/0 , fun mymod:arg2/0 ).
- Called recursively ..
- Anonymous Functions .. Funs.
- Anonymous Functions .. Funs.
- Similar in functionality as a normal function
- Cannot be called recursively .
Guards
Erlang , storage (ETS)
- Large key-value look up tables.
- ETS is memory resident.
- ETS stores tuples.
- Data stored is transient (cannot survive system crash).
- ETS data are not garbage collected.
- First element of the tuple is the key of the table.
Basic operations on ETS are: Basic operations on ETS are:
- 1. Create a new table or open an existing table.
- 2. Insert a tuple or tuples in the database.
- 3. Lookup for a tuple in the table.
- 4. Dispose/Delete a table.
Types of table:
- Sets
- Ordered Sets,
- Bags,
- Duplicate bags.
- Process memory
ETS Table Insert Copied to ..
Erlang , storage (DETS)
- Large key-value look up tables.
- DETS is disk resident (DETS files can be of 2gb ).
- DETS stores tuples.
- Data stored is persistent (can survive system crash).
- DETS data are not garbage collected.
- First element of the tuple is the key of the table.
- DETS files are open before insertion and closed after insertion.
- DETS files are open before insertion and closed after insertion.
- Process can share the DETS table.
Basic operations on DETS are:
- 1. Create a new table or open an existing table.
- 2. Insert a tuple or tuples in the database.
- 3. Lookup for a tuple in the table.
- 4. Dispose/Delete a table.
Erlang , MNESIA
- DB tables can be stored in RAM for speed / Disk for persistence.
- Can be replicated in various machines for fault tolerance.
- Conditionally selecting data from the table ( use of list comprehensions ).
- Selecting Data from two tables and apply conditions.
- Provision for Transaction (pessimistic locking).
In case of a process accessing a “locked table” the “fun” in the process may be tried multiple number of times ( avoid side affects ).
- Mnesia supports fragmented tables ( horizontal portioning ) which can be
replicated across machines , and can have indexes. replicated across machines , and can have indexes.
- Strategies for tables.
- RAM resident single node.
- RAM + disk copy on a single node.
- Disk-only copy on a single node.
- RAM resident table on two/more nodes.
- Disk copies on two/mode nodes
Erlang , actors.
- Light weight process.
- Isolated from other process.
- No shared state between the actors.
- Inherently concurrent.
- Communication is using asynchronous messaging.
- Messages buffered in mail box.
- Massively scalable.
- Actors instead of Objects.
- Actors can change the state of itself
- Actors send (immutable)messages to other Actor.
Society of actors:
1. Fails fast ( trap_exit) events 2. Supervisors.
- Actors send (immutable)messages to other Actor.
- Actors don’t compete for the shared data.
Actors manage their heap and stack Mail box or Channel Messages can be serialized to survive crash
Apache vs. Yaws
http://www.sics.se/~joe/apachevsyaws.html Coordinated as thoughput (KBytes/second) vs. load. The red curve is yaws (running on an NFS file system). The blue curve is apache (running
- n an NFS file system). The green curve is apache (running on a local file system).
Erlang and scalability.
Horizontal distribution (scaling ) of actors
Single Core Core-1 Core-2 Core-3
Automatically Erlang VM distributes the work load across CPU resources
actor model ---
container based multi-threaded app servers
Thread pool
Good for scaling up
Incoming request Outgoing response
Good for scaling out Non Blocking calls
Threads
Comparison: Actor / Thread
Exporting these paradigms to act inherently
Blocking calls Pass by reference State sharing
Outgoing response
Pass by value No State sharing
Shared State reference
Action
Shared State reference
Action STATE
State
Action
State
Action
Erlang & OTP
- Productivity—Using OTP makes it possible to produce production quality
systems by making use of the patterns given.
- Stability—Code written on top of OTP can focus on the logic and avoid error-
prone re-implementations of the typical things that every real-world system needs: process management, servers, state machines, and so on.
- Supervision—The application structure provided by the framework makes it
simple to supervise and control the running systems, both automatically and through graphical user interfaces. through graphical user interfaces.
- Upgradability—The framework provides patterns for handling code upgrades in
a systematic way.
- Reliable code base—The code for the OTP framework is rock solid and has been
thoroughly battle tested.
OTP (open telecom platform )
Behavior Container Behavior Implementation Behavior Interface
gen_server Init/1 , handle_call/3 , handle_cast/2 , handle_info/2 , terminate/2 , code_change/3 User’s application specific implementation User’s application specific implementation
Behaviour containers handle much of what is challenging about writing lightweight , scalable , concurrent, fault- tolerant OTP code
gen_server ( process-1 ) gen_server ( process-2 ) gen_server ( process-3 ) gen_server : For implementing the server of a client-server relation. gen_fsm : For implementing finite state machines. gen_event : For implementing event handling functionality. supervisor : For implementing a supervisor in a supervision tree.
Supervisor.
- A supervisor is responsible for starting,
stopping and monitoring its child processes. The basic idea of a supervisor is that it should keep its child processes alive by restarting them when necessary
1.
- ne_on_one : number of restarts ,
within a time frame. 2.
- ne_for_all
: 3. rest_for_one :
Distributed .. cache.
Process communication by copying
Location transparency Location information fed during shell startup Clusters don’t connect automatically M 2 M communication via secret cookie
Pid ! “ message" Pid ! “ message"
- Node1@
hostna me Node3 @hostn ame Node4 @hostn ame Node2 @host name Node1@ hostna me Node3 @hostn ame Node2 @hostn ame Node4 @host name
EPMD
TCP/IP 4369
EPMD
Erlang , some perspective.
Modules Actors ( processes )… Tail recursion… Pattern Matching… Selective receive… Selective receive… Supervisor… Let is crash.. Process linking… Hot code deployment.
Reference Architecture
State Mgmt. Engines Erlang OTP ( gen_fsm ) View Portal Layer Mochiweb , YAWS Provision Portal Layer Mochiweb , YAWS
System of engagement System of records
Display Data CouchBase Business data records Analytical data store !!
- Cache Layer
Master-Master | Master -- Slave | Big Data | RDBMS | Data-warehousing
Erlang , who uses it ?
- Amazon uses Erlang to implement SimpleDB, providing database services as a part of
the Amazon Elastic Compute Cloud (EC2).
- Yahoo! uses it in its social bookmarking service, Delicious, which has more than 5 million
users and 150 million bookmarked URLs.
- Facebook uses Erlang to power the backend of its chat service, handling more than 100
million active users.
- T-Mobile uses Erlang in its SMS and authentication systems.
- Motorola is using Erlang in call processing products in the public-safety industry.
- Ericsson uses Erlang in its support nodes, used in GPRS and 3G mobile networks
worldwide.
- The Ejabberd system, which provides an Extensible Messaging and Presence
Protocol (XMPP) based instant messaging (IM) application server.
- The CouchDB “schema-less” document-oriented database, providing scalability
across multicore and multiserver clusters.
- The MochiWeb library that provides support for building lightweight HTTP servers.
It is used to power services such as MochiBot and MochiAds, which serve dynamically
Continue..
It is used to power services such as MochiBot and MochiAds, which serve dynamically generated content to millions of viewers daily.
- RabbitMQ, an AMQP messaging protocol implementation. AMQP is an emerging
standard for high-performance enterprise messaging.
Learning Erlang and Actor principles
http://learnyosomeerlang.com
Pearson - Building Scalable Applications with Erlang - Jerry Jackson APress - Mastering Erlang : Writing Real World Applications - Geoff Cant
Thank you .. Thank you ..
Email : vinoodas@gmail.com Working on the areas of disruptive technology … Areas of interest: Erlang , AKKA , Node.js , RabbitMQ ,NoSQL .