ERLANG/OTP Torben Ho fg mann Erlang Solutions @LeHo fg - - PowerPoint PPT Presentation

erlang otp
SMART_READER_LITE
LIVE PREVIEW

ERLANG/OTP Torben Ho fg mann Erlang Solutions @LeHo fg - - PowerPoint PPT Presentation

TORBEN HOFFMANN presents ERLANG/OTP Torben Ho fg mann Erlang Solutions @LeHo fg torben@erlang-solutions.com www.erlang-solutions.com W HAT I S S CALABILITY ? handle tra ffj c spike Behave predictably under extended heavy load Carry the tra ffj


slide-1
SLIDE 1

ERLANG/OTP

presents

TORBEN HOFFMANN

Torben Hofgmann Erlang Solutions @LeHofg torben@erlang-solutions.com www.erlang-solutions.com

slide-2
SLIDE 2
slide-3
SLIDE 3

WHAT IS SCALABILITY?

handle traffjc spike Behave predictably under extended heavy load Carry the traffjc it was designed to handle

slide-4
SLIDE 4

handle traffjc spike Behave predictably under extended heavy load Carry the traffjc it was designed to handle

slide-5
SLIDE 5

WHAT IS (MASSIVE)

millions of simultaneous requests being handled Requests running independently of each other SMS TV voting spile

slide-6
SLIDE 6

millions of simultaneous requests being handled Requests running independently of each other SMS TV voting spile

slide-7
SLIDE 7

WHAT IS HIGH

slide-8
SLIDE 8

No single point of failure. Two Computers (Joe Armstrong) Three if you ask Leslie Lamport Redundant network - Sys admin tripping on a network cable not an excuse Battery backup / generators. Hardware failure Distribute your software and data. Software is important, but it is not

  • nly about Software.
slide-9
SLIDE 9

WHAT IS FAULT

slide-10
SLIDE 10

Even if things go wrong continue working and not afgect other things in the system. Ability to isolate the error. Regain control.

slide-11
SLIDE 11

WHAT IS DISTRIBUTION

Simplicity in designing your system. Scalability and fault tolerance. Language with built in distribution.

slide-12
SLIDE 12

Source: http://www.tuvie.com

With Erlang you can hide how the distribution over machines is taking place or you can decide to peek inside if you want to know more. Flexibility Simplicity in designing your system. Scalability and fault tolerance. Language with built in distribution.

slide-13
SLIDE 13

YES, PLEASE!! !

Do you need a distributed system? Do you need a scalable system? Do you need a reliable system? Do you need a fault-tolerant system? Do you need a massively concurrent system? Do you need a distributed system? Do you need a scalable system? Do you need a reliable system? Do you need a fault-tolerant system? Do

So, you have all these requirements. What is it you actually need?

slide-14
SLIDE 14

TO THE RESCUE

slide-15
SLIDE 15
  • OPEN SOURCE
  • CONCURRENCY-ORIENTED
  • LIGHTWEIGHT PROCESSES
  • ASYNCHRONOUS MESSAGE PASSING
  • SHARE-NOTHING MODEL
  • PROCESS LINKING / MONITORING
  • SUPERVISION TREES AND RECOVERY

STRATEGIES

  • TRANSPARENT DISTRIBUTION MODEL
  • SOFT-REAL TIME
  • LET-IT-FAIL PHILOSOPHY
  • HOT-CODE UPGRADES

WHAT IS ERLANG

slide-16
SLIDE 16

WELL, IN FACT YOU NEED MORE.

slide-17
SLIDE 17

ERLANG IS JUST A PROGRAMMING LANGUAGE.

If you need to develop a highly complex system which never goes down, has built in fault tolerance, distribution mechanisms and manages millions of simultaneous transactions, you need more than just a programming language.

slide-18
SLIDE 18

YOU NEED ARCHITECTURE PATTERNS. YOU NEED MIDDLEWARE.

Erlang solves many software related problems. It is still just a programming language Lots of problems you solve are the same. Don’t want to reinvent the wheel. Development, deployment and monitoring tools.

slide-19
SLIDE 19

YOU NEED OTP.

BOS - 1993, merged with Erlang in 1995. Erlang is only 33% of your strength. VM, OTP What does OTP Stand for? Rather not tell you. On The Phone, One True Pair, Oh, This is Perfect

slide-20
SLIDE 20

SOME TEXT

Ministry of Propaganda at Ericsson Openness - JSON, XML, ASN.1, SNMP, Java, C, Ports. Telecom - Distributed, Massively concurrent soft realtime systems with requirements

  • n scalability

Platform -

slide-21
SLIDE 21

WHAT IS MIDDLEWARE?

A set of abstract principles and design rules They describe the software architecture of an Erlang System Needed so existing tools will be compatible with them Facilitate the understanding of the system among teams Leave Architectural Patterns to Last

slide-22
SLIDE 22

MIDDLEWARE

DESIGN PATTERNS FAULT TOLERANCE DISTRIBUTION UPGRADES

Systems will do very difgerent things. But the issues are still the same. Glue to manage your distribution and communication layers. Your fault tolerance layers. Deploy and upgrade your systems.

slide-23
SLIDE 23

WHAT ARE LIBRARIES?

Basic Applications Erlang Runtime System, Kernel, Compiler, Standard Lib,

System Architecture Support Library (SASL)

Database Applications Mnesia (Distributed relational database) ODBC (Interface for accessing SQL databases)

slide-24
SLIDE 24

LIBRARIES

STORAGE O&M INTERFACES COMMUNICATIO

N

Operations and Maintenance Applications Operating System Monitor, SNMP, OTP MIBs Interface and communication Applications

  • Corba ORB, ASN1 Compiler, Crypto, (Wx widgets), Inets (TCP, UDP,

HTTP, FTP), Java Interface & Erlang to C Interface, SSH/SSL, XML Parsing

slide-25
SLIDE 25

WHAT TOOLS?

slide-26
SLIDE 26

OTP TOOLS

DEVELOPMENT TEST FRAMEWORKS RELEASE & DEPLOYMENT DEBUGGING &

Eunit, Common test. No mocking frameworks, several OS. Release and upgrade tools. Worth the hassle? Low level debugging tools. dbg, trace local & global calls Percept - Concurrency bottlenecks/profiling Observer - web front end to other tools, e.g. crash dump viewer. etop, crash dump viewer

slide-27
SLIDE 27

PART OF THE ERLANG DISTRIBUTION OPEN SOURCE

OTP IS

slide-28
SLIDE 28

OTP

Servers Finite State Machines Event Handlers Supervisors Applications Less Code Less Bugs More Solid Code More Tested Code More Free Time

Cons: Steeper learning curve, afgects performance

slide-29
SLIDE 29

Your Heading

Fail Safe, Fail Early * Hide tricky parts of Concurrency. Mutexes, deadlocks, race conditions * Stress 9-5 programmers

slide-30
SLIDE 30

Let It Fail

convert(Day) -> case Day of monday -> 1; tuesday -> 2; wednesday -> 3; thursday -> 4; friday -> 5; saturday -> 6; sunday -> 7; Other -> {error, unknown_day} end.

slide-31
SLIDE 31

BANG FOR THE BUCK

Source: http://www.slideshare.net/ JanHenryNystrom/productivity- gains-in-erlang

You spend 3x the time on solving the actual problem (App) and much less on all sorts of other things.

slide-32
SLIDE 32

ISOLATE THE ERROR!

Runtime Error Do not use the word crash. No shared memory -> Restart the process. Recreate the State.

slide-33
SLIDE 33

PROPAGATING EXIT SIGNALS

Exit Signals

PidA PidB {'EXIT', PidA, Reason} PidC {'EXIT', PidB, Reason}

Explain Links, Exit Signals and trapping exits

slide-34
SLIDE 34

Trap Exit

TRAPPING AN EXIT SIGNAL

PidA {'EXIT', PidA, Reason} PidC PidB

slide-35
SLIDE 35

Supervisors

PidA PidC PidB

Supervisor Workers Application

Handle dependencies.

  • An application is a logical unit of processes and modules grouped together

to perform a given task

  • Application = Collection of resources loaded, started and stopped as one
  • Contains supervision tree. Workers can be implemented using generic

behaviours

slide-36
SLIDE 36

Releases

Release

Mongoose

IM

folsom

lager snmp mnesia stdlib SASL kernel

ERTS

  • Complete Erlang systems are built as releases
  • A release is: a version of the Erlang Run Time System (ERTS). A set of OTP applications that

work together

  • Releases allow to start, stop, and manage applications in a standard manner
  • Releases can be upgraded or downgraded as a unit
  • Applications which come as part of OTP
  • Applications the programmer writes
slide-37
SLIDE 37

BEHAVIOURS

OTP Behaviours are a formalisation of design patterns Processes share similar structures and life cycles , started, receive messages & send replies, terminate Even if they perform difgerent tasks, they will perform them following a set of patterns Each design pattern solves a specific problem

slide-38
SLIDE 38

SPECIFIC CALLBACK MODULE GENERIC BEHAVIOUR MODULE

Server

process

The idea is to split the code in two parts The generic part is called the generic behaviour,provided as library modules

The specific part is called the callback module, implemented by programmer

slide-39
SLIDE 39

OTP

Servers Finite State Machines Event Handlers Supervisors Applications Less Code Less Bugs More Solid Code More Tested Code More Free Time

Generic: start, stop, receive and send messages. Specific: Server state, messages, handling requests (+reply) Specific know nothing about the generic. generic servers, fsm, event handlers, supervisors, roll out your own

slide-40
SLIDE 40

call(Name, Message) -> Name ! {request, self(), Message}, receive {reply, Reply} -> Reply end. reply(Pid, Reply) -> Pid ! {reply, Reply}.

Client Server {request, Pid, Message} {reply, Reply}

call(Name, Message) -> Name ! {request, self(), Message}, receive {reply, Reply} -> Reply end.

9-5 programmer will not think of all error cases. Concurrency is tricky. Deadlocks, race conditions, mutexes, critical sections.

slide-41
SLIDE 41

Client Server {request, Pid, Message} {reply, Reply} Server {reply, Reply}

call(Name, Msg) -> Ref = make_ref(), Name ! {request, {Ref, self()}, Msg}, receive {reply, Ref, Reply} -> Reply end. reply({Ref, Pid}, Reply) -> Pid ! {reply, Ref, Reply}.

{request, {Ref, self()}, Message} {reply, Ref, Reply} {reply, ???, Reply}

TODO Fix Animation

slide-42
SLIDE 42

PidA PidB

{request, {Ref, PidA}, Msg} call(Name, Msg) -> Ref = erlang:monitor(process, Name), Name ! {request, {Ref, self()}, Msg}, receive

  • {reply, Ref, Reply} ->
  • erlang:demonitor(Ref),
  • Reply;
  • {'DOWN', Ref, process, _Name, _Reason} ->
  • {error, no_proc}

end.

Fix animation

slide-43
SLIDE 43

PidA PidB

{request, {Ref, PidA}, Msg} call(Name, Msg) -> Ref = erlang:monitor(process, Name), Name ! {request, {Ref, self()}, Msg}, receive

  • {reply, Ref, Reply} ->
  • erlang:demonitor(Ref, [flush]),
  • Reply;
  • {'DOWN', Ref, process, _Name, _Reason} ->
  • {error, no_proc}

end. {reply, Ref, Reply} {'DOWN', Ref, process, PidB, Reason}

slide-44
SLIDE 44

BEHAVIOURS

TIMEOUTS DEADLOCKS

TRACING

MONITORING DISTRIBUTION

slide-45
SLIDE 45

AUTOMATIC TAKEOVER

slide-46
SLIDE 46

N1

{myApp, 2000, {n1@host, {n2@host, n3@host}]}

N2 N3

Applicatio n Master Applicatio n n1@host dies Application Masters on failover nodes

slide-47
SLIDE 47

N2 N3

n2@host dies Application is restarted on n2@host {myApp, 2000, {n1@host, {n2@host, n3@host}]}

slide-48
SLIDE 48

N1 N3

n1@host comes back up Application is restarted on n3@host {myApp, 2000, {n1@host, {n2@host, n3@host}]}

slide-49
SLIDE 49

N1 N3

N1 takes over N3 {myApp, 2000, {n1@host, {n2@host, n3@host}]}

slide-50
SLIDE 50

RELEASE STATEMENT OF AIMS

“To scale the radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server- based systems, on massively parallel machines (10^5 cores).”

Until recently, every 18 months, computing power doubled. Moore’s law came about. Million cores within our lifetime, 100,000s will become common place. Consortium of companies and universities. Bring OTP to the next level

European Union Seventh Framework Programme (FP7/2007-2013) , aprox. 3.5 million Euro

slide-51
SLIDE 51

The Runtime Queues

Erlang VM Scheduler #1 Scheduler #2 run queue Scheduler #2 Scheduler #N run queue run queue

migration logic migration logic

1 scheduler per core Efgort in migration logic among the cores.

slide-52
SLIDE 52

Heriot-Watt, University of Kent, Uppsala University, Institute of Communications & Computer Systems (Athens) Electricite de France, Erlang Solutions (Case Studies), Ericsson

slide-53
SLIDE 53

WP4 Scalable Infrastructure WP3 SD Erlang Language WP2 Virtual Machine WP5 Tools WP6 Case Studies

LIMITATIONS ARE PRESENT AT THREE LEVELS

Erlang is too much small cluster focused. * Cover / Stratch across * There might be some overlap between layers

slide-54
SLIDE 54
  • PUSH THE RESPONSIBILITY FOR SCALABILITY FROM THE

PROGRAMMER TO THE VM

  • ANALYZE PERFORMANCE AND SCALABILITY
  • IDENTIFY BOTTLENECKS AND PRIORITIZE CHANGES AND

EXTENSIONS

  • TACKLE WELL-KNOWN SCALABILITY ISSUES
  • ETS TABLES (SHARED GLOBAL DATA STRUCTURE)
  • MESSAGE PASSING, COPYING AND FREQUENTLY COMMUNICATING

PROCESSES

VM LANGUAGE INFRASTRUCT URE

Evolve the Erlang virtual machine – which implements Erlang on each core – so that it can work effectively in large-scale multicore systems. Percept2 - visualisation

slide-55
SLIDE 55

VM LANGUAGE INFRASTRUCT URE

  • TWO MAJOR ISSUES
  • FULLY CONNECTED CLUSTERS
  • EXPLICIT PROCESS PLACEMENT
  • SCALABLE DISTRIBUTED (SD) ERLANG
  • NODES GROUPING
  • NON-TRANSITIVE CONNECTIONS
  • IMPLICIT PROCESS PLACEMENT
  • PART OF THE STANDARD ERLANG/OTP PACKAGE
  • NEW CONCEPTS INTRODUCED
  • LOCALITY, AFFINITY AND DISTANCE

Scalable Distributed (SD) Erlang, provides constructs to control how computations are spread across multicore platforms, and coordination patterns to allow SD Erlang to effectively describe computations on large platforms, while preserving performance portability.

Tools - Scheduler, visualising process migration.

slide-56
SLIDE 56
  • MIDDLEWARE LAYER
  • SET OF ERLANG APPLICATIONS
  • CREATE AND MANAGE CLUSTERS OF

(HETEROGENEOUS) ERLANG NODES

  • API TO MONITOR AND CONTROL ERLANG

DISTRIBUTED SYSTEMS

  • EXISTING TRACING/LOGGING/DEBUGGING TOOLS

PLUGGABLE

  • BROKER LAYER BETWEEN USERS AND CLOUD

PROVIDERS

VM LANGUAGE INFRASTRUCT URE

WombatOA M

* Basic Erlang has the ability to go in and monitor what is going on in any node you can attach yourself to. * But no tool exists to manage a big number of nodes in a coherent fashion.

* Cloud Provider. Analyse metrics which are on an OS level. CPU load, memory, etc * Scaling should however be based on the application layer * O&M which monitor. Hidden nodes. * Nagios & other tools with plugins.

slide-57
SLIDE 57

CONCLUSIONS

slide-58
SLIDE 58

USE ERLANG

Do you need a distributed system? Do you need a scalable system? Do you need a reliable system? Do you need a fault-tolerant system? Do you need a massively concurrent system? Do you need a distributed system? Do you need a scalable system? Do you need a reliable system? Do you need a fault-tolerant system? Do

slide-59
SLIDE 59

USE ERLANG/ OTP

Do you need a distributed system? Do you need a scalable system? Do you need a reliable system? Do you need a fault-tolerant system? Do you need a massively concurrent system? Do you need a distributed system? Do you need a scalable system? Do you need a reliable system? Do you need a fault-tolerant system? Do

slide-60
SLIDE 60

@LeHoff

EVALUATE NOW!