SLIDE 1 %w(map reduce).first A Tale About Rabbits, Latency and Slim Crontabs
Paolo Negri
thanks to: www.autoscout24.de
SLIDE 2
Summary:
SLIDE 3 Map
http://www.matthiasdittrich.com/projekte/dliste/visualisations/index.html
SLIDE 4 http://www.flickr.com/photos/myxi/448253580
rabbitMQ
SLIDE 5 crontab diet
http://www.flickr.com/photos/tim_norris/2600843131/
SLIDE 6
Map Reduce
“Programming model for processing and generating large data sets” (Google)
SLIDE 7
Map Reduce "Map" step
the master node takes the input, chops it up into smaller sub-problems, and distributes those to worker nodes. (Wikipedia)
SLIDE 8
The problem Invoicing our clients
SLIDE 9
Is it as simple as...
clients.map do |client| client.invoice end
SLIDE 10 No!
Because the process is:
SLIDE 11 Problems:
- How many nodes?
- How many workers?
- Distribution mechanism to
feed the workers?
SLIDE 12 What about queuing?
- the master node takes the input, chops
it up into smaller sub-problems, and publishes them in a queue
- workers independently consume the
content of the queue
SLIDE 13 Here comes
- RabbitMQ is an implementation of AMQP,
the emerging standard for high performance enterprise messaging
- It’s opensource
- Can be used to manage queues
- Written in Erlang
SLIDE 14 Erlang?
- Erlang is a general-purpose concurrent
programming language designed by Ericsson
- distributed
- fault tolerant
- soft real time
- high availability
SLIDE 15 Install it
- sudo apt-get install rabbitmq
- sudo gem install !""#$%"&'
SLIDE 16
Do it - master node
SLIDE 17
Use it - worker node
SLIDE 18 What and where
RabbitMQ (Erlang) TCP/IP Master (ruby) Worker (ruby) Worker (ruby)
SLIDE 19 Get for free
- Decoupling master/worker
- Workers take care of feeding themselves
- Flexible number of workers
SLIDE 20 Get for free
- RabbitMQ can be clustered
- Support of message acknowledgement
- Queues can be persisted on disk
(at a price)
SLIDE 21 Queue
- Is an actual entity
- has a name
- can be inspected and managed
SLIDE 22
EventMachine
SLIDE 23 EventMachine
- Non blocking IO and lightweight
concurrency
- eliminate the complexities of high-
performance threaded network programming Is an implementation of Reactor Pattern
SLIDE 24
EventMachine
SLIDE 25
EventMachine
amqp gem is built on EventMachine => you’re in a context where you can leverage concurrent programming
SLIDE 26
EM - Deferrables
SLIDE 27 EM - Deferrables
“The Deferrable pattern allows you to specify
any number of Ruby code blocks that will be executed at some future time when the status
- f the Deferrable object changes “
SLIDE 28
EM - Deferrables
SLIDE 29
EM - Deferrables
SLIDE 30 Deferrables
ClientStat Arrears
without deferrables with deferrables Time
ClientStat Arrears
SLIDE 31 Achieved so far
- Easy distribution of tasks
- Architecture that supports arbitrary
number of workers (and masters)
- Concurrency within the single worker
SLIDE 32 More rabbits
Analogy with email system
SLIDE 33
Multicasting - producer
SLIDE 34
Multicasting - consumer
SLIDE 35 Multicasting
Exchange msg A Queue1 Queue3 Queue2 Cons1 Cons2 Cons3 Publisher
SLIDE 36 Multicasting
Exchange Queue1 Queue3 Queue2 msg A msg A msg A Cons1 Cons2 Cons3 Publisher
SLIDE 37 Not only queues then
- communication across hosts,
heterogeneous systems
Use messages distribution to build the nervous system of your app
SLIDE 38
Where to start? crontab -l
5 * * * * bin/do_the_quick_thing.rb 0 2 * * * bin/do_the_scary_thing.rb
SLIDE 39 Cron
- Simple
- Reliable
- No maintenance
- Status is not explicit
- Locking?
- Shot and forget
SLIDE 40 Queue
- Distributed easily
- Reliable
- Can be inspected
- Add/decrease workers
- Makes you think!
- Adds more complexity
SLIDE 41 On github - Projects
- eventmachine/eventmachine
- tmm1/amqp
- macournoyer/thin
- famoseagle/carrot
- celldee/bunny
- ezmobius/nanite
SLIDE 42
Q&A
?
SLIDE 43
Thanks!
Paolo Negri / hungryblank