High-Performance Web Applications in Haskell
Gregory Collins Google Switzerland QCon, London, UK Friday, March 11, 2011
1 of 1
High-Performance Web Applications in Haskell Gregory Collins - - PowerPoint PPT Presentation
High-Performance Web Applications in Haskell Gregory Collins Google Switzerland QCon, London, UK Friday, March 11, 2011 1 of 1 A little about me My academic background was in type systems and functional programming, mostly in Standard ML.
1 of 1
When Google breaks, we fix it. Google is hiring!
2 of 52
3 of 52
4 of 52
5 of 52
6 of 52
7 of 52
8 of 52
map :: (a -> b) -> [a] -> [b] map toUpper "hello, world!" == "HELLO, WORLD!" foldl' (+) 0 [1..10] == 55
9 of 52
Iterator<Foo> it = l1.iterator(); ArrayList<Foo> l2 = new ArrayList<Foo>(); while (it.hasNext()) { l2.add(foo(it.next())); }
l2 = map foo l1
10 of 52
11 of 52
12 of 52
13 of 52
14 of 52
1 of 1
15 of 52
16 of 52
If it finds an input which breaks your invariant, it can quite often shrink the testcase to find a minimal example.
17 of 52
take :: Int -> [a] -> [a]
! l . ! n | n >= 0 && length(l) >= n . length(take n l) == n ! l . ! n . isPrefixOf (take n l) n
18 of 52
myTake n _ | n <= 1 = [] -- "1" should be "0" here myTake _ [] = [] myTake n (x:xs) = x : myTake (n-1) xs
prop_length (l,n) = length l >= n && n >= 0 ==> length (myTake n l) == n prop_prefix (l,n) = myTake n l `isPrefixOf` l
19 of 52
> quickCheck prop_length *** Failed! Falsifiable (after 6 tests and 4 shrinks): ([()],1)
20 of 52
21 of 52
22 of 52
foreign import ccall unsafe "unistd.h read" c_read :: CInt -> Ptr a -> CSize -> IO (CSize) foreign import ccall unsafe "unistd.h write" c_write :: CInt -> Ptr a -> CSize -> IO (CSize)
23 of 52
24 of 52
Separate processes (forking or pre-forking). Every connection served by a separate OS process, and no OS process serves more than one request at once. Blocking I/O. 1. OS-level threads. Every connection served by a separate OS thread, one process serves many requests at once. Blocking I/O. 2. Event-driven. Server has one or more “event loops”, each of which runs in a single thread, handling N active connections. Uses OS-level multiplexing (epoll() or kqueue()) to get notifications for sockets which are ready to be read or written to. Non-blocking I/O. 3.
25 of 52
int s = accept(...); pid_t pid = fork(); if (pid == 0) { /* handle request */ } else /* ... */
26 of 52
1 of 1
1 of 1
1 of 1
1 of 1
1 of 1
1 of 1
28 of 52
29 of 52
int s = accept(...); int tid = pthread_create(...); ...
30 of 52
31 of 52
32 of 52
33 of 52
function read(callback) { request.addListener("response", function (response) { var responseBody = ""; response.setBodyEncoding("utf8"); response.addListener("data", function(chunk) { responseBody += chunk; }); response.addListener("end", function() { callback(responseBody); }) }); request.close(); }
34 of 52
1 of 1
1 of 1
1 of 1
1 of 1
1 of 1
Per-connection overhead is very low: a little bit of per-connection state, no stack. Idle connections consume very few resources; modern syscalls like epoll() and kqueue() are O(k) in the number of active connections, and usually k << n.
36 of 52
Kernel context switch: O(1–4 μs) — max 250k–1M/s Processor ring switch from user to kernel mode: O(50 ns)
All of the web server throughput champions (nginx, lighttpd, Cherokee, etc) use this model.
37 of 52
38 of 52
40 of 52
40 of 52
41 of 52
thread-ring benchmark: 7.3M context switches per second, per-core (32-bit, Q6600), 4.2M context switches per second on x64. this is because there is no process context switch or processor mode switch when scheduling the threads.
42 of 52
schedules an interest in reading on the socket file descriptor with the runtime system 1. waits on a lock which the RTS will twiddle when epoll() says the socket is readable. 2.
43 of 52
1 of 1
1 of 1
1 of 1
1 of 1
1 of 1
1 of 1
1 of 1
1 of 1
1 of 1
1 of 1
1 of 1
46 of 52
47 of 52
(PONG benchmark, y axis is requests/second) (serving a 40kB file, y axis is requests/second)
48 of 52
49 of 52
API servers, compute-heavy workloads, hotspots It’s close to as fast as C/C++/Java, but much much much nicer to program in.
50 of 52
1 of 52