Graph: composable production systems in Clojure Jason Wolfe ( @w01fe ) Strange Loop ’12
Motivation • Interesting software has: • many components • complex web of dependencies • Developers want: • simple, factored code • easy testability • tools for monitoring and debugging
Graph • Graph is a simple, {:x (fnk [i] ...) declarative way to express :y (fnk [j x] ...) system composition :z (fnk [x y] ...)} • A Graph is just a map of functions that can depend i j on previous outputs input output {:x 2 {:i 1 • Graphs are easy to create, x :y 5 :j 2} y reason about, test, and :z 12} build upon z
Outline • Prismatic • Design Goals {:x (fnk [i] ...) • Graph: specs and compilation :y (fnk [j x] ...) :z (fnk [x y] ...)} • Applications • newsfeed generation • production services response response
Prismatic • Personalized, interest-based newsfeeds • Build crawlers, topic models, graph analysis, story clustering, ... • Backend 99.9% Clojure • Personalized ranked feeds in real-time (~200ms) getprismatic.com
Prismatic’s production API service • >100 components • storage systems • caches & indices • ranking algorithms ec2-keys doc index index snapshots • Coordinate in intricate dance feed-builder top news handlers server to serve feeds fast SQL • Relentlessly refactored env observer log store • Still dozens of top-level update index pubsub service-name logger components in complex service-info dependency network Parameters Remote Storage Caches, Indices Fns, Other Thread Pools
The feed builder user query • 20+ steps from query to personalized ranking, 20+ parameters • Not a simple pipeline response
The feed builder user query • 20+ steps from query to personalized ranking, 20+ parameters • Not a simple pipeline • > 10 feed types w/ slightly different steps, configurations response response
The feed builder user query • 20+ steps from query to personalized ranking, 20+ parameters • Not a simple pipeline • > 10 feed types w/ slightly different steps, configurations • Support for early stopping response response
Theme: complexity of composition • Previous implementations: defn s with huge let s • Unwieldy for large systems with complex or polymorphic dependencies • Hard to test, debug, and monitor response response
The ‘monster let ’ • Tens of parameters, (defn start [{:keys [a,z]}] not compositional (let [s1 (store a ...) s2 (store b ...) • Mocks/polymorphic flow db (sql-db c) difficult t2 (cron s2 db...) • Ad hoc monitoring & ... shutdown logic per item srv (server ...)] (fn shutdown [] • Core issue: structure of (.stop srv) (de)composition is locked ... up in an opaque function (.flush s1))))
Prismatic software engineering philosophy • Fine-grained, composable abstractions (FCA) Libraries >> Frameworks • Strive for simplicity, work with the language • Graph is a FCA for composition
Goal: declarative • Declarative specifications fix ‘monster let ’ • Explicitly list components, dependencies • Enable abstractions over components, reasoning about composition • Not new: Pregel, Dryad, Storm, ...
Goal: simple • Distill this idea to its simplest, most idiomatic expression • a Graph spec is just a (Clojure) map • no XML files or interface hell • Graphs are ordinary data • manipulate them ‘for free’ • --> unexpected applications It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures. - Alan Perlis
From ‘ let ’ to Graph (defn stats [{:keys [xs]}] (let [n (count xs) m (/ (sum xs) n) xs m2 (/ (sum sq xs) n) v (- m2 (* m m))] n {:n n :m m :m2 m2 :v v})) m m2 {:n (fn [xs] (count xs)) k v :m (fn [xs n] (/ (sum xs) n)) k :m2 (fn [xs n] (/ (sum sq xs) n)) k :v (fn [m m2] (- m2 (* m m)))} k
Bring on the fnk • fnk = keyword function (defnk foo [x y [s 1]] (+ x (* y s))) • Similar to {:keys []} destructuring (= 8 (foo {:x 2 :y 3 :s 2})) • nicer opt. arg. support (= 5 (foo {:x 2 :y 3})) • asserts that keys exist • metadata about args (thrown? Ex. (foo {:x 2})) • Quite useful in itself (= (meta foo) {:req-ks #{:x :y}} • Only macros in Graph :opt-ks #{:s})
A Graph Specification • A Graph is just a map from keywords to fnks {:n (fnk [xs] • Required keys of each fnk (count xs)) specify graph relationships :m (fnk [xs n] • Entire graph specifies a (/ (sum xs) n)) fnk to map of results :m2 (fnk [xs n] (/ (sum sq xs) n)) :v (fnk [m m2] xs (- m2 (* m m)))} n m m2 v
A Graph Specification {:xs [1 2 3 6]} • A Graph is just a map from keywords to fnks {:n (fnk [xs] • Required keys of each fnk (count xs)) specify graph relationships :m (fnk [xs n] • Entire graph specifies a (/ (sum xs) n)) fnk to map of results :m2 (fnk [xs n] (/ (sum sq xs) n)) :v (fnk [m m2] xs (- m2 (* m m)))} n m m2 v
A Graph Specification {:xs [1 2 3 6]} • A Graph is just a map from keywords to fnks {:n (fnk [xs] 4 • Required keys of each fnk (count xs)) specify graph relationships :m (fnk [xs n] • Entire graph specifies a (/ (sum xs) n)) fnk to map of results :m2 (fnk [xs n] (/ (sum sq xs) n)) :v (fnk [m m2] xs (- m2 (* m m)))} n m m2 v
A Graph Specification {:xs [1 2 3 6]} • A Graph is just a map from keywords to fnks {:n (fnk [xs] 4 • Required keys of each fnk (count xs)) specify graph relationships :m (fnk [xs n] 3 • Entire graph specifies a (/ (sum xs) n)) fnk to map of results :m2 (fnk [xs n] (/ (sum sq xs) n)) :v (fnk [m m2] xs (- m2 (* m m)))} n m m2 v
A Graph Specification {:xs [1 2 3 6]} • A Graph is just a map from keywords to fnks {:n (fnk [xs] 4 • Required keys of each fnk (count xs)) specify graph relationships :m (fnk [xs n] 3 • Entire graph specifies a (/ (sum xs) n)) fnk to map of results :m2 (fnk [xs n] 12.5 (/ (sum sq xs) n)) :v (fnk [m m2] xs (- m2 (* m m)))} n m m2 v
A Graph Specification {:xs [1 2 3 6]} • A Graph is just a map from keywords to fnks {:n (fnk [xs] 4 • Required keys of each fnk (count xs)) specify graph relationships :m (fnk [xs n] 3 • Entire graph specifies a (/ (sum xs) n)) fnk to map of results :m2 (fnk [xs n] 12.5 (/ (sum sq xs) n)) :v (fnk [m m2] 3.5 xs (- m2 (* m m)))} n m m2 v
Compiling Graphs • Compile graph to fnk that (def g returns map of outputs {:n (fnk [xs] ...) :m (fnk [xs n] ...) :m2 (fnk [xs n] ...) :v (fnk [m m2] ...)}) (def stats (compile g)) (= (stats {:xs [1 2 3 6]}) {:n 4 :m 3 :m2 12.5 :v 3.5)
Compiling Graphs • Compile graph to fnk that (def g returns map of outputs {:n (fnk [xs] ...) • error checked :m (fnk [xs n] ...) :m2 (fnk [xs n] ...) :v (fnk [m m2] ...)}) (def stats (compile g)) (= (stats {:xs [1 2 3 6]}) (thrown? {:n 4 :m 3 (Ex. “missing :xs”) :m2 12.5 :v 3.5) (stats {:x 1}))
Compiling Graphs • Compile graph to fnk that (def g returns map of outputs {:n (fnk [xs] ...) • error checked :m (fnk [xs n] ...) • can return lazy map :m2 (fnk [xs n] ...) :v (fnk [m m2] ...)}) (def stats (def stats (lazy-compile g)) (compile g)) (thrown? (= (stats {:xs [1 2 3 6]}) (= (:m (stats {:xs [1 5]})) (Ex. “missing :xs”) {:n 4 :m 3 3) :m2 12.5 :v 3.5) (stats {:x 1}))
Compiling Graphs • Compile graph to fnk that (def g returns map of outputs {:n (fnk [xs] ...) • error checked :m (fnk [xs n] ...) • can return lazy map :m2 (fnk [xs n] ...) :v (fnk [m m2] ...)}) • can auto-parallelize (def stats (def stats (def stats (compile g)) (par-compile g)) (lazy-compile g)) (= (:v (stats {:xs [1 5]})) (= (:m (stats {:xs [1 5]})) (= (stats {:xs [1 2 3 6]}) (thrown? (Ex. “missing :xs”) 3) {:n 4 :m 3 3.5) (stats {:x 1})) :m2 12.5 :v 3.5)
Compiling Graphs • Compile graph to fnk that (def g returns map of outputs {:n (fnk [xs] ...) 2 • error checked :m (fnk [xs n] ...) • can return lazy map :m2 (fnk [xs n] ...) :v (fnk [m m2] ...)}) • can auto-parallelize (def stats (def stats (def stats (compile g)) (lazy-compile g)) (par-compile g)) (= (stats {:xs [1 2 3 6]}) (thrown? (= (:v (stats {:xs [1 5]})) (= (:m (stats {:xs [1 5]})) 3.5) (Ex. “missing :xs”) 3) {:n 4 :m 3 (stats {:x 1})) :m2 12.5 :v 3.5)
Compiling Graphs • Compile graph to fnk that (def g returns map of outputs {:n (fnk [xs] ...) 2 • error checked :m (fnk [xs n] ...) 3 • can return lazy map :m2 (fnk [xs n] ...) 13 :v (fnk [m m2] ...)}) • can auto-parallelize (def stats (def stats (def stats (compile g)) (par-compile g)) (lazy-compile g)) (= (:v (stats {:xs [1 5]})) (= (:m (stats {:xs [1 5]})) (= (stats {:xs [1 2 3 6]}) (thrown? 3) (Ex. “missing :xs”) 3.5) {:n 4 :m 3 (stats {:x 1})) :m2 12.5 :v 3.5)
Recommend
More recommend