WAP5
Black-box Performance Debugging for Wide-Area Distributed Systems Patrick Reynolds
reynolds@cs.duke.edu With: http://www.hpl.hp.com/research/project5/ Marcos Aguilera Amin Vahdat Janet Wiener Jeffrey Mogul
WAP5 Black-box Performance Debugging for Wide-Area Distributed - - PowerPoint PPT Presentation
WAP5 Black-box Performance Debugging for Wide-Area Distributed Systems Patrick Reynolds reynolds@cs.duke.edu With: Janet Wiener Marcos Aguilera Jeffrey Mogul Amin Vahdat http://www.hpl.hp.com/research/project5/ Motivation Discover
reynolds@cs.duke.edu With: http://www.hpl.hp.com/research/project5/ Marcos Aguilera Amin Vahdat Janet Wiener Jeffrey Mogul
page 2 WAP5 - WWW'06
performance problems in large, wide-area systems
– One path per client request – Discover timing at each step
that are problematic
– First step in performance
debugging
Local DHT node Web proxy Client Remote DHT node DNS Origin web server
page 3 WAP5 - WWW'06
Client Proxy Proxy Proxy Proxy Proxy DNS Origin server Origin server
processing, annotated with timing/delays
250ms
500ms
page 4 WAP5 - WWW'06
– Performance bugs: too much or too little time at any point – Structure bugs: incorrect ordering or placement of
processing or communication
– Structure discovery – Measure latency for processing and communication – Unexpected structure or timing
– Do not require source code access – Allow heterogeneity
page 5 WAP5 - WWW'06
– Debugging or optimizing his/her own system
– Inheriting a project or joining a programming team – Discovery: learning how the system behaves
– Monitoring a running system for unexpected behavior – Performing regression tests after a change
page 6 WAP5 - WWW'06
– Trace capture library – Causal path analysis – Visualization
– Coral and CoDeeN
Packet or socket traces Reconciliation Message traces Causal analysis Causal paths and timing Visualization Trace capture
page 7 WAP5 - WWW'06
– Message linking algorithm
page 8 WAP5 - WWW'06
– May be many TCP or UDP packets
Client Web server Host = foo.cs.duke.edu Web proxy pid=2297 DHT pid=2312
1025 80 1207 /tmp/corald… 8080
DHT node
page 9 WAP5 - WWW'06
– Message into a process/host can cause messages out
– Calls to foo:8080 are different from calls to foo:53 – Client hosts and ports can be ignored
Client Web server Host = foo.cs.duke.edu
1025 80 1207 /tmp/corald… 8080
DHT node Web proxy pid=2297 DHT pid=2312
page 10 WAP5 - WWW'06
– Message linking algorithm
page 11 WAP5 - WWW'06
interposition
– All three choices: no modifications to applications – On PlanetLab: sniffing on host only, limited flexibility
– Captures all calls that create, modify, or use a socket
kernel program libc kernel program libc Network sniffing Host sniffing Library interposition
page 12 WAP5 - WWW'06
– Message linking algorithm
page 13 WAP5 - WWW'06
cl i ent / 5040 ser ver / 8712 0. 592 0. 852 ser ver / 8712 cl i ent / 5040 1. 705 2. 033 bi nd( f d=6, addr ={ 15. 1. 2. 3: 33250} ) connect ( f d=6, addr ={ 16. 5. 6. 7: 80} ) send( f d=6, l en=10, t i m e=0. 592) r ecv( f d=6, l en=12, t i m e=2. 033) client pid=5040 bi nd( f d=4, addr ={ 16. 5. 6. 7: 80} ) accept ( l f d=4, addr ={ 15. 1. 2. 3: 33250} ) = 5 r ecv( f d=5, l en=10, t i m e=0. 852) send( f d=5, l en=12, t i m e=1. 705) server pid=8712
page 14 WAP5 - WWW'06
– Detect dropped or reordered UDP packets – Detect differing message (buffer) boundaries
cl i ent / 5040 ser ver / 8712 0. 592 0. 852 ser ver / 8712 cl i ent / 5040 1. 705 2. 033 bi nd( f d=6, addr ={ 15. 1. 2. 3: 33250} ) connect ( f d=6, addr ={ 16. 5. 6. 7: 80} ) send( f d=6, l en=10, t i m e=0. 592) r ecv( f d=6, l en=12, t i m e=2. 033) client pid=5040 bi nd( f d=4, addr ={ 16. 5. 6. 7: 80} ) accept ( l f d=4, addr ={ 15. 1. 2. 3: 33250} ) = 5 r ecv( f d=5, l en=10, t i m e=0. 852) send( f d=5, l en=12, t i m e=1. 705) server pid=8712
page 15 WAP5 - WWW'06
cl i ent / 5040 ser ver / 8712 0. 592 0. 852 ser ver / 8712 cl i ent / 5040 1. 705 2. 033 bi nd( f d=6, addr ={ 15. 1. 2. 3: 33250} ) connect ( f d=6, addr ={ 16. 5. 6. 7: 80} ) send( f d=6, l en=10, t i m e=0. 592) r ecv( f d=6, l en=12, t i m e=2. 033) client pid=5040 bi nd( f d=4, addr ={ 16. 5. 6. 7: 80} ) accept ( l f d=4, addr ={ 15. 1. 2. 3: 33250} ) = 5 r ecv( f d=5, l en=10, t i m e=0. 852) send( f d=5, l en=12, t i m e=1. 705) server pid=8712
page 16 WAP5 - WWW'06
– Message linking algorithm
page 17 WAP5 - WWW'06
– Could be spontaneous action – May be ambiguous
page 18 WAP5 - WWW'06
Score possible parents for each message
Link-probability trees Build and aggregate paths Estimate average causal delays Message traces Causal-path patterns
page 19 WAP5 - WWW'06
messages
– Take smallest delay before each BC message – Trace-specific upper limit
– Might underestimate D
– f(t) = λe –λ t
Smallest delay for BC
page 20 WAP5 - WWW'06
page 21 WAP5 - WWW'06
– Spontaneous action has small probability, not shown – Links to BD are slightly less likely
ZB YB XB BC 0.64 0.24 0.09 ZB YB XB BD 0.61 0.22 0.08
page 22 WAP5 - WWW'06
ZB YB XB BC 0.64 0.24 0.09 ZB YB XB BD 0.61 0.22 0.08 BC
ZB BC BD 0.64 0.61 YB BC BD 0.24 0.22 XB BC BD 0.09 0.08 ZB YB XB BC ZB YB XB BD 0.61 0.22 0.08 BC BD BD 0.64 0.24 0.09
page 23 WAP5 - WWW'06
– Use probability to decide whether or not to keep child – Some links are “try-both” and generate 2 trees
p = 0.8 * 0.9 * (1-0.2) * (1-0.1) * (1-0.48) ≈ 0.270
CG AB BC BD BE BF 0.8 0.2 0.1 0.48 0.9 A B C G p=0.270 A B C G F p=0.249
page 24 WAP5 - WWW'06
– Combine client names and ports for better aggregation
– Expected number of instances – Highlights paths that appear many times with high
confidence
page 25 WAP5 - WWW'06
– Message linking algorithm
page 26 WAP5 - WWW'06
Coral processing Origin server Response
page 27 WAP5 - WWW'06
– 0.20 to 4.86 ms in different proxies
page 28 WAP5 - WWW'06
3 calls in parallel
page 29 WAP5 - WWW'06
– Particularly PlanetLab applications
– We found paths that match authors’ descriptions of systems – We characterized delays at each step and found outliers
http://www.hpl.hp.com/research/project5/
page 31 WAP5 - WWW'06
– Time-varying patterns – Time-varying delays
– Coalesce similar path patterns
page 32 WAP5 - WWW'06
– Our LAN-based work in SOSP 2003 – Magpie: detailed picture per machine by using OS-level
instrumentation (Event Tracing for Windows)
– Pinpoint: instrument middleware – Others instrument applications
– SLIC uses statistical induction to correlate low-level
metrics with SLO violations
– Trickle, ModelNet
page 33 WAP5 - WWW'06
– Not affected by clock offset
– Correct for clock offset [Paxson98]
A B A time t1 t2 t4 t3
page 34 WAP5 - WWW'06
– Use existing framework to push out new software – Restart process to begin/end trace
– Logical message semantics – Per process, not per machine – Capture UNIX, TCP, and UDP sockets
– Timestamps combine OS and network latency – No control packets or fragments