Investigating System Performance for DevOps Using Kernel Tracing - - PowerPoint PPT Presentation
Investigating System Performance for DevOps Using Kernel Tracing - - PowerPoint PPT Presentation
LinuxCon North America 2016 LinuxCon North America 2016 Investigating System Performance for DevOps Using Kernel Tracing jeremie.galarneau@efficios.com @LeGalarneau Presenter Presenter Jrmie Galarneau EfficiOS Inc. Head
LinuxCon North America 2016 2
Presenter Presenter
Jérémie Galarneau EfficiOS Inc.
– Head of Support – http://www.efficios.com
Maintainer of
– Babeltrace – LTTng-Tools
LinuxCon North America 2016 3
Projects Projects
- LTTng-tools
– Session configuration daemon – Network streaming daemon – Command-line Interface
- Babeltrace
– CLI trace reader – CTF-Writer library – CTF reader/writer Python bindings
LinuxCon North America 2016 4
Previous Talks Previous Talks
- Have mostly focused on LTTng tracer internals
– Scalability, – Reducing per-event cost
- Today
– How traces can be collected in the real world, – Tools we have developed to make them useful
LinuxCon North America 2016 5
LTTng: Two Tracers, Two Groups of Users LTTng: Two Tracers, Two Groups of Users
- User space (LTTng-UST)
– Some open source projects are already instrumented (node.js, CoreCLR, QEMU, ...) – Manually instrument applications (requires knowledge of internals) – Often used as a flexible, high-speed logger – Understand interactions between user space and kernel space
- Kernel space (LTTng-modules)
– Out-of-tree kernel modules (no kernel patching required, your distro has packages) – Leverages mainline kernel instrumentation – Only useful to kernel experts... or is it?
LinuxCon North America 2016 6
Do we need tools? Do we need tools?
- Traces are huge
– Recording very fine-grained events
- b
l
- c
k , s c h e d , i r q , n e t , p
- w
e r , s y s c a l l s , . . .
– Manual inspection works, but requires knowledge of
application/kernel internals
– Not always clear what you are looking for
LinuxCon North America 2016 7
Yes, we do! Yes, we do!
- User space developers are building custom tools!
– Know exactly what they are looking for – Modelling their application – Tracking internal resources (worker threads, memory pools, connections, users, etc.)
- Originally text-based tools
– Piping hundreds of GBs of text traces through g
r e p , s e d , p e r l , a w k . . .
– Lots of one-off scripts being passed around – Unmanageable, hard to maintain, etc. – Break when Babeltrace’s text output changes (new event fields)
LinuxCon North America 2016 8
Babeltrace Python Bindings Babeltrace Python Bindings
- Introduced Python bindings to read traces (2013)
– Provide users with an easy way to “hack something together”
- Debugging
- Testing
– Reasonably efficient under most scenarios – Scripts are maintained as internal tools
- Could we do the same for kernel space?
LinuxCon North America 2016 9
LTTng analyses LTTng analyses
- Development started in early 2014
- Collection of utils
- Models some kernel subsystems to track their current state
– Latency statistics and distributions (IO, Scheduling, IRQ) – System call statistics – IRQ handler duration – Top resource users
https://github.com/lttng/lttng-analyses
LinuxCon North America 2016 10
LTTng analyses LTTng analyses
LinuxCon North America 2016 11
LTTng analyses LTTng analyses
LinuxCon North America 2016 12
LTTng analyses LTTng analyses
LinuxCon North America 2016 13
LTTng analyses LTTng analyses
LinuxCon North America 2016 14
LTTng analyses - Trace Compass Integration LTTng analyses - Trace Compass Integration
- Invoke custom analyses
- LAMI 1.0
–
Open Specification
–
JSON based
LinuxCon North America 2016 15
Principles – Tracking Resources Principles – Tracking Resources
s c h e d _ s w i t c h c p u _ i d = 3 n e x t _ t i d = 1 2 3 4 s y s c a l l _ e n t r y _
- p
e n t i d = 1 2 3 4 f i l e n a m e = / e t c / l d . s
- .
c a c h e f l a g s = 5 2 4 2 8 8 m
- d
e = 1 s y s c a l l _ e x i t _
- p
e n t i d = 1 2 3 4 r e t ( f d ) = 2 2 [ . . . ]
LinuxCon North America 2016 16
Principles – Tracking Latencies Principles – Tracking Latencies
read() duration
s y s c a l l _ e n t r y _ r e a d s c h e d _ s w i t c h s c h e d _ s w i t c h s y s c a l l _ e x i t _ r e a d
LinuxCon North America 2016 17
Custom Periods Custom Periods
$ l t t n g
- p
e r i
- d
l
- g
- p
e r i
- d
[ N A M E [ ( P A R E N T ) ] ] : B E G I N _ E X P R [ : E N D _ E X P R ]
- p
e r i
- d
- a
g g r e g a t e P E R I O D S
- p
e r i
- d
- a
g g r e g a t e
- b
y P E R I O D
- p
e r i
- d
- g
r
- u
p
- b
y F I E L D [ . . . ]
- Example for r
e a d ( ) durations
$ l t t n g
- p
e r i
- d
l
- g
- t
- p
- f
r e q
- s
t a t s
- p
e r i
- d
‘ r e a d _ d u r a t i
- n
: $ e v t . $ n a m e = = “ s y s c a l l _ e n t r y _ r e a d ” : $ e v t . $ n a m e = = “ s y s c a l l _ e x i t _ r e a d ” ’
- p
e r i
- d
- c
a p t u r e s ‘ r e a d _ d u r a t i
- n
: r e a d _ s i z e = $ e v t . c
- u
n t ’
LinuxCon North America 2016 18
Custom Periods Custom Periods
$ l t t n g
- p
e r i
- d
l
- g
- p
e r i
- d
‘ j
- b
_ d u r a t i
- n
: $ e v t . $ n a m e = = “ j
- b
_ s t a r t ” : $ e v t . $ n a m e = = “ j
- b
_ e n d ”
- p
e r i
- d
‘ f e t c h ( j
- b
_ l e n ) : $ e v t . $ n a m e = = “ j
- b
_ s t a r t ” : $ e v t . $ n a m e = = “ f e t c h _ e n d ” ’ [ . . . ]
j
- b
_ l e n
j
- b
_ s t a r t p r
- c
e s s i n g _ e n d f e t c h _ e n d j
- b
_ e n d
f e t c h p r
- c
e s s p
- s
t
LinuxCon North America 2016 19
Tackling Some “Real” Problems Tackling Some “Real” Problems
- Long Time to First Byte
- Demos on
–
WordPress
–
MariaDB/MySQL
–
Apache httpd
- The stack doesn’t matter...
as long as it runs on Linux!
LinuxCon North America 2016 20
Anatomy of a Request Anatomy of a Request
- request
–
f d = a c c e p t 4 ( . . . )
–
c l
- s
e ( f d )
LinuxCon North America 2016 21
Anatomy of a Request Anatomy of a Request
- url_rewrite
–
f d = a c c e p t 4 ( . . . )
–
n e w s t a t ( . . . )
LinuxCon North America 2016 22
Anatomy of a Request Anatomy of a Request
- db_queries
–
c
- n
n e c t ( . . . )
–
n e w s t a t ( . . . )
- >
PHP resumes execution
LinuxCon North America 2016 23
Anatomy of a Request Anatomy of a Request
- Coarse breakdown of periods
– The goal is not to benchmark – Want to know in which “phase” the problem happens
- r
e q u e s t ( p a r e n t )
– u
r l _ r e w r i t e
– d
b _ q u e r i e s
– r
e n d e r i n g
– s
e n d _ t
- _
c l i e n t
LinuxCon North America 2016 24
Deploying LTTng Deploying LTTng
Collector Apache httpd MariaDB l t t n g
- r
e l a y d l t t n g
- s
e s s i
- n
d l t t n g
- s
e s s i
- n
d
LTTng session profile
Kernel Traces
LinuxCon North America 2016 25
Case #1 Case #1
- Almost all requests are unacceptably slow
- No connectivity problems between clients and servers
- Capture a trace for a couple of seconds on both nodes
– Stream to my machine for storage and investigation
- No need to plan for storage in production
- Local buffering can be used
– No push-back against the kernel if the network bandwidth is too low
LinuxCon North America 2016 26
Case #1 Case #1
- Extract a log of all requests from the WordPress node’s trace
$ l t t n g
- p
e r i
- d
l
- g
- t
- p
- s
t a t s
- f
r e q
- p
e r i
- d
' r e q u e s t : $ e v t . $ n a m e = = " s y s c a l l _ e x i t _ a c c e p t 4 " & & $ e v t . p r
- c
n a m e = = " a p a c h e 2 " : $ e v t . p i d = = $ b e g i n . $ e v t . p i d & & $ e v t . $ n a m e = = " s y s c a l l _ e n t r y _ c l
- s
e " & & $ e v t . f d = = $ b e g i n . $ e v t . r e t '
LinuxCon North America 2016 27
Not a tracers’ forte Not a tracers’ forte
- Collecting long traces to compute CPU usage is inefficient
- Most monitoring tools would have found this right away
LinuxCon North America 2016 28
What Tracing is Good For What Tracing is Good For
- Most monitoring tools focus on trends and averages
–
High-level view of resource usage
–
Error rates
- Tracers are great to understand
–
Usage problems
- Lots of small r
e a d ( ) , contention on a shared system resource, etc.
- Race conditions
–
Spurious issues
- One slow request out of many million
LinuxCon North America 2016 29
Snapshot Mode Snapshot Mode
- Also known as “flight-recorder” mode
–
Trace to in-memory buffers
–
Configurable buffer sizes
–
Low overhead
- Save only when an “
l t t n g s n a p s h
- t
r e c
- r
d ” is issued
–
Save to disk
–
Stream through the network to the collector node
LinuxCon North America 2016 30
mod_trigger mod_trigger
- Proof-of-concept Apache httpd module inspired by mod_logslow
- Triggers a snapshot on all machines whenever a request takes longer than a set
threshold
- Limitations
– ssh into target machines can take too long and cause us to lose the information we were
looking for (overwritten buffers)
– No rate limiting, which can make things worse...
https://github.com/jgalar/mod_trigger
LinuxCon North America 2016 31
Case #2, #3 and #4 Case #2, #3 and #4
- Loaded the LTTng session profile on both machines
- Launch l
t t n g
- r
e l a y d
- n my machine
– Runs on Linux, BSD and OS X
- Wait until a request fails to meet our QOS-levels
– 1 second per request
LinuxCon North America 2016 32
Case #2 Case #2
- xz is clearly taking-up a lot of CPU time
– Trace Compass allows us to see which process launched it – Logs are being compressed and collected
LinuxCon North America 2016 33
Case #3 Case #3
- Long call to s
e l e c t ( . . . )
- UDP socket created
- c
- n
n e c t ( ) to 8 . 8 . 6 . 6
LinuxCon North America 2016 34
Case #3 Case #3
- a
c c e s s ( “ w p
- c
- n
t e n t / p l u g i n s / h e l l
- .
p h p ” )
LinuxCon North America 2016 35
Case #3 Case #3
- A WordPress plug-in is making DNS
requests to 8.8.6.6 ... which doesn’t exist!
LinuxCon North America 2016 36
Case #4 Case #4
- Limitations
–
IO at the block layer is hard to track
–
We don’t see the core dump file being created (no o p e n ( ) )
–
b l
- c
k _ d i r t y _ b u f f e r happens in the context of crashing m y _ s e r v i c e
–
b l
- c
k _ r q _ i s s u e and b l
- c
k _ r q _ c
- m
p l e t e happen in the context of k w
- r
k e r / : 2
LinuxCon North America 2016 37
Conclusion Conclusion
- Tools make traces useful in the real world
–
Manual inspection works in a very narrow window
–
You don’t need to be a kernel developer to make sense of them
- Almost automated...
–
Use snapshots to collect on failure
–
lttng-analyses + your scripts can be run on the traces to start the investigation
- See https://github.com/jgalar/LinuxConNA2016
LinuxCon North America 2016 38