Investigating System Performance for DevOps Using Kernel Tracing - - PowerPoint PPT Presentation

investigating system performance for devops using kernel
SMART_READER_LITE
LIVE PREVIEW

Investigating System Performance for DevOps Using Kernel Tracing - - PowerPoint PPT Presentation

LinuxCon North America 2016 LinuxCon North America 2016 Investigating System Performance for DevOps Using Kernel Tracing jeremie.galarneau@efficios.com @LeGalarneau Presenter Presenter Jrmie Galarneau EfficiOS Inc. Head


slide-1
SLIDE 1

Investigating System Performance for DevOps Using Kernel Tracing

jeremie.galarneau@efficios.com  @LeGalarneau 

LinuxCon North America 2016 LinuxCon North America 2016

slide-2
SLIDE 2

LinuxCon North America 2016 2

Presenter Presenter

 Jérémie Galarneau  EfficiOS Inc.

– Head of Support – http://www.efficios.com

Maintainer of

– Babeltrace – LTTng-Tools

slide-3
SLIDE 3

LinuxCon North America 2016 3

Projects Projects

  • LTTng-tools

– Session configuration daemon – Network streaming daemon – Command-line Interface

  • Babeltrace

– CLI trace reader – CTF-Writer library – CTF reader/writer Python bindings

slide-4
SLIDE 4

LinuxCon North America 2016 4

Previous Talks Previous Talks

  • Have mostly focused on LTTng tracer internals

– Scalability, – Reducing per-event cost

  • Today

– How traces can be collected in the real world, – Tools we have developed to make them useful

slide-5
SLIDE 5

LinuxCon North America 2016 5

LTTng: Two Tracers, Two Groups of Users LTTng: Two Tracers, Two Groups of Users

  • User space (LTTng-UST)

– Some open source projects are already instrumented (node.js, CoreCLR, QEMU, ...) – Manually instrument applications (requires knowledge of internals) – Often used as a flexible, high-speed logger – Understand interactions between user space and kernel space

  • Kernel space (LTTng-modules)

– Out-of-tree kernel modules (no kernel patching required, your distro has packages) – Leverages mainline kernel instrumentation – Only useful to kernel experts... or is it?

slide-6
SLIDE 6

LinuxCon North America 2016 6

Do we need tools? Do we need tools?

  • Traces are huge

– Recording very fine-grained events

  • b

l

  • c

k , s c h e d , i r q , n e t , p

  • w

e r , s y s c a l l s , . . .

– Manual inspection works, but requires knowledge of

application/kernel internals

– Not always clear what you are looking for

slide-7
SLIDE 7

LinuxCon North America 2016 7

Yes, we do! Yes, we do!

  • User space developers are building custom tools!

– Know exactly what they are looking for – Modelling their application – Tracking internal resources (worker threads, memory pools, connections, users, etc.)

  • Originally text-based tools

– Piping hundreds of GBs of text traces through g

r e p , s e d , p e r l , a w k . . .

– Lots of one-off scripts being passed around – Unmanageable, hard to maintain, etc. – Break when Babeltrace’s text output changes (new event fields)

slide-8
SLIDE 8

LinuxCon North America 2016 8

Babeltrace Python Bindings Babeltrace Python Bindings

  • Introduced Python bindings to read traces (2013)

– Provide users with an easy way to “hack something together”

  • Debugging
  • Testing

– Reasonably efficient under most scenarios – Scripts are maintained as internal tools

  • Could we do the same for kernel space?
slide-9
SLIDE 9

LinuxCon North America 2016 9

LTTng analyses LTTng analyses

  • Development started in early 2014
  • Collection of utils
  • Models some kernel subsystems to track their current state

– Latency statistics and distributions (IO, Scheduling, IRQ) – System call statistics – IRQ handler duration – Top resource users

https://github.com/lttng/lttng-analyses

slide-10
SLIDE 10

LinuxCon North America 2016 10

LTTng analyses LTTng analyses

slide-11
SLIDE 11

LinuxCon North America 2016 11

LTTng analyses LTTng analyses

slide-12
SLIDE 12

LinuxCon North America 2016 12

LTTng analyses LTTng analyses

slide-13
SLIDE 13

LinuxCon North America 2016 13

LTTng analyses LTTng analyses

slide-14
SLIDE 14

LinuxCon North America 2016 14

LTTng analyses - Trace Compass Integration LTTng analyses - Trace Compass Integration

  • Invoke custom analyses
  • LAMI 1.0

Open Specification

JSON based

slide-15
SLIDE 15

LinuxCon North America 2016 15

Principles – Tracking Resources Principles – Tracking Resources

s c h e d _ s w i t c h c p u _ i d = 3 n e x t _ t i d = 1 2 3 4 s y s c a l l _ e n t r y _

  • p

e n t i d = 1 2 3 4 f i l e n a m e = / e t c / l d . s

  • .

c a c h e f l a g s = 5 2 4 2 8 8 m

  • d

e = 1 s y s c a l l _ e x i t _

  • p

e n t i d = 1 2 3 4 r e t ( f d ) = 2 2 [ . . . ]

slide-16
SLIDE 16

LinuxCon North America 2016 16

Principles – Tracking Latencies Principles – Tracking Latencies

read() duration

s y s c a l l _ e n t r y _ r e a d s c h e d _ s w i t c h s c h e d _ s w i t c h s y s c a l l _ e x i t _ r e a d

slide-17
SLIDE 17

LinuxCon North America 2016 17

Custom Periods Custom Periods

$ l t t n g

  • p

e r i

  • d

l

  • g
  • p

e r i

  • d

[ N A M E [ ( P A R E N T ) ] ] : B E G I N _ E X P R [ : E N D _ E X P R ]

  • p

e r i

  • d
  • a

g g r e g a t e P E R I O D S

  • p

e r i

  • d
  • a

g g r e g a t e

  • b

y P E R I O D

  • p

e r i

  • d
  • g

r

  • u

p

  • b

y F I E L D [ . . . ]

  • Example for r

e a d ( ) durations

$ l t t n g

  • p

e r i

  • d

l

  • g
  • t
  • p
  • f

r e q

  • s

t a t s

  • p

e r i

  • d

‘ r e a d _ d u r a t i

  • n

: $ e v t . $ n a m e = = “ s y s c a l l _ e n t r y _ r e a d ” : $ e v t . $ n a m e = = “ s y s c a l l _ e x i t _ r e a d ” ’

  • p

e r i

  • d
  • c

a p t u r e s ‘ r e a d _ d u r a t i

  • n

: r e a d _ s i z e = $ e v t . c

  • u

n t ’

slide-18
SLIDE 18

LinuxCon North America 2016 18

Custom Periods Custom Periods

$ l t t n g

  • p

e r i

  • d

l

  • g
  • p

e r i

  • d

‘ j

  • b

_ d u r a t i

  • n

: $ e v t . $ n a m e = = “ j

  • b

_ s t a r t ” : $ e v t . $ n a m e = = “ j

  • b

_ e n d ”

  • p

e r i

  • d

‘ f e t c h ( j

  • b

_ l e n ) : $ e v t . $ n a m e = = “ j

  • b

_ s t a r t ” : $ e v t . $ n a m e = = “ f e t c h _ e n d ” ’ [ . . . ]

j

  • b

_ l e n

j

  • b

_ s t a r t p r

  • c

e s s i n g _ e n d f e t c h _ e n d j

  • b

_ e n d

f e t c h p r

  • c

e s s p

  • s

t

slide-19
SLIDE 19

LinuxCon North America 2016 19

Tackling Some “Real” Problems Tackling Some “Real” Problems

  • Long Time to First Byte
  • Demos on

WordPress

MariaDB/MySQL

Apache httpd

  • The stack doesn’t matter...

as long as it runs on Linux!

slide-20
SLIDE 20

LinuxCon North America 2016 20

Anatomy of a Request Anatomy of a Request

  • request

f d = a c c e p t 4 ( . . . )

c l

  • s

e ( f d )

slide-21
SLIDE 21

LinuxCon North America 2016 21

Anatomy of a Request Anatomy of a Request

  • url_rewrite

f d = a c c e p t 4 ( . . . )

n e w s t a t ( . . . )

slide-22
SLIDE 22

LinuxCon North America 2016 22

Anatomy of a Request Anatomy of a Request

  • db_queries

c

  • n

n e c t ( . . . )

n e w s t a t ( . . . )

  • >

PHP resumes execution

slide-23
SLIDE 23

LinuxCon North America 2016 23

Anatomy of a Request Anatomy of a Request

  • Coarse breakdown of periods

– The goal is not to benchmark – Want to know in which “phase” the problem happens

  • r

e q u e s t ( p a r e n t )

– u

r l _ r e w r i t e

– d

b _ q u e r i e s

– r

e n d e r i n g

– s

e n d _ t

  • _

c l i e n t

slide-24
SLIDE 24

LinuxCon North America 2016 24

Deploying LTTng Deploying LTTng

Collector Apache httpd MariaDB l t t n g

  • r

e l a y d l t t n g

  • s

e s s i

  • n

d l t t n g

  • s

e s s i

  • n

d

 

LTTng session profile

Kernel Traces

slide-25
SLIDE 25

LinuxCon North America 2016 25

Case #1 Case #1

  • Almost all requests are unacceptably slow
  • No connectivity problems between clients and servers
  • Capture a trace for a couple of seconds on both nodes

– Stream to my machine for storage and investigation

  • No need to plan for storage in production
  • Local buffering can be used

– No push-back against the kernel if the network bandwidth is too low

slide-26
SLIDE 26

LinuxCon North America 2016 26

Case #1 Case #1

  • Extract a log of all requests from the WordPress node’s trace

$ l t t n g

  • p

e r i

  • d

l

  • g
  • t
  • p
  • s

t a t s

  • f

r e q

  • p

e r i

  • d

' r e q u e s t : $ e v t . $ n a m e = = " s y s c a l l _ e x i t _ a c c e p t 4 " & & $ e v t . p r

  • c

n a m e = = " a p a c h e 2 " : $ e v t . p i d = = $ b e g i n . $ e v t . p i d & & $ e v t . $ n a m e = = " s y s c a l l _ e n t r y _ c l

  • s

e " & & $ e v t . f d = = $ b e g i n . $ e v t . r e t '

slide-27
SLIDE 27

LinuxCon North America 2016 27

Not a tracers’ forte Not a tracers’ forte

  • Collecting long traces to compute CPU usage is inefficient
  • Most monitoring tools would have found this right away
slide-28
SLIDE 28

LinuxCon North America 2016 28

What Tracing is Good For What Tracing is Good For

  • Most monitoring tools focus on trends and averages

High-level view of resource usage

Error rates

  • Tracers are great to understand

Usage problems

  • Lots of small r

e a d ( ) , contention on a shared system resource, etc.

  • Race conditions

Spurious issues

  • One slow request out of many million
slide-29
SLIDE 29

LinuxCon North America 2016 29

Snapshot Mode Snapshot Mode

  • Also known as “flight-recorder” mode

Trace to in-memory buffers

Configurable buffer sizes

Low overhead

  • Save only when an “

l t t n g s n a p s h

  • t

r e c

  • r

d ” is issued

Save to disk

Stream through the network to the collector node

slide-30
SLIDE 30

LinuxCon North America 2016 30

mod_trigger mod_trigger

  • Proof-of-concept Apache httpd module inspired by mod_logslow
  • Triggers a snapshot on all machines whenever a request takes longer than a set

threshold

  • Limitations

– ssh into target machines can take too long and cause us to lose the information we were

looking for (overwritten buffers)

– No rate limiting, which can make things worse...

https://github.com/jgalar/mod_trigger

slide-31
SLIDE 31

LinuxCon North America 2016 31

Case #2, #3 and #4 Case #2, #3 and #4

  • Loaded the LTTng session profile on both machines
  • Launch l

t t n g

  • r

e l a y d

  • n my machine

– Runs on Linux, BSD and OS X

  • Wait until a request fails to meet our QOS-levels

– 1 second per request

slide-32
SLIDE 32

LinuxCon North America 2016 32

Case #2 Case #2

  • xz is clearly taking-up a lot of CPU time

– Trace Compass allows us to see which process launched it – Logs are being compressed and collected

slide-33
SLIDE 33

LinuxCon North America 2016 33

Case #3 Case #3

  • Long call to s

e l e c t ( . . . )

  • UDP socket created
  • c
  • n

n e c t ( ) to 8 . 8 . 6 . 6

slide-34
SLIDE 34

LinuxCon North America 2016 34

Case #3 Case #3

  • a

c c e s s ( “ w p

  • c
  • n

t e n t / p l u g i n s / h e l l

  • .

p h p ” )

slide-35
SLIDE 35

LinuxCon North America 2016 35

Case #3 Case #3

  • A WordPress plug-in is making DNS

requests to 8.8.6.6 ... which doesn’t exist!

slide-36
SLIDE 36

LinuxCon North America 2016 36

Case #4 Case #4

  • Limitations

IO at the block layer is hard to track

We don’t see the core dump file being created (no o p e n ( ) )

b l

  • c

k _ d i r t y _ b u f f e r happens in the context of crashing m y _ s e r v i c e

b l

  • c

k _ r q _ i s s u e and b l

  • c

k _ r q _ c

  • m

p l e t e happen in the context of k w

  • r

k e r / : 2

slide-37
SLIDE 37

LinuxCon North America 2016 37

Conclusion Conclusion

  • Tools make traces useful in the real world

Manual inspection works in a very narrow window

You don’t need to be a kernel developer to make sense of them

  • Almost automated...

Use snapshots to collect on failure

lttng-analyses + your scripts can be run on the traces to start the investigation

  • See https://github.com/jgalar/LinuxConNA2016
slide-38
SLIDE 38

LinuxCon North America 2016 38

Questions ? Questions ?

?

 lttng.org  lttng-dev@lists.lttng.org  @lttng_project

 www.efficios.com  @efficios