Linux Performance Profiling and Monitoring Christoph Mitasch & - - PowerPoint PPT Presentation
Linux Performance Profiling and Monitoring Christoph Mitasch & - - PowerPoint PPT Presentation
Linux Performance Profiling and Monitoring Christoph Mitasch & Georg Schnberger @cmitasch Thomas-Krenn.AG _ A server manufacturer in Bavaria, Germany _ Well visited knowledge base, Thomas-Krenn Wiki (parts in English) 2 Agenda _
2
Thomas-Krenn.AG
_ A server manufacturer in Bavaria, Germany _ Well visited knowledge base, Thomas-Krenn Wiki (parts in English)
3
Agenda
_ Collect Statistics
_
Sysstat Package
_
i
- s
t a t
_
p i d s t a t
_
s a r , a t
- p
_
Percona Cacti Template
_ Watch online
_
t
- p
_
i
- t
- p
_
i f t
- p
_ Tracing
_
p e r f _ e v e n t s
_
f t r a c e
_
p e r f
- t
- l
s
_
Flame graphs
# f i n d /
- t
y p e f
- n
a m e s t a t i s t i c s
From Brendan Gregg, http://www.brendangregg.com/linuxperf.html
From Brendan Gregg, http://www.brendangregg.com/linuxperf.html
20
8
m p s t a t
( p a r t
- f
s y s s t a t )
_ Without Interval/Count → since system startup _ CPU usage per Core
_
Including Hyperthreading
_ Check how well usage is balanced
# m p s t a t
- P
A L L L i n u x 3 . 1 3 .
- 4
8
- g
e n e r i c ( X 2 2 ) 2 1 5
- 4
- 1
4 _ x 8 6 _ 6 4 _ ( 4 C P U ) 1 4 : 2 8 : 2 1 C P U % u s r % n i c e % s y s % i
- w
a i t % i r q % s
- f
t % s t e a l % g u e s t % g n i c e % i d l e 1 4 : 2 8 : 2 1 a l l 1 1 , 5 9 , 9 3 , 6 2 , 3 , , 4 , , , 8 4 , 6 4 1 4 : 2 8 : 2 1 6 , 4 5 , 5 1 , 8 7 , 4 , , 7 , , , 9 1 , 5 3 1 4 : 2 8 : 2 1 1 1 6 , 4 4 , 1 1 5 , 5 6 , 1 , , , , , 7 7 , 8 9 1 4 : 2 8 : 2 1 2 1 7 , 1 5 , 1 4 5 , 5 5 , 3 , , 5 , , , 7 7 , 8 1 4 : 2 8 : 2 1 3 1 6 , 2 7 , 1 1 4 , 8 9 , 1 , , 2 , , , 7 8 , 7 # l s c p u | g r e p
- E
' c
- r
e | s
- c
k e t ' T h r e a d ( s ) p e r c
- r
e : 2 C
- r
e ( s ) p e r s
- c
k e t : 2
9
m p s t a t
# m p s t a t
- P
A L L 1 2 L i n u x 3 . 1 3 .
- 4
8
- g
e n e r i c ( X 2 2 ) 2 1 5
- 4
- 1
4 _ x 8 6 _ 6 4 _ ( 4 C P U ) 1 5 : 2 4 : 4 4 C P U % u s r % n i c e % s y s % i
- w
a i t % i r q % s
- f
t % s t e a l % g u e s t % g n i c e % i d l e 1 5 : 2 4 : 4 5 a l l 5 , 2 1 , 7 , 1 2 1 7 , 8 1 , , 2 7 , , , 6 9 , 5 9 1 5 : 2 4 : 4 5 1 , 4 3 , 1 , 4 3 , , 2 , 8 6 , , , 9 4 , 2 9 1 5 : 2 4 : 4 5 1 1 1 , 8 8 , 2 3 , 7 6 6 4 , 3 6 , , , , , , 1 5 : 2 4 : 4 5 2 4 , 1 2 , 1 , 3 , , , , , , 9 4 , 8 5 1 5 : 2 4 : 4 5 3 3 , 3 , 1 , 1 , , , , , , 9 5 , 9 6 1 5 : 2 4 : 4 5 C P U % u s r % n i c e % s y s % i
- w
a i t % i r q % s
- f
t % s t e a l % g u e s t % g n i c e % i d l e 1 5 : 2 4 : 4 6 a l l 5 , 7 4 , 7 , 1 1 7 , 7 6 , , 5 5 , , , 6 8 , 8 5 1 5 : 2 4 : 4 6 2 , 9 9 , 1 , 4 9 , , 2 , 9 9 , , , 9 2 , 5 4 1 5 : 2 4 : 4 6 1 1 1 , 8 8 , 2 3 , 7 6 6 4 , 3 6 , , , , , , 1 5 : 2 4 : 4 6 2 6 , , 1 , , , , , , , 9 3 , 1 5 : 2 4 : 4 6 3 1 , 1 , 1 , 1 , , , , , , 9 7 , 9 8
Core 1 is not idle and also deals with % i
- w
a i t
10
v m s t a t
_ High Level Statistics about
_
Virtual memory
_
Swap/Paging
_
I/O statistics
_
System interrupts and context switches
_
CPU statistics
# v m s t a t 1 p r
- c
s
- m
e m
- r
y
- s
w a p
- i
- s
y s t e m
- c
p u
- r
b s w p d f r e e b u f f c a c h e s i s
- b
i b
- i
n c s u s s y i d w a s t 3 1 7 2 3 7 1 8 5 6 1 3 7 8 8 3 1 2 5 6 6 4 1 5 3 6 7 6 1 8 7 5 9 1 7 9 5 6 1 7 3 1 7 2 4 1 6 5 9 6 1 3 7 9 6 3 1 2 5 7 4 1 6 3 4 2 8 6 8 9 7 4 1 9 1 1 1 6 1 1 7 1 7 2 4 5 1 7 1 6 1 3 7 9 6 3 8 9 9 1 6 3 9 6 1 8 4 8 3 1 9 6 1 7 2 4 1 3 9 1 6 1 3 7 1 8 3 1 1 8 7 9 6 5 2 5 2 2 2 1 8 9 2 9 2 1 7 2 3 9 9 7 5 6 1 3 7 1 8 3 1 1 8 8 6 2 8 4 8 8 4 1 4 8 3 1 9 4 1 1 1 3 6 6 1 2 1 1 1 7 2 3 6 4 9 4 8 1 3 7 1 8 3 1 1 8 9 8 8 3 1 7 9 2 1 6 2 4 1 2 7 3 8 2 1 3 5 3 1 4
11
v m s t a t
_ Memory statistics
_
b u f f Raw disk blocks like filesystem metadata (superblocks, inodes)
_
c a c h e Memory used for data -> pages with actual contents
$ v m s t a t 1 p r
- c
s
- m
e m
- r
y
- s
w a p
- i
- s
y s t e m
- c
p u
- r
b s w p d f r e e b u f f c a c h e s i s
- b
i b
- i
n c s u s s y i d w a s t 1 7 2 6 7 7 6 1 8 2 1 7 2 3 3 1 3 6 8 4 1 5 9 4 9 6 1 5 4 2 2 2 1 8 6 7 6 1 7 2 6 7 6 2 8 1 8 2 1 7 2 3 3 1 3 6 8 4 5 2 3 8 7 2 8 4 2 9 5 1 7 2 6 7 3 4 8 1 8 2 1 7 2 3 3 1 3 6 8 4 3 9 7 2 3 4 4 1 9 5 1 7 2 6 6 4 4 8 1 8 2 1 7 2 3 3 1 3 6 8 4 3 7 8 1 8 9 6 4 2 9 4 $ f r e e t
- t
a l u s e d f r e e s h a r e d b u f f e r s c a c h e d M e m : 8 5 6 6 6 4 7 4 5 3 1 6 6 6 3 4 8 4 9 1 8 2 1 8 2 1 7 2 3 3 1 3 6 8 4
- /
+ b u f f e r s / c a c h e : 3 9 5 4 4 6 4 1 2 2 4 S w a p : 1 4 8 5 7 2 1 7 2 1 4 8 4
12
v m s t a t
_ Process related fields
_
r The number of runnable processes (running or waiting for run time)
_
If high → indicator for saturation
_
b The number of processes in uninterruptible sleep
_
Mostly waiting for I/O
# v m s t a t 1 p r
- c
s
- m
e m
- r
y
- s
w a p
- i
- s
y s t e m
- c
p u
- r
b s w p d f r e e b u f f c a c h e s i s
- b
i b
- i
n c s u s s y i d w a s t [ . . . ] 1 1 7 2 4 4 7 6 8 1 3 7 8 8 3 1 2 5 6 6 4 1 6 7 5 2 4 9 2 9 6 9 5 5 6 6 7 1 8 1 1 7 2 3 9 9 9 5 6 1 3 7 8 8 3 1 2 5 6 6 4 1 3 8 3 4 8 1 3 3 6 1 6 5 7 7 6 8 1 9 $ p s
- e
- p
p i d , p i d , u s e r , s t a t , p c p u , c
- m
m , w c h a n : 3 2 | g r e p e x t 4 [ . . . ] 7 1 5 9 7 1 6 1 r
- t
D s 3 . 2 f i
- e
x t 4 _ f i l e _ w r i t e 7 1 5 9 7 1 6 2 r
- t
D s 3 . 2 f i
- e
x t 4 _ f i l e _ w r i t e 7 1 5 9 7 1 6 4 r
- t
D s 3 . 2 f i
- e
x t 4 _ f i l e _ w r i t e
Kernel function process is sleeping on Processes doing I/O can be in waiting state
13
v m s t a t plots
https://clusterbuffer.wordpress.com/admin-tools/vmstat-plotter/
Drawing interrupts and context switches
But we are not satisfied with summaries and overviews...
What is PID 9059 doing?
16
p i d s t a t
( p a r t
- f
s y s s t a t )
_ Report statistics for tasks being managed by kernel _ CPU bound → identify peak activity
$ t
- p
- b
- n
1
- d
2
- %
C P U | h e a d [ . . . ] P I D U S E R P R N I V I R T R E S S H R S % C P U % M E M T I M E + C O M M A N D 9 5 9 g s c h
- e
n b 2 4 7 5 3 2 2 1 1 3 2 2 4 4 4 R 9 6 , 9 , 3 : 2 . 1 4 p y t h
- n
1 r
- t
2 3 3 8 8 3 2 5 6 1 5 S , , : 2 . 3 5 i n i t $ p i d s t a t
- p
9 5 9
- u
1
- l
L i n u x 3 . 1 3 .
- 4
8
- g
e n e r i c ( X 2 2 ) 2 1 5
- 4
- 1
5 _ x 8 6 _ 6 4 _ ( 4 C P U ) 1 : 1 1 : 4 U I D P I D % u s r % s y s t e m % g u e s t % C P U C P U C
- m
m a n d 1 : 1 1 : 5 1 9 5 9 1 , , , 1 , p y t h
- n
i j k
- m
a t r i x . p y
- i
m a t r i x . i n 1 : 1 1 : 6 1 9 5 9 1 , , , 1 , p y t h
- n
i j k
- m
a t r i x . p y
- i
m a t r i x . i n 1 : 1 1 : 7 1 9 5 9 1 , , , 1 , p y t h
- n
i j k
- m
a t r i x . p y
- i
m a t r i x . i n
Even check command line arguments (“-l”) !
17
p i d s t a t
_ I/O bound → device report
# m p s t a t
- P
A L L 1 1 : 2 5 : 3 1 C P U % u s r % n i c e % s y s % i
- w
a i t % i r q % s
- f
t % s t e a l % g u e s t % g n i c e % i d l e 1 : 2 5 : 3 2 a l l 1 4 , 8 8 , 9 , 4 1 3 , 8 4 , 1 , 4 , , , 6 , 8 4 1 : 2 5 : 3 2 2 2 , 4 5 , 1 , 2 , , , , , , 7 6 , 5 3 1 : 2 5 : 3 2 1 1 3 , 7 3 , 3 4 , 3 1 5 1 , 9 6 , , , , , , 1 : 2 5 : 3 2 2 1 7 , 8 6 , , , , 3 , 5 7 , , , 7 8 , 5 7 1 : 2 5 : 3 2 3 6 , 1 2 , , , , , , , , 9 3 , 8 8 # p i d s t a t
- d
1 L i n u x 3 . 1 3 .
- 4
8
- g
e n e r i c ( X 2 2 ) 2 1 5
- 4
- 1
5 _ x 8 6 _ 6 4 _ ( 4 C P U ) 1 : 2 6 : 3 5 U I D P I D k B _ r d / s k B _ w r / s k B _ c c w r / s C
- m
m a n d 1 : 2 6 : 3 6 9 2 8 , 2 3 3 , 8 5 , f i
- 1
: 2 6 : 3 6 9 2 9 , 2 9 9 6 , 1 5 , f i
- 1
: 2 6 : 3 6 9 2 1 , 2 2 3 , 8 , f i
- 1
: 2 6 : 3 6 9 2 1 1 , 1 2 8 4 , 6 2 , f i
- Which process
is causing % i
- w
a i t ? Device report reveals command and I/O
18
p i d s t a t
_ How much memory is PID 8461 using?
_
Major faults require I/O operations, good indicator you need more RAM!
# p i d s t a t
- r
- p
8 4 6 1 1 3 L i n u x 3 . 1 3 .
- 4
9
- g
e n e r i c ( X 2 2 ) 2 1 5
- 4
- 2
1 _ x 8 6 _ 6 4 _( 4 C P U ) 1 : 9 : 6 U I D P I D m i n f l t / s m a j f l t / s V S Z R S S % M E M C
- m
m a n d 1 : 9 : 7 1 8 4 6 1 8 , , 2 1 8 3 8 4 7 8 6 6 8 8 9 , 7 6 f i r e f
- x
1 : 9 : 8 1 8 4 6 1 1 1 , , 2 1 8 3 8 4 7 8 6 6 8 8 9 , 7 6 f i r e f
- x
1 : 9 : 9 1 8 4 6 1 2 3 , , 2 1 8 4 4 8 7 8 6 8 9 2 9 , 7 7 f i r e f
- x
A v e r a g e : 1 8 4 6 1 1 4 , , 2 1 8 4 5 7 8 6 7 5 6 9 , 7 7 f i r e f
- x
Current used share
- f physical memory
Minor and major page faults
19
i
- s
t a t
( p a r t
- f
s y s s t a t )
_ I/O subsystem statistics _ CPU or device utilization report _ Without argument → summary since boot
_
Skip that with - y
- ption
# i
- s
t a t L i n u x 3 . 1 3 .
- 4
8
- g
e n e r i c ( X 2 2 ) 2 1 5
- 4
- 1
5 _ x 8 6 _ 6 4 _ ( 4 C P U ) a v g
- c
p u : % u s e r % n i c e % s y s t e m % i
- w
a i t % s t e a l % i d l e 1 6 , 1 6 , 9 4 , 7 9 , 4 6 , 7 8 , 5 D e v i c e : t p s k B _ r e a d / s k B _ w r t n / s k B _ r e a d k B _ w r t n s d a 8 3 , 8 4 1 , 6 4 5 3 1 , 4 3 2 2 3 7 5 5 7 2 8 5 5 8 1 1 9 6
20
i
- s
t a t
_ CPU util report → % i
- w
a i t _ Not really reliable → % i
- w
a i t is some kind of % i d l e time
# t a s k s e t 1 f i
- –
r w = r a n d w r i t e [ . . . ] & # i
- s
t a t
- y
- c
1 3 [ … ] a v g
- c
p u : % u s e r % n i c e % s y s t e m % i
- w
a i t % s t e a l % i d l e 1 7 , 3 2 , 6 , 5 6 1 3 , 6 5 , 6 2 , 4 7 # t a s k s e t 1 s h
- c
" w h i l e t r u e ; d
- t
r u e ; d
- n
e " & # i
- s
t a t
- y
- c
1 3 a v g
- c
p u : % u s e r % n i c e % s y s t e m % i
- w
a i t % s t e a l % i d l e 3 5 , 5 9 , 7 , 2 , , 5 7 , 3 9
http://www.percona.com/blog/2014/06/03/trust-vmstat-iowait-numbers/
21
i
- s
t a t
_ Extended device util report (“-x”) → % u t i l
_
m a n i
- s
t a t → … for devices serving request serially, for parallel processing (RAID arrays and SSDs), this number does not reflect their performance limits.
_ In theory
_ 94,4% u
t i l 23032 IOPS
_ 99,6% u
t i l 24300 IOPS
22
i
- s
t a t
# i
- s
t a t
- y
- d
- x
1 3 L i n u x 3 . 1 3 .
- 4
8
- g
e n e r i c ( X 2 2 ) 2 1 5
- 4
- 1
5 _ x 8 6 _ 6 4 _ ( 4 C P U ) D e v i c e : r r q m / s w r q m / s r / s w / s r k B / s w k B / s a v g r q
- s
z a v g q u
- s
z a w a i t r _ a w a i t w _ a w a i t s v c t m % u t i l s d a , 2 , , 2 3 3 2 , , 9 2 1 3 6 , 8 , 2 , 9 , 1 3 , , 1 3 , 4 9 4 , 4 # i
- s
t a t
- y
- d
- x
1 3 L i n u x 3 . 1 3 .
- 4
8
- g
e n e r i c ( X 2 2 ) 2 1 5
- 4
- 1
5 _ x 8 6 _ 6 4 _ ( 4 C P U ) D e v i c e : r r q m / s w r q m / s r / s w / s r k B / s w k B / s a v g r q
- s
z a v g q u
- s
z a w a i t r _ a w a i t w _ a w a i t s v c t m % u t i l s d a , 2 9 1 7 , , 4 3 1 7 5 , , 1 8 4 5 , 8 , 5 5 1 3 5 , 7 5 3 , 1 5 , 3 , 1 5 , 2 9 9 , 6
Only 5% u t i l increase, but IOPS nearly doubled!
23
https://www.thomas-krenn.com/de/wiki/Linux_Storage_Stack_Diagramm
24
i
- s
t a t
_ a v g q u
- s
z
- Avg. queue length of requests issued
_
( d e l t a [ t i m e _ i n _ q u e u e ] / i n t e r v a l ) / 1 .
_
t i m e _ i n _ q u e u e Requets waiting for device, effected by i n _ f l i g h t
_ a w a i t
- Avg. time requests being served
_
d e l t a [ r e a d _ t i c k s + w r i t e _ t i c k s ] / d e l t a [ r e a d _ I O s + w r i t e _ I
- s
]
_
t i c k s also effected by i n _ f l i g h t
_ Therefore serving more requests while a w a i t is not increasing, is a good performance indicator
- Monitoring IO Performance using iostat and pt-diskstats
- Block layer statistics
4 20
26
d s t a t
_ Combines several classical tools
_
Prints metrics and uses colors
_
Has a plugin concept
27
n i c s t a t
_ Print network device statistics
_
% U t i l depends on speed and duplex mode
_
S a t also takes errors into account
# n i c s t a t
- l
I n t L
- p
b a c k M b i t / s D u p l e x S t a t e v b
- x
n e t N
- u
n k n u p e t h N
- 1
f u l l u p l
- Y
e s
- u
n k n u p w l a n N
- u
n k n u p # n i c s t a t
- i
e t h 1 5 T i m e I n t r K B / s w K B / s r P k / s w P k / s r A v s w A v s % U t i l S a t 1 4 : 5 2 : 2 1 e t h 3 . 8 . 3 6 3 . 1 3 2 . 4 8 1 7 . 6 1 4 9 . 4 . . 5 1 4 : 5 2 : 2 2 e t h 1 9 . 8 9 1 . 2 3 1 6 . 9 8 1 7 . 9 7 1 1 9 9 . 6 7 . . 2 . 1 4 : 5 2 : 2 3 e t h 2 1 . 4 2 1 . 9 2 1 . 9 9 1 6 . 9 9 7 . 1 7 . . 2 . # n i c s t a t
- i
e t h
- t
1 2 1 4 : 5 7 : 3 6 I n K B O u t K B I n S e g O u t S e g R e s e t A t t F % R e T X I n C
- n
n O u t C
- n
D r
- p
s T C P . . 2 . 8 8 2 . 5 1 . 2 . . . . 4 . 1 4 : 5 7 : 3 7 I n K B O u t K B I n S e g O u t S e g R e s e t A t t F % R e T X I n C
- n
n O u t C
- n
D r
- p
s T C P . . . . . . . . . .
Check if your network is saturated, D r
- p
s can be an indicator!
Do you have a history of your system's performance data?
yes no
29
s a r
( p a r t
- f
s y s s t a t )
_ It's easy with system activity reporter
_
s a r , s a d c , s a 1 and s a 2 , s a d f
http://www.brendangregg.com/Perf/linux_observability_sar.png
30
k s a r
_ L C _ A L L = P O S I X s a r
- A
- f
s a 1 > k s a r .
- u
t . t x t
Mitigates character encoding and number format problems
31
a t
- p
_ Sets up a cronjob per default
$ g r e p s t a r t
- A
2 / e t c / c r
- n
. d / a t
- p
# s t a r t a t
- p
d a i l y a t m i d n i g h t * * * r
- t
i n v
- k
e
- r
c . d a t
- p
_ c r
- n
$ l s / v a r / l
- g
/ a t
- p
/ a t
- p
_ 2 1 5 4 * / v a r / l
- g
/ a t
- p
/ a t
- p
_ 2 1 5 4 1 / v a r / l
- g
/ a t
- p
/ a t
- p
_ 2 1 5 4 9 / v a r / l
- g
/ a t
- p
/ a t
- p
_ 2 1 5 4 2 2 / v a r / l
- g
/ a t
- p
/ a t
- p
_ 2 1 5 4 2 / v a r / l
- g
/ a t
- p
/ a t
- p
_ 2 1 5 4 2 / v a r / l
- g
/ a t
- p
/ a t
- p
_ 2 1 5 4 2 4 / v a r / l
- g
/ a t
- p
/ a t
- p
_ 2 1 5 4 8 / v a r / l
- g
/ a t
- p
/ a t
- p
_ 2 1 5 4 2 1 / v a r / l
- g
/ a t
- p
/ a t
- p
_ 2 1 5 4 2 7 $ a t
- p
- r
/ v a r / l
- g
/ a t
- p
/ a t
- p
_ 2 1 5 4 2 7
Toggle t to trigger a new sample
- r use as stopwatch if interval=0
32
Percona Cacti Template
_ Percona Linux Monitoring Template for Cacti _ generate many graphs easily
https://www.percona.com/doc/percona-monitoring- plugins/1.1/cacti/linux-templates.html
33
Agenda
_ Collect Statistics
_
Sysstat Package
_
i
- s
t a t
_
p i d s t a t
_
s a r and s a d c
_
Percona Cacti Template
_ Watch online
_
t
- p
_
i
- t
- p
_
i f t
- p
_ Tracing
_
p e r f _ e v e n t s
_
f t r a c e
_
p e r f
- t
- l
s
_
Flame graphs
34
t
- p
_ System summary at beginning _ Per process metrics afterwards
_
Default sorted by CPU usage
$ t
- p
- b
- n
1 | h e a d
- 1
5 t
- p
- 1
5 : 3 3 : 5 u p 3 d a y s , 1 9 : 2 , 3 u s e r s , l
- a
d a v e r a g e : . 1 3 , . 5 1 , . 5 9 T a s k s : 6 6 8 t
- t
a l , 1 r u n n i n g , 6 6 7 s l e e p i n g , s t
- p
p e d , z
- m
b i e C p u ( s ) : 1 . 5 % u s , . 3 % s y , . 1 % n i , 9 8 . 1 % i d , . % w a , . % h i , . % s i , . % s t M e m : 1 3 2 9 3 5 6 k t
- t
a l , 2 3 4 5 7 1 7 2 k u s e d , 1 8 5 5 2 1 8 4 k f r e e , 1 6 1 2 k b u f f e r s S w a p : 3 9 4 4 4 4 k t
- t
a l , k u s e d , 3 9 4 4 4 4 k f r e e , 1 2 6 8 2 1 8 8 k c a c h e d P I D U S E R P R N I V I R T R E S S H R S % C P U % M E M T I M E + C O M M A N D 2 9 2 7 6 r
- t
2 6 9 2 8 3 4 8 8 6 6 8 S 1 9 . 2 2 : 5 5 . 7 2
- s
s e c
- s
y s c h e c k d 1 1 9 3 g s c h
- e
n b 2 1 7 7 2 8 1 7 4 9 3 6 R 4 . : . 2 t
- p
1 1 2 5 7 r
- t
2 2 2 6 4 2 6 3 6 1 8 4 S 4 . 7 : 3 8 . 8 8
- p
e n v p n 1 9 9 7 w w w
- d
a t a 2 1 9 7 m 6 1 m 5 2 m S 4 . : 6 . 1 8 a p a c h e 2 7 7 5 r
- t
2 S 2 . 8 : 3 . 1 3 m d 3 _ r a i d 1 3 7 1 2 r
- t
3 9 1 9 S 2 . 2 2 : 4 5 . 8 5 k i p m i 1 2 8 7 r
- t
- 3
S 2 . 6 : 2 . 3 d r b d 2 _ a s e n d e r 1 8 6 5 3 r
- t
2 S 2 . 1 2 : 4 . 1 9 d r b d 1 _ r e c e i v e r
1, 5 and 15 min load average
35
t
- p
_ Memory usage
_
V I R T The total size of virtual memory for the process
_
Also including e.g. shared libraries, not already mapped h e a p
- r
s w a p
_
how much memory the program is able to access at the moment
_
R E S How many blocks are really allocated and mapped to address space → resident
_
how much actual physical memory a process is consuming
_
S H R
_
how much of the VIRT size is actually sharable
- https://www.linux.com/learn/tutorials/42048-uncover-the-meaning-of-tops-statistics
- http://www.linuxdevcenter.com/pub/a/linux/2006/11/30/linux-out-of-memory.html
36
t
- p
_ Can consume resources on it's own _ Toggle f and select fields, e.g. S W A P _ - u let's you see processes from a user _ Toggle k to kill a PID _ Toggle r to renice a PID _ But
_
t
- p
can miss short living processes
_
high % C P U → so what?
_
Keep an eye on the tracing part
37
h t
- p
_ „Super advanced“ t
- p
_
Uses colors, views can be customized
38
i
- t
- p
_ Simple top like I/O monitor _ Which process is causing I/O
_
Filtering specific PID is possible
# i
- t
- p
- b
T
- t
a l D I S K R E A D : . B / s | T
- t
a l D I S K W R I T E : 6 3 . 9 4 M / s A c t u a l D I S K R E A D : . B / s | A c t u a l D I S K W R I T E : 6 3 . 9 M / s T I D P R I O U S E R D I S K R E A D D I S K W R I T E S W A P I N I O C O M M A N D 1 9 1 5 3 b e / 4 r
- t
. B / s 6 3 . 8 9 M / s . % 7 5 . 4 4 % f i
- r
w = r a n d w r i t e
- n
a m e = t e s t
- f
i l e n a m e = t e s t . f i
- s
i z e = 3 M
- d
i r e c t = 1
- b
s = 4 k 1 7 7 1 5 b e / 4 g s c h
- e
n b . B / s 4 6 . 1 8 K / s . % . % f i r e f
- x
[ m
- z
S t
- r
a g e # 1 ] # i
- t
- p
- b
T
- t
a l D I S K R E A D : 6 9 . 2 M / s | T
- t
a l D I S K W R I T E : 6 5 . 9 2 K / s A c t u a l D I S K R E A D : 6 9 . 2 M / s | A c t u a l D I S K W R I T E : 3 4 5 . 1 2 K / s T I D P R I O U S E R D I S K R E A D D I S K W R I T E S W A P I N I O C O M M A N D 1 9 1 7 6 b e / 4 r
- t
6 9 . 2 M / s . B / s . % 8 8 . 2 8 % f i
- r
w = r e a d
- n
a m e = t e s t
- f
i l e n a m e = t e s t . f i
- s
i z e = 3 M
- d
i r e c t = 1
- b
s = 8 k
Show writes, reads and command in realtime
39
Bandwidth live usage
_ i f t
- p
_
Per interface usage
_ n e t h
- g
s
_
Per process
N e t H
- g
s v e r s i
- n
. 8 . P I D U S E R P R O G R A M D E V S E N T R E C E I V E D 1 7 6 9 2 g s c h
- e
n b / u s r / l i b / f i r e f
- x
/ f i r e f
- x
e t h . 1 6 2 . 1 9 4 K B / s e c 1 6 5 8 5 r
- t
/ u s r / b i n / s s h e t h . . K B / s e c 1 6 6 1 1 g s c h
- e
n b e v
- l
u t i
- n
e t h . . K B / s e c ? r
- t
u n k n
- w
n T C P . . K B / s e c T O T A L . 1 6 2 . 1 9 4 K B / s e c
13 20
41
Agenda
_ Collect Statistics
_
Sysstat Package
_
i
- s
t a t
_
p i d s t a t
_
s a r and s a d c
_
Percona Cacti Template
_ Watch online
_
a t
- p
_
t
- p
_
i
- t
- p
_
i f t
- p
_ Tracing
_
p e r f _ e v e n t s
_
f t r a c e
_
p e r f
- t
- l
s
_
Flame graphs
# w h e r e i s t r a c i n g
43
Profiling
_ Create profile about usage characteristics
_
Count specific samples/events
_
Count objects
_ Collecting statistics about tracepoints
_
Lines of kernel code with defined event
_ Next slides focus on system profiling
_
f t r a c e
_
p e r f _ e v e n t s and p e r f
44
f t r a c e
_ Part of the Linux kernel since 2.6.27 (2008) _ What is going on inside the kernel _ Common task is to trace events _ With f t r a c e configured, only d e b u g f s is required
# c a t / p r
- c
/ s y s / k e r n e l / f t r a c e _ e n a b l e d 1 # m
- u
n t | g r e p d e b u g n
- n
e
- n
/ s y s / k e r n e l / d e b u g t y p e d e b u g f s ( r w ) / s y s / k e r n e l / d e b u g / t r a c i n g # c a t a v a i l a b l e _ t r a c e r s b l k m m i
- t
r a c e f u n c t i
- n
_ g r a p h w a k e u p _ r t w a k e u p f u n c t i
- n
n
- p
Most widely applicable, traces kernel function calls
45
f t r a c e
_ Interact with files in s y s
_
Easier with t r a c e
- c
m d → interface for s y s files
# l e s s / s y s / k e r n e l / d e b u g / t r a c i n g / t r a c e # ! / b i n / b a s h D E B U G F S = ` g r e p d e b u g f s / p r
- c
/ m
- u
n t s | a w k ' { p r i n t $ 2 ; } ' ` e c h
- $
$ > $ D E B U G F S / t r a c i n g / s e t _ f t r a c e _ p i d e c h
- f
u n c t i
- n
> $ D E B U G F S / t r a c i n g / c u r r e n t _ t r a c e r e c h
- 1
> $ D E B U G F S / t r a c i n g / t r a c i n g _
- n
$ * e c h
- >
$ D E B U G F S / t r a c i n g / t r a c i n g _
- n
View the recorded trace
46
p e r f _ e v e n t s and p e r f
_ Used to be called performance counters for Linux _ A lot of updates for kernel 4.1
_
https://lkml.org/lkml/2015/4/14/264
_ CPU performance counters, tracepoints, kprobes and uprobes _ Per package with l i n u x
- t
- l
s
- c
- m
m
- n
# w h i c h p e r f / u s r / b i n / p e r f # d p k g
- S
/ u s r / b i n / p e r f l i n u x
- t
- l
s
- c
- m
m
- n
: / u s r / b i n / p e r f
47
p e r f l i s t
_ p e r f l i s t
_
Shows supported events
# p e r f l i s t | w c
- l
1 7 7 9 # p e r f l i s t | g r e p H a r d w a r e c p u
- c
y c l e s O R c y c l e s [ H a r d w a r e e v e n t ] i n s t r u c t i
- n
s [ H a r d w a r e e v e n t ] c a c h e
- r
e f e r e n c e s [ H a r d w a r e e v e n t ] c a c h e
- m
i s s e s [ H a r d w a r e e v e n t ] b r a n c h
- i
n s t r u c t i
- n
s O R b r a n c h e s [ H a r d w a r e e v e n t ] b r a n c h
- m
i s s e s [ H a r d w a r e e v e n t ] b u s
- c
y c l e s [ H a r d w a r e e v e n t ] s t a l l e d
- c
y c l e s
- f
r
- n
t e n d O R i d l e
- c
y c l e s
- f
r
- n
t e n d [ H a r d w a r e e v e n t ] s t a l l e d
- c
y c l e s
- b
a c k e n d O R i d l e
- c
y c l e s
- b
a c k e n d [ H a r d w a r e e v e n t ] r e f
- c
y c l e s [ H a r d w a r e e v e n t ] L 1
- d
c a c h e
- l
- a
d s [ H a r d w a r e c a c h e e v e n t ] L 1
- d
c a c h e
- l
- a
d
- m
i s s e s [ H a r d w a r e c a c h e e v e n t ] L 1
- d
c a c h e
- s
t
- r
e s [ H a r d w a r e c a c h e e v e n t ] L 1
- d
c a c h e
- s
t
- r
e
- m
i s s e s [ H a r d w a r e c a c h e e v e n t ]
This also includes static tracepoints
48
Raw CPU counters
_ Each CPU has it's own raw counters
_
They should be documented by the hardware manufacturer
_
https://download.01.org/perfmon/
_ l i b p f m 4 is a nice way to find raw masks
# p e r f l i s t | g r e p r N N N r N N N [ R a w h a r d w a r e e v e n t d e s c r i p t
- r
] # g i t c l
- n
e g i t : / / p e r f m
- n
2 . g i t . s
- u
r c e f
- r
g e . n e t / g i t r
- t
/ p e r f m
- n
2 / l i b p f m 4 # c d l i b p f m 4 # m a k e # c d e x a m p l e s / # . / s h
- w
e v t i n f
- |
g r e p L L C | g r e p M I S S E S N a m e : L L C _ M I S S E S [ . . . ] # . / c h e c k _ e v e n t s L L C _ M I S S E S | g r e p C
- d
e s C
- d
e s : x 5 3 4 1 2 e # p e r f s t a t
- e
r 5 3 4 1 2 e s l e e p 5
Now we collect last level cache misses with the raw mask
49
Tracepoints
_ p e r f also has trace functionalities
_
Filesystem
_
Block layer
_
Syscalls
# p e r f l i s t | g r e p
- i
t r a c e | w c
- l
1 7 1 6 # p e r f s t a t
- e
' s y s c a l l s : s y s _ e n t e r _ m m a p ' . / h e l l
- W
- r
l d .
- u
t H e l l
- w
- r
l d ! P e r f
- r
m a n c e c
- u
n t e r s t a t s f
- r
' . / h e l l
- W
- r
l d .
- u
t ' : 8 s y s c a l l s : s y s _ e n t e r _ m m a p , 5 5 6 9 6 1 s e c
- n
d s t i m e e l a p s e d
50
p e r f s t a t
_ Get a counter summary
# p e r f s t a t p y t h
- n
n u m p y
- m
a t r i x . p y
- i
m a t r i x . i n P e r f
- r
m a n c e c
- u
n t e r s t a t s f
- r
' p y t h
- n
n u m p y
- m
a t r i x . p y
- i
m a t r i x . i n ' : 5 7 6 , 1 4 2 2 1 t a s k
- c
l
- c
k ( m s e c ) # , 9 3 C P U s u t i l i z e d 3 1 9 c
- n
t e x t
- s
w i t c h e s # , 5 5 4 K / s e c 4 c p u
- m
i g r a t i
- n
s # , 7 K / s e c 9 . 7 3 8 p a g e
- f
a u l t s # , 1 7 M / s e c 1 . 7 4 3 . 6 6 4 . 1 9 9 c y c l e s # 3 , 2 7 G H z [ 8 2 , 6 3 % ] 8 3 1 . 3 6 4 . 2 9 s t a l l e d
- c
y c l e s
- f
r
- n
t e n d # 4 7 , 6 8 % f r
- n
t e n d c y c l e s i d l e [ 8 3 , 7 5 % ] 4 5 8 . 7 6 . 5 2 3 s t a l l e d
- c
y c l e s
- b
a c k e n d # 2 6 , 3 1 % b a c k e n d c y c l e s i d l e [ 6 7 , 2 6 % ] 2 . 7 9 3 . 9 5 3 . 3 3 i n s t r u c t i
- n
s # 1 , 6 i n s n s p e r c y c l e # , 3 s t a l l e d c y c l e s p e r i n s n [ 8 4 , 2 8 % ] 5 7 3 . 3 4 2 . 4 7 3 b r a n c h e s # 9 9 5 , 2 6 M / s e c [ 8 3 , 7 8 % ] 3 . 5 8 6 . 2 4 9 b r a n c h
- m
i s s e s # , 6 3 %
- f
a l l b r a n c h e s [ 8 2 , 7 % ] , 6 1 9 4 8 2 1 2 8 s e c
- n
d s t i m e e l a p s e d
A way to compare performance of different algorithms
51
p e r f r e c
- r
d
_ Record samples to a file
_
Can be post-processed with p e r f r e p
- r
t
_
- a
records on all CPUs
_
- g
records call graphs
_
Install debug symbols
# p e r f r e c
- r
d
- a
- g
s l e e p 5 [ p e r f r e c
- r
d : W
- k
e n u p 4 t i m e s t
- w
r i t e d a t a ] [ p e r f r e c
- r
d : C a p t u r e d a n d w r
- t
e 2 . 1 5 7 M B p e r f . d a t a ( ~ 9 4 2 5 4 s a m p l e s ) ]
Nice way to record what's currently running on all CPUs
52
p e r f r e p
- r
t
_ Displays profile of a record
_
Can be sorted and or filtered
_
Shows all samples
53
# p e r f r e p
- r
t
- i
p e r f . d a t a . d d
- s
t d i
- s
h
- w
c p u u t i l i z a t i
- n
- s
- r
t c
- m
m , d s
- [
. . . ] # O v e r h e a d s y s u s r C
- m
m a n d S h a r e d O b j e c t # . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 5 . % 9 5 . % . % d d [ k e r n e l . k a l l s y m s ] | |
- 3
3 . 2 2 %
- _
a e s n i _ e n c 1 | _ _ a b l k _ e n c r y p t | a b l k _ e n c r y p t | c r y p t _ s c a t t e r l i s t | c r y p t _ e x t e n t | e c r y p t f s _ e n c r y p t _ p a g e | e c r y p t f s _ w r i t e _ e n d | g e n e r i c _ f i l e _ b u f f e r e d _ w r i t e | _ _ g e n e r i c _ f i l e _ a i
- _
w r i t e | g e n e r i c _ f i l e _ a i
- _
w r i t e | d
- _
s y n c _ w r i t e | v f s _ w r i t e | s y s _ w r i t e | s y s t e m _ c a l l _ f a s t p a t h | _ _ G I _ _ _ l i b c _ w r i t e | x 4 1 5 f 6 5 6 4 3 d 5 2 4 5 5 |
- 9
. 1 1 %
- _
c
- n
d _ r e s c h e d | | | |
- 5
7 . 9 4 %
- e
x t 4 _ d i r t y _ i n
- d
e | | _ _ m a r k _ i n
- d
e _ d i r t y | | g e n e r i c _ w r i t e _ e n d | | e x t 4 _ d a _ w r i t e _ e n d | | g e n e r i c _ f i l e _ b u f f e r e d _ w r i t e
Command and shared object Traced method d d writes data
54
p e r f
- t
- l
s
_ By Brendan Gregg
_
https://github.com/brendangregg/perf-tools
_
Mostly quick hacks, read Warnings!
_ Using p e r f _ e v e n t s and f t r a c e _ Good examples what can be done with p e r f and f t r a c e
_
i
- s
n
- p
Shows I/O access for commands, including latency
_
c a c h e s t a t Linux page cache hit/miss statistics
_
f u n c t r a c e Count kernel functions matching wildcards Nice, this are simple bash scripts!
19 20
# v i e w f l a m e g r a p h
57
Flamegraph
_ Visualization how resources are distributed among code
Powered by @agentzh, http://agentzh.org/misc/slides/yapc-na-2013-flame-graphs.pdf
58
Flamegraph
# p e r f r e c
- r
d
- g
d d i f = / d e v / z e r
- f
= t e s t . d a t a c
- u
n t = 1 b s = 1 M # m v p e r f . d a t a p e r f . d a t a . d d # p e r f s c r i p t
- i
p e r f . d a t a . d d | . / F l a m e G r a p h / s t a c k c
- l
l a p s e
- p
e r f . p l >
- u
t . d d . f
- l
d e d # . / F l a m e G r a p h / f l a m e g r a p h . p l
- u
t . d d . f
- l
d e d >
- u
t . p e r f . d d . s v g
Thanks for your attention!
_ cmitasch AT thomas-krenn.com _ @cmitasch