H a r d w a r e - a s s i s t e d s o f t w a - - PowerPoint PPT Presentation

h a r d w a r e a s s i s t e d s o f t w a r e t r a c i
SMART_READER_LITE
LIVE PREVIEW

H a r d w a r e - a s s i s t e d s o f t w a - - PowerPoint PPT Presentation

H a r d w a r e - a s s i s t e d s o f t w a r e t r a c i n g A d r i e n V e r g a d r i e n v e r g e @ g m a i l . c o m t a l k a b o u t t r a c i n g i m p


slide-1
SLIDE 1

H a r d w a r e

  • a

s s i s t e d s

  • f

t w a r e t r a c i n g

A d r i e n V e r g é

a d r i e n v e r g e @ g m a i l . c

  • m
slide-2
SLIDE 2

t a l k a b

  • u

t

t r a c i n g

slide-3
SLIDE 3

i m p r

  • v

e

t r a c i n g

u s i n g

h a r d w a r e

slide-4
SLIDE 4

1 T r a c i n g 2 H a r d w a r e 3 I m p r

  • v

e m e n t s

slide-5
SLIDE 5

1

T r a c i n g

slide-6
SLIDE 6

“ a t e c h n i q u e u s e d t

  • u

n d e r s t a n d w h a t i s g

  • i

n g

  • n

i n a s y s t e m i n

  • r

d e r t

  • d

e b u g

  • r

m

  • n

i t

  • r

i t ”

slide-7
SLIDE 7

r e c

  • r

d i n g e v e n t s

f r

  • m

t h e k e r n e l : i n u s e r

  • s

p a c e : t

r a c e p

  • i

n t s i n s i d e y

  • u

r a p p l i c a t i

  • n

I R Q h a n d l e r s , s y s t e m c a l l s , s c h e d u l i n g a c t i v i t y , n e t w

  • r

k a c t i v i t y , e t c .

slide-8
SLIDE 8

W h y i s m y s

  • f

t w a r e c r a s h i n g ?

W h e r e a r e t h e b

  • t

t l e n e c k s ?

H

  • w

t

  • i

m p r

  • v

e p e r f

  • r

m a n c e ? u s e l e s s r e s

  • u

r c e s r u n f a s t e r s a v e b a t t e r y

slide-9
SLIDE 9

a p r

  • c

e s s s p a w n s 2 t h r e a d s : # 1 p r

  • d

u c e s c h u n k s

  • f

d a t a t h a t # 2 c

  • n

s u m e s

t h r e a d 1 t h r e a d 1 t h r e a d 2 t h r e a d 2 p r

  • c

e s s p r

  • c

e s s

t i m e

e x a m p l e

slide-10
SLIDE 10

e x a m p l e : L T T n g + T M F

12:40:48.500 12:40:48.600 12:40:48.700

Process TID PTID bottleneck bottleneck bottleneck 26242 26243 26244 26226 26242 26242

12:40:48.500 12:40:48.600 12:40:48.700

CPU 0 CPU 1 CPU 2 CPU 3 CPU 4 CPU 5 CPU 6 CPU 7 IRQ 44 IRQ 46 IRQ 43 SOFT_IRQ 9 SOFT_IRQ 4 SOFT_IRQ 1 SOFT_IRQ 7

WAIT_BLOCKED WAIT_FOR_CPU USERMODE SYSCALL INTERRUPTED

slide-11
SLIDE 11

e x a m p l e : L T T n g + T M F

12:40:48.500 12:40:48.600 12:40:48.700

Process TID PTID bottleneck bottleneck bottleneck 26242 26243 26244 26226 26242 26242

WAIT_BLOCKED WAIT_FOR_CPU USERMODE SYSCALL INTERRUPTED

execve e c c m m

slide-12
SLIDE 12

e x a m p l e : L T T n g + T M F

12:40:48.500 12:40:48.600 12:40:48.700

Process TID PTID bottleneck bottleneck bottleneck 26242 26243 26244 26226 26242 26242

WAIT_BLOCKED WAIT_FOR_CPU USERMODE SYSCALL INTERRUPTED

read exec

slide-13
SLIDE 13

e x a m p l e : L T T n g + T M F

12:40:48.500 12:40:48.600 12:40:48.700

Process TID PTID bottleneck bottleneck bottleneck 26242 26243 26244 26226 26242 26242

WAIT_BLOCKED WAIT_FOR_CPU USERMODE SYSCALL INTERRUPTED

read write wri write read exec

slide-14
SLIDE 14

t r a c i n g :

r e c

  • r

d i n g e v e n t s f

  • r

u s e i n f u r t h e r a n a l y s i s

slide-15
SLIDE 15

t r a c i n g :

r e c

  • r

d i n g e v e n t s f

  • r

u s e i n f u r t h e r a n a l y s i s

S

  • i

t ' s j u s t l

  • g

g i n g ?

slide-16
SLIDE 16

t r a c i n g

v s .

l

  • g

g i n g

c

  • m

p a c t b i n a r y t r a c e f

  • r

m a t b u f f e r i n g — a v

  • i

d d i s k I O l

  • c

k l e s s a l g

  • r

i t h m s l

  • w
  • l

e v e l

  • p

t i m i z a t i

  • n

s

r e s u l t : ~ 2 µ s v s . ~ 2 n s / e v e n t

slide-17
SLIDE 17

t r a c i n g u s e r s

h h e e a a v v y y w w

  • r

r k k l l

  • a

a d d s s e e r r v v e e r r s s r e a l

  • t

i m e i n t r u s i

  • n

d e t e c t i

  • n

G

  • g

l eI B M A u t

  • d

e s k C A E O P A L

  • R

T

slide-18
SLIDE 18

t r a c i n g u s e r s

h h e e a a v v y y w w

  • r

r k k l l

  • a

a d d s s e e r r v v e e r r s s e m b e d d e d s y s t e m s r e a l

  • t

i m e i n t r u s i

  • n

d e t e c t i

  • n

G

  • g

l eI B M E r i c s s

  • n

F r e e s c a l e M

  • n

t a v i s t a N

  • k

i a S i e m e n s S T M i c r

  • e

l e c t r

  • n

i c s W i n d R i v e r A u t

  • d

e s k C A E O P A L

  • R

T

slide-19
SLIDE 19

t r a c i n g u s e r s

h h e e a a v v y y w w

  • r

r k k l l

  • a

a d d s s e e r r v v e e r r s s e m b e d d e d s y s t e m s r e a l

  • t

i m e i n t r u s i

  • n

d e t e c t i

  • n

Y O U !

G

  • g

l eI B M E r i c s s

  • n

F r e e s c a l e M

  • n

t a v i s t a N

  • k

i a S i e m e n s S T M i c r

  • e

l e c t r

  • n

i c s W i n d R i v e r A u t

  • d

e s k C A E O P A L

  • R

T

slide-20
SLIDE 20

B e y

  • n

d H e i s e n b e r g :

  • b

s e r v e w i t h

  • u

t a l t e r i n g

— p e r f

  • r

m

l i g h t

( s i z e ) a n d

f a s t

( t i m e ) — d

  • n

' t p

  • l

l u t e

m e m

  • r

y

s p a c e —

t h

  • u

s a n d s

  • f

e v e n t s / s

n e e d s

slide-21
SLIDE 21

2

H a r d w a r e

slide-22
SLIDE 22

M i c r

  • c

h i p s a r e

n

  • l
  • n

g e r

j u s t

C P U

s

credit: ARM

slide-23
SLIDE 23

I n t e l

( x 8 6 )

B T S , L B R , P T . . . F r e e s c a l e

( P

  • w

e r P C )

N e x u s P r

  • g

r a m T r a c e , D a t a A c q u i s i t i

  • n

. . . A R M C

  • r

e S i g h t E T M , E T B , S T M . . .

l

  • t

s

  • f

t r a c i n g u n i t s

slide-24
SLIDE 24

S T M (

e v e n t t r a c i n g )

E T M (

e x e c u t i

  • n

t r a c i n g )

B T S (

e x e c u t i

  • n

t r a c i n g )

l

  • t

s

  • f

t r a c i n g u n i t s

slide-25
SLIDE 25

s u p p

  • r

t e d b y

( p r

  • b

a b l y g

  • d

)

p r

  • p

r i e t a r y s

  • f

t w a r e

l

  • t

s

  • f

t r a c i n g u n i t s

slide-26
SLIDE 26

D

  • y
  • u

h a v e

  • n

e

  • f

t h e s e ?

w i d e l y s p r e a d

credit: Samsung, tabletolic.com, player.de, digitaltrends.com

slide-27
SLIDE 27

w i d e l y s p r e a d

I s y

  • u

r I n t e l C P U n e w e r t h a n t h i s

  • n

e ?

credit: Intel

slide-28
SLIDE 28

3

I m p r

  • v

e m e n t s

slide-29
SLIDE 29

3

I m p r

  • v

e m e n t s

1 / 3 S T M

  • n

A R M

slide-30
SLIDE 30

S y s t e m T r a c e M

  • d

u l e ( S T M )

h e l p s

  • f

t w a r e

r e c

  • r

d i n g e v e n t s

G

  • a

l :

slide-31
SLIDE 31

S y s t e m T r a c e M

  • d

u l e ( S T M )

P r

  • v

i d e s

d e d i c a t e d r e s

  • u

r c e s

b u s , b u f f e r , t i m e s t a m p i n g

N e e d t

  • i

n s t r u m e n t

s

  • f

t w a r e

slide-32
SLIDE 32

CPU

ETM STM ETB

system bus timestamping

system-on-chip

S y s t e m T r a c e M

  • d

u l e ( S T M )

slide-33
SLIDE 33

i m p l e m e n t a t i

  • n

“ L T T n g

  • e

q u i v a l e n t ”

T h e t r a c e d p r

  • c

e s s i s i n s t r u m e n t e d : c a l l i n g t r a c e p

  • i

n t ( ) w r i t e s t

  • t

h e S T M . E m b e d d i n g p a y l

  • a

d i s p

  • s

s i b l e . A c

  • n

s u m e r p r

  • c

e s s r e t r i e v e s g e n e r a t e d t r a c e s a n d s t

  • r

e s t h e m .

slide-34
SLIDE 34

i m p l e m e n t a t i

  • n
  • p

t i m i z e d , c

  • m

p a c t b u t p r

  • p

r i e t a r y f

  • r

m a t

T r a c e s a r e e n c

  • d

e d i n

S T P

.

slide-35
SLIDE 35

r e s u l t s

5 10 15 20

  • nly tracepoints

computation + tracepoints time per iteration (µs) no tracing LTTng-UST STM + ETB

i n d i c a t i v e b e n c h m a r k :

  • v

e r h e a d m

  • s

t l y d e p e n d s

  • n

t h e t r a c e d a p p l i c a t i

  • n

!

slide-36
SLIDE 36

r e s u l t s

5 10 15 20

  • nly tracepoints

computation + tracepoints time per iteration (µs) no tracing LTTng-UST STM + ETB

slide-37
SLIDE 37

3

I m p r

  • v

e m e n t s

2 / 3 E T M

  • n

A R M

slide-38
SLIDE 38

t r a c e

e x e c u t i

  • n

G

  • a

l :

E m b e d d e d T r a c e M a c r

  • c

e l l ( E T M )

slide-39
SLIDE 39

t r a c e

e x e c u t i

  • n

G

  • a

l :

E m b e d d e d T r a c e M a c r

  • c

e l l ( E T M )

i . e . s a v e e v e r y e x e c u t e d i n s t r u c t i

  • n

a d d r e s s

slide-40
SLIDE 40

E m b e d d e d T r a c e M a c r

  • c

e l l ( E T M )

P r

  • v

i d e s

d e d i c a t e d r e s

  • u

r c e s

a d d r e s s c

  • m

p a r a t

  • r

s , b u f f e r , t i m e s t a m p i n g

slide-41
SLIDE 41

E m b e d d e d T r a c e M a c r

  • c

e l l ( E T M )

P r

  • v

i d e s

d e d i c a t e d r e s

  • u

r c e s

a d d r e s s c

  • m

p a r a t

  • r

s , b u f f e r , t i m e s t a m p i n g

C a n f

  • c

u s

  • n

a

s p e c i f i c p r

  • c

e s s

  • r

f u n c t i

  • n

t r i g g e r s u p

  • n

c u s t

  • m

c

  • n

d i t i

  • n

s

slide-42
SLIDE 42

E m b e d d e d T r a c e M a c r

  • c

e l l ( E T M )

P r

  • v

i d e s

d e d i c a t e d r e s

  • u

r c e s

a d d r e s s c

  • m

p a r a t

  • r

s , b u f f e r , t i m e s t a m p i n g

N

  • n

e e d t

  • i

n s t r u m e n t

s

  • f

t w a r e C a n f

  • c

u s

  • n

a

s p e c i f i c p r

  • c

e s s

  • r

f u n c t i

  • n

t r i g g e r s u p

  • n

c u s t

  • m

c

  • n

d i t i

  • n

s

slide-43
SLIDE 43

CPU

ETM STM ETB

system bus timestamping

system-on-chip

E m b e d d e d T r a c e M a c r

  • c

e l l ( E T M )

slide-44
SLIDE 44

i m p l e m e n t a t i

  • n

E T M n

  • t

m e a n t t

  • t

r a c e

e v e n t s

slide-45
SLIDE 45

i m p l e m e n t a t i

  • n

d

  • e

x e c u t i

  • n

t r a c i n g

  • n

e v e n t a d d r e s s e s

s e t a d d r e s s c

  • m

p a r a t

  • r

s t

  • t

r i g g e r i n [ e v e n t , e v e n t + 4 ]

E T M n

  • t

m e a n t t

  • t

r a c e

e v e n t s

I d e a :

slide-46
SLIDE 46

i m p l e m e n t a t i

  • n

n e e d e d t

  • w

r i t e

k e r n e l s u p p

  • r

t

f

  • r

p r

  • c

e s s

a n d

f u n c t i

  • n

t r a c i n g

slide-47
SLIDE 47

r e s u l t s

5 10 15 20 25 30 35 40 45 50

  • nly

computation more computation time per iteration (µs) EVENT LOSS no tracing LTTng-UST ETM + ETB

slide-48
SLIDE 48

3

I m p r

  • v

e m e n t s

3 / 3 B T S

  • n

x 8 6

slide-49
SLIDE 49

t r a c e

e x e c u t i

  • n

G

  • a

l :

B r a n c h T r a c e S t

  • r

e ( B T S )

slide-50
SLIDE 50

CPU

BTS

RAM

x86 host

branch records

4015a8 7f2aac77e024 7f2aac77e012 40ef26 4015b0 4015b4

B r a n c h T r a c e S t

  • r

e ( B T S )

slide-51
SLIDE 51

d

  • e

s n

  • t

p r

  • v

i d e d e d i c a t e d b u f f e r s c a n n

  • t

f

  • c

u s

  • n

a s p e c i f i c p r

  • c

e s s

  • r

f u n c t i

  • n

: t r a c e s e v e r y b r a n c h !

B r a n c h T r a c e S t

  • r

e ( B T S )

slide-52
SLIDE 52

$ perf record -e branches:u -c 1 -d ./myprogram $ perf script -f time,ip,addr 101918.272364: ffffffff814a6f2c => 7f8d7b9b3180 101918.272364: ffffffff814a6f2c => 7f8d7b9b3180 101918.272364: 7f8d7b9b3183 => 7f8d7b9b6730 101918.272364: ffffffff814a6f2c => 7f8d7b9b6730 101918.272364: ffffffff814a6f2c => 7f8d7b9b674f 101918.272364: ffffffff814a6f2c => 7f8d7b9b6756 101918.272364: 7f8d7b9b67c2 => 7f8d7b9b67df 101918.272364: 7f8d7b9b67e3 => 7f8d7b9b67c8 101918.272364: 7f8d7b9b67e3 => 7f8d7b9b67c8 101918.272364: 7f8d7b9b67ef => 7f8d7b9b6a30 101918.272364: 7f8d7b9b6a38 => 7f8d7b9b6a58 101918.272364: 7f8d7b9b6a62 => 7f8d7b9b6bc0 101918.272364: 7f8d7b9b6bd7 => 7f8d7b9b67d3 101918.272364: 7f8d7b9b67e3 => 7f8d7b9b67c8 101918.272364: 7f8d7b9b67e3 => 7f8d7b9b67c8 101918.272364: 7f8d7b9b67e3 => 7f8d7b9b67c8 101918.272364: 7f8d7b9b67e3 => 7f8d7b9b67c8 101918.272364: 7f8d7b9b67e3 => 7f8d7b9b67c8 101918.272364: 7f8d7b9b67e3 => 7f8d7b9b67c8 101918.272364: 7f8d7b9b67e3 => 7f8d7b9b67c8 101918.272364: 7f8d7b9b67e3 => 7f8d7b9b67c8

slide-53
SLIDE 53

“ I s

h a r d w a r e

  • a

s s i s t e d b r a n c h t r a c i n g

f a s t e r t h a n

p u r e

  • s
  • f

t w a r e e v e n t t r a c i n g

? ”

B T S n

  • t

m e a n t t

  • t

r a c e e v e n t s

i f e n a b l e d , t r a c e s e v e r y b r a n c h

slide-54
SLIDE 54

i m p l e m e n t a t i

  • n

h a r d w a r e

  • t

r a c e d w i t h

B T S :

s

  • f

t w a r e

  • t

r a c e d w i t h

L T T n g :

s i m p l e p r

  • g

r a m , e v e r y b r a n c h r e c

  • r

d e d s a m e p r

  • g

r a m , a d d a t r a c e p

  • i

n t ( ) a t e v e r y b r a n c h

slide-55
SLIDE 55

r e s u l t s

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 time per iteration (µs) program branching rate (branch/s) no tracing LTTng-UST BTS with perf

slide-56
SLIDE 56

core

64K 512K

core 1

64K 512K

core 6

64K 512K

core 7

64K 512K

disk user-space

system bufger

  • r

i g i n a l p e r f

B T S w r i t e s t r a c e t

  • a

d e d i c a t e d b u f f e r t r a c e i s c

  • p

i e d t

  • a

b i g g e r m e m

  • r

y z

  • n

e u p

  • n

b u f f e r f u l l

  • r

c

  • n

t e x t s w i t c h u s e r s t

  • r

e s t r a c e t

  • d

i s k u s i n g t h e w r i t e s y s t e m c a l l p

  • s

s i b l e c

  • p

y i n a n

  • t

h e r b u f f e r b e c a u s e n

  • O

_ S Y N C f l a g

slide-57
SLIDE 57

core core 1 core 6 core 7 disk

64K 512K × number

  • f cores

n e w “ s p l i c e d ” p e r f

B T S w r i t e s t r a c e t

  • a

d e d i c a t e d b u f f e r u p

  • n

b u f f e r f u l l

  • r

c

  • n

t e x t s w i t c h , m

  • v

e t

  • t

h e n e x t s u b

  • b

u f f e r f i l l e d s u b

  • b

u f f e r s a r e l a b e l e d t

  • b

e w r i t t e n t

  • d

i s k l a t e r w r i t i n g i s d

  • n

e b y a k e r n e l t a s k i n u s e r c

  • n

t e x t

slide-58
SLIDE 58

r e s u l t s

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 time per iteration (µs) program branching rate (branch/s) no tracing LTTng-UST BTS with perf BTS with "spliced" perf

slide-59
SLIDE 59

R e s u l t s

  • 7

5 %

  • v

e r h e a d c

  • m

p a r e d t

  • L

T T n g

  • U

S T n e e d s p

  • s

t

  • d

e c

  • d

i n g

S T M E T M B T S

  • 3

% t

  • 5

%

  • v

e r h e a d l i m i t e d n u m b e r

  • f

t r a c e p

  • i

n t s n

  • p

a y l

  • a

d n

  • t

s u i t e d f

  • r

e v e n t t r a c i n g ( n

  • t

f l e x i b l e ) c

  • m

p a r e d t

  • v

a n i l l a p e r f , 2 × f a s t e r

slide-60
SLIDE 60
  • t

h e r

h a r d w a r e

F r e e s c a l e :

D a t a A c q u i s i t i

  • n

P r

  • g

r a m T r a c e

I n t e l :

P r

  • c

e s s

  • r

T r a c e

slide-61
SLIDE 61

l a s t w

  • r

d s

slide-62
SLIDE 62

t r a c i n g

h e l p s y

  • u

b u i l d

e f f i c i e n t s

  • f

t w a r e

slide-63
SLIDE 63

u s i n g L T T n g : v e r y l

  • w

f

  • t

p r i n t

C

  • r

t e x

  • A

9 : ~ 5 s µ / e v e n t C

  • r

e i 7 : ~ 2 n s / e v e n t

slide-64
SLIDE 64

u s i n g h a r d w a r e : a l m

  • s

t z e r

  • f
  • t

p r i n t

t r a c e i n p r

  • d

u c t i

  • n

!

slide-65
SLIDE 65

L i n k s

L T T n g

a n d

T M F :

h t t p s : / / l t t n g .

  • r

g /

S T M l i b r a r i e s :

h t t p s : / / g i t h u b . c

  • m

/ a d r i e n v e r g e / l i b c

  • r

e s i g h t

  • m

a p 4 4 3

E T M p a t c h :

h t t p s : / / l k m l .

  • r

g / l k m l / 2 1 4 / 1 / 3 / 2 5 9

B T S p a t c h :

h t t p s : / / g i t h u b . c

  • m

/ a d r i e n v e r g e / l i n u x / t r e e / p a t c h _ p e r f _ b t s _ s p l i c e

slide-66
SLIDE 66

T h a n k y

  • u

Q u e s t i

  • n

s ?