perf scripts jiri olsa 1 PERF SCRIPTS | JIRI OLSA HI basics - - PowerPoint PPT Presentation

perf scripts jiri olsa
SMART_READER_LITE
LIVE PREVIEW

perf scripts jiri olsa 1 PERF SCRIPTS | JIRI OLSA HI basics - - PowerPoint PPT Presentation

perf scripts jiri olsa 1 PERF SCRIPTS | JIRI OLSA HI basics perf in python post process scripts 2 PERF SCRIPTS | JIRI OLSA COUNTING perf stat CPU 0 CPU 1 CPU 2 start $ perf stat -e ' cycles,instructions ' WORKLOAD


slide-1
SLIDE 1

PERF SCRIPTS | JIRI OLSA 1

perf scripts jiri olsa

slide-2
SLIDE 2

PERF SCRIPTS | JIRI OLSA 2

HI

  • basics
  • perf in python
  • post process scripts
slide-3
SLIDE 3

PERF SCRIPTS | JIRI OLSA 3

  • perf stat

COUNTING

WORKLOAD WORKLOAD WORKLOAD WORKLOAD WORKLOAD

CPU 0 CPU 1 CPU 2

$ perf stat -e 'cycles,instructions' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles 64,785,445 instructions

start stop

slide-4
SLIDE 4

PERF SCRIPTS | JIRI OLSA 4

SAMPLING

WORKLOAD WORKLOAD WORKLOAD WORKLOAD WORKLOAD

CPU 0 CPU 1 CPU 2

$ perf record -e 'cycles' WORKLOAD [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.048 MB perf.data

start stop

ID PID CPU ADDRESS CALLCHAIN BRANCHES MEMORY TRACEPOINT

sample perf.data

SAMPLE SAMPLE SAMPLE SAMPLE

  • perf record
slide-5
SLIDE 5

PERF SCRIPTS | JIRI OLSA 5

PERF INTERFACE IN NUTSHELL

$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles

slide-6
SLIDE 6

PERF SCRIPTS | JIRI OLSA 6

PERF INTERFACE IN NUTSHELL

$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles

kernel user

slide-7
SLIDE 7

PERF SCRIPTS | JIRI OLSA 7

PERF INTERFACE IN NUTSHELL

kernel user

$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles SYS_PERF_EVENT_OPEN

slide-8
SLIDE 8

PERF SCRIPTS | JIRI OLSA 8

PERF INTERFACE IN NUTSHELL

kernel user

$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles EVENT SYS_PERF_EVENT_OPEN

slide-9
SLIDE 9

PERF SCRIPTS | JIRI OLSA 9

PERF INTERFACE IN NUTSHELL

kernel user

SYS_PERF_EVENT_OPEN $ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles EVENT TASK 1

CPU 0 CPU 1

CGROUP

slide-10
SLIDE 10

PERF SCRIPTS | JIRI OLSA 10

PERF INTERFACE IN NUTSHELL

kernel user

$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles EVENT TASK 1

CPU 0 CPU 1

CGROUP SYS_READ SYS_PERF_EVENT_OPEN

slide-11
SLIDE 11

PERF SCRIPTS | JIRI OLSA 11

PERF INTERFACE IN NUTSHELL

kernel user

$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles EVENT TASK 1

CPU 0 CPU 1

CGROUP SYS_READ SYS_PERF_EVENT_OPEN $ perf record -e 'cycles' WORKLOAD [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.048 MB

slide-12
SLIDE 12

PERF SCRIPTS | JIRI OLSA 12

PERF INTERFACE IN NUTSHELL

kernel user

$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles EVENT TASK 1

CPU 0 CPU 1

CGROUP SYS_READ SYS_PERF_EVENT_OPEN $ perf record -e 'cycles' WORKLOAD [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.048 MB

slide-13
SLIDE 13

PERF SCRIPTS | JIRI OLSA 13

PERF INTERFACE IN NUTSHELL

kernel user

$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles EVENT TASK 1

CPU 0 CPU 1

CGROUP SYS_READ SYS_PERF_EVENT_OPEN $ perf record -e 'cycles' WORKLOAD [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.048 MB SYS_MMAP

slide-14
SLIDE 14

PERF SCRIPTS | JIRI OLSA 14

PERF INTERFACE IN NUTSHELL

kernel user

$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles EVENT TASK 1

CPU 0 CPU 1

CGROUP SYS_READ SYS_PERF_EVENT_OPEN $ perf record -e 'cycles' WORKLOAD [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.048 MB SYS_MMAP ID PID CPU ADDRESS CALLCHAIN BRANCHES MEMORY TRACEPOINT

sample

SAMPLE SAMPLE SAMPLE SAMPLE

perf.data

slide-15
SLIDE 15

PERF SCRIPTS | JIRI OLSA 15

  • 2 areas of script support
  • use perf in python scripts
  • post process perf data via python/perl

PERF SCRIPTS

slide-16
SLIDE 16

PERF SCRIPTS | JIRI OLSA 16

  • use perf in python scripts
  • perf module

PYTHON SCRIPTS

slide-17
SLIDE 17

PERF SCRIPTS | JIRI OLSA 17

PYTHON SCRIPTS

kernel user

$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles EVENT TASK 1

CPU 0 CPU 1

CGROUP SYS_READ SYS_PERF_EVENT_OPEN $ perf record -e 'cycles' WORKLOAD [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.048 MB SYS_MMAP ID PID CPU ADDRESS CALLCHAIN BRANCHES MEMORY TRACEPOINT

sample

SAMPLE SAMPLE SAMPLE SAMPLE

perf.data

slide-18
SLIDE 18

PERF SCRIPTS | JIRI OLSA 18

PYTHON SCRIPTS

kernel user

EVENT TASK 1

CPU 0 CPU 1

CGROUP SYS_READ SYS_PERF_EVENT_OPEN SYS_MMAP ID PID CPU ADDRESS CALLCHAIN BRANCHES MEMORY TRACEPOINT

sample

slide-19
SLIDE 19

PERF SCRIPTS | JIRI OLSA 19

PYTHON SCRIPTS

kernel user

EVENT TASK 1

CPU 0 CPU 1

CGROUP SYS_READ SYS_PERF_EVENT_OPEN SYS_MMAP ID PID CPU ADDRESS CALLCHAIN BRANCHES MEMORY TRACEPOINT

sample

#!/usr/bin/python import perf def main(): cpus = perf.cpu_map() threads = perf.thread_map() evsel = perf.evsel(task = 1, wakeup_eve.. sample_id_ evsel.open(cpus = cpus, threads ..) while True: evlist.poll(timeout = -1) for cpu in cpus: event = evlist.read_on_cpu(cpu) if not event: continue print event while True: print "nobody likes python anyway.."

slide-20
SLIDE 20

PERF SCRIPTS | JIRI OLSA 20

PYTHON SCRIPTS

kernel user

EVENT TASK 1

CPU 0 CPU 1

CGROUP SYS_READ SYS_PERF_EVENT_OPEN SYS_MMAP ID PID CPU ADDRESS CALLCHAIN BRANCHES MEMORY TRACEPOINT

sample

#!/usr/bin/python import perf def main(): cpus = perf.cpu_map() threads = perf.thread_map() evsel = perf.evsel(task = 1, wakeup_eve.. sample_id_ evsel.open(cpus = cpus, threads ..) while True: evlist.poll(timeout = -1) for cpu in cpus: event = evlist.read_on_cpu(cpu) if not event: continue print event while True: print "every1 likes python anyway.."

perf.so python extension

# yum install python-perf

slide-21
SLIDE 21

PERF SCRIPTS | JIRI OLSA 21

  • only sampling interface atm
  • quite simple one:

cpu_map thread_map evsel (open) evlist (open, mmap, poll, add, read_on_cpu) (mmap|task|comm|lost|read|sample|throttle)_event

PYTHON PERF MODULE

slide-22
SLIDE 22

PERF SCRIPTS | JIRI OLSA 22

PYTHON PERF MODULE

import perf def main(): cpus = perf.cpu_map() threads = perf.thread_map() if __name__ == '__main__': main()

slide-23
SLIDE 23

PERF SCRIPTS | JIRI OLSA 23

PYTHON PERF MODULE

import perf def main(): cpus = perf.cpu_map() threads = perf.thread_map() evsel = perf.evsel(task = 1, comm = 1, mmap = 0, wakeup_events = 1, watermark = 1, sample_id_all = 1, sample_type = perf.SAMPLE_PERIOD | perf.SAMPLE_TID | perf.SAMPLE_CPU) evsel.open(cpus = cpus, threads = threads) if __name__ == '__main__': main()

slide-24
SLIDE 24

PERF SCRIPTS | JIRI OLSA 24

PYTHON PERF MODULE

import perf def main(): cpus = perf.cpu_map() threads = perf.thread_map() evsel = perf.evsel(task = 1, comm = 1, mmap = 0, wakeup_events = 1, watermark = 1, sample_id_all = 1, sample_type = perf.SAMPLE_PERIOD | perf.SAMPLE_TID | perf.SAMPLE_CPU) evsel.open(cpus = cpus, threads = threads) evlist = perf.evlist(cpus, threads) evlist.add(evsel) evlist.mmap() if __name__ == '__main__': main()

slide-25
SLIDE 25

PERF SCRIPTS | JIRI OLSA 25

PYTHON PERF MODULE

import perf def main(): cpus = perf.cpu_map() threads = perf.thread_map() evsel = perf.evsel(task = 1, comm = 1, mmap = 0, wakeup_events = 1, watermark = 1, sample_id_all = 1, sample_type = perf.SAMPLE_PERIOD | perf.SAMPLE_TID | perf.SAMPLE_CPU) evsel.open(cpus = cpus, threads = threads) evlist = perf.evlist(cpus, threads) evlist.add(evsel) evlist.mmap() while True: evlist.poll(timeout = -1) for cpu in cpus: event = evlist.read_on_cpu(cpu) if not event: continue print event if __name__ == '__main__': main()

slide-26
SLIDE 26

PERF SCRIPTS | JIRI OLSA 26

PYTHON PERF MODULE

  • needs some love

counting interface stabilize

  • volunteers welcome ;-)

$KERNEL/tools/perf/util/python.c

slide-27
SLIDE 27

PERF SCRIPTS | JIRI OLSA 27

  • interface for processing perf data from:

perf record perf stat

POST PROCESS SCRIPTING

slide-28
SLIDE 28

PERF SCRIPTS | JIRI OLSA 28

POST PROCESS SCRIPTING - SAMPLING

# Children Self Command Shared Object Symbol # ........ ........ ....... ................ ............................... # 51.40% 0.00% ls [kernel.vmlinux] [k] system_call 9.71% 0.00% ls [kernel.vmlinux] [k] __alloc_pages_nodemask 9.71% 9.71% ls [kernel.vmlinux] [k] clear_page |

  • --clear_page

__alloc_pages_nodemask alloc_pages_vma handle_mm_fault __do_page_fault do_page_fault page_fault _int_malloc 8.73% 8.30% ls [kernel.vmlinux] [k] perf_event_context_sched_in |

  • --perf_event_context_sched_in

| |--95.07%-- __perf_event_task_sched_in | finish_task_switch | __schedule | _cond_resched | sys_write | system_call | __GI___libc_write | 0x2d646c6975622d66 |

  • -4.93%-- perf_event_exec

setup_new_exec

perf.data

SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE

perf report

slide-29
SLIDE 29

PERF SCRIPTS | JIRI OLSA 29

POST PROCESS SCRIPTING - SAMPLING

# Children Self Command Shared Object Symbol # ........ ........ ....... ................ ............................... # 51.40% 0.00% ls [kernel.vmlinux] [k] system_call 9.71% 0.00% ls [kernel.vmlinux] [k] __alloc_pages_nodemask 9.71% 9.71% ls [kernel.vmlinux] [k] clear_page |

  • --clear_page

__alloc_pages_nodemask alloc_pages_vma handle_mm_fault __do_page_fault do_page_fault page_fault _int_malloc 8.73% 8.30% ls [kernel.vmlinux] [k] perf_event_context_sched_in |

  • --perf_event_context_sched_in

| |--95.07%-- __perf_event_task_sched_in | finish_task_switch | __schedule | _cond_resched | sys_write | system_call | __GI___libc_write | 0x2d646c6975622d66 |

  • -4.93%-- perf_event_exec

setup_new_exec

perf.data

SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE

perf report

slide-30
SLIDE 30

PERF SCRIPTS | JIRI OLSA 30

POST PROCESS SCRIPTING - SAMPLING

# Children Self Command Shared Object Symbol # ........ ........ ....... ................ ............................... # 51.40% 0.00% ls [kernel.vmlinux] [k] system_call 9.71% 0.00% ls [kernel.vmlinux] [k] __alloc_pages_nodemask 9.71% 9.71% ls [kernel.vmlinux] [k] clear_page |

  • --clear_page

__alloc_pages_nodemask alloc_pages_vma handle_mm_fault __do_page_fault do_page_fault page_fault _int_malloc 8.73% 8.30% ls [kernel.vmlinux] [k] perf_event_context_sched_in |

  • --perf_event_context_sched_in

| |--95.07%-- __perf_event_task_sched_in | finish_task_switch | __schedule | _cond_resched | sys_write | system_call | __GI___libc_write | 0x2d646c6975622d66 |

  • -4.93%-- perf_event_exec

setup_new_exec

perf.data

SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE

perf report

slide-31
SLIDE 31

PERF SCRIPTS | JIRI OLSA 31

  • perf script wrapper

perf script -s <script>

  • python/perl support

POST PROCESS SCRIPTING - SAMPLING

slide-32
SLIDE 32

PERF SCRIPTS | JIRI OLSA 32

POST PROCESS SCRIPTING - SAMPLING

#!/usr/bin/python def process_event(d): for k,v in d.items(): print "%s = %s" % (k, v) def kmem__kmalloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kmem_cache_alloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kmalloc_node(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags, node): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kfree(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass def kmem__kmem_cache_free(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass def trace_begin(): print "start"

perf.data

SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE

perf script

slide-33
SLIDE 33

PERF SCRIPTS | JIRI OLSA 33

POST PROCESS SCRIPTING - SAMPLING

#!/usr/bin/python def process_event(d): for k,v in d.items(): print "%s = %s" % (k, v) def kmem__kmalloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kmem_cache_alloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kmalloc_node(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags, node): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kfree(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass def kmem__kmem_cache_free(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass def trace_begin(): print "start"

perf.data

SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE

perf script

slide-34
SLIDE 34

PERF SCRIPTS | JIRI OLSA 34

POST PROCESS SCRIPTING - SAMPLING

#!/usr/bin/python def process_event(d): for k,v in d.items(): print "%s = %s" % (k, v) def kmem__kmalloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kmem_cache_alloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kmalloc_node(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags, node): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kfree(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass def kmem__kmem_cache_free(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass def trace_begin(): print "start"

perf.data

SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE

perf script

slide-35
SLIDE 35

PERF SCRIPTS | JIRI OLSA 35

POST PROCESS SCRIPTING - SAMPLING

#!/usr/bin/python def process_event(d): for k,v in d.items(): print "%s = %s" % (k, v) def kmem__kmalloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kmem_cache_alloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kmalloc_node(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags, node): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kfree(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass def kmem__kmem_cache_free(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass def trace_begin(): print "start"

perf.data

SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE

perf script

slide-36
SLIDE 36

PERF SCRIPTS | JIRI OLSA 36

  • set of callbacks:

trace_begin/trace_end process_event (non tracepoint) $(SUBSYSTEM)__$(EVENT) (tracepoint)

POST PROCESS SCRIPTING - INTERFACE

slide-37
SLIDE 37

PERF SCRIPTS | JIRI OLSA 37

  • process_event (args)

args – dictionary with arguments

  • $(SUBSYSTEM)__$(EVENT)(...)

event, ctxt, cpu, s, ns, tid, comm, callchain + tracepoint specific arguments

POST PROCESS SCRIPTING - INTERFACE

def kmem__kmalloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): pass def kmem__kfree(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass #!/usr/bin/python def process_event(d): for k,v in d.items(): print "%s = %s" % (k, v)

slide-38
SLIDE 38

PERF SCRIPTS | JIRI OLSA 38

  • perf record ...
  • perf script -g lang
  • perf script -s <script.py>
  • perf script -l
  • perf script record|report <script>
  • man perf-script ;-)

POST PROCESS SCRIPTING

slide-39
SLIDE 39

PERF SCRIPTS | JIRI OLSA 39

  • native perf language
  • post process perf stat output

POST PROCESS SCRIPTING - COUNTING

slide-40
SLIDE 40

PERF SCRIPTS | JIRI OLSA 40

  • native perf language
  • post process perf stat output

POST PROCESS SCRIPTING - COUNTING

$ perf stat -e 'cycles,instructions' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles 64,785,445 instructions

slide-41
SLIDE 41

PERF SCRIPTS | JIRI OLSA 41

  • native perf language
  • post process perf stat output

POST PROCESS SCRIPTING - COUNTING

$ perf stat -e 'cycles,instructions' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles 64,785,445 instructions 1.34357914 cpi

cycles per instructions ratio

slide-42
SLIDE 42

PERF SCRIPTS | JIRI OLSA 42

POST PROCESS SCRIPTING - COUNTING

cpi { events { CY = cycles:u IN = instructions:u } cpi = CY / IN print cpi }

slide-43
SLIDE 43

PERF SCRIPTS | JIRI OLSA 43

POST PROCESS SCRIPTING - COUNTING

cpi { events { CY = cycles:u IN = instructions:u } cpi = CY / IN print cpi }

formula name

slide-44
SLIDE 44

PERF SCRIPTS | JIRI OLSA 44

POST PROCESS SCRIPTING - COUNTING

cpi { events { CY = cycles:u IN = instructions:u } cpi = CY / IN print cpi }

formula name events

slide-45
SLIDE 45

PERF SCRIPTS | JIRI OLSA 45

POST PROCESS SCRIPTING - COUNTING

cpi { events { CY = cycles:u IN = instructions:u } cpi = CY / IN print cpi }

formula name events calculation formula

slide-46
SLIDE 46

PERF SCRIPTS | JIRI OLSA 46

POST PROCESS SCRIPTING - COUNTING

cpi { events { CY = cycles:u IN = instructions:u } cpi = CY / IN print cpi }

formula name events calculation formula variable to print

slide-47
SLIDE 47

PERF SCRIPTS | JIRI OLSA 47

POST PROCESS SCRIPTING - COUNTING

cpi { events { CY = cycles:u IN = instructions:u } cpi = CY / IN print cpi }

formula name events calculation formula variable to print

$ perf stat -f formula.conf -e formula-cpi -a ^C Performance counter stats for 'system wide': 739,225,320 cycles:u [100.00%] 620,227,854 instructions:u # 0.84 insns per cycle 1.674587325 seconds time elapsed 1.19186089 cpi

slide-48
SLIDE 48

PERF SCRIPTS | JIRI OLSA 48

POST PROCESS SCRIPTING - COUNTING

branch { events { IN = instructions:u BI = branch-instructions:u BM = branch-misses:u } branch-rate = BI / IN branch-miss-rate = BM / IN branch-miss-ratio = BM / BI print branch-rate print branch-miss-rate print branch-miss-ratio } $ perf stat -f formula.conf -e formula-branch du -sh / ^Cdu: Interrupt Performance counter stats for 'du -sh /': 39,285,799 instructions:u 8,865,310 branch-instructions:u 273,038 branch-misses:u # 3.08% of all branches 0.923258595 seconds time elapsed 0.22566195 branch-rate 0.00695004 branch-miss-rate 0.03079847 branch-miss-ratio

  • branch example
  • branch-rate/miss/ratio
slide-49
SLIDE 49

PERF SCRIPTS | JIRI OLSA 49

POST PROCESS SCRIPTING - COUNTING

branch { events { IN = instructions:u BI = branch-instructions:u BM = branch-misses:u } branch-rate = BI / IN branch-miss-rate = BM / IN branch-miss-ratio = BM / BI print branch-rate print branch-miss-rate print branch-miss-ratio } $ perf stat -f formula.conf -e formula-branch du -sh / ^Cdu: Interrupt Performance counter stats for 'du -sh /': 39,285,799 instructions:u 8,865,310 branch-instructions:u 273,038 branch-misses:u # 3.08% of all branches 0.923258595 seconds time elapsed 0.22566195 branch-rate 0.00695004 branch-miss-rate 0.03079847 branch-miss-ratio

  • branch example
  • branch-rate/miss/ratio
slide-50
SLIDE 50

PERF SCRIPTS | JIRI OLSA 50

POST PROCESS SCRIPTING - COUNTING

  • needs more testing/users
  • not upstream yet

https://git.kernel.org/cgit/linux/kernel/git/jolsa/perf.git/ perf/formula

slide-51
SLIDE 51

PERF SCRIPTS | JIRI OLSA 51

POST PROCESS SCRIPTING - COUNTING

  • store stat data into perf.data

$ perf stat -e 'cycles,instructions' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles 64,785,445 instructions

slide-52
SLIDE 52

PERF SCRIPTS | JIRI OLSA 52

POST PROCESS SCRIPTING - COUNTING

  • store stat data into perf.data

$ perf stat -e 'cycles,instructions' record WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles 64,785,445 instructions

perf.data

STAT EVENT STAT EVENT

slide-53
SLIDE 53

PERF SCRIPTS | JIRI OLSA 53

POST PROCESS SCRIPTING - COUNTING

  • store stat data into perf.data

$ perf stat -e 'cycles,instructions' record WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles 64,785,445 instructions

perf.data

STAT EVENT STAT EVENT $ perf stat report Performance counter stats for 'find ..': 104,142,555 cycles 64,785,445 instructions

slide-54
SLIDE 54

PERF SCRIPTS | JIRI OLSA 54

POST PROCESS SCRIPTING - COUNTING

def trace_begin(): print "in trace_begin" def trace_end(): print "in trace_end" def stat__cycles(cpu, sec, nsec, val, ena, run): print "%6d.%09d CPU%d %d cycles" % (sec, nsec, cpu, val)

perf.data

SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE

perf script

slide-55
SLIDE 55

PERF SCRIPTS | JIRI OLSA 55

POST PROCESS SCRIPTING - COUNTING

def trace_begin(): print "in trace_begin" def trace_end(): print "in trace_end" def stat__cycles(cpu, sec, nsec, val, ena, run): print "%6d.%09d CPU%d %d cycles" % (sec, nsec, cpu, val)

perf.data

SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE

perf script

slide-56
SLIDE 56

PERF SCRIPTS | JIRI OLSA 56

POST PROCESS SCRIPTING - COUNTING

def trace_begin(): print "in trace_begin" def trace_end(): print "in trace_end" def stat__cycles(cpu, sec, nsec, val, ena, run): print "%6d.%09d CPU%d %d cycles" % (sec, nsec, cpu, val)

perf.data

SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE

perf script

slide-57
SLIDE 57

PERF SCRIPTS | JIRI OLSA 57

POST PROCESS SCRIPTING - COUNTING

def trace_begin(): print "in trace_begin" def trace_end(): print "in trace_end" def stat__cycles(cpu, sec, nsec, val, ena, run): print "%6d.%09d CPU%d %d cycles" % (sec, nsec, cpu, val)

perf.data

SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE

perf script

slide-58
SLIDE 58

PERF SCRIPTS | JIRI OLSA 58

POST PROCESS SCRIPTING - COUNTING

  • not upstream, work in progress
slide-59
SLIDE 59

PERF SCRIPTS | JIRI OLSA 59

THANKS, QUESTIONS?

Jiri Olsa <jolsa@redhat.com>