PERF SCRIPTS | JIRI OLSA 1
perf scripts jiri olsa 1 PERF SCRIPTS | JIRI OLSA HI basics - - PowerPoint PPT Presentation
perf scripts jiri olsa 1 PERF SCRIPTS | JIRI OLSA HI basics - - PowerPoint PPT Presentation
perf scripts jiri olsa 1 PERF SCRIPTS | JIRI OLSA HI basics perf in python post process scripts 2 PERF SCRIPTS | JIRI OLSA COUNTING perf stat CPU 0 CPU 1 CPU 2 start $ perf stat -e ' cycles,instructions ' WORKLOAD
PERF SCRIPTS | JIRI OLSA 2
HI
- basics
- perf in python
- post process scripts
PERF SCRIPTS | JIRI OLSA 3
- perf stat
COUNTING
WORKLOAD WORKLOAD WORKLOAD WORKLOAD WORKLOAD
CPU 0 CPU 1 CPU 2
$ perf stat -e 'cycles,instructions' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles 64,785,445 instructions
start stop
PERF SCRIPTS | JIRI OLSA 4
SAMPLING
WORKLOAD WORKLOAD WORKLOAD WORKLOAD WORKLOAD
CPU 0 CPU 1 CPU 2
$ perf record -e 'cycles' WORKLOAD [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.048 MB perf.data
start stop
ID PID CPU ADDRESS CALLCHAIN BRANCHES MEMORY TRACEPOINT
sample perf.data
SAMPLE SAMPLE SAMPLE SAMPLE
- perf record
PERF SCRIPTS | JIRI OLSA 5
PERF INTERFACE IN NUTSHELL
$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles
PERF SCRIPTS | JIRI OLSA 6
PERF INTERFACE IN NUTSHELL
$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles
kernel user
PERF SCRIPTS | JIRI OLSA 7
PERF INTERFACE IN NUTSHELL
kernel user
$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles SYS_PERF_EVENT_OPEN
PERF SCRIPTS | JIRI OLSA 8
PERF INTERFACE IN NUTSHELL
kernel user
$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles EVENT SYS_PERF_EVENT_OPEN
PERF SCRIPTS | JIRI OLSA 9
PERF INTERFACE IN NUTSHELL
kernel user
SYS_PERF_EVENT_OPEN $ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles EVENT TASK 1
CPU 0 CPU 1
CGROUP
PERF SCRIPTS | JIRI OLSA 10
PERF INTERFACE IN NUTSHELL
kernel user
$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles EVENT TASK 1
CPU 0 CPU 1
CGROUP SYS_READ SYS_PERF_EVENT_OPEN
PERF SCRIPTS | JIRI OLSA 11
PERF INTERFACE IN NUTSHELL
kernel user
$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles EVENT TASK 1
CPU 0 CPU 1
CGROUP SYS_READ SYS_PERF_EVENT_OPEN $ perf record -e 'cycles' WORKLOAD [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.048 MB
PERF SCRIPTS | JIRI OLSA 12
PERF INTERFACE IN NUTSHELL
kernel user
$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles EVENT TASK 1
CPU 0 CPU 1
CGROUP SYS_READ SYS_PERF_EVENT_OPEN $ perf record -e 'cycles' WORKLOAD [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.048 MB
PERF SCRIPTS | JIRI OLSA 13
PERF INTERFACE IN NUTSHELL
kernel user
$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles EVENT TASK 1
CPU 0 CPU 1
CGROUP SYS_READ SYS_PERF_EVENT_OPEN $ perf record -e 'cycles' WORKLOAD [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.048 MB SYS_MMAP
PERF SCRIPTS | JIRI OLSA 14
PERF INTERFACE IN NUTSHELL
kernel user
$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles EVENT TASK 1
CPU 0 CPU 1
CGROUP SYS_READ SYS_PERF_EVENT_OPEN $ perf record -e 'cycles' WORKLOAD [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.048 MB SYS_MMAP ID PID CPU ADDRESS CALLCHAIN BRANCHES MEMORY TRACEPOINT
sample
SAMPLE SAMPLE SAMPLE SAMPLE
perf.data
PERF SCRIPTS | JIRI OLSA 15
- 2 areas of script support
- use perf in python scripts
- post process perf data via python/perl
PERF SCRIPTS
PERF SCRIPTS | JIRI OLSA 16
- use perf in python scripts
- perf module
PYTHON SCRIPTS
PERF SCRIPTS | JIRI OLSA 17
PYTHON SCRIPTS
kernel user
$ perf stat -e 'cycles' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles EVENT TASK 1
CPU 0 CPU 1
CGROUP SYS_READ SYS_PERF_EVENT_OPEN $ perf record -e 'cycles' WORKLOAD [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.048 MB SYS_MMAP ID PID CPU ADDRESS CALLCHAIN BRANCHES MEMORY TRACEPOINT
sample
SAMPLE SAMPLE SAMPLE SAMPLE
perf.data
PERF SCRIPTS | JIRI OLSA 18
PYTHON SCRIPTS
kernel user
EVENT TASK 1
CPU 0 CPU 1
CGROUP SYS_READ SYS_PERF_EVENT_OPEN SYS_MMAP ID PID CPU ADDRESS CALLCHAIN BRANCHES MEMORY TRACEPOINT
sample
PERF SCRIPTS | JIRI OLSA 19
PYTHON SCRIPTS
kernel user
EVENT TASK 1
CPU 0 CPU 1
CGROUP SYS_READ SYS_PERF_EVENT_OPEN SYS_MMAP ID PID CPU ADDRESS CALLCHAIN BRANCHES MEMORY TRACEPOINT
sample
#!/usr/bin/python import perf def main(): cpus = perf.cpu_map() threads = perf.thread_map() evsel = perf.evsel(task = 1, wakeup_eve.. sample_id_ evsel.open(cpus = cpus, threads ..) while True: evlist.poll(timeout = -1) for cpu in cpus: event = evlist.read_on_cpu(cpu) if not event: continue print event while True: print "nobody likes python anyway.."
PERF SCRIPTS | JIRI OLSA 20
PYTHON SCRIPTS
kernel user
EVENT TASK 1
CPU 0 CPU 1
CGROUP SYS_READ SYS_PERF_EVENT_OPEN SYS_MMAP ID PID CPU ADDRESS CALLCHAIN BRANCHES MEMORY TRACEPOINT
sample
#!/usr/bin/python import perf def main(): cpus = perf.cpu_map() threads = perf.thread_map() evsel = perf.evsel(task = 1, wakeup_eve.. sample_id_ evsel.open(cpus = cpus, threads ..) while True: evlist.poll(timeout = -1) for cpu in cpus: event = evlist.read_on_cpu(cpu) if not event: continue print event while True: print "every1 likes python anyway.."
perf.so python extension
# yum install python-perf
PERF SCRIPTS | JIRI OLSA 21
- only sampling interface atm
- quite simple one:
cpu_map thread_map evsel (open) evlist (open, mmap, poll, add, read_on_cpu) (mmap|task|comm|lost|read|sample|throttle)_event
PYTHON PERF MODULE
PERF SCRIPTS | JIRI OLSA 22
PYTHON PERF MODULE
import perf def main(): cpus = perf.cpu_map() threads = perf.thread_map() if __name__ == '__main__': main()
PERF SCRIPTS | JIRI OLSA 23
PYTHON PERF MODULE
import perf def main(): cpus = perf.cpu_map() threads = perf.thread_map() evsel = perf.evsel(task = 1, comm = 1, mmap = 0, wakeup_events = 1, watermark = 1, sample_id_all = 1, sample_type = perf.SAMPLE_PERIOD | perf.SAMPLE_TID | perf.SAMPLE_CPU) evsel.open(cpus = cpus, threads = threads) if __name__ == '__main__': main()
PERF SCRIPTS | JIRI OLSA 24
PYTHON PERF MODULE
import perf def main(): cpus = perf.cpu_map() threads = perf.thread_map() evsel = perf.evsel(task = 1, comm = 1, mmap = 0, wakeup_events = 1, watermark = 1, sample_id_all = 1, sample_type = perf.SAMPLE_PERIOD | perf.SAMPLE_TID | perf.SAMPLE_CPU) evsel.open(cpus = cpus, threads = threads) evlist = perf.evlist(cpus, threads) evlist.add(evsel) evlist.mmap() if __name__ == '__main__': main()
PERF SCRIPTS | JIRI OLSA 25
PYTHON PERF MODULE
import perf def main(): cpus = perf.cpu_map() threads = perf.thread_map() evsel = perf.evsel(task = 1, comm = 1, mmap = 0, wakeup_events = 1, watermark = 1, sample_id_all = 1, sample_type = perf.SAMPLE_PERIOD | perf.SAMPLE_TID | perf.SAMPLE_CPU) evsel.open(cpus = cpus, threads = threads) evlist = perf.evlist(cpus, threads) evlist.add(evsel) evlist.mmap() while True: evlist.poll(timeout = -1) for cpu in cpus: event = evlist.read_on_cpu(cpu) if not event: continue print event if __name__ == '__main__': main()
PERF SCRIPTS | JIRI OLSA 26
PYTHON PERF MODULE
- needs some love
counting interface stabilize
- volunteers welcome ;-)
$KERNEL/tools/perf/util/python.c
PERF SCRIPTS | JIRI OLSA 27
- interface for processing perf data from:
perf record perf stat
POST PROCESS SCRIPTING
PERF SCRIPTS | JIRI OLSA 28
POST PROCESS SCRIPTING - SAMPLING
# Children Self Command Shared Object Symbol # ........ ........ ....... ................ ............................... # 51.40% 0.00% ls [kernel.vmlinux] [k] system_call 9.71% 0.00% ls [kernel.vmlinux] [k] __alloc_pages_nodemask 9.71% 9.71% ls [kernel.vmlinux] [k] clear_page |
- --clear_page
__alloc_pages_nodemask alloc_pages_vma handle_mm_fault __do_page_fault do_page_fault page_fault _int_malloc 8.73% 8.30% ls [kernel.vmlinux] [k] perf_event_context_sched_in |
- --perf_event_context_sched_in
| |--95.07%-- __perf_event_task_sched_in | finish_task_switch | __schedule | _cond_resched | sys_write | system_call | __GI___libc_write | 0x2d646c6975622d66 |
- -4.93%-- perf_event_exec
setup_new_exec
perf.data
SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE
perf report
PERF SCRIPTS | JIRI OLSA 29
POST PROCESS SCRIPTING - SAMPLING
# Children Self Command Shared Object Symbol # ........ ........ ....... ................ ............................... # 51.40% 0.00% ls [kernel.vmlinux] [k] system_call 9.71% 0.00% ls [kernel.vmlinux] [k] __alloc_pages_nodemask 9.71% 9.71% ls [kernel.vmlinux] [k] clear_page |
- --clear_page
__alloc_pages_nodemask alloc_pages_vma handle_mm_fault __do_page_fault do_page_fault page_fault _int_malloc 8.73% 8.30% ls [kernel.vmlinux] [k] perf_event_context_sched_in |
- --perf_event_context_sched_in
| |--95.07%-- __perf_event_task_sched_in | finish_task_switch | __schedule | _cond_resched | sys_write | system_call | __GI___libc_write | 0x2d646c6975622d66 |
- -4.93%-- perf_event_exec
setup_new_exec
perf.data
SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE
perf report
PERF SCRIPTS | JIRI OLSA 30
POST PROCESS SCRIPTING - SAMPLING
# Children Self Command Shared Object Symbol # ........ ........ ....... ................ ............................... # 51.40% 0.00% ls [kernel.vmlinux] [k] system_call 9.71% 0.00% ls [kernel.vmlinux] [k] __alloc_pages_nodemask 9.71% 9.71% ls [kernel.vmlinux] [k] clear_page |
- --clear_page
__alloc_pages_nodemask alloc_pages_vma handle_mm_fault __do_page_fault do_page_fault page_fault _int_malloc 8.73% 8.30% ls [kernel.vmlinux] [k] perf_event_context_sched_in |
- --perf_event_context_sched_in
| |--95.07%-- __perf_event_task_sched_in | finish_task_switch | __schedule | _cond_resched | sys_write | system_call | __GI___libc_write | 0x2d646c6975622d66 |
- -4.93%-- perf_event_exec
setup_new_exec
perf.data
SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE
perf report
PERF SCRIPTS | JIRI OLSA 31
- perf script wrapper
perf script -s <script>
- python/perl support
POST PROCESS SCRIPTING - SAMPLING
PERF SCRIPTS | JIRI OLSA 32
POST PROCESS SCRIPTING - SAMPLING
#!/usr/bin/python def process_event(d): for k,v in d.items(): print "%s = %s" % (k, v) def kmem__kmalloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kmem_cache_alloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kmalloc_node(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags, node): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kfree(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass def kmem__kmem_cache_free(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass def trace_begin(): print "start"
perf.data
SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE
perf script
PERF SCRIPTS | JIRI OLSA 33
POST PROCESS SCRIPTING - SAMPLING
#!/usr/bin/python def process_event(d): for k,v in d.items(): print "%s = %s" % (k, v) def kmem__kmalloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kmem_cache_alloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kmalloc_node(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags, node): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kfree(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass def kmem__kmem_cache_free(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass def trace_begin(): print "start"
perf.data
SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE
perf script
PERF SCRIPTS | JIRI OLSA 34
POST PROCESS SCRIPTING - SAMPLING
#!/usr/bin/python def process_event(d): for k,v in d.items(): print "%s = %s" % (k, v) def kmem__kmalloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kmem_cache_alloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kmalloc_node(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags, node): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kfree(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass def kmem__kmem_cache_free(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass def trace_begin(): print "start"
perf.data
SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE
perf script
PERF SCRIPTS | JIRI OLSA 35
POST PROCESS SCRIPTING - SAMPLING
#!/usr/bin/python def process_event(d): for k,v in d.items(): print "%s = %s" % (k, v) def kmem__kmalloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kmem_cache_alloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kmalloc_node(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags, node): insert_stat(call_site, ptr, bytes_req, bytes_alloc, cpu); def kmem__kfree(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass def kmem__kmem_cache_free(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass def trace_begin(): print "start"
perf.data
SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE
perf script
PERF SCRIPTS | JIRI OLSA 36
- set of callbacks:
trace_begin/trace_end process_event (non tracepoint) $(SUBSYSTEM)__$(EVENT) (tracepoint)
POST PROCESS SCRIPTING - INTERFACE
PERF SCRIPTS | JIRI OLSA 37
- process_event (args)
args – dictionary with arguments
- $(SUBSYSTEM)__$(EVENT)(...)
event, ctxt, cpu, s, ns, tid, comm, callchain + tracepoint specific arguments
POST PROCESS SCRIPTING - INTERFACE
def kmem__kmalloc(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr, bytes_req, bytes_alloc, gfp_flags): pass def kmem__kfree(event, ctxt, cpu, s, ns, tid, comm, callchain, call_site, ptr): pass #!/usr/bin/python def process_event(d): for k,v in d.items(): print "%s = %s" % (k, v)
PERF SCRIPTS | JIRI OLSA 38
- perf record ...
- perf script -g lang
- perf script -s <script.py>
- perf script -l
- perf script record|report <script>
- man perf-script ;-)
POST PROCESS SCRIPTING
PERF SCRIPTS | JIRI OLSA 39
- native perf language
- post process perf stat output
POST PROCESS SCRIPTING - COUNTING
PERF SCRIPTS | JIRI OLSA 40
- native perf language
- post process perf stat output
POST PROCESS SCRIPTING - COUNTING
$ perf stat -e 'cycles,instructions' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles 64,785,445 instructions
PERF SCRIPTS | JIRI OLSA 41
- native perf language
- post process perf stat output
POST PROCESS SCRIPTING - COUNTING
$ perf stat -e 'cycles,instructions' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles 64,785,445 instructions 1.34357914 cpi
cycles per instructions ratio
PERF SCRIPTS | JIRI OLSA 42
POST PROCESS SCRIPTING - COUNTING
cpi { events { CY = cycles:u IN = instructions:u } cpi = CY / IN print cpi }
PERF SCRIPTS | JIRI OLSA 43
POST PROCESS SCRIPTING - COUNTING
cpi { events { CY = cycles:u IN = instructions:u } cpi = CY / IN print cpi }
formula name
PERF SCRIPTS | JIRI OLSA 44
POST PROCESS SCRIPTING - COUNTING
cpi { events { CY = cycles:u IN = instructions:u } cpi = CY / IN print cpi }
formula name events
PERF SCRIPTS | JIRI OLSA 45
POST PROCESS SCRIPTING - COUNTING
cpi { events { CY = cycles:u IN = instructions:u } cpi = CY / IN print cpi }
formula name events calculation formula
PERF SCRIPTS | JIRI OLSA 46
POST PROCESS SCRIPTING - COUNTING
cpi { events { CY = cycles:u IN = instructions:u } cpi = CY / IN print cpi }
formula name events calculation formula variable to print
PERF SCRIPTS | JIRI OLSA 47
POST PROCESS SCRIPTING - COUNTING
cpi { events { CY = cycles:u IN = instructions:u } cpi = CY / IN print cpi }
formula name events calculation formula variable to print
$ perf stat -f formula.conf -e formula-cpi -a ^C Performance counter stats for 'system wide': 739,225,320 cycles:u [100.00%] 620,227,854 instructions:u # 0.84 insns per cycle 1.674587325 seconds time elapsed 1.19186089 cpi
PERF SCRIPTS | JIRI OLSA 48
POST PROCESS SCRIPTING - COUNTING
branch { events { IN = instructions:u BI = branch-instructions:u BM = branch-misses:u } branch-rate = BI / IN branch-miss-rate = BM / IN branch-miss-ratio = BM / BI print branch-rate print branch-miss-rate print branch-miss-ratio } $ perf stat -f formula.conf -e formula-branch du -sh / ^Cdu: Interrupt Performance counter stats for 'du -sh /': 39,285,799 instructions:u 8,865,310 branch-instructions:u 273,038 branch-misses:u # 3.08% of all branches 0.923258595 seconds time elapsed 0.22566195 branch-rate 0.00695004 branch-miss-rate 0.03079847 branch-miss-ratio
- branch example
- branch-rate/miss/ratio
PERF SCRIPTS | JIRI OLSA 49
POST PROCESS SCRIPTING - COUNTING
branch { events { IN = instructions:u BI = branch-instructions:u BM = branch-misses:u } branch-rate = BI / IN branch-miss-rate = BM / IN branch-miss-ratio = BM / BI print branch-rate print branch-miss-rate print branch-miss-ratio } $ perf stat -f formula.conf -e formula-branch du -sh / ^Cdu: Interrupt Performance counter stats for 'du -sh /': 39,285,799 instructions:u 8,865,310 branch-instructions:u 273,038 branch-misses:u # 3.08% of all branches 0.923258595 seconds time elapsed 0.22566195 branch-rate 0.00695004 branch-miss-rate 0.03079847 branch-miss-ratio
- branch example
- branch-rate/miss/ratio
PERF SCRIPTS | JIRI OLSA 50
POST PROCESS SCRIPTING - COUNTING
- needs more testing/users
- not upstream yet
https://git.kernel.org/cgit/linux/kernel/git/jolsa/perf.git/ perf/formula
PERF SCRIPTS | JIRI OLSA 51
POST PROCESS SCRIPTING - COUNTING
- store stat data into perf.data
$ perf stat -e 'cycles,instructions' WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles 64,785,445 instructions
PERF SCRIPTS | JIRI OLSA 52
POST PROCESS SCRIPTING - COUNTING
- store stat data into perf.data
$ perf stat -e 'cycles,instructions' record WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles 64,785,445 instructions
perf.data
STAT EVENT STAT EVENT
PERF SCRIPTS | JIRI OLSA 53
POST PROCESS SCRIPTING - COUNTING
- store stat data into perf.data
$ perf stat -e 'cycles,instructions' record WORKLOAD Performance counter stats for 'find ..': 104,142,555 cycles 64,785,445 instructions
perf.data
STAT EVENT STAT EVENT $ perf stat report Performance counter stats for 'find ..': 104,142,555 cycles 64,785,445 instructions
PERF SCRIPTS | JIRI OLSA 54
POST PROCESS SCRIPTING - COUNTING
def trace_begin(): print "in trace_begin" def trace_end(): print "in trace_end" def stat__cycles(cpu, sec, nsec, val, ena, run): print "%6d.%09d CPU%d %d cycles" % (sec, nsec, cpu, val)
perf.data
SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE
perf script
PERF SCRIPTS | JIRI OLSA 55
POST PROCESS SCRIPTING - COUNTING
def trace_begin(): print "in trace_begin" def trace_end(): print "in trace_end" def stat__cycles(cpu, sec, nsec, val, ena, run): print "%6d.%09d CPU%d %d cycles" % (sec, nsec, cpu, val)
perf.data
SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE
perf script
PERF SCRIPTS | JIRI OLSA 56
POST PROCESS SCRIPTING - COUNTING
def trace_begin(): print "in trace_begin" def trace_end(): print "in trace_end" def stat__cycles(cpu, sec, nsec, val, ena, run): print "%6d.%09d CPU%d %d cycles" % (sec, nsec, cpu, val)
perf.data
SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE
perf script
PERF SCRIPTS | JIRI OLSA 57
POST PROCESS SCRIPTING - COUNTING
def trace_begin(): print "in trace_begin" def trace_end(): print "in trace_end" def stat__cycles(cpu, sec, nsec, val, ena, run): print "%6d.%09d CPU%d %d cycles" % (sec, nsec, cpu, val)
perf.data
SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE
perf script
PERF SCRIPTS | JIRI OLSA 58
POST PROCESS SCRIPTING - COUNTING
- not upstream, work in progress
PERF SCRIPTS | JIRI OLSA 59
THANKS, QUESTIONS?
Jiri Olsa <jolsa@redhat.com>