ftrace profiling
play

Ftrace Profiling Presenter: Steven Rostedt rostedt@goodmis.org - PowerPoint PPT Presentation

Ftrace Profiling Presenter: Steven Rostedt rostedt@goodmis.org Red Hat What do you want to profile? Application Cache misses Memory locality Page faults Finding bad algorithms O(n^2) CPU cycles I/O usage What do


  1. Ftrace Profiling Presenter: Steven Rostedt rostedt@goodmis.org Red Hat

  2. What do you want to profile? ● Application – Cache misses – Memory locality – Page faults – Finding bad algorithms – O(n^2) – CPU cycles – I/O usage

  3. What do you want to profile? ● Kernel – Cache misses – Memory locality – Page faults – Finding bad algorithms – O(n^2) – CPU cycles – I/O usage – Different than user space – Locking – Interrupts (Disabling, latency, etc) – Scheduling ● Latency, fairness, RT, Dead Line, etc

  4. Profiling tools ● perf ● oprofile ● strace ● gdb ● trace-cmd

  5. Perf ● stat ● top ● record ● report ● trace

  6. Perf Stat ● Great for comparing versions of tools ● Perhaps not the best for the kernel analysis perf stat -e cycles --repeat 100 -- ● ● Gives average cycles with standard deviation

  7. Perf Record / Report ● Profile the system ● Works for both kernel and userspace ● Shows where in code time is spent ● Can break down to assembly ● Beware, can be off by one – Due to latching instructions – Irq enabling

  8. Perf Record / Report [root@bxtest ~]# perf record -g /work/c/hackbench 10 Time: 0.365 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.592 MB perf.data (14360 samples) ]

  9. Perf Record / Report

  10. Perf Record / Report

  11. Perf Record / Report ● Very powerful ● Traces both user space and kernel ● I love the UI ● But! – Perf starts to show its overhead with heavy tracing – Function tracing is still a weak

  12. Perf Record / Report ● Very powerful ● Traces both user space and kernel ● I love the UI ● But! – Perf starts to show its overhead with heavy tracing – Function tracing is still a weak – I program in ftrace ;-)

  13. Ftrace / trace-cmd ● trace-cmd – A front end interface to ftrace. ● Requires root privilege to start tracing ● Traces the kernel (not user space) ● Kernel buffer is optimized for tracing ● Does not do periodic profiling

  14. trace-cmd 2.5 ● New profiling feature ● Allows you to connect events ● Timings: Total, average, max, min ● Shows where events occur and frequency

  15. trace-cmd profile ● By default enables: – Function graph tracer ● Depth of 1 – All irq events – All raw system call events – Schedule events: ● sched_wakeup – Sets stack trace ● sched_switch – Sets stack trace ● sched_process_exec – Page fault event

  16. trace-cmd profile Event Hooks ● sched_wakeup → sched_switch ● sched_switch → sched_wakeup – Sleeping ● sched_switch → sched_switch – Preempted ● softirq_raise → softirq_entry ● softirq_entry → softirq_exit ● irq_handler_entry → irq_handler_exit ● sys_enter → sys_exit

  17. trace-cmd profile # trace-cmd profile --stderr hackbench 10 2> out # cat out

  18. trace-cmd profile Kernel buffer statistics: Note: "entries" are the entries left in the kernel ring buffer and are not recorded in the trace data. They should all be zero. CPU: 0 entries: 0 overrun: 1271072 commit overrun: 0 bytes: 644 oldest event ts: 13166.516390 now ts: 13166.927822 dropped events: 0 read events: 41271 [...]

  19. FYI - coming in 2.6 ● New output CPU 0: 74560 bytes lost CPU 1: 59253 bytes lost CPU 2: 427979 bytes lost CPU 3: 99514 bytes lost CPU 4: 47925 bytes lost CPU 5: 42843 bytes lost CPU 6: 70676 bytes lost CPU 7: 676606 bytes lost

  20. trace-cmd profile task: hackbench-2658 Event: sched_switch:R (4) Total: 824611 Avg: 206152 Max: 637060(ts:13164.420597) Min:38201(ts:13166.515937) | + ftrace_raw_event_sched_switch (0xffffffff810ad610) 100% (4) time:824611 max:637060(ts:13164.419988) min:38201(ts:13166.515926) avg:206152 __schedule (0xffffffff81770450) preempt_schedule (0xffffffff81770d2e) ___preempt_schedule (0xffffffff813b80ce) | + cpu_stop_queue_work (0xffffffff81133e26) | 77% (1) time:637060 max:637060(ts:13164.419988) | min:637060(ts:13164.419988) avg:637060 | stop_one_cpu (0xffffffff81134130) | sched_exec (0xffffffff810b0adb) | do_execveat_common.isra.32 | (0xffffffff8121e436) | do_execve (0xffffffff8121eb2c) | SyS_execve (0xffffffff8121ee0e) | return_to_handler (0xffffffff81779468) | stub_execve (0xffffffff81777669)

  21. trace-cmd profile Event: sched_switch:S (44) Total: 71376940 Avg: 1622203 Max: 21473090(ts:13166.461946) Min:25558(ts:13166.514547) | + ftrace_raw_event_sched_switch (0xffffffff810ad610) 100% (44) time:71376940 max:21473090(ts:13166.440494) min:25558(ts:13166.514544) avg:1622203 __schedule (0xffffffff81770450) schedule (0xffffffff81770d79) do_wait (0xffffffff810832ac) SyS_wait4 (0xffffffff81084723) return_to_handler (0xffffffff81779468) tracesys_phase2 (0xffffffff81777289)

  22. trace-cmd profile Event: sched_switch:D (1) Total: 2421103 Avg: 2421103 Max: 2421103(ts:453.664398) Min:2421103(ts:453.664398) | + ftrace_raw_event_sched_switch (0xffffffff8109fbb0) 100% (1) time:2421103 max:2421103(ts:453.662003) min:2421103(ts:453.662003) avg:2421103 __schedule (0xffffffff816b7bb9) schedule (0xffffffff816b8139) schedule_timeout (0xffffffff816bab35) io_schedule_timeout (0xffffffff816b8551) wait_for_completion_io (0xffffffff816b8cf1) blk_execute_rq (0xffffffff81315d39) scsi_execute (0xffffffff81454b67) scsi_execute_req_flags (0xffffffff814564ec) sr_check_events (0xffffffff81466ee9) cdrom_check_events (0xffffffff81492f3c) sr_block_check_events (0xffffffff81467381) disk_check_events (0xffffffff8131f55b) disk_events_workfn (0xffffffff8131f666) process_one_work (0xffffffff8109149b) return_to_handler (0xffffffff816be158) worker_thread (0xffffffff81091c3b) kthread (0xffffffff810976b9) ret_from_fork (0xffffffff816bc02c)

  23. trace-cmd profile Event: sched_wakeup:0xa62 (53) Total: 3752286 Avg: 70797 Max: 641910(ts:13166.503267) Min:17215(ts:13166.514565) | + ftrace_raw_event_sched_wakeup_template (0xffffffff810abbf0) 100% (53) time:3752286 max:641910(ts:13166.502654) min:17215(ts:13166.514563) avg:70797 ttwu_do_wakeup (0xffffffff810af0f2) ttwu_do_activate.constprop.120 (0xffffffff810af2e6) try_to_wake_up (0xffffffff810b4a9b) default_wake_function (0xffffffff810b4d72) | + child_wait_callback (0xffffffff81081822) | 95% (52) time:3582236 | max:641910(ts:13166.502654) | min:17215(ts:13166.514563) avg:68889 | __wake_up_common (0xffffffff810c8cb8) | __wake_up_sync_key (0xffffffff810c8f74) | __wake_up_parent (0xffffffff810844c6) | do_notify_parent (0xffffffff81091552) | do_exit (0xffffffff810841a5) | do_group_exit (0xffffffff81084411) | SyS_exit_group (0xffffffff81084497) | return_to_handler (0xffffffff81779468) | tracesys_phase2 (0xffffffff81777289)

  24. trace-cmd profile Event: func: do_notify_resume() (1) Total: 3823 Avg: 3823 Max: 3823(ts:13164.442512) Min:3823(ts:13164.442512) Event: func: __do_page_fault() (83) Total: 877552 Avg: 10572 Max: 101563(ts:13164.442393) Min:1943(ts:13164.442602) Event: func: __fsnotify_parent() (1) Total: 655 Avg: 655 Max: 655(ts:13164.419758) Min:655(ts:13164.419758) Event: func: preempt_count_add() (166) Total: 77873 Avg: 469 Max: 12562(ts:13164.443360) Min:207(ts:13164.442078) Event: func: rcu_lockdep_current_cpu_online() (166) Total: 52312 Avg: 315 Max: 502(ts:13164.443588) Min:262(ts:13164.443169) Event: func: syscall_trace_enter_phase1() (159) Total: 44922 Avg: 282 Max: 558(ts:13166.440430) Min:149(ts:13166.506518) Event: func: SyS_close() (14) Total: 21622 Avg: 1544 Max: 2598(ts:13164.419772) Min:1286(ts:13164.444957) Event: func: mutex_unlock() (2) Total: 128686 Avg: 64343 Max: 127977(ts:13164.419757) Min:709(ts:13164.419763) Event: func: SyS_munmap() (1) Total: 37085 Avg: 37085 Max: 37085(ts:13164.442506) Min:37085(ts:13164.442506) Event: func: SyS_newfstat() (3) Total: 5724 Avg: 1908 Max: 2498(ts:13166.515650) Min:1511(ts:13164.441994) Event: func: SyS_socketpair() (12) Total: 360633 Avg: 30052 Max: 45273(ts:13164.444105) Min:22783(ts:13164.442701) Event: func: syscall_trace_leave() (159) Total: 229978 Avg: 1446 Max: 19880(ts:13166.504255) Min:909(ts:13166.506518) Event: func: SyS_wait4() (86) Total: 77870280 Avg: 905468 Max: 21639490(ts:13166.462072) Min:11999(ts:13166.503385)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend