The basic way to trace all events:
# trace-cmd record -e all ls > /dev/null
# trace-cmd report
trace-cmd-13541 [003] 106260.693809: filemap_fault: address=0x128122 offset=0xce
trace-cmd-13543 [001] 106260.693809: kmalloc: call_site=81128dd4 ptr=0xffff88003dd83800 bytes_req=768 bytes_alloc=1024 gfp_flags=GFP_KERNEL|GFP_ZERO
ls-13545 [002] 106260.693809: kfree: call_site=810a7abb ptr=0x0
ls-13545 [002] 106260.693818: sys_exit_write: 0x1
To use the function tracer with sched switch tracing:
# trace-cmd record -p function -e sched_switch ls > /dev/null
# trace-cmd report
ls-13587 [002] 106467.860310: function: hrtick_start_fair <-- pick_next_task_fair
ls-13587 [002] 106467.860313: sched_switch: prev_comm=trace-cmd prev_pid=13587 prev_prio=120 prev_state=R ==> next_comm=trace-cmd next_pid=13583 next_prio=120
trace-cmd-13585 [001] 106467.860314: function: native_set_pte_at <-- __do_fault
trace-cmd-13586 [003] 106467.860314: function: up_read <-- do_page_fault
ls-13587 [002] 106467.860317: function: __phys_addr <-- schedule
trace-cmd-13585 [001] 106467.860318: function: _raw_spin_unlock <-- __do_fault
ls-13587 [002] 106467.860320: function: native_load_sp0 <-- __switch_to
trace-cmd-13586 [003] 106467.860322: function: down_read_trylock <-- do_page_fault
Here is a nice way to find what interrupts have the highest
latency:
# trace-cmd record -p function_graph -e irq_handler_entry -l do_IRQ sleep 10
# trace-cmd report
<idle>-0 [000] 157412.933969: funcgraph_entry: | do_IRQ() {
<idle>-0 [000] 157412.933974: irq_handler_entry: irq=48 name=eth0
<idle>-0 [000] 157412.934004: funcgraph_exit: + 36.358 us | }
<idle>-0 [000] 157413.895004: funcgraph_entry: | do_IRQ() {
<idle>-0 [000] 157413.895011: irq_handler_entry: irq=48 name=eth0
<idle>-0 [000] 157413.895026: funcgraph_exit: + 24.014 us | }
<idle>-0 [000] 157415.891762: funcgraph_entry: | do_IRQ() {
<idle>-0 [000] 157415.891769: irq_handler_entry: irq=48 name=eth0
<idle>-0 [000] 157415.891784: funcgraph_exit: + 22.928 us | }
<idle>-0 [000] 157415.934869: funcgraph_entry: | do_IRQ() {
<idle>-0 [000] 157415.934874: irq_handler_entry: irq=48 name=eth0
<idle>-0 [000] 157415.934906: funcgraph_exit: + 37.512 us | }
<idle>-0 [000] 157417.888373: funcgraph_entry: | do_IRQ() {
<idle>-0 [000] 157417.888381: irq_handler_entry: irq=48 name=eth0
<idle>-0 [000] 157417.888398: funcgraph_exit: + 25.943 us | }
An example of the profile:
# trace-cmd record --profile sleep 1
# trace-cmd report --profile --comm sleep
task: sleep-21611
Event: sched_switch:R (1) Total: 99442 Avg: 99442 Max: 99442 Min:99442
<stack> 1 total:99442 min:99442 max:99442 avg=99442
=> ftrace_raw_event_sched_switch (0xffffffff8105f812)
=> __schedule (0xffffffff8150810a)
=> preempt_schedule (0xffffffff8150842e)
=> ___preempt_schedule (0xffffffff81273354)
=> cpu_stop_queue_work (0xffffffff810b03c5)
=> stop_one_cpu (0xffffffff810b063b)
=> sched_exec (0xffffffff8106136d)
=> do_execve_common.isra.27 (0xffffffff81148c89)
=> do_execve (0xffffffff811490b0)
=> SyS_execve (0xffffffff811492c4)
=> return_to_handler (0xffffffff8150e3c8)
=> stub_execve (0xffffffff8150c699)
Event: sched_switch:S (1) Total: 1000506680 Avg: 1000506680 Max: 1000506680 Min:1000506680
<stack> 1 total:1000506680 min:1000506680 max:1000506680 avg=1000506680
=> ftrace_raw_event_sched_switch (0xffffffff8105f812)
=> __schedule (0xffffffff8150810a)
=> schedule (0xffffffff815084b8)
=> do_nanosleep (0xffffffff8150b22c)
=> hrtimer_nanosleep (0xffffffff8108d647)
=> SyS_nanosleep (0xffffffff8108d72c)
=> return_to_handler (0xffffffff8150e3c8)
=> tracesys_phase2 (0xffffffff8150c304)
Event: sched_wakeup:21611 (1) Total: 30326 Avg: 30326 Max: 30326 Min:30326
<stack> 1 total:30326 min:30326 max:30326 avg=30326
=> ftrace_raw_event_sched_wakeup_template (0xffffffff8105f653)
=> ttwu_do_wakeup (0xffffffff810606eb)
=> ttwu_do_activate.constprop.124 (0xffffffff810607c8)
=> try_to_wake_up (0xffffffff8106340a)