对于 ftrace 的function tracer/function graph tracer 用的比较多,本文介绍 wakeup tracer/wakeup_rt tracer。
开始介绍之前,先扯两句别的,在一开始想了解wakeup/wakeup_rt tracer 的时候,因为内核Documents 中有ftrace的文档介绍,
直接在工程中的ftrace.txt 看了一下wakeup/wakeup_rt tracer 的介绍,当场看糊涂了,写的完全驴唇不对马嘴,工程中的kernel 是3.4的,
当时就怀疑Documents 应该是没跟上代码更新,看了一下linus torvals tree, 果然更新了。建议直接看最新版本的介绍:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/trace/ftrace.txt
1.wakeup/wakeup_rt tracer 是什么?
下面有详细的英文介绍,简单来说,wakeup/wakeup_rt tracer 是ftrace 的一个tracing utilities,是两种tracer.
区别在于wakeup tracer 是Trace 所有的进程中(实时/非实时进程)的schedule latency, 而wakeup_rt 只Trace 实时进程的最大schedule latency.
"wakeup"
Traces and records the max latency that it takes for
the highest priority task to get scheduled after
it has been woken up.
Traces all tasks as an average developer would expect.
"wakeup_rt"
Traces and records the max latency that it takes for just
RT tasks (as the current "wakeup" does). This is useful
for those interested in wake up timings of RT tasks.
2. wakeup tracer How-to
$ adb shell cat /sys/kernel/debug/tracing/available_tracers
function_graph wakeup_rt wakeup preemptirqsoff preemptoff irqsoff function nop
$ adb shell "echo wakeup > /sys/kernel/debug/tracing/current_tracer"
$ adb shell "echo 0 > /sys/kernel/debug/tracing/tracing_max_latency"
$ adb shell "echo 1 > /sys/kernel/debug/tracing/tracing_on"
$ adb shell cat /sys/kernel/debug/tracing/trace
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 3.4.0-gd26777b-00428-gd66afe1-dirty
# --------------------------------------------------------------------
# latency: 21865 us, #11408/11408, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)//wakeup latency 是21865us
# -----------------
# | task: PowerManagerSer-943 (uid:1000 nice:-4 policy:0 rt_prio:0)//trace 的进程是PowerManagerSer-943
# -----------------
#//下面是详细的trace,向你解释为什么PowerManagerSer-943 进程从wake up 到真正scheduled in 得以执行用了21865us.
//第一列是执行wakeup PowerManagerSer-943 进程的时候正在运行的进程,后面是运行于哪个CPU,中断开关情况,need-resched flag 的情况,
//中断服务程序中,还是软中断中,抢占深度
//当你看到第一列中的进程优先级并不高时,PowerManagerSer-943 为啥得不到执行,原因无非就是中断被关了,无法调度,或是抢占深度count 置位,无法
//抢占。
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# / ||||| | /
<...>-1108 0d..5 6us+: 1108:120:R + [000] 943:116:R PowerManagerSer
<...>-1108 0d..5 9us+: 0
<...>-1108 0d..4 11us : check_preempt_curr <-ttwu_do_wakeup
<...>-1108 0d..4 12us : check_preempt_wakeup <-check_preempt_curr
<...>-1108 0d..4 13us : update_curr <-check_preempt_wakeup
<...>-1108 0d..4 15us : wakeup_preempt_entity <-check_preempt_wakeup
<...>-1108 0d..4 16us : _raw_spin_unlock <-try_to_wake_up
<...>-1108 0d..4 17us+: sub_preempt_count <-_raw_spin_unlock
<...>-1108 0d..3 19us+: ttwu_stat <-try_to_wake_up
...
kworker/-31782 0d..2 21843us+: update_stats_wait_end <-set_next_entity
kworker/-31782 0d..2 21849us : _raw_spin_unlock_irq <-__schedule
kworker/-31782 0...2 21850us+: sub_preempt_count <-_raw_spin_unlock_irq
kworker/-31782 0...2 21852us : __schedule
kworker/-31782 0...2 21853us : 31782:120:S ==> [000] 943:116:R PowerManagerSer
没一列的详述:
cmd: The name of the process in the trace.
pid: The PID of that process.
CPU#: The CPU which the process was running on.
irqs-off: 'd' interrupts are disabled. '.' otherwise.
Note: If the architecture does not support a way to
read the irq flags variable, an 'X' will always
be printed here.
need-resched:
'N' both TIF_NEED_RESCHED and PREEMPT_NEED_RESCHED is set,
'n' only TIF_NEED_RESCHED is set,
'p' only PREEMPT_NEED_RESCHED is set,
'.' otherwise.
hardirq/softirq:
'H' - hard irq occurred inside a softirq.
'h' - hard irq is running
's' - soft irq is running
'.' - normal context.
preempt-depth: The level of preempt_disabled
The above is mostly meaningful for kernel developers.
time: When the latency-format option is enabled, the trace file
output includes a timestamp relative to the start of the
trace. This differs from the output when latency-format
is disabled, which includes an absolute timestamp.
delay: This is just to help catch your eye a bit better. And
needs to be fixed to be only relative to the same CPU.
The marks are determined by the difference between this
current trace and the next trace.
'$' - greater than 1 second
'#' - greater than 1000 microsecond
'!' - greater than 100 microsecond
'+' - greater than 10 microsecond
' ' - less than or equal to 10 microsecond.
The rest is the same as the 'trace' file.
Note, the latency tracers will usually end with a back trace
to easily find where the latency occurred.
3.wakeup_rt tracer How-to
$ adb shell cat /sys/kernel/debug/tracing/available_tracers
function_graph wakeup_rt wakeup preemptirqsoff preemptoff irqsoff function nop
$ adb shell "echo wakeup_rt > /sys/kernel/debug/tracing/current_tracer"
$ adb shell "echo 0 > /sys/kernel/debug/tracing/tracing_max_latency"
$ adb shell "echo 1 > /sys/kernel/debug/tracing/tracing_on"
$ adb shell cat /sys/kernel/debug/tracing/trace
# tracer: wakeup_rt
#
# wakeup_rt latency trace v1.1.5 on 3.4.0-gd26777b-00428-gd66afe1-dirty
# --------------------------------------------------------------------
# latency: 1973 us, #967/967, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:1)
# -----------------
# | task: mdss_fb0-319 (uid:0 nice:0 policy:1 rt_prio:16)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# / ||||| | /
surfacef-253 2d..4 2us+: 253:112:R + [000] 319: 83:R mdss_fb0
surfacef-253 2d..4 8us+: 0
RenderTh-2982 0dN.3 16us+: _raw_spin_unlock <-task_sched_runtime
RenderTh-2982 0dN.3 19us+: sub_preempt_count <-_raw_spin_unlock
RenderTh-2982 0dN.2 21us+: _raw_spin_unlock_irqrestore <-task_sched_runtime
RenderTh-2982 0dN.2 23us+: sub_preempt_count <-_raw_spin_unlock_irqrestore
RenderTh-2982 0dN.1 25us+: task_sched_runtime <-thread_group_cputime
RenderTh-2982 0dN.1 27us+: task_rq_lock <-task_sched_runtime
RenderTh-2982 0dN.1 29us+: _raw_spin_lock_irqsave <-task_rq_lock
RenderTh-2982 0dN.1 31us+: add_preempt_count <-_raw_spin_lock_irqsave
RenderTh-2982 0dN.2 33us+: _raw_spin_lock <-task_rq_lock
RenderTh-2982 0dN.2 34us+: add_preempt_count <-_raw_spin_lock
RenderTh-2982 0dN.3 36us+: _raw_spin_unlock <-task_sched_runtime
RenderTh-2982 0dN.3 38us+: sub_preempt_count <-_raw_spin_unlock
RenderTh-2982 0dN.2 40us+: _raw_spin_unlock_irqrestore <-task_sched_runtime
RenderTh-2982 0dN.2 42us+: sub_preempt_count <-_raw_spin_unlock_irqrestore
RenderTh-2982 0dN.1 44us+: task_sched_runtime <-thread_group_cputime
RenderTh-2982 0dN.1 45us+: task_rq_lock <-task_sched_runtime
RenderTh-2982 0dN.1 47us+: _raw_spin_lock_irqsave <-task_rq_lock
RenderTh-2982 0dN.1 49us+: add_preempt_count <-_raw_spin_lock_irqsave
RenderTh-2982 0dN.2 51us+: _raw_spin_lock <-task_rq_lock
RenderTh-2982 0dN.2 53us+: add_preempt_count <-_raw_spin_lock
RenderTh-2982 0dN.3 55us+: _raw_spin_unlock <-task_sched_runtime
RenderTh-2982 0dN.3 57us+: sub_preempt_count <-_raw_spin_unlock
RenderTh-2982 0dN.2 59us+: _raw_spin_unlock_irqrestore <-task_sched_runtime
RenderTh-2982 0dN.2 60us+: sub_preempt_count <-_raw_spin_unlock_irqrestore
RenderTh-2982 0dN.1 63us+: task_sched_runtime <-thread_group_cputime
RenderTh-2982 0dN.1 64us+: task_rq_lock <-task_sched_runtime
RenderTh-2982 0dN.1 66us+: _raw_spin_lock_irqsave <-task_rq_lock
RenderTh-2982 0dN.1 68us+: add_preempt_count <-_raw_spin_lock_irqsave
RenderTh-2982 0dN.2 70us+: _raw_spin_lock <-task_rq_lock
RenderTh-2982 0dN.2 71us+: add_preempt_count <-_raw_spin_lock
RenderTh-2982 0dN.3 73us+: _raw_spin_unlock <-task_sched_runtime
RenderTh-2982 0dN.3 76us+: sub_preempt_count <-_raw_spin_unlock
RenderTh-2982 0dN.2 77us+: _raw_spin_unlock_irqrestore <-task_sched_runtime
RenderTh-2982 0dN.2 79us+: sub_preempt_count <-_raw_spin_unlock_irqrestore
RenderTh-2982 0dN.1 81us+: task_sched_runtime <-thread_group_cputime
from: https://blog.csdn.net/fervor_heart/article/details/46968087