zoukankan      html  css  js  c++  java
  • 内核调试神器SystemTap — 更多功能与原理(三)

    a linux trace/probe tool.

    官网:https://sourceware.org/systemtap/

    用户空间

    SystemTap探测用户空间程序需要utrace的支持,3.5以上的内核版本默认支持。

    对于3.5以下的内核版本,需要自己打相关补丁。

    更多信息:http://sourceware.org/systemtap/wiki/utrace

    需要:

    debugging information for the named program

    utrace support in the kernel

    (1) Begin/end

    探测点:

    进程/线程创建时

    进程/线程结束时

    process.begin

    process("PATH").begin

    process(PID).begin

    process.thread.begin

    process("PATH").thread.begin

    process(PID).thread.begin

    process.end

    process("PATH").end

    process(PID).end

    process.thread.end

    process("PATH").thread.end

    process(PID).thread.end

    (2) Syscall

    探测点:

    系统调用开始

    系统调用返回

     

    process.syscall

    process("PATH").syscall

    process(PID).syscall

    process.syscall.return

    process("PATH").syscall.return

    process(PID).syscall.return

    可用的进程上下文变量:

    $syscall // 系统调用号

    $argN ($arg1~$arg6) // 系统调用参数

    $return // 系统调用返回值

    (3) Function/statement

    探测点:

    函数入口处

    函数返回处

    文件中某行

    函数中的某个标签

    process("PATH").function("NAME")

    process("PATH").statement("*@FILE.c:123")

    process("PATH").function("*").return

    process("PATH").function("myfunc").label("foo")

    (4) Absolute variant

    探测点:

    进程的虚拟地址

    process(PID).statement(ADDRESS).absolute

    A non-symbolic probe point uses raw, unverified virtual addresses and provide no $variables.

    The target PID parameter must identify a running process and ADDRESS must identify a valid instruction address.

    This is a guru mode probe.

    (5) Target process

    探测点:

    动态链接库中的函数(比如glibc)

     

    Target process mode (invoked with stap -c CMD or -x PID) implicitly restricts all process.* probes to the given child

    process.

    If PATH names a shared library, all processes map that shared library can be probed.

    If dwarf debugging information is installed, try using a command with this syntax:

    probe process("/lib64/libc-2.8.so").function("...") { ... }

    (6) Instruction probes

    探测点:

    单条指令

    指令块

    process("PATH").insn

    process(PID).insn

    process("PATH").insn.block

    process(PID).insn.block

    The .insn probe is called for every single-stepped instruction of the process described by PID or PATH.

    The .insn.block probe is called for every block-stepped instruction of the process described by PID or PATH.

    Using this feature will significantly slow process execution.

    统计一个进程执行了多少条指令:

    stap -e 'global steps; probe process("/bin/ls").insn {steps++}; probe end {printf("Total instruction: %d ", steps)}'

        -c /bin/ls

    (7) 使用

    gcc -g3 -o test test.c

    stap -L 'process("./test").function("*")' // 显示程序中的函数和变量

    调试等级:

    Request debugging information and also use level to specify how much information. The default level is 2.

    Level 0 produces no debug information at all. Thus, -g0 negates -g.

    Level 1 produces minimal information, enough for making backtraces in parts of the program that you don't

    plan to debug. This includes descriptions of functions and external variables, but no information about local

    variables and no line numbers.

    Level 3: includes extra information, such as all the macro definitions present in the program.

    高级功能

    (1) 自建脚本库

    A tapset is just a script that designed for reuse by installation into a special directory.

    Systemtap attempts to resolve references to global symbols (probes, functions, variables) that are not defined

    within the script by a systematic search through the tapset library for scripts that define those symbols.

    A user may give additional directories with the -I DIR option.

    构建自己的库:

    1. 创建库目录mylib,添加两个库文件

    time-default.stp

    function __time_value() {
    	return gettimeofday_us()
    }

    time-common.stp

    global __time_vars
    
    function timer_begin(name) {
    	__time_vars[name] = __time_value()
    }
    
    function timer_end(name) {
    	return __time_value() - __time_vars[name]
    }

    2. 编写应用脚本

    tapset-time-user.stp

    probe begin {
    	timer_begin("bench")
    	for(i=0; i<1000; i++) ;
    	printf("%d cycles
    ", timer_end("bench"))
    	exit()
    }

    3. 执行

    stap -I mylib/ tapset-time-user.stp

    (2) 探测点重命名

    主要用于在探测点之上提供一个抽象层。

    Probe point aliases allow creation of new probe points from existing ones.

    This is useful if the new probe points are named to provide a higher level of abstraction.

    格式:

    probe new_name = existing_name1, existing_name2[, ..., existing_nameN]

    {

        prepending behavior

    }

    实例:

    probe syscallgroup.io = syscall.open, syscall.close,
    	  	     syscall.read, syscall.write
    {
    	groupname = "io"
    }
    
    probe syscallgroup.process = syscall.fork, syscall.execve
    {
    	groupname = "process"
    }
    
    probe syscallgroup.*
    {
    	groups[execname() . "/" . groupname]++
    }
    
    global groups
    
    probe end
    {
    	foreach (eg in groups+)
    		printf("%s: %d
    ", eg, groups[eg])
    }


    (3) 嵌入C代码

    SystemTap provides an "escape hatch" to go beyond what the language can safely offer.

    嵌入的C代码段用%{和%}括起来,执行脚本时要加-g选项。

    提供一个THIS宏,可以用于获取函数参数和保存函数返回值。

    实例:

    %{
    #include <linux/sched.h>
    #include <linux/list.h>
    %}
    
    function process_list()
    %{
    	struct task_struct *p;
    	struct list_head *_p, *_n;
    
    	printk("%-20s%-10s
    ", "program", "pid");
    
    	list_for_each_safe(_p, _n, &current->tasks) {
    		p = list_entry(_p, struct task_struct, tasks);
    		printk("%-20s%-10d
    ", p->comm, p->pid);
    	}
    %}
    
    probe begin {
    	process_list()
    	exit()
    }

    stap -g embeded-c.stp

    dmesg可看到打印出的所有进程。

    C代码用%{ ... %}括起来,可以是独立的一个段,可以作为函数的一部分,也可以只是一个表达式。

    (4) 已有脚本库

    SystemTap默认提供了非常强大的脚本库,主要类别如下:

    Context Functions

    Timestamp Functions

    Time utility functions

    Shell command functions

    Memory Tapset

    Task Time Tapset

    Secheduler Tapset

    IO Scheduler and block IO Tapset

    SCSI Tapset

    TTY Tapset

    Interrupt Request (IRQ) Tapset

    Networking Tapset

    Socket Tapset

    SNMP Information Tapset

    Kernel Process Tapset

    Signal Tapset

    Errno Tapset

    Device Tapset

    Directory-entry (dentry) Tapset

    Logging Tapset

    Queue Statistics Tapset

    Random functions Tapset

    String and data retrieving functions Tapset

    String and data writing functions Tapset

    Guru tapsets

    A collection of standard string functions

    Utility functions for using ansi control chars in logs

    SystemTap Translator Tapset

    Network File Storage Tapsets

    Speculation

    实现原理

    (1) SystemTap脚本的执行流程

    pass1

    During the parsing of the code, it is represented internally in a parse tree.

    Preprocessing is performed during this step, and the code is checked for semantic and syntax errors.

    pass2

    During the elaboration step, the symbols and references in the SystemTap script are resolved.

    Also, any tapsets that are referenced in the SystemTap script are imported.

    Debug data that is read from the DWARF(a widely used, standardized debugging data format) information,

    which is produced during kernel compilation, is used to find the addresses for functions and variables

    referenced in the script, and allows probes to be placed inside functions.

    pass3

    Takes the output from the elaboration phase and converts it into C source code.

    Variables used by multiple probes are protected by locks. Safety checks, and any necessary locking, are

    handled during the translation. The code is also converted to use the Kprobes API for inserting probe points

    into the kernel.

    pass4

    Once the SystemTap script has been translated into a C source file, the code is compiled into a module that

    can be dynamically loaded and executed in the kernel.

    pass5

    Once the module is built, SystemTap loads the module into the kernel.

    When the module loads, an init routine in the module starts running and begins inserting probes into their

    proper locations. Hitting a probe causes execution to stop while the handler for that probe is called.

    When the handler exits, normal execution continues. The module continues waiting for probes and executing

    handler code until the script exits, or until the user presses Ctrl-c, at which time SystemTap removes the

    probes, unloads the module, and exits.

    Output from SystemTap is transferred from the kernel through a mechanism called relayfs, and sent to STDOUT.

    (2) 从用户空间和内核空间来看SystemTap脚本的执行

    (3) kprobes

    断点指令(breakpoint instruction):__asm INT 3,机器码为CC。

    断点中断(INT3)是一种软中断,当执行到INT 3指令时,CPU会把当时的程序指针(CS和EIP)压入堆栈保存起来,

    然后通过中断向量表调用INT 3所对应的中断例程。

    INT是软中断指令,中断向量表是中断号和中断处理函数地址的对应表。

    INT 3即触发软中断3,相应的中断处理函数的地址为:中断向量表地址 + 4 * 3。

    A Kprobe is a general purpose hook that can be inserted almost anywhere in the kernel code.

    To allow it to probe an instruction, the first byte of the instruction is replaced with the breakpoint

    instruction for the architecture being used. When this breakpoint is hit, Kprobe takes over execution,

    executes its handler code for the probe, and then continues execution at the next instruction.

    (4) 依赖的内核特性

    kprobes/jprobes

    return probes

    reentrancy

    colocated (multiple)

    relayfs

    scalability (unlocked handlers)

    user-space probes

  • 相关阅读:
    对数据库进行增删改查操作
    Chromium网页Graphics Layer Tree创建过程分析
    苹果产品设计中鲜为人知的10个细节
    翻翻git之---自己定义邮件发送buttonSendButton(流程分析,实现思路能够学习下)
    pascal+sublime搭建Pascal学习环境
    我们的一个已投产项目的高可用数据库实战
    开源 免费 java CMS
    上海居住证续签流程须知
    【LeetCode OJ 232】Implement Queue using Stacks
    Android学习笔记之:android更新ui的几种经常用法
  • 原文地址:https://www.cnblogs.com/aiwz/p/6333278.html
Copyright © 2011-2022 走看看