zoukankan      html  css  js  c++  java
  • 关于strace的一点东西

           好久没写博客了,感觉有点羞愧,认为自己也应该静下心来利用自己可分配的时间去提升自己.
           尽管近期在看一些Python的东西,但是认为自己还是不能忘记本行啊,Linux C的一些东西必须一直好好温习.
           不废话,今天打算写一个简单的东西, strace , 应该都对它或多或少的知道一些.非常多时候除了core dump, backtrace的方式去排查程序的问题外, strace 跟踪调试也是不错的选择.

      看一小段測试code:

      /*
    ======================================================================
       *
       *       Filename:  strace_use_test.c
       *        Version:  1.0
       *        Created:  2014年07月19日 09时40分49秒
       *       Revision:  none
       *       Compiler:  clang
       *         Author:  sim szm, xianszm007@gmail.com
       *        Company:  Class 1107 of Computer Science and Technology
       *
       * ======================================================================
       */
      
      #include <unistd.h>
      #include <assert.h>
      #include <stdio.h>
      #include <stdlib.h>
      #include <sys/types.h>
      #include <fcntl.h>
      #include <sys/stat.h>
      
      int main(int argc, char *argv[])
      {
              int fd,i=0;
              fd=open("/tmp/non_file",O_RDONLY);
              if(fd<0)
                    i=2;
              else
                    i=1;
              return i;
      }

    编译: gcc strace_use_test.c -o strace_use
      生成跟踪文件: strace -o strace_use.txt ./strace_use (事实上就是跟踪结果重定向到文件)
      我们来看下相应代码生成的strace_dump信息:

      szm@szm-Lenovo:~/some_code$ cat strace_use.txt 
      (1) execve("./strace_use", ["./strace_use"], [/* 74 vars */]) = 0
      (2) brk(0)                                  = 0x8a92000
      (3) uname({sys="Linux", node="szm-Lenovo", ...}) = 0
      (4) access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
      (5) mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb775d000
      (6) access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
      (7) open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
      (8) fstat64(3, {st_mode=S_IFREG|0644, st_size=131876, ...}) = 0
      (9)mmap2(NULL, 131876, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb773c000
      (10) close(3)                                = 0
      (11) access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
      (12) open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
      (13) read(3, "177ELF1113310002261004"..., 512) = 512
      (14) fstat64(3, {st_mode=S_IFREG|0755, st_size=1730024, ...}) = 0
      (15) mmap2(NULL, 1743580, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7592000
      (16) mprotect(0xb7735000, 4096, PROT_NONE)   = 0
      (17) mmap2(0xb7736000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a3) = 0xb7736000
      (18) mmap2(0xb7739000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7739000
      (19) close(3)                                = 0
      (20) mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7591000
      (21) set_thread_area({entry_number:-1 -> 6, base_addr:0xb75916c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
      (22) mprotect(0xb7736000, 8192, PROT_READ)   = 0
      (23) mprotect(0x8049000, 4096, PROT_READ)    = 0
      (24) mprotect(0xb777f000, 4096, PROT_READ)   = 0
      (25) munmap(0xb773c000, 131876)              = 0
      (26) open("/tmp/non_file", O_RDONLY)         = -1 ENOENT (No such file or directory)
      (27) exit_group(2)                           = ?

          (当然,标号是我自己编辑上去的,哈哈)

          看到这一堆,或许你会看到一些熟悉的函数调用,可是每行干嘛的预计是不是有点迷惑.所以我们来说说每一行都在干什么.事实上很多其它的就是每一行相应了一个系统调用:(以下的一些系统地址是我自己机子,换台机子肯定不同),几乎相同格式就是[系统调用的名( 參数... ) = 返回值 错误标志和描写叙述] 这样的样子,还是看看啥意思吧.
       1: 对于命令行下运行的程序,execve(或exec系列调用中的某一个)均为strace输出系统调用中的第一个。strace首先调用fork或clone函数新建一个子进程。然后在子进程中调用exec加载须要运行的程序(这里为./strace_use)
       2: 以0作为參数调用brk。返回值为内存管理的起始地址(若在子进程中调用malloc,则从0x8a92000地址開始分配空间)
       3: 调用access函数检验/etc/ld.so.nohwcap是否存在
       5: 使用mmap2函数进行匿名内存映射,以此来获取8192bytes内存空间。该空间起始地址为0xb775d000,匿名内存映射就是不涉及详细的文件名称,避免了文件的创建及打开,非常显然仅仅能用于具有亲缘关系的进程间通信(这个解释或许有点简单,google一下吧)
       7: 调用open函数尝试打开/etc/ld.so.cache文件。返回文件描写叙述符为3
       8: fstat64函数获取/etc/ld.so.cache文件信息
       9: 调用mmap2函数将/etc/ld.so.cache文件映射至内存
       10: close关闭文件描写叙述符为3指向的/etc/ld.so.cache文件
       13: 调用read,从/lib/i386-linux-gnu/libc.so.6该libc库文件里读取512bytes,即读取ELF头信息
       16: 使用mprotect函数对0xb7735000,起始的4096bytes空间进行保护(PROT_NONE參数就是不能訪问,相应还有PROT_READ表示能够读取)
       25: 调用munmap函数。将/etc/ld.so.cache文件从内存中去映射,与9行的mmap2函数相应
       26: 相应源代码中使用到的唯一的系统调用——open函数。使用其打开/tmp/non_file文件
       27: 子进程结束。退出码为2(源代码返回值)
      从之前的源代码看来,真正能与源代码相应上的仅仅有open这一个系统调用(Line 26)。其它系统调用差点儿都用于进行进程初始化工作:装载被运行程序、加载libc函数库、设置内存映射等。并且源代码中的if语句或其它代码在相应strace输出中并没有体现,由于它们并没有唤起系统调用。所以strace仅仅关心程序与系统之间产生的交互,进而strace不适用于程序逻辑代码的排错和分析。

    当然对于Linux中几百个系统调用,上面strace输出的几个仅仅是非常小一块啦.
      
      以下说说strace一般经常使用的几点吧:
      首先那就是跟踪进程喽, 默认情况下。strace仅仅跟踪指定的进程。而不正确指定进程中新建的子进程进行跟踪。使用-f选项,可对进程中新建的子进程进行跟踪。并在输出结果中打印对应进程PID:
      我们吧strace_use_test.c的代码改为例如以下:      

      /*
    ======================================================================
       *
       *       Filename:  strace_use_test.c
       *        Version:  1.0
       *        Created:  2014年07月19日 09时40分49秒
       *       Revision:  none
       *       Compiler:  clang
       *         Author:  sim szm, xianszm007@gmail.com
       *        Company:  Class 1107 of Computer Science and Technology
       *
       * ======================================================================
       */
      
      #include <unistd.h>
      #include <assert.h>
      #include <stdio.h>
      #include <stdlib.h>
      #include <sys/types.h>
      #include <fcntl.h>
      #include <sys/stat.h>
      
      int main(int argc, char *argv[])
      {
              pid_t pid;
              pid=fork();
              if(pid>0){
                      printf("child do_something here !
    ");
              }else if(pid==0){
                      printf("parent area print!
    ");
              }else{
                      perror("fork error !");
                      _exit(0);
              }
              return 0;
      }
      

    相应加上 -f 选项跟踪父子进程, dump 信息例如以下:
      szm@szm-Lenovo:~/some_code$ strace -o strace_use.txt -f ./strace_use 
      parent area print!
      child do_something here !
      szm@szm-Lenovo:~/some_code$ cat strace_use.txt 
      9256  execve("./strace_use", ["./strace_use"], [/* 74 vars */]) = 0
      9256  brk(0)                            = 0x8d8d000
      9256  uname({sys="Linux", node="szm-Lenovo", ...}) = 0
      9256  access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
      9256  mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7799000
      9256  access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
      9256  open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
      9256  fstat64(3, {st_mode=S_IFREG|0644, st_size=131876, ...}) = 0
      9256  mmap2(NULL, 131876, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7778000
      9256  close(3)                          = 0
      9256  access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
      9256  open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
      9256  read(3, "177ELF1113310002261004"..., 512) = 512
      9256  fstat64(3, {st_mode=S_IFREG|0755, st_size=1730024, ...}) = 0
      9256  mmap2(NULL, 1743580, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb75ce000
      9256  mprotect(0xb7771000, 4096, PROT_NONE) = 0
      9256  mmap2(0xb7772000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a3) = 0xb7772000
      9256  mmap2(0xb7775000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7775000
      9256  close(3)                          = 0
      9256  mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75cd000
      9256  set_thread_area({entry_number:-1 -> 6, base_addr:0xb75cd6c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
      9256  mprotect(0xb7772000, 8192, PROT_READ) = 0
      9256  mprotect(0x8049000, 4096, PROT_READ) = 0
      9256  mprotect(0xb77bb000, 4096, PROT_READ) = 0
      9256  munmap(0xb7778000, 131876)        = 0
      9256  clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb75cd728) = 9257
      9257  fstat64(1,  <unfinished ...>
      9256  fstat64(1,  <unfinished ...>
      9257  <... fstat64 resumed> {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0
      9256  <... fstat64 resumed> {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0
      9257  mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
      9256  mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
      9257  <... mmap2 resumed> )             = 0xb7798000
      9256  <... mmap2 resumed> )             = 0xb7798000
      9257  write(1, "parent area print!
    ", 19 <unfinished ...>
      9256  write(1, "child do_something here !
    ", 26 <unfinished ...>
      9257  <... write resumed> )             = 19
      9256  <... write resumed> )             = 26
      9257  exit_group(0)                     = ?
      9256  exit_group(0)                     = ?

           9256   clone(child_stack=0,flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb75cd728) = 9257
    即9256这个进程内核进行clone()产生9527.

           所以非常多时候多进程的一些程序strace跟踪时,一般打开 -f 选项, 使用strace对执行中的程序进行跟踪。-p 參数就可以,命令执行之后。被跟踪的进程照常执行,strace的其它选项也适用于执行中的进程跟踪。有时间要求的时候,strace还能够记录程序与系统交互时,各个系统调用发生时的时间信息,有r、t、tt、ttt、T等几个选项,它们记录时间的方式为:
      -T: 记录各个系统调用花费的时间,精确到微秒
      -r: 以第一个系统调用(通常为execve)计时,精确到微秒
      -t: 时:分:秒
      -tt: 时:分:秒. 微秒
      -ttt: 计算机纪元以来的秒数. 微秒
      
            Strace 还能够用来推断程序挂起的问题,继续改动程序例如以下:

    /*
    ======================================================================
     *
     *       Filename:  strace_use_test.c
     *        Version:  1.0
     *        Created:  2014年07月19日 09时40分49秒
     *       Revision:  none
     *       Compiler:  clang
     *         Author:  sim szm, xianszm007@gmail.com
     *        Company:  Class 1107 of Computer Science and Technology
     *
     * ======================================================================
     */
    
    #include <unistd.h>
    #include <assert.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/types.h>
    #include <fcntl.h>
    #include <sys/stat.h>
    
    int main(int argc, char *argv[])
    {
            printf("flag see !
    ");
            if(strcmp(argv[1],"flag1")==0){
                    while(1);
            }else if(strcmp(argv[1],"flag2")==0){
                    sleep(400);
            }
            return 0;
    }
    

            flag1 能够用来模拟用户态死循环挂死状态,flag2 能够用来模拟内核态挂死状态.由于sleep()是陷入内核的.
    看下相应的 strace dump结果:
          首先相应的是flag1

    szm@szm-Lenovo:~/some_code$ strace ./strace_use flag1
    execve("./strace_use", ["./strace_use", "flag1"], [/* 74 vars */]) = 0
    brk(0)                                  = 0x8cff000
    uname({sys="Linux", node="szm-Lenovo", ...}) = 0
    access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
    mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77a8000
    access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
    open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
    fstat64(3, {st_mode=S_IFREG|0644, st_size=131876, ...}) = 0
    mmap2(NULL, 131876, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7787000
    close(3)                                = 0
    access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
    open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
    read(3, "177ELF1113310002261004"..., 512) = 512
    fstat64(3, {st_mode=S_IFREG|0755, st_size=1730024, ...}) = 0
    mmap2(NULL, 1743580, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb75dd000
    mprotect(0xb7780000, 4096, PROT_NONE)   = 0
    mmap2(0xb7781000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a3) = 0xb7781000
    mmap2(0xb7784000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7784000
    close(3)                                = 0
    mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75dc000
    set_thread_area({entry_number:-1 -> 6, base_addr:0xb75dc6c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
    mprotect(0xb7781000, 8192, PROT_READ)   = 0
    mprotect(0x8049000, 4096, PROT_READ)    = 0
    mprotect(0xb77ca000, 4096, PROT_READ)   = 0
    munmap(0xb7787000, 131876)              = 0
    fstat64(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0
    mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77a7000
    write(1, "flag see !
    ", 11flag see !
    )            = 11
    

          一直卡在这里循环

          再看看flag2:

    szm@szm-Lenovo:~/some_code$ strace ./strace_use flag2
    execve("./strace_use", ["./strace_use", "flag2"], [/* 74 vars */]) = 0
    brk(0)                                  = 0x83c0000
    uname({sys="Linux", node="szm-Lenovo", ...}) = 0
    access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
    mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb777a000
    access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
    open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
    fstat64(3, {st_mode=S_IFREG|0644, st_size=131876, ...}) = 0
    mmap2(NULL, 131876, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7759000
    close(3)                                = 0
    access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
    open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
    read(3, "177ELF1113310002261004"..., 512) = 512
    fstat64(3, {st_mode=S_IFREG|0755, st_size=1730024, ...}) = 0
    mmap2(NULL, 1743580, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb75af000
    mprotect(0xb7752000, 4096, PROT_NONE)   = 0
    mmap2(0xb7753000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a3) = 0xb7753000
    mmap2(0xb7756000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7756000
    close(3)                                = 0
    mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75ae000
    set_thread_area({entry_number:-1 -> 6, base_addr:0xb75ae6c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
    mprotect(0xb7753000, 8192, PROT_READ)   = 0
    mprotect(0x8049000, 4096, PROT_READ)    = 0
    mprotect(0xb779c000, 4096, PROT_READ)   = 0
    munmap(0xb7759000, 131876)              = 0
    fstat64(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0
    mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7779000
    write(1, "flag see !
    ", 11flag see !
    )            = 11
    rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
    rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
    rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
    nanosleep({400, 0},
      陷入sleep

      从输出的跟踪信息能够看出两中情况的不同, 当用户态挂死情况下。strace在write(由调用的printf产生write进描写叙述符1,也就是标准输出) 一行输出之后没有其它系统调用输出。进程在内核态挂死,最后一行的系统调用nanosleep不能完整显示 (这里nanosleep没有返回值表示该调用尚未完毕 ) 所以我们就能够知道:使用strace跟踪挂死程序,假设最后一行系统调用显示完整,程序在逻辑代码处挂死。假设最后一行系统调用显示不完整。程序在该系统调用处挂死。



      当程序挂死在系统调用处,我们能够查看对应系统调用的man手冊。了解在什么情况下该系统调用会出现挂死情况。
      
      当然strace 还有其它一堆參数,就不说了,事实上我也没看,哈哈.经常使用跟踪进程信息使用方法的就是strace -o output.txt -T -tt -e trace=all -f -p pid (-e trace=all 意思是跟踪全部系统调用)

           Ok, 周末一上午写篇Blog 总比睡觉好.Over .

  • 相关阅读:
    多说社交评论插件学习《一》
    《转》每天起床时,优秀创业者都会问自己这3个问题
    老赵面试题参考答案(二)《转》
    老赵面试题参考答案(一)《转》
    20个开源项目托管站点推荐
    .NET面试题(二)
    .NET面试题(一)
    Python笔记(28)-----继承
    pytorch实战(2)-----回归例子
    深度学习之入门Pytorch(1)------基础
  • 原文地址:https://www.cnblogs.com/llguanli/p/6941585.html
Copyright © 2011-2022 走看看