zoukankan      html  css  js  c++  java
  • 关于strace的一点东西

           好久没写博客了,感觉有点羞愧,认为自己也应该静下心来利用自己可分配的时间去提升自己.
           尽管近期在看一些Python的东西,但是认为自己还是不能忘记本行啊,Linux C的一些东西必须一直好好温习.
           不废话,今天打算写一个简单的东西, strace , 应该都对它或多或少的知道一些.非常多时候除了core dump, backtrace的方式去排查程序的问题外, strace 跟踪调试也是不错的选择.

      看一小段測试code:

      /*
    ======================================================================
       *
       *       Filename:  strace_use_test.c
       *        Version:  1.0
       *        Created:  2014年07月19日 09时40分49秒
       *       Revision:  none
       *       Compiler:  clang
       *         Author:  sim szm, xianszm007@gmail.com
       *        Company:  Class 1107 of Computer Science and Technology
       *
       * ======================================================================
       */
      
      #include <unistd.h>
      #include <assert.h>
      #include <stdio.h>
      #include <stdlib.h>
      #include <sys/types.h>
      #include <fcntl.h>
      #include <sys/stat.h>
      
      int main(int argc, char *argv[])
      {
              int fd,i=0;
              fd=open("/tmp/non_file",O_RDONLY);
              if(fd<0)
                    i=2;
              else
                    i=1;
              return i;
      }

    编译: gcc strace_use_test.c -o strace_use
      生成跟踪文件: strace -o strace_use.txt ./strace_use (事实上就是跟踪结果重定向到文件)
      我们来看下相应代码生成的strace_dump信息:

      szm@szm-Lenovo:~/some_code$ cat strace_use.txt 
      (1) execve("./strace_use", ["./strace_use"], [/* 74 vars */]) = 0
      (2) brk(0)                                  = 0x8a92000
      (3) uname({sys="Linux", node="szm-Lenovo", ...}) = 0
      (4) access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
      (5) mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb775d000
      (6) access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
      (7) open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
      (8) fstat64(3, {st_mode=S_IFREG|0644, st_size=131876, ...}) = 0
      (9)mmap2(NULL, 131876, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb773c000
      (10) close(3)                                = 0
      (11) access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
      (12) open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
      (13) read(3, "177ELF1113310002261004"..., 512) = 512
      (14) fstat64(3, {st_mode=S_IFREG|0755, st_size=1730024, ...}) = 0
      (15) mmap2(NULL, 1743580, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7592000
      (16) mprotect(0xb7735000, 4096, PROT_NONE)   = 0
      (17) mmap2(0xb7736000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a3) = 0xb7736000
      (18) mmap2(0xb7739000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7739000
      (19) close(3)                                = 0
      (20) mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7591000
      (21) set_thread_area({entry_number:-1 -> 6, base_addr:0xb75916c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
      (22) mprotect(0xb7736000, 8192, PROT_READ)   = 0
      (23) mprotect(0x8049000, 4096, PROT_READ)    = 0
      (24) mprotect(0xb777f000, 4096, PROT_READ)   = 0
      (25) munmap(0xb773c000, 131876)              = 0
      (26) open("/tmp/non_file", O_RDONLY)         = -1 ENOENT (No such file or directory)
      (27) exit_group(2)                           = ?

          (当然,标号是我自己编辑上去的,哈哈)

          看到这一堆,或许你会看到一些熟悉的函数调用,可是每行干嘛的预计是不是有点迷惑.所以我们来说说每一行都在干什么.事实上很多其它的就是每一行相应了一个系统调用:(以下的一些系统地址是我自己机子,换台机子肯定不同),几乎相同格式就是[系统调用的名( 參数... ) = 返回值 错误标志和描写叙述] 这样的样子,还是看看啥意思吧.
       1: 对于命令行下运行的程序,execve(或exec系列调用中的某一个)均为strace输出系统调用中的第一个。strace首先调用fork或clone函数新建一个子进程。然后在子进程中调用exec加载须要运行的程序(这里为./strace_use)
       2: 以0作为參数调用brk。返回值为内存管理的起始地址(若在子进程中调用malloc,则从0x8a92000地址開始分配空间)
       3: 调用access函数检验/etc/ld.so.nohwcap是否存在
       5: 使用mmap2函数进行匿名内存映射,以此来获取8192bytes内存空间。该空间起始地址为0xb775d000,匿名内存映射就是不涉及详细的文件名称,避免了文件的创建及打开,非常显然仅仅能用于具有亲缘关系的进程间通信(这个解释或许有点简单,google一下吧)
       7: 调用open函数尝试打开/etc/ld.so.cache文件。返回文件描写叙述符为3
       8: fstat64函数获取/etc/ld.so.cache文件信息
       9: 调用mmap2函数将/etc/ld.so.cache文件映射至内存
       10: close关闭文件描写叙述符为3指向的/etc/ld.so.cache文件
       13: 调用read,从/lib/i386-linux-gnu/libc.so.6该libc库文件里读取512bytes,即读取ELF头信息
       16: 使用mprotect函数对0xb7735000,起始的4096bytes空间进行保护(PROT_NONE參数就是不能訪问,相应还有PROT_READ表示能够读取)
       25: 调用munmap函数。将/etc/ld.so.cache文件从内存中去映射,与9行的mmap2函数相应
       26: 相应源代码中使用到的唯一的系统调用——open函数。使用其打开/tmp/non_file文件
       27: 子进程结束。退出码为2(源代码返回值)
      从之前的源代码看来,真正能与源代码相应上的仅仅有open这一个系统调用(Line 26)。其它系统调用差点儿都用于进行进程初始化工作:装载被运行程序、加载libc函数库、设置内存映射等。并且源代码中的if语句或其它代码在相应strace输出中并没有体现,由于它们并没有唤起系统调用。所以strace仅仅关心程序与系统之间产生的交互,进而strace不适用于程序逻辑代码的排错和分析。

    当然对于Linux中几百个系统调用,上面strace输出的几个仅仅是非常小一块啦.
      
      以下说说strace一般经常使用的几点吧:
      首先那就是跟踪进程喽, 默认情况下。strace仅仅跟踪指定的进程。而不正确指定进程中新建的子进程进行跟踪。使用-f选项,可对进程中新建的子进程进行跟踪。并在输出结果中打印对应进程PID:
      我们吧strace_use_test.c的代码改为例如以下:      

      /*
    ======================================================================
       *
       *       Filename:  strace_use_test.c
       *        Version:  1.0
       *        Created:  2014年07月19日 09时40分49秒
       *       Revision:  none
       *       Compiler:  clang
       *         Author:  sim szm, xianszm007@gmail.com
       *        Company:  Class 1107 of Computer Science and Technology
       *
       * ======================================================================
       */
      
      #include <unistd.h>
      #include <assert.h>
      #include <stdio.h>
      #include <stdlib.h>
      #include <sys/types.h>
      #include <fcntl.h>
      #include <sys/stat.h>
      
      int main(int argc, char *argv[])
      {
              pid_t pid;
              pid=fork();
              if(pid>0){
                      printf("child do_something here !
    ");
              }else if(pid==0){
                      printf("parent area print!
    ");
              }else{
                      perror("fork error !");
                      _exit(0);
              }
              return 0;
      }
      

    相应加上 -f 选项跟踪父子进程, dump 信息例如以下:
      szm@szm-Lenovo:~/some_code$ strace -o strace_use.txt -f ./strace_use 
      parent area print!
      child do_something here !
      szm@szm-Lenovo:~/some_code$ cat strace_use.txt 
      9256  execve("./strace_use", ["./strace_use"], [/* 74 vars */]) = 0
      9256  brk(0)                            = 0x8d8d000
      9256  uname({sys="Linux", node="szm-Lenovo", ...}) = 0
      9256  access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
      9256  mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7799000
      9256  access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
      9256  open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
      9256  fstat64(3, {st_mode=S_IFREG|0644, st_size=131876, ...}) = 0
      9256  mmap2(NULL, 131876, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7778000
      9256  close(3)                          = 0
      9256  access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
      9256  open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
      9256  read(3, "177ELF1113310002261004"..., 512) = 512
      9256  fstat64(3, {st_mode=S_IFREG|0755, st_size=1730024, ...}) = 0
      9256  mmap2(NULL, 1743580, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb75ce000
      9256  mprotect(0xb7771000, 4096, PROT_NONE) = 0
      9256  mmap2(0xb7772000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a3) = 0xb7772000
      9256  mmap2(0xb7775000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7775000
      9256  close(3)                          = 0
      9256  mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75cd000
      9256  set_thread_area({entry_number:-1 -> 6, base_addr:0xb75cd6c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
      9256  mprotect(0xb7772000, 8192, PROT_READ) = 0
      9256  mprotect(0x8049000, 4096, PROT_READ) = 0
      9256  mprotect(0xb77bb000, 4096, PROT_READ) = 0
      9256  munmap(0xb7778000, 131876)        = 0
      9256  clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb75cd728) = 9257
      9257  fstat64(1,  <unfinished ...>
      9256  fstat64(1,  <unfinished ...>
      9257  <... fstat64 resumed> {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0
      9256  <... fstat64 resumed> {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0
      9257  mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
      9256  mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
      9257  <... mmap2 resumed> )             = 0xb7798000
      9256  <... mmap2 resumed> )             = 0xb7798000
      9257  write(1, "parent area print!
    ", 19 <unfinished ...>
      9256  write(1, "child do_something here !
    ", 26 <unfinished ...>
      9257  <... write resumed> )             = 19
      9256  <... write resumed> )             = 26
      9257  exit_group(0)                     = ?
      9256  exit_group(0)                     = ?

           9256   clone(child_stack=0,flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb75cd728) = 9257
    即9256这个进程内核进行clone()产生9527.

           所以非常多时候多进程的一些程序strace跟踪时,一般打开 -f 选项, 使用strace对执行中的程序进行跟踪。-p 參数就可以,命令执行之后。被跟踪的进程照常执行,strace的其它选项也适用于执行中的进程跟踪。有时间要求的时候,strace还能够记录程序与系统交互时,各个系统调用发生时的时间信息,有r、t、tt、ttt、T等几个选项,它们记录时间的方式为:
      -T: 记录各个系统调用花费的时间,精确到微秒
      -r: 以第一个系统调用(通常为execve)计时,精确到微秒
      -t: 时:分:秒
      -tt: 时:分:秒. 微秒
      -ttt: 计算机纪元以来的秒数. 微秒
      
            Strace 还能够用来推断程序挂起的问题,继续改动程序例如以下:

    /*
    ======================================================================
     *
     *       Filename:  strace_use_test.c
     *        Version:  1.0
     *        Created:  2014年07月19日 09时40分49秒
     *       Revision:  none
     *       Compiler:  clang
     *         Author:  sim szm, xianszm007@gmail.com
     *        Company:  Class 1107 of Computer Science and Technology
     *
     * ======================================================================
     */
    
    #include <unistd.h>
    #include <assert.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/types.h>
    #include <fcntl.h>
    #include <sys/stat.h>
    
    int main(int argc, char *argv[])
    {
            printf("flag see !
    ");
            if(strcmp(argv[1],"flag1")==0){
                    while(1);
            }else if(strcmp(argv[1],"flag2")==0){
                    sleep(400);
            }
            return 0;
    }
    

            flag1 能够用来模拟用户态死循环挂死状态,flag2 能够用来模拟内核态挂死状态.由于sleep()是陷入内核的.
    看下相应的 strace dump结果:
          首先相应的是flag1

    szm@szm-Lenovo:~/some_code$ strace ./strace_use flag1
    execve("./strace_use", ["./strace_use", "flag1"], [/* 74 vars */]) = 0
    brk(0)                                  = 0x8cff000
    uname({sys="Linux", node="szm-Lenovo", ...}) = 0
    access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
    mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77a8000
    access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
    open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
    fstat64(3, {st_mode=S_IFREG|0644, st_size=131876, ...}) = 0
    mmap2(NULL, 131876, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7787000
    close(3)                                = 0
    access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
    open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
    read(3, "177ELF1113310002261004"..., 512) = 512
    fstat64(3, {st_mode=S_IFREG|0755, st_size=1730024, ...}) = 0
    mmap2(NULL, 1743580, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb75dd000
    mprotect(0xb7780000, 4096, PROT_NONE)   = 0
    mmap2(0xb7781000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a3) = 0xb7781000
    mmap2(0xb7784000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7784000
    close(3)                                = 0
    mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75dc000
    set_thread_area({entry_number:-1 -> 6, base_addr:0xb75dc6c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
    mprotect(0xb7781000, 8192, PROT_READ)   = 0
    mprotect(0x8049000, 4096, PROT_READ)    = 0
    mprotect(0xb77ca000, 4096, PROT_READ)   = 0
    munmap(0xb7787000, 131876)              = 0
    fstat64(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0
    mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77a7000
    write(1, "flag see !
    ", 11flag see !
    )            = 11
    

          一直卡在这里循环

          再看看flag2:

    szm@szm-Lenovo:~/some_code$ strace ./strace_use flag2
    execve("./strace_use", ["./strace_use", "flag2"], [/* 74 vars */]) = 0
    brk(0)                                  = 0x83c0000
    uname({sys="Linux", node="szm-Lenovo", ...}) = 0
    access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
    mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb777a000
    access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
    open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
    fstat64(3, {st_mode=S_IFREG|0644, st_size=131876, ...}) = 0
    mmap2(NULL, 131876, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7759000
    close(3)                                = 0
    access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
    open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
    read(3, "177ELF1113310002261004"..., 512) = 512
    fstat64(3, {st_mode=S_IFREG|0755, st_size=1730024, ...}) = 0
    mmap2(NULL, 1743580, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb75af000
    mprotect(0xb7752000, 4096, PROT_NONE)   = 0
    mmap2(0xb7753000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a3) = 0xb7753000
    mmap2(0xb7756000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7756000
    close(3)                                = 0
    mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75ae000
    set_thread_area({entry_number:-1 -> 6, base_addr:0xb75ae6c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
    mprotect(0xb7753000, 8192, PROT_READ)   = 0
    mprotect(0x8049000, 4096, PROT_READ)    = 0
    mprotect(0xb779c000, 4096, PROT_READ)   = 0
    munmap(0xb7759000, 131876)              = 0
    fstat64(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0
    mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7779000
    write(1, "flag see !
    ", 11flag see !
    )            = 11
    rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
    rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
    rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
    nanosleep({400, 0},
      陷入sleep

      从输出的跟踪信息能够看出两中情况的不同, 当用户态挂死情况下。strace在write(由调用的printf产生write进描写叙述符1,也就是标准输出) 一行输出之后没有其它系统调用输出。进程在内核态挂死,最后一行的系统调用nanosleep不能完整显示 (这里nanosleep没有返回值表示该调用尚未完毕 ) 所以我们就能够知道:使用strace跟踪挂死程序,假设最后一行系统调用显示完整,程序在逻辑代码处挂死。假设最后一行系统调用显示不完整。程序在该系统调用处挂死。



      当程序挂死在系统调用处,我们能够查看对应系统调用的man手冊。了解在什么情况下该系统调用会出现挂死情况。
      
      当然strace 还有其它一堆參数,就不说了,事实上我也没看,哈哈.经常使用跟踪进程信息使用方法的就是strace -o output.txt -T -tt -e trace=all -f -p pid (-e trace=all 意思是跟踪全部系统调用)

           Ok, 周末一上午写篇Blog 总比睡觉好.Over .

  • 相关阅读:
    PAT (Advanced Level) 1114. Family Property (25)
    PAT (Advanced Level) 1113. Integer Set Partition (25)
    PAT (Advanced Level) 1112. Stucked Keyboard (20)
    PAT (Advanced Level) 1111. Online Map (30)
    PAT (Advanced Level) 1110. Complete Binary Tree (25)
    PAT (Advanced Level) 1109. Group Photo (25)
    PAT (Advanced Level) 1108. Finding Average (20)
    PAT (Advanced Level) 1107. Social Clusters (30)
    PAT (Advanced Level) 1106. Lowest Price in Supply Chain (25)
    PAT (Advanced Level) 1105. Spiral Matrix (25)
  • 原文地址:https://www.cnblogs.com/llguanli/p/6941585.html
Copyright © 2011-2022 走看看