zoukankan      html  css  js  c++  java
  • execsnoop-短时进程追踪工具

    在实际工作中,偶尔会遇到系统的CPU使用率和系统平均负载很高,但却找不到高CPU的应用;

    产生这个问题的原因:进程有可能在不断的崩溃、重启

    通过uptime发现系统负载很高,但是通过top,mpstat,pidstat,perf等工具很难发现是什么进程导致了系统负载和CPU使用率很高;

    注:通过上面工具的判断,即不是CPU密集型,也不存在IO等待,也不存在进程、线程争用的情况

    execsnoop-专门用于为追踪短时进程(瞬时进程)设计的工具;

    它通过 ftrace 实时监控进程的 exec() 行为,并输出短时进程的基本信息,包括进程 PID、父进程 PID、命令行参数以及执行的结果。

    github地址:https://github.com/brendangregg/perf-tools/blob/master/execsnoop

    如何安装使用:将上面的github的内容复制,然后写入execsnoop文件,并且加上x权限即可;

    [root@localhost ~]# cat execsnoop 
    #!/bin/bash
    #
    # execsnoop - trace process exec() with arguments.
    #             Written using Linux ftrace.
    #
    # This shows the execution of new processes, especially short-lived ones that
    # can be missed by sampling tools such as top(1).
    #
    # USAGE: ./execsnoop [-hrt] [-n name]
    #
    # REQUIREMENTS: FTRACE and KPROBE CONFIG, sched:sched_process_fork tracepoint,
    # and either the sys_execve, stub_execve or do_execve kernel function. You may
    # already have these on recent kernels. And awk.
    #
    # This traces exec() from the fork()->exec() sequence, which means it won't
    # catch new processes that only fork(). With the -r option, it will also catch
    # processes that re-exec. It makes a best-effort attempt to retrieve the program
    # arguments and PPID; if these are unavailable, 0 and "[?]" are printed
    # respectively. There is also a limit to the number of arguments printed (by
    # default, 8), which can be increased using -a.
    #
    # This implementation is designed to work on older kernel versions, and without
    # kernel debuginfo. It works by dynamic tracing an execve kernel function to
    # read the arguments from the %si register. The sys_execve function is tried
    # first, then stub_execve and do_execve. The sched:sched_process_fork
    # tracepoint is used to get the PPID. This program is a workaround that should be
    # improved in the future when other kernel capabilities are made available. If
    # you need a more reliable tool now, then consider other tracing alternatives
    # (eg, SystemTap). This tool is really a proof of concept to see what ftrace can
    # currently do.
    #
    # From perf-tools: https://github.com/brendangregg/perf-tools
    #
    # See the execsnoop(8) man page (in perf-tools) for more info.
    #
    # COPYRIGHT: Copyright (c) 2014 Brendan Gregg.
    #
    #  This program is free software; you can redistribute it and/or
    #  modify it under the terms of the GNU General Public License
    #  as published by the Free Software Foundation; either version 2
    #  of the License, or (at your option) any later version.
    #
    #  This program is distributed in the hope that it will be useful,
    #  but WITHOUT ANY WARRANTY; without even the implied warranty of
    #  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    #  GNU General Public License for more details.
    #
    #  You should have received a copy of the GNU General Public License
    #  along with this program; if not, write to the Free Software Foundation,
    #  Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
    #
    #  (http://www.gnu.org/copyleft/gpl.html)
    #
    # 07-Jul-2014   Brendan Gregg   Created this.
    
    ### default variables
    tracing=/sys/kernel/debug/tracing
    flock=/var/tmp/.ftrace-lock; wroteflock=0
    opt_duration=0; duration=; opt_name=0; name=; opt_time=0; opt_reexec=0
    opt_argc=0; argc=8; max_argc=16; ftext=
    trap ':' INT QUIT TERM PIPE HUP # sends execution to end tracing section
    
    function usage {
            cat <<-END >&2
            USAGE: execsnoop [-hrt] [-a argc] [-d secs] [name]
                             -d seconds      # trace duration, and use buffers
                             -a argc         # max args to show (default 8)
                             -r              # include re-execs
                             -t              # include time (seconds)
                             -h              # this usage message
                             name            # process name to match (REs allowed)
              eg,
                   execsnoop                 # watch exec()s live (unbuffered)
                   execsnoop -d 1            # trace 1 sec (buffered)
                   execsnoop grep            # trace process names containing grep
                   execsnoop 'udevd$'        # process names ending in "udevd"
            See the man page and example file for more info.
    END
            exit
    }
    
    function warn {
            if ! eval "$@"; then
                    echo >&2 "WARNING: command failed "$@""
            fi
    }
    
    function end {
            # disable tracing
            echo 2>/dev/null
            echo "Ending tracing..." 2>/dev/null
            cd $tracing
            warn "echo 0 > events/kprobes/$kname/enable"
            warn "echo 0 > events/sched/sched_process_fork/enable"
            warn "echo -:$kname >> kprobe_events"
            warn "echo > trace"
            (( wroteflock )) && warn "rm $flock"
    }
    
    function die {
            echo >&2 "$@"
            exit 1
    }
    
    function edie {
            # die with a quiet end()
            echo >&2 "$@"
            exec >/dev/null 2>&1
            end
            exit 1
    }
    
    ### process options
    while getopts a:d:hrt opt
    do
            case $opt in
            a)      opt_argc=1; argc=$OPTARG ;;
            d)      opt_duration=1; duration=$OPTARG ;;
            r)      opt_reexec=1 ;;
            t)      opt_time=1 ;;
            h|?)    usage ;;
            esac
    done
    shift $(( $OPTIND - 1 ))
    if (( $# )); then
            opt_name=1
            name=$1
            shift
    fi
    (( $# )) && usage
    
    ### option logic
    (( opt_pid && opt_name )) && die "ERROR: use either -p or -n."
    (( opt_pid )) && ftext=" issued by PID $pid"
    (( opt_name )) && ftext=" issued by process name "$name""
    (( opt_file )) && ftext="$ftext for filenames containing "$file""
    (( opt_argc && argc > max_argc )) && die "ERROR: max -a argc is $max_argc."
    if (( opt_duration )); then
            echo "Tracing exec()s$ftext for $duration seconds (buffered)..."
    else
            echo "Tracing exec()s$ftext. Ctrl-C to end."
    fi
    
    ### select awk
    if (( opt_duration )); then
            [[ -x /usr/bin/mawk ]] && awk=mawk || awk=awk
    else
            # workarounds for mawk/gawk fflush behavior
            if [[ -x /usr/bin/gawk ]]; then
                    awk=gawk
            elif [[ -x /usr/bin/mawk ]]; then
                    awk="mawk -W interactive"
            else
                    awk=awk
            fi
    fi
    
    ### check permissions
    cd $tracing || die "ERROR: accessing tracing. Root user? Kernel has FTRACE?
        debugfs mounted? (mount -t debugfs debugfs /sys/kernel/debug)"
    
    ### ftrace lock
    [[ -e $flock ]] && die "ERROR: ftrace may be in use by PID $(cat $flock) $flock"
    echo $$ > $flock || die "ERROR: unable to write $flock."
    wroteflock=1
    
    ### build probe
    if [[ -x /usr/bin/getconf ]]; then
            bits=$(getconf LONG_BIT)
    else
            bits=64
            [[ $(uname -m) == i* ]] && bits=32
    fi
    (( offset = bits / 8 ))
    function makeprobe {
            func=$1
            kname=execsnoop_$func
            kprobe="p:$kname $func"
            i=0
            while (( i < argc + 1 )); do
                    # p:kname do_execve +0(+0(%si)):string +0(+8(%si)):string ...
                    kprobe="$kprobe +0(+$(( i * offset ))(%si)):string"
                    (( i++ ))
            done
    }
    # try in this order: sys_execve, stub_execve, do_execve
    makeprobe sys_execve
    
    ### setup and begin tracing
    echo nop > current_tracer
    if ! echo $kprobe >> kprobe_events 2>/dev/null; then
            makeprobe stub_execve
            if ! echo $kprobe >> kprobe_events 2>/dev/null; then
                makeprobe do_execve
                if ! echo $kprobe >> kprobe_events 2>/dev/null; then
                        edie "ERROR: adding a kprobe for execve. Exiting."
            fi
            fi
    fi
    if ! echo 1 > events/kprobes/$kname/enable; then
            edie "ERROR: enabling kprobe for execve. Exiting."
    fi
    if ! echo 1 > events/sched/sched_process_fork/enable; then
            edie "ERROR: enabling sched:sched_process_fork tracepoint. Exiting."
    fi
    echo "Instrumenting $func"
    (( opt_time )) && printf "%-16s " "TIMEs"
    printf "%6s %6s %s
    " "PID" "PPID" "ARGS"
    
    #
    # Determine output format. It may be one of the following (newest first):
    #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
    #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
    # To differentiate between them, the number of header fields is counted,
    # and an offset set, to skip the extra column when needed.
    #
    offset=$($awk 'BEGIN { o = 0; }
            $1 == "#" && $2 ~ /TASK/ && NF == 6 { o = 1; }
            $2 ~ /TASK/ { print o; exit }' trace)
    
    ### print trace buffer
    warn "echo > trace"
    ( if (( opt_duration )); then
            # wait then dump buffer
            sleep $duration
            cat -v trace
    else
            # print buffer live
            cat -v trace_pipe
    fi ) | $awk -v o=$offset -v opt_name=$opt_name -v name=$name 
        -v opt_duration=$opt_duration -v opt_time=$opt_time -v kname=$kname 
        -v opt_reexec=$opt_reexec '
            # common fields
            $1 != "#" {
                    # task name can contain dashes
                    comm = pid = $1
                    sub(/-[0-9][0-9]*/, "", comm)
                    sub(/.*-/, "", pid)
            }
            $1 != "#" && $(4+o) ~ /sched_process_fork/ {
                    cpid=$0
                    sub(/.* child_pid=/, "", cpid)
                    sub(/ .*/, "", cpid)
                    getppid[cpid] = pid
                    delete seen[pid]
            }
            $1 != "#" && $(4+o) ~ kname {
                    if (seen[pid])
                            next
                    if (opt_name && comm !~ name)
                            next
                    #
                    # examples:
                    # ... arg1="/bin/echo" arg2="1" arg3="2" arg4="3" ...
                    # ... arg1="sleep" arg2="2" arg3=(fault) arg4="" ...
                    # ... arg1="" arg2=(fault) arg3="" arg4="" ...
                    # the last example is uncommon, and may be a race.
                    #
                    if ($0 ~ /arg1=""/) {
                            args = comm " [?]"
                    } else {
                            args=$0
                            sub(/ arg[0-9]*=(fault).*/, "", args)
                            sub(/.*arg1="/, "", args)
                            gsub(/" arg[0-9]*="/, " ", args)
                            sub(/"$/, "", args)
                            if ($0 !~ /(fault)/)
                                    args = args " [...]"
                    }
                    if (opt_time) {
                            time = $(3+o); sub(":", "", time)
                            printf "%-16s ", time
                    }
                    printf "%6s %6d %s
    ", pid, getppid[pid], args
                    if (!opt_duration
                            fflush()
                    if (!opt_reexec) {
                            seen[pid] = 1
                            delete getppid[pid]
                    }
            }
            $0 ~ /LOST.*EVENT[S]/ { print "WARNING: " $0 > "/dev/stderr" }
    '
    
    ### end tracing
    end
    [root@localhost ~]# ./execsnoop
    Tracing exec()s. Ctrl-C to end.
    Instrumenting sys_execve
       PID   PPID ARGS
     27570  27566 gawk -v o=1 -v opt_name=0 -v name= -v opt_duration=0 [...]
     27571  27569 cat -v trace_pipe
  • 相关阅读:
    获取aspx页面执行时间完全解决方案
    WebForm中DataGrid的20篇经典文章
    不走寻常路 设计ASP.NET应用程序的七大绝招
    动态绑定dropdownlist 开始拣.NET
    Notes中几个处理多值域的通用函数
    Lotus开发之Lotus Notes中域的验证
    Undokumentierte @Formeln/LotusScript im Lotus Notes Client/Server
    domino server端的Notes.ini详解
    Lotus开发基本性能优化
    以Ajax方式显示Lotus Notes视图的javasript类库NotesView2
  • 原文地址:https://www.cnblogs.com/liujunjun/p/12248023.html
Copyright © 2011-2022 走看看