zoukankan      html  css  js  c++  java
  • Finding the source of signals on Linux with strace, auditd, or systemtap

    inux and UNIX® like operating systems commonly use signals to communicate between processes. The use of the command line kill is widely known. WebSphere Application Servers on Linux and UNIX by default respond to kill -3 by producing a javacore, and to kill -11 by creating s system core and exiting. There are in fact a lot of signals that may be sent and acted on.

    In some cases, we determine that a signal has unexpectedly come to a WebSphere Application Server and we need to determine which process/user sent the signal. This is possible in most cases with strace command for kill -3, but kill -9 and kill -11 are not usually reported.

    The strace utility is fairly universal and starting it with this line will generally find the source of kill -3 and so on:

    strace -tt -o /tmp/traceit -p <pid> &

     

    This results in volumes of output that do include the source of most signals:

    strace -tt -o /tmp/traceit -p <pid> &
    16:08:45.388961 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
    16:08:45.389113 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=21398, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---

     

    16:09:01.210200 --- SIGTTOU {si_signo=SIGTTOU, si_code=SI_USER, si_pid=829, si_uid=1000} ---

    In case you do not recognize SIGTTOU   use kill -l to list signals on your environment:

     kill -l

     1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP

     6) SIGABRT 7) SIGBUS 8) SIGFPE 9) SIGKILL 10) SIGUSR1

    11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM

    16) SIGSTKFLT 17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP

    21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU 25) SIGXFSZ

    26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 29) SIGIO 30) SIGPWR

    31) SIGSYS 34) SIGRTMIN 35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3

    38) SIGRTMIN+4 39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8

    43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13

    48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12

    53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7

    58) SIGRTMAX-6 59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2

    63) SIGRTMAX-1 64) SIGRTMAX

     

    which may be a surprise if you never used anything but 3, 9, and 11.   kill -22 is SIGTHOU and the process id and userid of the sender are listed. Unfortunately, most of the time strace does not show kill -9 and kill -11 as they are not trapped and all you get is this line:

    ++++  killed by SIGKILL  +++

     

    There are 2 available tools that are not usually installed and/or active on Linux but have so much functionality, they should be. These tools are included in the Linux repositories for the RHEL, SUSE, and Fedora distributions and are installed as any other software package would be using the usual Linux install tools. Since they are very functional at the system level, root or elevated access rights are needed. However, the install process is quite simple and the functionality is worthwhile.

     

    AUDIT

    Auditd is a daemon process or service that does as the name implies and produces audit logs of System level activities. It is installed from the usual repository as the audit package and then is configured in /etc/audit/auditd.conf and the rules are in /etc/audit/audit.rules.

    Example entry for kill signal logging:

    -a entry,always -F arch=b64 -S kill -k kill_signals

    then the command: sevice auditd start

    will log all signals in /ver/audit/audit.log with a key of kill_signals for searching by your favorite editor or you may use ausearch -k kill_signals

    Of course, this example captures all signals and is quite verbose. The usual output will look like this:

    time->Wed Jun  3 16:34:08 2015
    type=SYSCALL msg=audit(1433363648.091:6342): arch=c000003e syscall=62 success=no exit=-3 a0=1e06 a1=0 a2=1e06 a3=fffffffffffffff0 items=0 ppid=10044 pid=10140 auid=500 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=2 comm=4174746163682041504920696E6974 exe="/opt/ibm/WebSphere/AppServer/java/jre/bin/java" subj=unconfined_u:unconfined_r:unconfined_java_t:s0-s0:c0.c1023 key="kill_signals"
    ----
    time->Wed Jun  3 16:34:08 2015
    type=OBJ_PID msg=audit(1433363648.130:6343): opid=27307 oauid=-1 ouid=0 oses=-1 obj=system_u:system_r:initrc_t:s0 ocomm="symcfgd"
    type=SYSCALL msg=audit(1433363648.130:6343): arch=c000003e syscall=62 success=yes exit=0 a0=6aab a1=12 a2=f a3=50d items=0 ppid=1 pid=27214 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="sav-limitcpu" exe="/usr/bin/sav-limitcpu" subj=system_u:system_r:initrc_t:s0 key="kill_signals"
    ----
    time->Wed Jun  3 16:34:08 2015

     

    Stop the logging with service auditd stop command and see this link from RedHat for more information: How to use audit to monitor a specific SYSCALL

     

    System Tap

    This tool is relatively more complex and flexible than the audit tool. The tool provide probe and taps that are written in a script that is remarkably C like. It is similar to Dtrace on Solaris in that regard. It is also similar to Dtrace in that it offers a lot of probes to look at performance and memory as well as network activity. It too is easily installed (for example on RHEL yum install systemtap does it). Root access does seem to be required. Good news, it comes with a set of taps that will perform a comprehesive set of tracing. These live in /usr/share/systemtap. Root access is required or you may be a member of a group with the privileges.

    The basic command:

      stap sigkill.stp gets very verbose

    even on lab systems while the same script can be filtered. An example to trace kill commands for a specific pid and a specific command:

    stap sigkill.stp -x <pid> SIGKILL

    which logs:

    SIGKILL was sent to java (pid:<pid>) by bash uid:0

    on testing on a command sent from the command line.

     

    So you do need the script sigkill.stp which is created by RedHat and looks like this:

    #! /usr/bin/env stap
    # sigkill.stp
    # Copyright (C) 2007 Red Hat, Inc., Eugene Teo <eteo@redhat.com>
    #
    # This program is free software; you can redistribute it and/or modify
    # it under the terms of the GNU General Public License version 2 as
    # published by the Free Software Foundation.
    #
    # /usr/share/systemtap/tapset/signal.stp:
    # [...]
    # probe signal.send = _signal.send.*
    # {
    #     sig=$sig
    #     sig_name = _signal_name($sig)
    #     sig_pid = task_pid(task)
    #     pid_name = task_execname(task)
    # [...]
    probe signal.send {
      if (sig_name == "SIGKILL")
        printf("%s was sent to %s (pid:%d) by %s uid:%d
    ",
               sig_name, pid_name, sig_pid, execname(), uid())
    }

     

    Here is a very useful link for System Tap. It shows some useful tools for tracking down most signals (strace) or all of them (audit and system tap):
    Red Hat Enterprise Linux 6 SystemTap Beginners Guide Introduction to SystemTap

     

    https://www.ibm.com/developerworks/community/blogs/aimsupport/entry/Finding_the_source_of_signals_on_Linux_with_strace_auditd_or_Systemtap?lang=en

  • 相关阅读:
    oracle函数 TO_DATE(X[,c2[,c3]])
    oracle函数 TO_CHAR(x[[,c2],C3])
    oracle函数 RAWTOHEX(c1)
    oracle HEXTORAW(c1)
    oracle函数 CONVERT(c1,set1,set2)
    oracle函数 ROWIDTOCHAR(rowid)
    oracle函数 chartorowid(c1)
    创建可按比例调整的布局的 Windows 窗体
    A Byte of Python(简明Python教程) for Python 3.0 下载
    使用异步事件在后台进行计算并报告进度
  • 原文地址:https://www.cnblogs.com/DataArt/p/10176473.html
Copyright © 2011-2022 走看看