一、相关数据结构及其位置(大致浏览即可,介绍流程时再来仔细看)
1.1 进程描述符struct task_struct所在目录:includelinuxsched.h
关注task_struct中如下字段:
struct sigpending pending;存放着实时信号,位于includelinuxsignal.h,该结构有如下字段
struct sigpending { struct list_head list; \指针 sigset_t signal; \位图 };
但是该结构的list字段指向的是另一结构sigqueue,具体如下:
struct sigqueue { struct list_head list; int flags; siginfo_t info; struct user_struct *user; };
而sigqueue中的siginfo结构定义在includeasm-genericsiginfo.h之中,其中存放着出现特定信号的信息,具体不再赘述。
struct signal_struct *signal;进程的信号描述符的指针,和task_struct同样位于sched.h。这个字段表示的是所有线程组共享信号。在该结构中还有一个指针指向新的struct sigpending结构,然后struct sigpending结构又接着sigqueue结构。重要字段如下:
struct signal_struct { atomic_t count; \信号描述符的使用计数器 atomic_t live; \线程组中活动进程的数量 wait_queue_head_t wait_chldexit; \在wait4中睡眠的进程的等待队列 struct task_struct *curr_target; \接收信号的线程组中最后一个进程的描述符 struct sigpending shared_pending; \存放共享挂起信号的数据结构 int group_exit_code; \线程组的进程终止代码 struct task_struct *group_exit_task; \在杀死整个线程组的时候使用 int notify_count; \在杀死整个线程组的时候使用 int group_stop_count; \在停止整个线程组的时候使用 unsigned int flags; \在传递修改进程状态的信号时使用的标志 };
struct sighand_struct *sighand;信号处理程序描述符的指针,表示的是每个信号该怎样被该进程组处理和task_struct同样位于sched.h,结构如下:
struct sighand_struct { atomic_t count; \使用计数器 struct k_sigaction action[_NSIG]; \用途见下面 spinlock_t siglock; \锁 };
其中很多数据包含在k_sigaction结构之中,该结构在includei386signal.c之中(i386表示特定体系结构,可以选择别的),在该文件之中还有位图、位图的操作、信号(信号本就是体系结构相关的)的定义。重要结构如下:
struct k_sigaction { struct sigaction sa; }; struct sigaction { __sighandler_t sa_handler; \信号处理函数的入口地址 unsigned long sa_flags; \标志位,具体取值同样在includeasm-x86_64signal.c之中,不再赘述 sigset_t sa_mask; \信号处理函数运行时需要屏蔽的信号 };
sa_handler的取值在includeasm-genericsignal.h之中
#define SIG_DFL ((__force __sighandler_t)0) /* default signal handling */ #define SIG_IGN ((__force __sighandler_t)1) /* ignore signal */ #define SIG_ERR ((__force __sighandler_t)-1) /* error return from signal */
而sa_flags取值在includeasm-i386signal.h
sigset_t blocked, real_blocked; 被阻塞的掩码,sigset_t位于include体系结构signal.h,sigaction结构同样位于该文件
有几个重要的位图处理函数需要了解
sigaddset(set,sig)表示向set位图中添加sig位
sigdelset(set,sig)表示向set位图中删除sig位
sigismember(set,sig)表示向set位图中查询是否存在sig位
sigisemptyset(sigset_t *set)清空位图
sigfillset(sigset_t *set)填满位图
siginitset(sigset_t *set, unsigned long mask)用mask初始化set的低32位,0初始化set的高32位。
这是主要的几个,其他位图操作不再赘述了,见具体文件内。
sigset_t saved_sigmask;
屏蔽位图的备份
unsigned long sas_ss_sp;
信号处理程序备用堆栈地址,在设置栈帧的地方起作用
size_t sas_ss_size;
信号处理程序备用堆栈大小
int (*notifier)(void *priv);
指向一个函数的指针,设备驱动函数使用这个函数阻塞进程的某些信号
void *notifier_data;
指向notifier函数可能使用的数据
sigset_t *notifier_mask;
设备驱动函数通过notifier函数所阻塞的信号的位掩码
以进程的进程描述符为中心呈如下的组织形式
二、主要流程分析
2.1 改变信号相关操作
我们阅读上一部分的相关数据结构,知信号的操作都记录在sighand_struct结构之中,而其中一个k_siagction结构是表示该进程对于一个信号的处理方式。sa_handler表示信号处理函数的入口地址,sa_flags是一个标志集,指定必须怎样处理信号,具体取值在includeasm-体系结构signal.h之中,而sa_mask表示信号处理函数在运行时需要屏蔽的信号。
2.1.1 sigaction系统调用,该系统调用有3个参数,1是sig编号,2是用户传入的新的sigaction地址,3是存放进程描述符中旧的siagction的地址。
该系统调用服务例程sys_sigaction函数在arch体系结构kernelsignal.c之中
asmlinkage int sys_sigaction(int sig, const struct old_sigaction __user *act, struct old_sigaction __user *oact) { struct k_sigaction new_ka, old_ka; int ret; if (act) { old_sigset_t mask; if (!access_ok(VERIFY_READ, act, sizeof(*act)) || __get_user(new_ka.sa.sa_handler, &act->sa_handler) || __get_user(new_ka.sa.sa_restorer, &act->sa_restorer)) return -EFAULT; __get_user(new_ka.sa.sa_flags, &act->sa_flags); __get_user(mask, &act->sa_mask); siginitset(&new_ka.sa.sa_mask, mask); } ret = do_sigaction(sig, act ? &new_ka : NULL, oact ? &old_ka : NULL); if (!ret && oact) { if (!access_ok(VERIFY_WRITE, oact, sizeof(*oact)) || __put_user(old_ka.sa.sa_handler, &oact->sa_handler) || __put_user(old_ka.sa.sa_restorer, &oact->sa_restorer)) return -EFAULT; __put_user(old_ka.sa.sa_flags, &oact->sa_flags); __put_user(old_ka.sa.sa_mask.sig[0], &oact->sa_mask); } return ret; }
access_ok用来验证地址空间是否可读可写,__put_user和__get_user从内核读或写用户地址空间数据。
这个函数所做的工作大致有3步
a、从用户地址空间复制用户传入的新的sigaction结构,存放到本地变量new_ka
b、调用了do_siagction函数,从传给do_siagaction的参数可以看出,当用户传入空的新的siagction结构时,该函数还可以用来获取旧的siagction。
c、将旧的siagction结构传回到用户地址空间
我们到kernelsignal.c中去看do_sigaction的源码
int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact) { struct k_sigaction *k; sigset_t mask; if (!valid_signal(sig) || sig < 1 || (act && sig_kernel_only(sig))) return -EINVAL; k = ¤t->sighand->action[sig-1]; spin_lock_irq(¤t->sighand->siglock); if (signal_pending(current)) { /* * If there might be a fatal signal pending on multiple * threads, make sure we take it before changing the action. */ spin_unlock_irq(¤t->sighand->siglock); return -ERESTARTNOINTR; } if (oact) *oact = *k; if (act) { sigdelsetmask(&act->sa.sa_mask, sigmask(SIGKILL) | sigmask(SIGSTOP)); *k = *act; /* * POSIX 3.3.1.3: * "Setting a signal action to SIG_IGN for a signal that is * pending shall cause the pending signal to be discarded, * whether or not it is blocked." * * "Setting a signal action to SIG_DFL for a signal that is * pending and whose default action is to ignore the signal * (for example, SIGCHLD), shall cause the pending signal to * be discarded, whether or not it is blocked" */ if (act->sa.sa_handler == SIG_IGN || (act->sa.sa_handler == SIG_DFL && sig_kernel_ignore(sig))) { struct task_struct *t = current; sigemptyset(&mask); sigaddset(&mask, sig); rm_from_queue_full(&mask, &t->signal->shared_pending); do { rm_from_queue_full(&mask, &t->pending); recalc_sigpending_tsk(t); t = next_thread(t); } while (t != current); } } spin_unlock_irq(¤t->sighand->siglock); return 0; }
a、先判断信号编号是否合理,valid_signal来判断信号编号是否小于64(在includelinuxsignal.h之中),后面(act && sig_kernel_only(sig))用来保证SIGKILL和SIGSTOP这两个信号不能注册信号处理函数。
b、signal_pending定义在includelinuxsched.h之中,检查进程是否有还没有处理的信号,这时候我们不能在处理之前改变它们的处理方式,所以直接返回
c、接着,将旧的siagction结构地址传给oact
d、在act不为空的情况下,利用act改变sighand_struct中对应信号的sigaction结构。首先保证SIGKILL和SIGSTOP不能被屏蔽,然后用act替换sighand->action[sig-1],接着判断对信号是不是采取忽略的处理方式,如果是,还要从共享信号队列和私有信号队列来删除该信号。其中recalc_sigpending_tsk函数是用来重新检查标志位的,因为此时队列中的信号已经改变,不能确定是否还有需要处理的信号,所以需要重新判断调整一下。该函数内部涉及的函数不展开说了,主要名称及其作用如下:freezing表示是否悬挂。PENDING宏分别于共享队列位图和私有队列位图和block位图比较(block位图在进程描述符中),查看是不是有非block的信号。set_tsk_thread_flag设置标志位,clear_tsk_thread_flag取消标志位。
2.1.2 signal系统调用,传入参数1为信号编号,参数2位信号处理函数的入口地址
系统服务例程在kernelsignal.c中,为sys_signal。signal系统调用实际上就是sigaction系统调用的削弱版
asmlinkage unsigned long sys_signal(int sig, __sighandler_t handler) { struct k_sigaction new_sa, old_sa; int ret; new_sa.sa.sa_handler = handler; new_sa.sa.sa_flags = SA_ONESHOT | SA_NOMASK; sigemptyset(&new_sa.sa.sa_mask); ret = do_sigaction(sig, &new_sa, &old_sa); return ret ? ret : (unsigned long)old_sa.sa.sa_handler; }
可以看到就是对参数2信号处理函数地址简单封装成sigaction结构,sa_flags设置成了SA_ONESHOT | SA_NOMASK,他们分别表示信号处理函数调用之后,要把信号的处理函数设置成默认值信号和处理函数运行时不屏蔽信号,然后调用了do_siagction函数。相应数据对象的含义以及取值可以回看第一部分,do_sigaction函数在前面也介绍过了。
2.1.3 rt_sigaction系统调用,前3个参数同sigaction系统调用,第4个参数为位图大小
asmlinkage long sys_rt_sigaction(int sig, const struct sigaction __user *act, struct sigaction __user *oact, size_t sigsetsize) { struct k_sigaction new_sa, old_sa; int ret = -EINVAL; /* XXX: Don't preclude handling different sized sigset_t's. */ if (sigsetsize != sizeof(sigset_t)) goto out; if (act) { if (copy_from_user(&new_sa.sa, act, sizeof(new_sa.sa))) return -EFAULT; } ret = do_sigaction(sig, act ? &new_sa : NULL, oact ? &old_sa : NULL); if (!ret && oact) { if (copy_to_user(oact, &old_sa.sa, sizeof(old_sa.sa))) return -EFAULT; } out: return ret; }
可以看出该该函数只是保证了sigset的大小和第四个参数一致,其他和sigaction系统调用完全一致。
2.2 发送信号(下面的函数如果不特殊描述,都位于kernelsignal.c之中,体系结构都以i386为例)
2.2.1 kill系统调用将信号发送给一个线程组
kill是用户进程发送信号的系统调用,它的系统服务例程为sys_kill。
其中siginfo结构在includeasm-i386siginfo.h之中,该文件中只有一个include,所以真正的定义在includeasm-genericsiginfo.h中
sys_kill调用了kill_something_info函数。
static int kill_something_info(int sig, struct siginfo *info, int pid) { if (!pid) { return kill_pg_info(sig, info, process_group(current)); } else if (pid == -1) { int retval = 0, count = 0; struct task_struct * p; read_lock(&tasklist_lock); for_each_process(p) { if (p->pid > 1 && p->tgid != current->tgid) { int err = group_send_sig_info(sig, info, p); ++count; if (err != -EPERM) retval = err; } } read_unlock(&tasklist_lock); return count ? retval : -ESRCH; } else if (pid < 0) { return kill_pg_info(sig, info, -pid); } else { return kill_proc_info(sig, info, pid); } }
从该函数中,我们可以看到kill系统调用的pid参数的规则
pid的取值 | 意义 |
pid>0 | 将此信号发送给进程ID为pid的进程组 |
pid==0 | 将此信号发送给同组所有进程 |
pid<0 | 将此信号发送给组ID为-pid的进程 |
pd==-1 | 将此信号发送给系统所有的进程 |
下面我们看kill_something_info里面的具体函数
在前面的kill_something_info中涉及的kill_pg_info函数,表示将信号发送到一个组,该函数第3个参数为组id。
int kill_pg_info(int sig, struct siginfo *info, pid_t pgrp) { int retval; read_lock(&tasklist_lock); retval = __kill_pg_info(sig, info, pgrp); read_unlock(&tasklist_lock); return retval; }
可以看到,在加锁之后,简单调用了__kill_pg_info函数,如下
int __kill_pg_info(int sig, struct siginfo *info, pid_t pgrp) { struct task_struct *p = NULL; int retval, success; if (pgrp <= 0) return -EINVAL; success = 0; retval = -ESRCH; do_each_task_pid(pgrp, PIDTYPE_PGID, p) { int err = group_send_sig_info(sig, info, p); success |= !err; retval = err; } while_each_task_pid(pgrp, PIDTYPE_PGID, p); return success ? 0 : retval; }
其中do_each_task_pid宏和while_each_task_pid宏位于includelinuxpid.h之中,同时该文件中有pid_type枚举类型等,和kernelpid.c文件一起,包含着pid的相关内容。再具体的内容这里不再赘述,
接下来看group_send_sig_info函数
int group_send_sig_info(int sig, struct siginfo *info, struct task_struct *p) { unsigned long flags; int ret; ret = check_kill_permission(sig, info, p); if (!ret && sig) { ret = -ESRCH; if (lock_task_sighand(p, &flags)) { ret = __group_send_sig_info(sig, info, p); unlock_task_sighand(p, &flags); } } return ret; }
这个函数中先进行了一个权限检查,再调用了__group_send_sig_info这个函数,这个权限检查主要做了如下的检测:
sig编号是否错误,sig必须在1和64之间
符合下面条件之一:
- 发送信号者有相应的权能
- 信号为SIGCNT且目标进程与发送进程处于同一session
- 两进程属于同一用户
接下来看__group_send_sig_info函数(group_send_sig_info在调用该函数前锁住了前面描述的sighand)
int __group_send_sig_info(int sig, struct siginfo *info, struct task_struct *p) { int ret = 0; assert_spin_locked(&p->sighand->siglock); handle_stop_signal(sig, p); /* Short-circuit ignored signals. */ if (sig_ignored(p, sig)) return ret; if (LEGACY_QUEUE(&p->signal->shared_pending, sig)) /* This is a non-RT signal and we already have one queued. */ return ret; /* * Put this signal on the shared-pending queue, or fail with EAGAIN. * We always use the shared queue for process-wide signals, * to avoid several races. */ ret = send_signal(sig, info, p, &p->signal->shared_pending); if (unlikely(ret)) return ret; __group_complete_signal(sig, p); return 0; }
__group_send_sig_info中第一个需要注意的函数是handle_stop_signal
/* * Handle magic process-wide effects of stop/continue signals. * Unlike the signal actions, these happen immediately at signal-generation * time regardless of blocking, ignoring, or handling. This does the * actual continuing for SIGCONT, but not the actual stopping for stop * signals. The process stop is done as a signal action for SIG_DFL. */ static void handle_stop_signal(int sig, struct task_struct *p) { struct task_struct *t; if (p->signal->flags & SIGNAL_GROUP_EXIT) /* * The process is in the middle of dying already. */ return; if (sig_kernel_stop(sig)) { /* * This is a stop signal. Remove SIGCONT from all queues. */ rm_from_queue(sigmask(SIGCONT), &p->signal->shared_pending); t = p; do { rm_from_queue(sigmask(SIGCONT), &t->pending); t = next_thread(t); } while (t != p); } else if (sig == SIGCONT) { /* * Remove all stop signals from all queues, * and wake all threads. */ if (unlikely(p->signal->group_stop_count > 0)) { /* * There was a group stop in progress. We'll * pretend it finished before we got here. We are * obliged to report it to the parent: if the * SIGSTOP happened "after" this SIGCONT, then it * would have cleared this pending SIGCONT. If it * happened "before" this SIGCONT, then the parent * got the SIGCHLD about the stop finishing before * the continue happened. We do the notification * now, and it's as if the stop had finished and * the SIGCHLD was pending on entry to this kill. */ p->signal->group_stop_count = 0; p->signal->flags = SIGNAL_STOP_CONTINUED; spin_unlock(&p->sighand->siglock); do_notify_parent_cldstop(p, CLD_STOPPED); spin_lock(&p->sighand->siglock); } rm_from_queue(SIG_KERNEL_STOP_MASK, &p->signal->shared_pending); t = p; do { unsigned int state; rm_from_queue(SIG_KERNEL_STOP_MASK, &t->pending); /* * If there is a handler for SIGCONT, we must make * sure that no thread returns to user mode before * we post the signal, in case it was the only * thread eligible to run the signal handler--then * it must not do anything between resuming and * running the handler. With the TIF_SIGPENDING * flag set, the thread will pause and acquire the * siglock that we hold now and until we've queued * the pending signal. * * Wake up the stopped thread _after_ setting * TIF_SIGPENDING */ state = TASK_STOPPED; if (sig_user_defined(t, SIGCONT) && !sigismember(&t->blocked, SIGCONT)) { set_tsk_thread_flag(t, TIF_SIGPENDING); state |= TASK_INTERRUPTIBLE; } wake_up_state(t, state); t = next_thread(t); } while (t != p); if (p->signal->flags & SIGNAL_STOP_STOPPED) { /* * We were in fact stopped, and are now continued. * Notify the parent with CLD_CONTINUED. */ p->signal->flags = SIGNAL_STOP_CONTINUED; p->signal->group_exit_code = 0; spin_unlock(&p->sighand->siglock); do_notify_parent_cldstop(p, CLD_CONTINUED); spin_lock(&p->sighand->siglock); } else { /* * We are not stopped, but there could be a stop * signal in the middle of being processed after * being removed from the queue. Clear that too. */ p->signal->flags = 0; } } else if (sig == SIGKILL) { /* * Make sure that any pending stop signal already dequeued * is undone by the wakeup for SIGKILL. */ p->signal->flags = 0; } }
handle_stop_signal执行下面的步骤:
a、如果该线程组正在被杀死(p->signal->flags & SIGNAL_GROUP_EXIT == true),则返回
b、如果sig_kernel_stop(sig)==true,sig_kernel_stop是在该文件首定义的宏,表示sig是SIGSTOP(停止进程执行)、SIGTSTP(从tty发出停止进程)、SIGTTIN(后台进程请求输入)、SIGTTOU(后台进程请求输出)中之一,则调用rm_from_queue将SIGCONT从线程组所有进程的signal->shared_pending和pending中删除(这两个结构见第一部分数据结构的描述,分别表示共享信号队列和私有挂起队列)。
c、如果sig == SIGCONT,则执行和b中相反的操作,从线程组所有进程的signal->shared_pending和pending中删除SIGSTOP(停止进程执行)、SIGTSTP(从tty发出停止进程)、SIGTTIN(后台进程请求输入)、SIGTTOU(后台进程请求输出)
__group_send_sig_info中第二个需要注意的函数是sig_ignored函数,该函数用来判断是否忽略信号
static int sig_ignored(struct task_struct *t, int sig) { void __user * handler; /* * Tracers always want to know about signals.. */ if (t->ptrace & PT_PTRACED) return 0; /* * Blocked signals are never ignored, since the * signal handler may change by the time it is * unblocked. */ if (sigismember(&t->blocked, sig)) return 0; /* Is it explicitly or implicitly ignored? */ handler = t->sighand->action[sig-1].sa.sa_handler; return handler == SIG_IGN || (handler == SIG_DFL && sig_kernel_ignore(sig)); }
该函数首先看sig是否在blocked位图中(被阻塞的话一定不会忽略,阻塞时当前不处理,取消阻塞之后再处理,blocked位图见前面的数据描述),然后再看sighand->action[sig-1].sa.sa_handler(即sigaction结构中代表该信号处理方式的字段)的取值,具体取值回去看前面的取值。sig_kernel_ignore是该文件首部定义的宏,用来表示内核中默认忽略的信号,有SIGCONT、SIGCHLD(子进程停止、结束或在被跟踪时获得信号)、SIGWINCH(窗口调整大小)、SIGURG(套接字上的紧急事件)
__group_send_sig_info中第三个需要注意的是sig_ignored宏
该代码段判断信号是否在共享挂起队列中已经存在了,如果已经存在,则不做处理了,直接返回
__group_send_sig_info中第四个需要注意的是send_signal函数,该函数在挂起信号队列之中插入一个新元素。
static int send_signal(int sig, struct siginfo *info, struct task_struct *t, struct sigpending *signals) { struct sigqueue * q = NULL; int ret = 0; /* * fast-pathed signals for kernel-internal things like SIGSTOP * or SIGKILL. */ if (info == SEND_SIG_FORCED) goto out_set; /* Real-time signals must be queued if sent by sigqueue, or some other real-time mechanism. It is implementation defined whether kill() does so. We attempt to do so, on the principle of least surprise, but since kill is not allowed to fail with EAGAIN when low on memory we just make sure at least one signal gets delivered and don't pass on the info struct. */ q = __sigqueue_alloc(t, GFP_ATOMIC, (sig < SIGRTMIN && (is_si_special(info) || info->si_code >= 0))); if (q) { list_add_tail(&q->list, &signals->list); switch ((unsigned long) info) { case (unsigned long) SEND_SIG_NOINFO: q->info.si_signo = sig; q->info.si_errno = 0; q->info.si_code = SI_USER; q->info.si_pid = current->pid; q->info.si_uid = current->uid; break; case (unsigned long) SEND_SIG_PRIV: q->info.si_signo = sig; q->info.si_errno = 0; q->info.si_code = SI_KERNEL; q->info.si_pid = 0; q->info.si_uid = 0; break; default: copy_siginfo(&q->info, info); break; } } else if (!is_si_special(info)) { if (sig >= SIGRTMIN && info->si_code != SI_USER) /* * Queue overflow, abort. We may abort if the signal was rt * and sent by user using something other than kill(). */ return -EAGAIN; } out_set: sigaddset(&signals->signal, sig); return ret; }
a、如果info==2,则该信号为SIGSTOP或是SIGKILL,并且已经经过内核处理,此时直接向共享信号挂起队列加入该信号,然后返回
b、调用__sigqueue_alloc分配一个sigqueue,并且填充siginfo结构,加入到队尾。
即使没有足够的空间,sigpending中的位图也被置位了,保证了重要的信号不被忽略。
__group_send_sig_info中第四个需要注意的是__group_complete_signal函数,该函数扫描线程组中的进程,查找能接受新信号的进程
static void __group_complete_signal(int sig, struct task_struct *p) { struct task_struct *t; /* * Now find a thread we can wake up to take the signal off the queue. * * If the main thread wants the signal, it gets first crack. * Probably the least surprising to the average bear. */ if (wants_signal(sig, p)) t = p; else if (thread_group_empty(p)) /* * There is just one thread and it does not need to be woken. * It will dequeue unblocked signals before it runs again. */ return; else { /* * Otherwise try to find a suitable thread. */ t = p->signal->curr_target; if (t == NULL) /* restart balancing at this thread */ t = p->signal->curr_target = p; while (!wants_signal(sig, t)) { t = next_thread(t); if (t == p->signal->curr_target) /* * No thread needs to be woken. * Any eligible threads will see * the signal in the queue soon. */ return; } p->signal->curr_target = t; } /* * Found a killable thread. If the signal will be fatal, * then start taking the whole group down immediately. */ if (sig_fatal(p, sig) && !(p->signal->flags & SIGNAL_GROUP_EXIT) && !sigismember(&t->real_blocked, sig) && (sig == SIGKILL || !(t->ptrace & PT_PTRACED))) { /* * This signal will be fatal to the whole group. */ if (!sig_kernel_coredump(sig)) { /* * Start a group exit and wake everybody up. * This way we don't have other threads * running and doing things after a slower * thread has the fatal signal pending. */ p->signal->flags = SIGNAL_GROUP_EXIT; p->signal->group_exit_code = sig; p->signal->group_stop_count = 0; t = p; do { sigaddset(&t->pending.signal, SIGKILL); signal_wake_up(t, 1); t = next_thread(t); } while (t != p); return; } /* * There will be a core dump. We make all threads other * than the chosen one go into a group stop so that nothing * happens until it gets scheduled, takes the signal off * the shared queue, and does the core dump. This is a * little more complicated than strictly necessary, but it * keeps the signal state that winds up in the core dump * unchanged from the death state, e.g. which thread had * the core-dump signal unblocked. */ rm_from_queue(SIG_KERNEL_STOP_MASK, &t->pending); rm_from_queue(SIG_KERNEL_STOP_MASK, &p->signal->shared_pending); p->signal->group_stop_count = 0; p->signal->group_exit_task = t; t = p; do { p->signal->group_stop_count++; signal_wake_up(t, 0); t = next_thread(t); } while (t != p); wake_up_process(p->signal->group_exit_task); return; } /* * The signal is already in the shared-pending queue. * Tell the chosen thread to wake up and dequeue it. */ signal_wake_up(t, sig == SIGKILL); return; }
该函数找到一个可以被发送信号的进程,然后判断是否是致命的信号,如果是,则增加SIGKILL信号,唤醒所有进程。如果不是致命信号,则通知该被选中的进程。唤醒进程使用signal_wake_up函数。
void signal_wake_up(struct task_struct *t, int resume) { unsigned int mask; set_tsk_thread_flag(t, TIF_SIGPENDING); /* * For SIGKILL, we want to wake it up in the stopped/traced case. * We don't check t->state here because there is a race with it * executing another processor and just now entering stopped state. * By using wake_up_state, we ensure the process will wake up and * handle its death signal. */ mask = TASK_INTERRUPTIBLE; if (resume) mask |= TASK_STOPPED | TASK_TRACED; if (!wake_up_state(t, mask)) kick_process(t); }
该函数置位了TIF_SIGPENDING位,调用wake_up_state唤醒,如果失败,则发送核间中断来确保进程被通知到。
以上,kill_something_info函数的kill_pg_info已经分析完了,我们接下来看kill_proc_info函数
int kill_proc_info(int sig, struct siginfo *info, pid_t pid) { int error; int acquired_tasklist_lock = 0; struct task_struct *p; rcu_read_lock(); if (unlikely(sig_needs_tasklist(sig))) { read_lock(&tasklist_lock); acquired_tasklist_lock = 1; } p = find_task_by_pid(pid); error = -ESRCH; if (p) error = group_send_sig_info(sig, info, p); if (unlikely(acquired_tasklist_lock)) read_unlock(&tasklist_lock); rcu_read_unlock(); return error; }
加锁之后调用了group_send_sig_info函数
2.2.2 tkill系统调用,将信号发送到进程(不是进程组,和kill区别一下)。tgkill系统调用,向一个特定线程组中的进程发信号
tkill的原型和kill一样,而tgkill调用有3个参数,第一个参数为组id,第二个为pid,第三个为信号编号,原型如下
tkill(int pid, int sig) tgkill(int tgid, int pid, int sig)
根据系统调用的规则,他们的服务例程分别为sys_tgkill和sys_tkill
asmlinkage long sys_tkill(int pid, int sig) { /* This is only valid for single tasks */ if (pid <= 0) return -EINVAL; return do_tkill(0, pid, sig); }
asmlinkage long sys_tgkill(int tgid, int pid, int sig) { /* This is only valid for single tasks */ if (pid <= 0 || tgid <= 0) return -EINVAL; return do_tkill(tgid, pid, sig); }
sys_tkill和sys_tgkill做了简单的pid和tgid判断之后,都调用了do_kill函数,只是do_tkill的第一个参数不同,我们看do_tkill
static int do_tkill(int tgid, int pid, int sig) { int error; struct siginfo info; struct task_struct *p; error = -ESRCH; info.si_signo = sig; info.si_errno = 0; info.si_code = SI_TKILL; info.si_pid = current->tgid; info.si_uid = current->uid; read_lock(&tasklist_lock); p = find_task_by_pid(pid); if (p && (tgid <= 0 || p->tgid == tgid)) { error = check_kill_permission(sig, &info, p); /* * The null signal is a permissions and process existence * probe. No signal is actually delivered. */ if (!error && sig && p->sighand) { spin_lock_irq(&p->sighand->siglock); handle_stop_signal(sig, p); error = specific_send_sig_info(sig, &info, p); spin_unlock_irq(&p->sighand->siglock); } } read_unlock(&tasklist_lock); return error; }
do_tkill中find_task_by_pid顾名思义,check_kill_permission和handle_stop_signal在kill流程中已经介绍过,最后我们来看核心处理步骤
specific_send_sig_info函数
static int specific_send_sig_info(int sig, struct siginfo *info, struct task_struct *t) { int ret = 0; BUG_ON(!irqs_disabled()); assert_spin_locked(&t->sighand->siglock); /* Short-circuit ignored signals. */ if (sig_ignored(t, sig)) goto out; /* Support queueing exactly one non-rt signal, so that we can get more detailed information about the cause of the signal. */ if (LEGACY_QUEUE(&t->pending, sig)) goto out; ret = send_signal(sig, info, t, &t->pending); if (!ret && !sigismember(&t->blocked, sig)) signal_wake_up(t, sig == SIGKILL); out: return ret; }
sig_ignored看该信号是否应该被忽略,LEGACY_QUEUE判断信号是不是非实时信号,如果是,而且私有队列已经存在该信号则直接返回,然后调用send_sig,成功后如果该信号不阻塞,还会调用signal_wake_up唤醒进程(signal_wake_up和send_sig已经在kill处描述过)
static int sig_ignored(struct task_struct *t, int sig) { void __user * handler; /* * Tracers always want to know about signals.. */ if (t->ptrace & PT_PTRACED) return 0; /* * Blocked signals are never ignored, since the * signal handler may change by the time it is * unblocked. */ if (sigismember(&t->blocked, sig)) return 0; /* Is it explicitly or implicitly ignored? */ handler = t->sighand->action[sig-1].sa.sa_handler; return handler == SIG_IGN || (handler == SIG_DFL && sig_kernel_ignore(sig)); }
sig_kernel_ignore判断信号编号是否小于32,而且是不是SIGCONT、SIGCHLD、SIGWINCH、SIGURG其中之一。
2.2.3 sys_rt_sigqueueinfo系统调用,向线程组发送一个实时信号,第三个参数为siginfo的指针。
asmlinkage long sys_rt_sigqueueinfo(int pid, int sig, siginfo_t __user *uinfo) { siginfo_t info; if (copy_from_user(&info, uinfo, sizeof(siginfo_t))) return -EFAULT; /* Not even root can pretend to send signals from the kernel. Nor can they impersonate a kill(), which adds source info. */ if (info.si_code >= 0) return -EPERM; info.si_signo = sig; /* POSIX.1b doesn't mention process groups. */ return kill_proc_info(sig, &info, pid); }
a、 使用siginfo的指针从用户空间获取siginfo结构
b、查看includeasm-genericsiginfo.h中关于si_code的描述,si_code>0时表示来自内核,等于0时表示使用kill、sigsend、raise。这里做了一个判断
c、调用了kill_proc_info函数
这里仔细看一下,sys_rt_sigqueueinfo走的是kill那条路线,向一个组发送信号,在__group_send_sig_info中的LEGACY_QUEUE宏判断时,过滤了重复的非实时信号(见前面的描述)
2.3 传递信号
2.3.1 判断信号的处理方式
在中断处理函数返回时,会检查进程的TIF_SIGPENDING位,查看是否有信号到来,如果有,则会调用do_signal函数,该函数在arch体系结构kernelsignal.c之中。
static void fastcall do_signal(struct pt_regs *regs) { siginfo_t info; int signr; struct k_sigaction ka; sigset_t *oldset; /* * We want the common case to go fast, which * is why we may in certain cases get here from * kernel mode. Just return without doing anything * if so. vm86 regs switched out by assembly code * before reaching here, so testing against kernel * CS suffices. */ if (!user_mode(regs)) return; if (test_thread_flag(TIF_RESTORE_SIGMASK)) oldset = ¤t->saved_sigmask; else oldset = ¤t->blocked; signr = get_signal_to_deliver(&info, &ka, regs, NULL); if (signr > 0) { /* Reenable any watchpoints before delivering the * signal to user space. The processor register will * have been cleared if the watchpoint triggered * inside the kernel. */ if (unlikely(current->thread.debugreg[7])) set_debugreg(current->thread.debugreg[7], 7); /* Whee! Actually deliver the signal. */ if (handle_signal(signr, &info, &ka, oldset, regs) == 0) { /* a signal was successfully delivered; the saved * sigmask will have been stored in the signal frame, * and will be restored by sigreturn, so we can simply * clear the TIF_RESTORE_SIGMASK flag */ if (test_thread_flag(TIF_RESTORE_SIGMASK)) clear_thread_flag(TIF_RESTORE_SIGMASK); } return; } /* Did we come from a system call? */ if (regs->orig_eax >= 0) { /* Restart the system call - no handlers present */ switch (regs->eax) { case -ERESTARTNOHAND: case -ERESTARTSYS: case -ERESTARTNOINTR: regs->eax = regs->orig_eax; regs->eip -= 2; break; case -ERESTART_RESTARTBLOCK: regs->eax = __NR_restart_syscall; regs->eip -= 2; break; } } /* if there's no signal to deliver, we just put the saved sigmask * back */ if (test_thread_flag(TIF_RESTORE_SIGMASK)) { clear_thread_flag(TIF_RESTORE_SIGMASK); sigprocmask(SIG_SETMASK, ¤t->saved_sigmask, NULL); } }
do_signal使用的参数为pt_reg,在includeasm-i386ptrace.h之中,表示cpu中各寄存器内容。
首先,判断使用user_mode函数判断进程是否处于用户态,该函数同样在ptrace.h中,使用xcs寄存器中最低两位判断,如果进程处于内核态,则do_signal函数直接返回,由于信号检查是写在中断处理恢复的地方,处于内核态就说明该时刻不是内核态转回用户态的时机。
TIF_RESTORE_SIGMASK位定义在includeasm-i386 hread_info.h,包括检查信号的标志位TIF_SIGPENDING。TIF_RESTORE_SIGMASK这个为就是和进程描述符中saved_sigmask位图,我的理解相当于在更改了block的信号之后,将原先的位图保存在saved_sigmask之中,同时置这个位。
然后看get_signal_to_deliver函数
int get_signal_to_deliver(siginfo_t *info, struct k_sigaction *return_ka, struct pt_regs *regs, void *cookie) { sigset_t *mask = ¤t->blocked; int signr = 0; try_to_freeze(); relock: spin_lock_irq(¤t->sighand->siglock); for (;;) { struct k_sigaction *ka; if (unlikely(current->signal->group_stop_count > 0) && handle_group_stop()) goto relock; signr = dequeue_signal(current, mask, info); if (!signr) break; /* will return 0 */ if ((current->ptrace & PT_PTRACED) && signr != SIGKILL) { ptrace_signal_deliver(regs, cookie); /* Let the debugger run. */ ptrace_stop(signr, signr, info); /* We're back. Did the debugger cancel the sig? */ signr = current->exit_code; if (signr == 0) continue; current->exit_code = 0; /* Update the siginfo structure if the signal has changed. If the debugger wanted something specific in the siginfo structure then it should have updated *info via PTRACE_SETSIGINFO. */ if (signr != info->si_signo) { info->si_signo = signr; info->si_errno = 0; info->si_code = SI_USER; info->si_pid = current->parent->pid; info->si_uid = current->parent->uid; } /* If the (new) signal is now blocked, requeue it. */ if (sigismember(¤t->blocked, signr)) { specific_send_sig_info(signr, info, current); continue; } } ka = ¤t->sighand->action[signr-1]; if (ka->sa.sa_handler == SIG_IGN) /* Do nothing. */ continue; if (ka->sa.sa_handler != SIG_DFL) { /* Run the handler. */ *return_ka = *ka; if (ka->sa.sa_flags & SA_ONESHOT) ka->sa.sa_handler = SIG_DFL; break; /* will return non-zero "signr" value */ } /* * Now we are doing the default action for this signal. */ if (sig_kernel_ignore(signr)) /* Default is nothing. */ continue; /* Init gets no signals it doesn't want. */ if (current == child_reaper) continue; if (sig_kernel_stop(signr)) { /* * The default action is to stop all threads in * the thread group. The job control signals * do nothing in an orphaned pgrp, but SIGSTOP * always works. Note that siglock needs to be * dropped during the call to is_orphaned_pgrp() * because of lock ordering with tasklist_lock. * This allows an intervening SIGCONT to be posted. * We need to check for that and bail out if necessary. */ if (signr != SIGSTOP) { spin_unlock_irq(¤t->sighand->siglock); /* signals can be posted during this window */ if (is_orphaned_pgrp(process_group(current))) goto relock; spin_lock_irq(¤t->sighand->siglock); } if (likely(do_signal_stop(signr))) { /* It released the siglock. */ goto relock; } /* * We didn't actually stop, due to a race * with SIGCONT or something like that. */ continue; } spin_unlock_irq(¤t->sighand->siglock); /* * Anything else is fatal, maybe with a core dump. */ current->flags |= PF_SIGNALED; if (sig_kernel_coredump(signr)) { /* * If it was able to dump core, this kills all * other threads in the group and synchronizes with * their demise. If we lost the race with another * thread getting here, it set group_exit_code * first and our do_group_exit call below will use * that value and ignore the one we pass it. */ do_coredump((long)signr, signr, regs); } /* * Death signals, no core dump. */ do_group_exit(signr); /* NOTREACHED */ } spin_unlock_irq(¤t->sighand->siglock); return signr; }
try_to_freeze函数在includelinuxsched.h之中,这个版本还没有实质内容,相关书籍上都是使用更老的版本,所以这里略过去了。
接着,加锁sighand,然后开始一个循环。
首先看group_stop_count这个结构,它和handle_group_stop的关系暂时不清楚,后面要再看看
接着调用dequeue_signal函数,该函数先考虑私有队列,然后看共有队列,从中选取一个信号,然后返回其编号,这里面包含了两个队列的结构更新,以及recal_sigpending更新标志位。关于该函数中还有许多标志位,有一些做法也不是很了解,记着后面再看。
我们回到get_signal_to_deliver函数之中,通过dequeue_signal得到信号之后,获取sigaction结构
根据sigaction结构中sa_handler字段来判断信号的处理方式,如果是SIG_IGN,则忽略,如果是SIG_DFL,则按默认处理,如果是其他,则代表了该字段存放的是信号处理函数的入口地址,需要执行信号处理函数,将sigaction结构传出去。(在这里可以看到SA_ONESHOT的作用,该位是signal函数时被置位)
2.3.2 信号的默认处理方式
默认处理方式有忽略,杀死进程,执行dump_core函数。默认处理在get_signal_to_deliver中
2.3.3 捕获信号
如果是捕获信号,则get_signal_to_deliver函数返回sigaction结构的内容和siginfo的内容到do_signal中,现在返回到do_signal函数中去看。
捕获信号实际上是通过handle_signal函数来做的。到archi386kernelsignal.c之中看handle_signal函数
static int handle_signal(unsigned long sig, siginfo_t *info, struct k_sigaction *ka, sigset_t *oldset, struct pt_regs * regs) { int ret; /* Are we from a system call? */ if (regs->orig_eax >= 0) { /* If so, check system call restarting.. */ switch (regs->eax) { case -ERESTART_RESTARTBLOCK: case -ERESTARTNOHAND: regs->eax = -EINTR; break; case -ERESTARTSYS: if (!(ka->sa.sa_flags & SA_RESTART)) { regs->eax = -EINTR; break; } /* fallthrough */ case -ERESTARTNOINTR: regs->eax = regs->orig_eax; regs->eip -= 2; } } /* * If TF is set due to a debugger (PT_DTRACE), clear the TF flag so * that register information in the sigcontext is correct. */ if (unlikely(regs->eflags & TF_MASK) && likely(current->ptrace & PT_DTRACE)) { current->ptrace &= ~PT_DTRACE; regs->eflags &= ~TF_MASK; } /* Set up the stack frame */ if (ka->sa.sa_flags & SA_SIGINFO) ret = setup_rt_frame(sig, ka, info, oldset, regs); else ret = setup_frame(sig, ka, oldset, regs); if (ret == 0) { spin_lock_irq(¤t->sighand->siglock); sigorsets(¤t->blocked,¤t->blocked,&ka->sa.sa_mask); if (!(ka->sa.sa_flags & SA_NODEFER)) sigaddset(¤t->blocked,sig); recalc_sigpending(); spin_unlock_irq(¤t->sighand->siglock); } return ret; }
这个函数首先做了一些系统调用的重新执行的操作,后面再说。
handle_signal函数使用setup_rt_frame或setup_frame函数来设置栈帧,后面就是设置mask位图和一些标志位的操作,recalc_sigpending函数查看信号情况来设置TIF_SIGPENGING位,SA_NODEFER来决定信号处理函数运行时是否屏蔽该信号
看setup_frame函数
static int setup_frame(int sig, struct k_sigaction *ka, sigset_t *set, struct pt_regs * regs) { void __user *restorer; struct sigframe __user *frame; int err = 0; int usig; frame = get_sigframe(ka, regs, sizeof(*frame)); if (!access_ok(VERIFY_WRITE, frame, sizeof(*frame))) goto give_sigsegv; usig = current_thread_info()->exec_domain && current_thread_info()->exec_domain->signal_invmap && sig < 32 ? current_thread_info()->exec_domain->signal_invmap[sig] : sig; err = __put_user(usig, &frame->sig); if (err) goto give_sigsegv; err = setup_sigcontext(&frame->sc, &frame->fpstate, regs, set->sig[0]); if (err) goto give_sigsegv; if (_NSIG_WORDS > 1) { err = __copy_to_user(&frame->extramask, &set->sig[1], sizeof(frame->extramask)); if (err) goto give_sigsegv; } restorer = (void *)VDSO_SYM(&__kernel_sigreturn); if (ka->sa.sa_flags & SA_RESTORER) restorer = ka->sa.sa_restorer; /* Set up to return from userspace. */ err |= __put_user(restorer, &frame->pretcode); /* * This is popl %eax ; movl $,%eax ; int $0x80 * * WE DO NOT USE IT ANY MORE! It's only left here for historical * reasons and because gdb uses it as a signature to notice * signal handler stack frames. */ err |= __put_user(0xb858, (short __user *)(frame->retcode+0)); err |= __put_user(__NR_sigreturn, (int __user *)(frame->retcode+2)); err |= __put_user(0x80cd, (short __user *)(frame->retcode+6)); if (err) goto give_sigsegv; /* Set up registers for signal handler */ regs->esp = (unsigned long) frame; regs->eip = (unsigned long) ka->sa.sa_handler; regs->eax = (unsigned long) sig; regs->edx = (unsigned long) 0; regs->ecx = (unsigned long) 0; set_fs(USER_DS); regs->xds = __USER_DS; regs->xes = __USER_DS; regs->xss = __USER_DS; regs->xcs = __USER_CS; /* * Clear TF when entering the signal handler, but * notify any tracer that was single-stepping it. * The tracer may want to single-step inside the * handler too. */ regs->eflags &= ~TF_MASK; if (test_thread_flag(TIF_SINGLESTEP)) ptrace_notify(SIGTRAP); #if DEBUG_SIG printk("SIG deliver (%s:%d): sp=%p pc=%p ra=%p ", current->comm, current->pid, frame, regs->eip, frame->pretcode); #endif return 0; give_sigsegv: force_sigsegv(sig, current); return -EFAULT; }
现实调用了get_sigframe函数
static inline void __user * get_sigframe(struct k_sigaction *ka, struct pt_regs * regs, size_t frame_size) { unsigned long esp; /* Default to using normal stack */ esp = regs->esp; /* This is the X/Open sanctioned signal stack switching. */ if (ka->sa.sa_flags & SA_ONSTACK) { if (sas_ss_flags(esp) == 0) esp = current->sas_ss_sp + current->sas_ss_size; } /* This is the legacy signal stack switching. */ else if ((regs->xss & 0xffff) != __USER_DS && !(ka->sa.sa_flags & SA_RESTORER) && ka->sa.sa_restorer) { esp = (unsigned long) ka->sa.sa_restorer; } esp -= frame_size; /* Align the stack pointer according to the i386 ABI, * i.e. so that on function entry ((sp + 4) & 15) == 0. */ esp = ((esp + 4) & -16ul) - 4; return (void __user *) esp; }
a、如果sa_flags设置了SA_ONSTACK标志,则使用备用堆栈,几个字段见第一部分的进程描述符字段描述
b、接着还有一个ss寄存器和SA_RESTORER标志位的判断,涉及到sigaction结构中第四个参数
c、将esp向栈顶移动一个frame的空间,为栈帧提供空间
d、有一句根据i386规整esp的语句,还不明白是什么意思
总而言之get_sigframe在栈上留出了一个栈帧大小的空位,然后返回esp,接着看get_sigframe函数,
下面的一长段都是初始化各个sigframe中的字段,其中还有setup_sigcontext,设置sigcontext的值
sigframe在archi386kernelsigframe.h中,sigcontext在includeasm-i386sigcontext.h之中(i386为具体体系结构)
struct sigframe { char __user *pretcode; int sig; struct sigcontext sc; struct _fpstate fpstate; unsigned long extramask[_NSIG_WORDS-1]; char retcode[8]; }; struct rt_sigframe { char __user *pretcode; int sig; struct siginfo __user *pinfo; void __user *puc; struct siginfo info; struct ucontext uc; struct _fpstate fpstate; char retcode[8]; };
struct sigcontext { unsigned short gs, __gsh; unsigned short fs, __fsh; unsigned short es, __esh; unsigned short ds, __dsh; unsigned long edi; unsigned long esi; unsigned long ebp; unsigned long esp; unsigned long ebx; unsigned long edx; unsigned long ecx; unsigned long eax; unsigned long trapno; unsigned long err; unsigned long eip; unsigned short cs, __csh; unsigned long eflags; unsigned long esp_at_signal; unsigned short ss, __ssh; struct _fpstate __user * fpstate; unsigned long oldmask; unsigned long cr2; };
关于这段的设计思想,还有各个字段的值,我在这里就不多说了,书上都描述的很清楚了。
这段还有一个force_sig的函数下面描述吧
到这里handle_sig函数也就执行完了,返回到do_signal函数,剩下的还有系统调用的重新执行,根据eax中的返回值做一些处理,这里也就不多说了。
2.4 sigreturn的执行
信号处理函数执行完之后会执行sigreturn系统调用
asmlinkage int sys_sigreturn(unsigned long __unused) { struct pt_regs *regs = (struct pt_regs *) &__unused; struct sigframe __user *frame = (struct sigframe __user *)(regs->esp - 8); sigset_t set; int eax; if (!access_ok(VERIFY_READ, frame, sizeof(*frame))) goto badframe; if (__get_user(set.sig[0], &frame->sc.oldmask) || (_NSIG_WORDS > 1 && __copy_from_user(&set.sig[1], &frame->extramask, sizeof(frame->extramask)))) goto badframe; sigdelsetmask(&set, ~_BLOCKABLE); spin_lock_irq(¤t->sighand->siglock); current->blocked = set; recalc_sigpending(); spin_unlock_irq(¤t->sighand->siglock); if (restore_sigcontext(regs, &frame->sc, &eax)) goto badframe; return eax; badframe: force_sig(SIGSEGV, current); return 0; }
这个函数做的主要工作就是恢复屏蔽位图以及寄存器值,还有一个需要注意的函数就是force_sig函数
force_sig(int sig, struct task_struct *p) { force_sig_info(sig, SEND_SIG_PRIV, p); } force_sig_info(int sig, struct siginfo *info, struct task_struct *t) { unsigned long int flags; int ret, blocked, ignored; struct k_sigaction *action; spin_lock_irqsave(&t->sighand->siglock, flags); action = &t->sighand->action[sig-1]; ignored = action->sa.sa_handler == SIG_IGN; blocked = sigismember(&t->blocked, sig); if (blocked || ignored) { action->sa.sa_handler = SIG_DFL; if (blocked) { sigdelset(&t->blocked, sig); recalc_sigpending_tsk(t); } } ret = specific_send_sig_info(sig, info, t); spin_unlock_irqrestore(&t->sighand->siglock, flags); return ret; }
所做的工作也很明显,就像名字一样,强制产生一个信号,不管现在进程时要忽略还是要屏蔽这个信号,里面涉及的函数在前面都已经描述过了
在sigreturn执行过后,进程再次执行时就会按照信号处理之前的时候继续执行。整个信号处理过程也就结束了。