zoukankan      html  css  js  c++  java
  • select函数的原理

    首先再来提一下I/O多路转接的基本思想:先构造一张有关描述符的表,然后调用一个函数,它要到这些描述符中的一个已准备好进行 I/O时才返回。在返回时,它告诉进程哪一个描述符已准备好可以进行 I/O。

    select函数的参数将告诉内核:

    (1) 我们所关心的描述符。

    (2) 对于每个描述符我们所关心的条件(是否读一个给定的描述符?是否想写一个给定的

    描述符?是否关心一个描述符的异常条件?)。

    (3) 希望等待多长时间(可以永远等待,等待一个固定量时间,或完全不等待)

    select从内核返回后内核会告诉我们:

    (1) 已准备好的描述符的数量。

    (2) 哪一个描述符已准备好读、写或异常条件。

    select 用于查询设备的状态,以便用户程序获知是否能对设备进行非阻塞的访问,需要设备驱动程序中的poll 函数支持。 驱动程序中 poll 函数中最主要用到的一个 API 是 poll_wait,其原型如下:

    void poll_wait(struct file *filp, wait_queue_heat_t *queue, poll_table * wait);

    poll_wait 函数所做的工作是把当前进程添加到 wait 参数指定的等待列表(poll_table)中。

    需要说明的是,poll_wait 函数并不阻塞,程序中 poll_wait(filp, &outq, wait)这句话的意思并不是说一直等待 outq 信号量可获得,真正的阻塞动作是上层的 select/poll 函数中完成的。select/poll 会在一个循环中对每个需要监听的设备调用它们自己的 poll 支持函数以使得当前进程被加入各个设备的等待列表。若当前没有任何被监听的设备就绪,则内核进行调度(调用 schedule)让出 cpu 进入阻塞状态,schedule 返回时将再次循环检测是否有操作可以进行,如此反复;否则,若有任意一个设备就绪,select/poll 都立即返回。

    应用程序调用select() 函数,系统调用陷入内核,进入到:

    SYSCALL_DEFINE5 (sys_select)----> core_sys_select -----> do_select()

    SYSCALL_DEFINE5(select, int, n, fd_set __user *, inp, fd_set __user *, outp,
    
                  fd_set __user *, exp, struct timeval __user *, tvp)//n为文件描述符
    
    {
    
           struct timespec end_time, *to = NULL;
    
           struct timeval tv;
    
           int ret;
    
     
    
           if (tvp) {
    
                  if (copy_from_user(&tv, tvp, sizeof(tv)))
    
                         return -EFAULT;
    
     
    
                  to = &end_time;
    
                  if (poll_select_set_timeout(to,
    
                                tv.tv_sec + (tv.tv_usec / USEC_PER_SEC),
    
                                (tv.tv_usec % USEC_PER_SEC) * NSEC_PER_USEC))
    
                         return -EINVAL;
    
           }
    
     
    
           ret = core_sys_select(n, inp, outp, exp, to);
    
           ret = poll_select_copy_remaining(&end_time, tvp, 1, ret);
    
     
    
           return ret;
    
    }

    在core_sys_select() 函数中调用了do_select:

     (觉得用代码格式反而不好看)

    int do_select(int n, fd_set_bits *fds, struct timespec *end_time)

    {

             ktime_t expire, *to = NULL;

             struct poll_wqueues table;

             poll_table *wait;

             int retval, i, timed_out = 0;

             unsigned long slack = 0;

             rcu_read_lock();

             retval = max_select_fd(n, fds);

             rcu_read_unlock();

             if (retval < 0)

                       return retval;

             n = retval;

             poll_initwait(&table);//初始化结构体,主要是初始化poll_wait的回调函数为__pollwait

             wait = &table.pt;

             if (end_time && !end_time->tv_sec && !end_time->tv_nsec) {

                       wait = NULL;

                       timed_out = 1;

             }

             if (end_time && !timed_out)

                       slack = estimate_accuracy(end_time);

             retval = 0;

             for (;;) {

                       unsigned long *rinp, *routp, *rexp, *inp, *outp, *exp;

                       inp = fds->in; outp = fds->out; exp = fds->ex;

                       rinp = fds->res_in; routp = fds->res_out; rexp = fds->res_ex;

                       for (i = 0; i < n; ++rinp, ++routp, ++rexp) {

                                unsigned long in, out, ex, all_bits, bit = 1, mask, j;

                                unsigned long res_in = 0, res_out = 0, res_ex = 0;

                                const struct file_operations *f_op = NULL;

                                struct file *file = NULL;

                                in = *inp++; out = *outp++; ex = *exp++;

                                all_bits = in | out | ex;

                                if (all_bits == 0) {

                                         i += __NFDBITS;

                                         continue;

                                }

                                for (j = 0; j < __NFDBITS; ++j, ++i, bit <<= 1) {

                                         int fput_needed;

                                         if (i >= n)

                                                   break;

                                         if (!(bit & all_bits))

                                                   continue;

                                         file = fget_light(i, &fput_needed);

                                          if (file) {

                                                   f_op = file->f_op;

                                                   mask = DEFAULT_POLLMASK;

                                                   if (f_op && f_op->poll) { 

                                                            wait_key_set(wait, in, out, bit);

                                                            mask = (*f_op->poll)(file, wait););//调用poll_wait处理过程,

                                                            //即把驱动中等待队列头增加到poll_wqueues中的entry中,并把指向

                                                            //当前里程的等待队列项增加到等待队列头中。每一个等待队列头占用一个entry

                                                   }

                                                   fput_light(file, fput_needed);

                                                   if ((mask & POLLIN_SET) && (in & bit)) {//如果有信号进行设置,记录,写回到对应项,设置跳出循环的retval

                                                            res_in |= bit;

                                                            retval++;

                                                            wait = NULL;

                                                   }

                                                   if ((mask & POLLOUT_SET) && (out & bit)) {

                                                            res_out |= bit;

                                                            retval++;

                                                            wait = NULL;

                                                   }

                                                   if ((mask & POLLEX_SET) && (ex & bit)) {

                                                            res_ex |= bit;

                                                            retval++;

                                                            wait = NULL;

                                                   }

                                         }

                                }

                                if (res_in)

                                         *rinp = res_in;

                                if (res_out)

                                         *routp = res_out;

                                if (res_ex)

                                         *rexp = res_ex;

                                cond_resched();//增加抢占点,调度其它进程,当前里程进入睡眠

                       }

                       wait = NULL;

                       if (retval || timed_out || signal_pending(current))//这里就跳出循环,需要讲一下signal_pending

                                break;

                       if (table.error) {

                                retval = table.error;

                                break;

                       }

                       /*

                        * If this is the first loop and we have a timeout

                        * given, then we convert to ktime_t and set the to

                        * pointer to the expiry value.

                        */

                        //读取需要等待的时间,等待超时

                       if (end_time && !to) { 

                                expire = timespec_to_ktime(*end_time);

                                to = &expire;

                       }

                       if (!poll_schedule_timeout(&table, TASK_INTERRUPTIBLE,to, slack))

                               timed_out = 1;

             }

             poll_freewait(&table);//从等待队列头中删除poll_wait中添加的等待队列,并释放资源

             return retval;//调用成功与否就看这个返回值

    }

    do_select大概的思想就是:当应用程序调用select() 函数, 内核就会相应调用 poll_wait(), 把当前进程添加到相应设备的等待队列上,然后将该应用程序进程设置为睡眠状态。直到该设备上的数据可以获取,然后调用wake up 唤醒该应用程序进程。

    注:分析内核代码离不开sourceInsight,只不过建议用英文版,我的中文版改不了字体,看起来很不方便。可以到http://kernel.org/下载源码来放到sourceInsight的工程中。然后就是使用Linux Cross Reference 进行查询。

  • 相关阅读:
    数论学习笔记之欧拉函数
    [CQOI2014]危桥
    lspci -nnk
    linux 详解useradd 命令基本用法
    。 (有些情况下通过 lsof(8) 或 fuser(1) 可以 找到有关使用该设备的进程的有用信息)
    CentOS 7 设置默认进入字符界面
    下面附上top和sar的使用方法,方便参考! "top"工具
    Centos7/RHEL7 开启kdump
    Linux内存带宽的一些测试笔记
    调试测试
  • 原文地址:https://www.cnblogs.com/zhuyp1015/p/2529079.html
Copyright © 2011-2022 走看看