zoukankan      html  css  js  c++  java
  • mongodb"failed to create thread after accepting new connection, closing connection"问题定位

    事件

    lxc宿主机10.11.164.28上所有mongodb数据节点,在同一时刻报错:failed to create thread after accepting new connection, closing connection

    宿主机版本:oracle linux 6.5,lxc版本:1.0.6

    数据库版本:mongodb 3.2.11

    报错信息:

    2017-07-26T10:23:18.734+0800 I NETWORK  [initandlisten] failed to create thread after accepting new connection, closing connection
    2017-07-26T10:23:23.781+0800 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:44334 #5874 (14 connections now open)
    2017-07-26T10:23:23.781+0800 I NETWORK  [initandlisten] pthread_create failed: errno:11 Resource temporarily unavailable
    2017-07-26T10:23:23.781+0800 I NETWORK  [initandlisten] failed to create thread after accepting new connection, closing connection
    2017-07-26T10:23:26.670+0800 I NETWORK  [initandlisten] connection accepted from 192.168.4.206:54601 #5875 (14 connections now open)
    2017-07-26T10:23:26.670+0800 I NETWORK  [initandlisten] pthread_create failed: errno:11 Resource temporarily unavailable

    问题确认

    报错信息所在文件:

    ./mongodb-src-r3.2.16/src/mongo/util/net//message_server_port.cpp:            log() << "failed to create thread after accepting new connection, closing connection";

    ./mongodb-src-r3.2.16/src/mongo/util/net//message_server_port.cpp:            log() << "pthread_create failed: " << errnoWithDescription(failed) << endl;

        virtual void accepted(std::shared_ptr<Socket> psocket, long long connectionId) {
            ScopeGuard sleepAfterClosingPort = MakeGuard(sleepmillis, 2);
            std::unique_ptr<MessagingPortWithHandler> portWithHandler(
                new MessagingPortWithHandler(psocket, _handler, connectionId));
            if (!Listener::globalTicketHolder.tryAcquire()) {
                log() << "connection refused because too many open connections: "
                      << Listener::globalTicketHolder.used() << endl;
                return;
            }
            try {
    #ifndef __linux__  // TODO: consider making this ifdef _WIN32
                {
                    stdx::thread thr(stdx::bind(&handleIncomingMsg, portWithHandler.get()));
                    thr.detach();
                }
    #else
                pthread_attr_t attrs;                                                                      //声明pthread_attr_t对象attrs
                pthread_attr_init(&attrs);                                                                 //初始化attrs
                pthread_attr_setdetachstate(&attrs, PTHREAD_CREATE_DETACHED);                              //设置线程attrs状态为PTHREAD_CREATE_DETACHED,退出时自行释放所占用的资源
                static const size_t STACK_SIZE =                                                           //设置静态常量stack_size,数据类型为size_t,正整数
                    1024 * 1024;  // if we change this we need to update the warning
                struct rlimit limits;                                                                      //声明rlimit类型的结构体limits,详细内容在下文解释
                verify(getrlimit(RLIMIT_STACK, &limits) == 0);                                             //验证,RLIMIT_STACK(最大的进程堆栈)和limits比较,如果
                if (limits.rlim_cur > STACK_SIZE) {                                                        //如果需要的stack大小大于建议设置的stack大小比较,则分配建议的stack_size(1M)
                    size_t stackSizeToSet = STACK_SIZE;
    #if !__has_feature(address_sanitizer)
                    if (kDebugBuild)                                                                       //
                        stackSizeToSet /= 2;
    #endif
                    pthread_attr_setstacksize(&attrs, stackSizeToSet);                                     //为线程attrs分配堆栈大小,大小为stackSizeToSet
                } else if (limits.rlim_cur < 1024 * 1024) {                                                //如果需要的limit值小于1M,则warning
                    warning() << "Stack size set to " << (limits.rlim_cur / 1024)
                              << "KB. We suggest 1MB" << endl;
                }
     
                pthread_t thread;                                                                          //声明进程
                int failed = pthread_create(&thread, &attrs, &handleIncomingMsg, portWithHandler.get());   //创建进程(进程号,属性,其实函数地址等,启动变量等),fail值,成功为0,失败-1
                pthread_attr_destroy(&attrs);                                                              //释放占用的sttrs资源
                if (failed) {                                                                              //创建失败,日志打印,
                    log() << "pthread_create failed: " << errnoWithDescription(failed) << endl;            //errnoWithDescription(failed)在这里为Resource temporarily unavailable
                    throw std::system_error(
                        std::make_error_code(std::errc::resource_unavailable_try_again));
                }
    #endif  // __linux__
                portWithHandler.release();                                                                //释放定制的函数指针
                sleepAfterClosingPort.Dismiss();                                                          //
            } catch (...) {                                                                               //抛出异常,释放监听进程等
                Listener::globalTicketHolder.release();
                log() << "failed to create thread after accepting new connection, closing connection";
            }
        }

    其中limits是rlimit类型的结构体,定义如下,rlimit是linux系统的结构体,定义一个进程在运行过程中能得到的最大进程,针对soft limit(软限制)或者hard limit(硬限制)

    struct rlimit {
    rlim_t rlim_cur;                       //soft limit
    rlim_t rlim_max;                       //hard limit
    };

    有两种函数控制:

    int getrlimit(int resource, struct rlimit *rlim);                                 //查询进程是否满足一个进程的rlimit
    int setrlimit(int resource, const struct rlimit *rlim);

    报错含义是:不能创建新的进程,来映射新链接

    确认原因

    lxc虚机与stack相关的参数信息如下:

    ulimit -s    
    stack size                        The maximum stack size
     
    cat /etc/security/limits.conf
    mongodb           soft    nproc   4096            max number of processes
    mongodb           hard    nproc   16384           max number of processes
    mongodb           soft    nofile  131072          max number of open file descriptors
    mongodb           hard    nofile  131072          max number of open file descriptors
    mongodb           soft    stack   1024            max stack size (KB)
    mongodb           hard    stack   1024            max stack size (KB)

    宿主机与stack相关的参数表

    stack size              (kbytes, -s) 8192
     
    # End of file
    *           soft    nproc   65536
    *           hard    nproc   65536
    *           soft    nofile  131072
    *           hard    nofile  131072

    尝试修改虚机的nproc限制,并不起作用。

    除此,在linux 6x版本中,还引入了一个配置cat /etc/security/limits.d/90-nproc.conf

    宿主机的配置:

    # cat /etc/security/limits.d/90-nproc.conf
    # Default limit for number of user's processes to prevent
    # accidental fork bombs.
    # See rhbz #432903 for reasoning.
    *          soft    nproc     65536
    root       soft    nproc     unlimited

    虚机的配置:

    #/etc/security/limits.d/90-nproc.conf
    # Default limit for number of user's processes to prevent
    # accidental fork bombs.
    # See rhbz #432903 for reasoning.
    *          soft    nproc     1024
    root       soft    nproc     unlimited

    尝试扩大虚机对*用户的nproc软限制,改为10240,mongodb创建链接恢复正常

    结论,创建thread受限两个配置,/etc/security/limit.conf和/etc/security/limits.d/90-nproc.conf,当然也受限与宿主机的配置。

  • 相关阅读:
    Linux进入单用户模式(passwd root修改密码)
    stark组件的分页,模糊查询,批量删除
    stark组件的增删改(新)
    stark组件的增删改
    stark组件之展示数据(查)
    stark组件配置,二层URL
    单例模式及设计url分发
    Django之modelform
    rbac组件权限按钮,菜单,可拔插
    rbac权限+中间件
  • 原文地址:https://www.cnblogs.com/wyett/p/7458651.html
Copyright © 2011-2022 走看看