zoukankan      html  css  js  c++  java
  • Zabbix各种报错信息和遇到的问题

    1、Zabbix报警 icmp pinger processes more than 75% busy

    [root@localhost zabbix]#  vi /etc/zabbix/zabbix_server.conf
    将这个值设置成StartPingers=5,然后重启zabbix-server服务。
     

    2、zabbix unreachable poller processes more than 75 busy
    unreachable poller processes 一直在处于busy的状态,那这个具体代表什么意思呢,查看官方文档zabbix internal process、unreachable poller - poller for unreachable devices 用于轮询不可到达到的设备。

    可能情况:
    1.通过Zabbix agent采集数据的设备处于moniting的状态但是此时机器死机或其他原因导致zabbix agent死掉server获取不到数据,此时unreachable poller就会升高。
    2.通过Zabbix agent采集数据的设备处于moniting的状态但是server向agent获取数据时时间过长,经常超过server设置的timeout时间,此时unreachable poller就会升高。

    3.支撑Zabbix的MySQL卡住了,Zabbix服务器的IO卡住了都有可能,Zabbix进程分配到内存不足都有可能。

    一个简单的方法是增加Zabbix Server启动时初始化的进程数量,这样直接增加了轮询的负载量,从比例上来讲忙的情况就少了

    [root@localhost zabbix]#  vi /etc/zabbix/zabbix_server.conf
    将这个值设置成StartPollers=500,然后重启zabbix-server服务。也可以定时重启zabbix服务。


    启动zabbix-server有如下报错:

    1. 29171:20180714:084911.367 cannot start alert manager service: Cannot bind socket to "/var/run/zabbix/zabbix_server_alerter.sock": [13] Permission denied.
    2. 29142:20180714:084911.368 One child process died (PID:29171,exitcode/signal:1). Exiting ...
    3. 29225:20180714:084923.611 cannot start preprocessing service: Cannot bind socket to "/var/run/zabbix/zabbix_server_preprocessing.sock": [13] Permission denied.
    4. 29213:20180714:084923.613 server #18 started [poller #2]
    5. 29195:20180714:084923.614 One child process died (PID:29225,exitcode/signal:1). Exiting ...
    6. 29195:20180714:084925.615 syncing history data...
    7. 29195:20180714:084925.615 syncing history data done
    8. 29195:20180714:084925.615 syncing trend data...
    9. 29195:20180714:084925.615 syncing trend data done
    10. 29195:20180714:084925.615 Zabbix Server stopped. Zabbix 3.4.10 (revision 81503).
       
       
      造成上述原因是因为SELINUX启动
      sestatus
      1. SELinux status: enabled
      2. SELinuxfs mount: /sys/fs/selinux
      3. SELinux root directory: /etc/selinux
      4. Loaded policy name: targeted
      5. Current mode: enforcing
      6. Mode from config file: disabled
      7. Policy MLS status: enabled
      8. Policy deny_unknown status: allowed
      9. Max kernel policy version: 28

      解决办法如下:
      vim /etc/selinux/config

      1. # This file controls the state of SELinux on the system.
      2. # SELINUX= can take one of these three values:
      3. # enforcing - SELinux security policy is enforced.
      4. # permissive - SELinux prints warnings instead of enforcing.
      5. # disabled - No SELinux policy is loaded.
      6. SELINUX=disabled
      7. # SELINUXTYPE= can take one of three two values:
      8. # targeted - Targeted processes are protected,
      9. # minimum - Modification of targeted policy. Only selected processes are protected.
      10. # mls - Multi Level Security protection.
      11. SELINUXTYPE=targeted

        修改SELINUX=disabled修改配置文件永久关闭。
      setenforce 0:临时关闭SELINUX。
        也可以设置SELINUX允许zabbix访问,也不是很麻烦,但是SELINUX基本用不到,所有这种方法如果你们想知道就自己搜索下吧,在这里就不过多陈述了。






     

    3、Zabbix alerter processes more than 75% busy
    收到几百条zabbix告警信息:
    Zabbix alerter processes more than 75% busy

    可能原因:
    zabbix的数据库问题
    zabbix服务器的IO负载
    zabbix进程分配到内存不足

    网络延时或者不通

     

    处理方法:

    [root@localhost zabbix] vim /etc/zabbix/zabbix_server.conf 
    将其默认值5修改为20:
    StartPollers=500
    修改的位置
    # StartDiscoverers=1
    StartDiscoverers=100
     

    4、zabbix-server服务挂了,启动后又自动停机了,并且日志中很多下面这个错误

    报警提示

    Zabbix value cache working in low memory mode
    Less than 25% free in the configuration cache

    [root@localhost zabbix] cat /var/log/zabbix/zabbix_server.log
    6278:20180320:190117.775 using configuration file: /etc/zabbix/zabbix_server.conf
    6278:20180320:190117.807 current database version (mandatory/optional): 03020000/03020001
    6278:20180320:190117.807 required mandatory version: 03020000
    6278:20180320:190118.378 __mem_malloc: skipped 0 asked 136 skip_min 4294967295 skip_max 0
    6278:20180320:190118.378 [file:dbconfig.c,line:653] zbx_mem_malloc(): out of memory (requested 136 bytes)
    6278:20180320:190118.378 [file:dbconfig.c,line:653] zbx_mem_malloc(): please increase CacheSize configuration parameter
    6354:20180320:190128.632 Starting Zabbix Server. Zabbix 3.2.10 (revision 74337).
     
    [root@localhost zabbix] vi /etc/zabbix/zabbix_server.conf
    ### Option: CacheSize
    #       Size of configuration cache, in bytes.
    #       Shared memory size for storing host, item and trigger data.
    #
    # Mandatory: no
    # Range: 128K-8G
    # Default:
    # CacheSize=8M
    CacheSize=2048M
    
    [root@localhost zabbix]# systemctl restart zabbix-server
    备注:今天批量添加了700台主机,造成内存溢出。
     

     

    5、zabbix-server日志报错,提示connection to database 'zabbix' failed: [1040] Too many connections错误,mariadb正常。想到应该是mysql最大连接数问题。

    修改mysql最大连接数的链接:http://blog.51cto.com/net881004/2089198

     

    6、报警提示More than 100 items having missing data for more than 10 minutes和Zabbix poller processes more than 75% busy错误。

    修改配置文件增大线程数和缓存

    [root@localhost zabbix]#  vim /usr/local/zabbix/etc/zabbix_server.conf
    StartPollers=500
    StartPollersUnreachable=50
    StartTrappers=30
    StartDiscoverers=6
    CacheSize=1G
    CacheUpdateFrequency=300
    StartDBSyncers=20
    HistoryCacheSize=512M
    TrendCacheSize=256M
    HistoryTextCacheSize=80M
    ValueCacheSize=1G
     

    7、server日志很多first network error, wait for 15 seconds报错

    server配置文件Timeout时间改大点,我改成了30s。

    8、zabbix告警“Zabbix poller processes more than 75% busy”(网友)
    告警原因:
    1.某个进程卡住了,
    2.僵尸进程出错,太多,导致慢了
    3.网络延迟(可忽略)
    4.zabbix消耗的内存多了

    告警危害:
    普通告警,暂无危害(但是最好处理)

    处理方法:
    一:简单,粗暴(重启zabbix-server可结合定时任务使用)
    service zabbix-server restart
    crontab -e 调出Cron编辑器中增加一个计划:
    @daily service zabbix-server restart > /dev/null 2>&1

    二:编辑Zabbix Server的配置文件/etc/zabbix/zabbix_server.conf,找到配置StartPollers的段落:
    ### Option: StartPollers
    #       Number of pre-forked instances of pollers.
    #
    # Mandatory: no
    # Range: 0-1000
    # Default:
    # StartPollers=5
    取消StartPollers=一行的注释或者直接在后面增加:
    StartPollers=10
    将StartPollers改成多少取决于服务器的性能和监控的数量,将StartPollers设置成12之后就再没有遇到过警报。如果内存足够的话可以设置更高。

  • 相关阅读:
    BUG漏测的原因总结,以及如何处理
    费用流
    拉格朗日插值
    数论问题整理
    计数问题
    POJ 1741 Tree
    bzoj 2820: YY的GCD
    luogu P3690 【模板】Link Cut Tree (动态树)
    bzoj 1036: [ZJOI2008]树的统计Count
    bzoj 3282: Tree
  • 原文地址:https://www.cnblogs.com/liulj0713/p/10018921.html
Copyright © 2011-2022 走看看