zoukankan      html  css  js  c++  java
  • (转)CentOS搭建Nagios监控

    A.Nagios服务端
    1.安装软件包

    1. yum install -y httpd

    2.下载nagios

    1. wget http://syslab.comsenz.com/downloads/linux/nagios-3.0.5.tar.gz
    2. wget http://syslab.comsenz.com/downloads/linux/nagios-plugins-1.4.13.tar.gz
    3. wget http://syslab.comsenz.com/downloads/linux/nrpe-2.12.tar.gz

    3.添加nagios账号

    1. useradd nagios

    4.编译安装nagios

    1. mkdir /opt/hadoop/
    2. tar -xzvf nagios-3.0.5.tar.gz
    3. cd nagios-3.0.5
    4. ./configure --prefix=/opt/hadoop/nagios
    5. make all
    6. make fullinstall
    7. mkdir /opt/hadoop/nagios/etc
    8. mkdir /opt/hadoop/nagios/etc/objects
    9. cp ./sample-config/cgi.cfg /opt/hadoop/nagios/etc/
    10. cp ./sample-config/nagios.cfg /opt/hadoop/nagios/etc/
    11. cp ./sample-config/resource.cfg /opt/hadoop/nagios/etc/
    12. cp ./sample-config/template-object/commands.cfg /opt/hadoop/nagios/etc/objects/
    13. cp ./sample-config/template-object/contacts.cfg /opt/hadoop/nagios/etc/objects/
    14. cp ./sample-config/template-object/timeperiods.cfg /opt/hadoop/nagios/etc/objects/
    15. cp ./sample-config/template-object/templates.cfg /opt/hadoop/nagios/etc/objects/
    16. cp ./sample-config/template-object/localhost.cfg /opt/hadoop/nagios/etc/objects/
    17. touch /opt/hadoop/nagios/var/nagios.log
    18. chmod -R 755/opt/hadoop/nagios/etc/
    19. chown -R nagios:nagios /opt/hadoop/nagios

    5.编译安装nagios-plugins

    1. tar zxvf nagios-plugins-1.4.13.tar.gz
    2. cd nagios-plugins-1.4.13
    3. ./configure --prefix=/opt/hadoop/nagios --with-nagios-user=nagios --with-nagios-group=nagios
    4. make && make install

    检查是否已经安装成功,看这个目录下是否有插件文件

    1. ls /opt/hadoop/nagios/libexec/

    6.安装nrpe

    1. tar zxvf nrpe-2.12.tar.gz
    2. cd nrpe-2.12
    3. ./configure --prefix=/opt/hadoop/nagios --enable-ssl --enable-command-args
    4. make all
    5. make install-plugin
    6. make install-daemon
    7. make install-daemon-config

    7.配置httpd
    添加web账号

    1. htpasswd -c /opt/hadoop/nagios/etc/htpasswd.users nagiosadmin

    B.Nagios客户端
    1.准备软件包

    1. wget http://syslab.comsenz.com/downloads/linux/nagios-plugins-1.4.13.tar.gz
    2. wget http://syslab.comsenz.com/downloads/linux/nrpe-2.12.tar.gz

    2.添加nagios账号,准备安装目录

    1. mkdir /opt/hadoop/nagios
    2. useradd nagios

    3.编译安装nrpe

    1. tar -xzvf nrpe-2.12.tar.gz
    2. cd nrpe-2.12
    3. ./configure --prefix=/opt/hadoop/nagios --enable-ssl --enable-command-args
    4. make all
    5. make install-plugin
    6. make install-daemon
    7. make install-daemon-config

    4.安装nagios-plugin

    1. tar -xzvf nagios-plugins-1.4.13.tar.gz
    2. cd nagios-plugins-1.4.13
    3. ./configure --prefix=/opt/hadoop/nagios --with-nagios-user=nagios --with-nagios-group=nagios
    4. make && make install

    检查是否已经安装成功,看这个目录下是否有插件文件

    1. ls /opt/hadoop/nagios/libexec/

    5. 配置nrpe

    1. vim /opt/hadoop/nagios/etc/nrpe.cfg
    2. 找到”allowed_hosts=127.0.0.1改成allowed_hosts=127.0.0.1,10.130.2.72”,后边的IPnagios服务端IP
    3. 找到” dont_blame_nrpe=0改成dont_blame_nrpe=1

    6.一段nrpe启停脚本,放在/etc/init.d/nrpe里

    1. #!/bin/bash
    2. #
    3. # chkconfig: 2345 55 25
    4. # description: NRPE Daemon
    5. #
    6.  
    7. # source function library
    8. ./etc/rc.d/init.d/functions
    9.  
    10. RETVAL=0
    11.  
    12. prog='nrpe'
    13. NRPE_CFG='/opt/hadoop/nagios/etc/nrpe.cfg'
    14. NRPE_PRG='/opt/hadoop/nagios/bin/nrpe'
    15. NRPE_OPT='-d'
    16. PID_FILE='/var/run/nrpe.pid'
    17.  
    18. start()
    19. {
    20. echo -n $"Starting $prog: "
    21. [-f $PID_FILE ]&& rm -f $PID_FILE
    22. $NRPE_PRG -c $NRPE_CFG $NRPE_OPT
    23. pid=`ps aux | grep -v grep | grep $NRPE_PRG | awk '{print $2}'`
    24. echo $pid > $PID_FILE
    25.  
    26. if ps aux | grep -v grep | grep -q $NRPE_PRG ;then
    27. RETVAL=0
    28. success
    29. else
    30. RETVAL=1
    31. failure
    32. fi
    33. echo
    34. }
    35.  
    36. stop()
    37. {
    38. echo -n $"Stopping $prog: "
    39. ps --pid=`cat $PID_FILE`&>/dev/null
    40. if[ $?-eq 0];then
    41. kill -9`cat $PID_FILE`
    42. RETVAL=0
    43. fi
    44. success
    45. echo
    46. RETVAL=0
    47. }
    48.  
    49. case"$1"in
    50. start)
    51. start
    52. ;;
    53. stop)
    54. stop
    55. ;;
    56. restart)
    57. stop
    58. start
    59. ;;
    60. status)
    61. status -p $PID_FILE $prog
    62. RETVAL=$?
    63. ;;
    64. *)
    65. echo $"Usage: $0 {start|stop|restart|status}"
    66. RETVAL=1
    67. esac
    68. exit $RETVAL

    6. 启动nrpe

    1. /etc/init.d/nrpe start

    C.Nagios服务端添加被监控机
    1.配置监控机目录

    1. mkdir /opt/hadoop/nagios/etc/servers
    2. vim /opt/hadoop/nagios/etc/nagios.cfg 追加cfg_dir=/opt/hadoop/nagios/etc/servers

    2.添加配置的机器

    1. vim /opt/hadoop/nagios/etc/servers/10.130.2.22.cfg
    2. define host{
    3. use linux-server
    4. host_name 10.130.2.22
    5. alias 10.130.2.22
    6. address 10.130.2.22
    7. }
    8. define service{
    9. use generic-service
    10. host_name 10.130.2.22
    11. service_description check_ping
    12. check_command check_ping!100.0,20%!200.0,50%
    13. max_check_attempts 5
    14. normal_check_interval 1
    15. }
    16. define service{
    17. use generic-service
    18. host_name 10.130.2.22
    19. service_description check_ssh
    20. check_command check_ssh
    21. max_check_attempts 5
    22. normal_check_interval 1
    23. }

    3.reload nagios服务端使配置生效

    1. service nagios reload

    重新加载nagios后就可以在nagios的界面上看到新的被监控的机器了
    4.添加使用nrpe的监控

    1. 在/opt/hadoop/nagios/etc/objects/commands.cfg里增加如下行
    2. define command{
    3. command_name check_nrpe
    4. command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
    5. }

    在服务器监控配置文件中加入如下行,确保被监控机的nrpe服务是开的

    1. define service{
    2. use generic-service
    3. host_name 10.130.2.22
    4. service_description check_load
    5. check_command check_nrpe!check_load
    6. max_check_attempts 5
    7. normal_check_interval 1
    8. }

    重新加载nagios使配置生效。

    1. service nagios reload

    5.自定义监控脚本
    编写脚本check_diskmount.sh

    1. vim /opt/hadoop/nagios/libexec/check_diskmount.sh
    2. #!/bin/bash
    3. num=`cat /proc/mounts | grep '/disk' | wc -l`
    4. if[ $num -eq 12];then
    5. echo "OK - mount disk is $num"
    6. exit 0
    7. else
    8. echo "Critical - mount disk is $num"
    9. exit 1
    10. fi

    加上可执行权限

    1. chmod +x /opt/hadoop/nagios/libexec/check_diskmount.sh

    在被监控机的nrpe里加入自定义脚本路径

    1. vim /opt/hadoop/nagios/etc/nrpe.cfg
    2. command[check_diskmount]=/opt/hadoop/nagios/libexec/check_diskmount.sh

    重启nrpe

    1. /etc/init.d/nrpe restart

    在nagios服务端加入配置

    1. vim /opt/hadoop/nagios/etc/servers/10.130.2.22.cfg
    2. define service{
    3. use generic-service
    4. host_name s9xplan2.isv.cm6
    5. service_description check_diskmount
    6. check_command check_nrpe!check_diskmount
    7. max_check_attempts 3
    8. normal_check_interval 1
    9. }

    重新加载nagios,使得配置生效

      1. service nagios reload

    摘自:http://www.opstool.com/article/236

  • 相关阅读:
    HyperLogLog
    Bitmaps
    正向代理与反向代理的概念
    性能优化——应用服务器性能优化
    Memcached的优点
    前端基础之BOM和DOM
    性能优化——Web前端性能优化
    亡命逃窜---三维搜索
    Sum It Up -- 深搜 ---较难
    排序---对二维数组的排序
  • 原文地址:https://www.cnblogs.com/newmanzhang/p/3270094.html
Copyright © 2011-2022 走看看