zoukankan      html  css  js  c++  java
  • CentOS6.6+Puppet3.7.4分布式部署Nagios监控系统

    测试框架

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    CentOS-6.6-x86_64(minimal)
     
    puppet-3.7.4
     
    nagios-4.0.8.tar.gz
     
    nagios-plugins-2.0.3.tar.gz
     
    nrpe-2.15.tar.gz
     
    192.168.188.10 mirrors.redking.com
     
    192.168.188.20 master.redking.com
     
    192.168.188.20 nagios.redking.com
     
    192.168.188.31 agent1.redking.com
     
    192.168.188.32 agent2.redking.com
     
    192.168.188.33 agent3.redking.com

    Puppet 要求所有机器有完整的域名(FQDN),如果没有 DNS 服务器提供域名的话,可以在两台机器上设置主机名(注意要先设置主机名再安装 Puppet,因为安装 Puppet 时会把主机名写入证书,客户端和服务端通信需要这个证书),因为我配置了DNS,所以就不用改hosts了,如果没有就需要改hosts文件指定。

    1.关闭selinux,iptables,并设置ntp      采用CentOS-6.6-x86_64.iso进行minimal最小化安装

    关闭selinux

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    [root@master ~]# cat /etc/selinux/config
     
    # This file controls the state of SELinux on the system.
     
    # SELINUX= can take one of these three values:
     
    # enforcing - SELinux security policy is enforced.
     
    # permissive - SELinux prints warnings instead of enforcing.
     
    # disabled - No SELinux policy is loaded.
     
    SELINUX=enforcing
     
    # SELINUXTYPE= can take one of these two values:
     
    # targeted - Targeted processes are protected,
     
    # mls - Multi Level Security protection.
     
    SELINUXTYPE=targeted
     
    [root@master ~]# sed -i '/SELINUX/ s/enforcing/disabled/g' /etc/selinux/config
     
    [root@master ~]# cat /etc/selinux/config
     
    # This file controls the state of SELinux on the system.
     
    # SELINUX= can take one of these three values:
     
    # enforcing - SELinux security policy is enforced.
     
    # permissive - SELinux prints warnings instead of enforcing.
     
    # disabled - No SELinux policy is loaded.
     
    SELINUX=disabled
     
    # SELINUXTYPE= can take one of these two values:
     
    # targeted - Targeted processes are protected,
     
    # mls - Multi Level Security protection.
     
    SELINUXTYPE=targeted
     
    [root@master ~]# setenforce 0

    停止iptables

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    [root@node1 ~]# chkconfig --list |grep tables
     
    ip6tables 0:off 1:off 2:on 3:on 4:on 5:on 6:off
     
    iptables 0:off 1:off 2:on 3:on 4:on 5:on 6:off
     
    [root@node1 ~]# chkconfig ip6tables off
     
    [root@node1 ~]# chkconfig iptables off
     
    [root@node1 ~]# service ip6tables stop
     
    ip6tables: Setting chains to policy ACCEPT: filter [ OK ]
     
    ip6tables: Flushing firewall rules: [ OK ]
     
    ip6tables: Unloading modules: [ OK ]
     
    [root@node1 ~]# service iptables stop
     
    iptables: Setting chains to policy ACCEPT: filter [ OK ]
     
    iptables: Flushing firewall rules: [ OK ]
     
    iptables: Unloading modules: [ OK ]
     
    [root@node1 ~]#

    设置ntp

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    [root@master ~]# ntpdate pool.ntp.org
     
    [root@master ~]# chkconfig --list|grep ntp
     
    ntpd 0:off 1:off 2:off 3:off 4:off 5:off 6:off
     
    ntpdate 0:off 1:off 2:off 3:off 4:off 5:off 6:off
     
    [root@master ~]# chkconfig ntpd on
     
    [root@master ~]# service ntpd start
     
    Starting ntpd: [ OK ]
     
    [root@master ~]#

    2.安装puppet服务   puppet不在CentOS的基本源中,需要加入 PuppetLabs 提供的官方源:

    1
    2
    3
    4
    5
    [root@master ~]# wget http://yum.puppetlabs.com/el/6/products/x86_64/puppetlabs-release-6-7.noarch.rpm
     
    [root@master ~]# rpm -ivh puppetlabs-release-6-7.noarch.rpm
     
    [root@master ~]# yum update -y

    在 master上安装和启用 puppet 服务:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    [root@master ~]# yum install -y puppet-server
     
    [root@master ~]# chkconfig puppet on
     
    [root@master ~]# chkconfig puppetmaster on
     
    [root@master ~]# service puppet start
     
    Starting puppet agent:                                     [  OK  ]
     
    [root@master ~]# service puppetmaster start
     
    Starting puppetmaster:                                     [  OK  ]
     
    [root@master ~]#

    在clients上安装puppet客户端

    1
    2
    3
    4
    5
    [root@agent1 ~]# yum install -y puppet
     
    [root@agent1 ~]# chkconfig puppet on
     
    [root@agent1 ~]# service puppet start

    3.配置puppet

    对于puppet 客户端,修改/etc/puppet/puppet.conf,指定master服务器

    clipboard[1]

    并重启puppet服务

    1
    [root@agent1 ~]# service puppet restart

    4.Client申请证书   服务端自动签发证书设置    设置master自动签发所有的证书,我们只需要在/etc/puppet目录下创建 autosign.conf文件。(不需要修改 /etc/puppet/puppet.conf文件,因为我默认的autosign.conf 文件的位置没有修改)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    [root@master ~]# cat > /etc/puppet/autosign.conf <<EOF
     
    > *.redking.com
     
    > EOF
     
    [root@master ~]# service puppetmaster restart
     
    Stopping puppetmaster:                                     [  OK  ]
     
    Starting puppetmaster:                                     [  OK  ]
     
    [root@master ~]#

    这样就会对所有来自fisteam2.com的机器的请求,都自动签名。  client需要向服务器端发出请求, 让服务器对客户端进行管理. 这其实是一个证书签发的过程. 第一次运行 puppet 客户端的时候会生成一个 SSL 证书并指定发给 Puppet 服务端, 服务器端如果同意管理客户端,就会对这个证书进行签发,可以用这个命令来签发证书,由于我们已经在客户端设置了server地址,因此不需要跟服务端地址

    1
    [root@agent1 ~]# puppet agent --test

    clipboard[2]

    就可以申请证书了,由于我配置的自动签发证书,所以直接就签发了,在服务端执行

    1
    [root@master ~]# puppet cert list --all

    clipboard[3]

    Nagios服务器安装

    1.安装Nagios相关依赖包

    1
    [root@master ~]# yum install -y httpd php gcc glibc glibc-common gd gd-devel openssl-devel

    2.创建Nagios用户与组

    1
    2
    3
    [root@master ~]# useradd -m nagios
     
    [root@master ~]# passwd nagios

    创建nagcmd用户组以执行来自Web接口命令,并添加nagios和apache用户到此用户组

    1
    2
    3
    4
    5
    [root@master ~]# groupadd nagcmd
     
    [root@master ~]# usermod -a -G nagcmd nagios
     
    [root@master ~]# usermod -a -G nagcmd apache

    3.下载Nagios和Plugins软件包

    http://www.nagios.org/download/下载Nagios Core和Nagios Plugins

    clipboard[4]

    4.编译安装Nagios

    1
    2
    3
    [root@master tmp]# tar zxf nagios-4.0.8.tar.gz
     
    [root@master tmp]# cd nagios-4.0.8

    #运行Nagios配置脚本,并把nagcmd更改为之前所创建的组

    1
    [root@master nagios-4.0.8]# ./configure --with-command-group=nagcmd

    #编译Nagios源码

    1
    [root@master nagios-4.0.8]# make all

    #安装二进制文件、init脚本文件、sample配置文件,设置外部命令目录权限

    1
    2
    3
    4
    5
    6
    7
    [root@master nagios-4.0.8]# make install
     
    [root@master nagios-4.0.8]# make install-init
     
    [root@master nagios-4.0.8]# make install-config
     
    [root@master nagios-4.0.8]# make install-commandmode

    5.修改配置文件

    样式配置文件位于/usr/local/nagios/etc目录,可以更改email地址

    1
    [root@master nagios-4.0.8]# vim /usr/local/nagios/etc/objects/contacts.cfg

    6.配置Web界面

    在Apache的conf.d目录中安装Nagios Web配置文件

    1
    [root@master nagios-4.0.8]# make install-webconf

    创建nagiosadmin帐号登录Nagios Web接口

    1
    2
    3
    4
    5
    6
    7
    [root@master nagios-4.0.8]# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
     
    [root@master nagios-4.0.8]# service httpd start
     
    Starting httpd:                                            [  OK  ]
     
    [root@master nagios-4.0.8]# chkconfig httpd on

    开启httpd服务使配置生效并设置开机自启

    clipboard[5]

    7.编译安装Nagios Plugins

    1
    2
    3
    4
    5
    6
    7
    [root@master tmp]# tar zxvf nagios-plugins-2.0.3.tar.gz
     
    [root@master tmp]# cd nagios-plugins-2.0.3
     
    [root@master nagios-plugins-2.0.3]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios
     
    [root@master nagios-plugins-2.0.3]# make && make install

    8.编译安装Nrpe

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    [root@master tmp]# tar zxvf nrpe-2.15.tar.gz
     
    [root@master nrpe-2.15]# ./configure
     
    [root@master nrpe-2.15]# make all
     
    [root@master nrpe-2.15]# make install-plugin
     
    [root@master nrpe-2.15]# make install-daemon
     
    [root@master nrpe-2.15]# make install-daemon-config

    clipboard[6]

    9.启动Nagios

    本机监控HTTP SSH的Notifications显示警告错误,解决方法

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    [root@master ~]# vim /usr/local/nagios/etc/objects/localhost.cfg
     
    # Define a service to check SSH on the local machine.
     
    # Disable notifications for this service by default, as not all users may have SSH enabled.
     
    define service{
     
    use                             local-service         ; Name of service template to use
     
    host_name                       localhost
     
    service_description             SSH
     
    check_command                   check_ssh
     
    notifications_enabled           1  #改为1,即可
     
    }
     
    # Define a service to check HTTP on the local machine.
     
    # Disable notifications for this service by default, as not all users may have HTTP enabled.
     
    define service{
     
    use                             local-service         ; Name of service template to use
     
    host_name                       localhost
     
    service_description             HTTP
     
    check_command                   check_http
     
    notifications_enabled           1  #改为1,即可
     
    }
     
    [root@master ~]# touch /var/www/html/index.html

    启动Nagios之前测试配置文件

    1
    [root@master ~]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

    clipboard[7]

    启动Nagios、nrpe并设置开机自启

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    [root@master ~]# chkconfig nagios --add
     
    [root@master ~]# chkconfig --list |grep nagios
     
    nagios          0:off   1:off   2:off   3:on    4:on    5:on    6:off
     
    [root@master ~]# chkconfig nagios on
     
    [root@master ~]# service nagios start
     
    Starting nagios: done.
     
    [root@master ~]# echo "/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d" >> /etc/rc.d/rc.local
     
    [root@master nrpe-2.15]# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
     
    [root@master nrpe-2.15]# netstat -tunpl |grep nrpe
     
    tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN 70100/nrpe
     
    tcp 0 0 :::5666 :::* LISTEN 70100/nrpe
     
    [root@master nrpe-2.15]#

    执行/usr/local/nagios/libexec/check_nrpe -H 127.0.0.1检查连接是否正常

    clipboard[8]

    使用之前定义的nagiosadmin帐号与密码登录Nagios,地址:http://192.168.188.20/nagios/

    clipboard[9]

    clipboard[10]

    创建Nagios客户端监控

    1.Puppet Master安装相应模块

    Nagios没有目前没有提供官方软件源,在批量部署时可以使用第三方epel源,采用Example42所提供的puppet-nrpe来实现Linux服务器批量部署。部署客户端使用官方3个模块:epel、nrpe、puppi。

    epel模块用于安装nrpe软件,nrpe模块用于收集主机信息,puppi属于Example42模块组件,使用Example42模块时都需要加载此模块。

    Puppi是一个Puppet模块和CLI命令,他可以标准化和自动化快速部署应用程序,并提供快速和标准查询命令,检查系统资源。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    [root@master ~]# git clone https://github.com/puppetlabs/puppetlabs-stdlib /etc/puppet/modules/stdlib
     
    [root@master ~]# git clone https://github.com/example42/puppi /etc/puppet/modules/puppi
     
    [root@master ~]# git clone https://github.com/example42/puppet-nrpe /etc/puppet/modules/nrpe
     
    [root@master ~]# puppet module install stahnma/epel
     
    [root@master ~]# vim /etc/puppet/puppet.conf
     
    [master]
     
    modulepath = /etc/puppet/modules/

    clipboard[11]

    2.创建agent节点组配置文件

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    [root@master ~]# mkdir /etc/puppet/manifests/nodes
     
    [root@master ~]# vim /etc/puppet/manifests/nodes/agentgroup.pp
     
    node /^agentd+.redking.com$/ {
     
    include stdlib
     
    include epel
     
    class { 'puppi': }
     
    class { 'nrpe':
     
    require => Class['epel'],
     
    allowed_hosts => ['127.0.0.1',$::ipaddress,'192.168.188.20'],
     
    template => 'nrpe/nrpe.cfg.erb',
     
    }
     
    }
     
    [root@master ~]# vim /etc/puppet/manifests/site.pp
     
    import "nodes/agentgroup.pp"

    3.配置Nagios添加agent.redking.com主机监控

    修改/usr/local/nagios/etc/objects/commands.cfg

    command_name check_nrpe ——定义命令名称为check_nrpe,services.cfg必须使用

    command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ ——用$USER1$代替/usr/local/nagios/libexec

    这是定义实际运行的插件程序.这个命令行的书写要完全按照check_nrpe这个命令的用法.不知道用法的就用check_nrpe –h查看; -c后面带的$ARG1$参数是传给nrpe daemon执行的检测命令,它必须是nrpe.cfg中所定义的5条命令中的其中一条。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    [root@master ~]# vim /usr/local/nagios/etc/objects/commands.cfg
     
    # 'check_nrpe' command definition
     
    define command{
     
    command_name check_nrpe
     
    command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
     
    }

    修改/usr/local/nagios/etc/nagios.cfg

    1
    2
    3
    4
    5
    6
    7
    [root@master ~]# vim /usr/local/nagios/etc/nagios.cfg
     
    cfg_file=/usr/local/nagios/etc/objects/agent1.redking.com.cfg
     
    cfg_file=/usr/local/nagios/etc/objects/agnet2.redking.com.cfg
     
    cfg_file=/usr/local/nagios/etc/objects/agent3.redking.com.cfg

    增加agent1~3.redking.com.cfg配置文件

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    # vim /usr/local/nagios/etc/objects/agent1.redking.com.cfg
     
    define host{
     
    use             linux-server
     
    host_name       agent1.redking.com
     
    alias agent1.redking.com
     
    address         192.168.188.31
     
    }
     
    define service{
     
    use                     generic-service
     
    host_name               agent1.redking.com
     
    service_description     PING
     
    check_command           check_ping!100.0,20%!500.0,60%
     
    }
     
    define service{
     
    use                     generic-service
     
    host_name               agent1.redking.com
     
    service_description     Current Users
     
    check_command           check_nrpe!check_users!10!5
     
    }
     
    define service{
     
    use                     generic-service
     
    host_name               agent1.redking.com
     
    service_description     Current Load
     
    check_command           check_nrpe!check_load!15,10,5!30,25,20
     
    }
     
    define service{
     
    use                     generic-service
     
    host_name               agent1.redking.com
     
    service_description     Swap Usage
     
    check_command           check_nrpe!check_swap!20!40
     
    }

    检测Nagios服务并重启使配置生效

    1
    2
    3
    4
    5
    [root@master ~]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
     
    [root@master ~]# service nagios restart
     
    [root@master ~]# service puppetmaster restart

    clipboard[12]

    客户端测试

    1
    [root@agent1 ~]# puppet agent --test

    客户端自动部署nrpe

    clipboard[13]

    下面我们来看下客户端自动化部署nrpe后采集信息的nagios监控界面

    clipboard[14]

    clipboard[15]

    clipboard[16]

    clipboard[17]

    clipboard[18]

    clipboard[19]

    NRPE模块中定义的nrpe.cfg包含大量脚本,我们可以直接拿来使用当然也可以自己修改nrpe.cfg.erb模板内容。在批量部署时可以分别采用自己编写的模块或者现有模块来实现,利用现有模块几乎能实现系统管理日常工作中90%任务,剩余的10%我们可以根据生产业务来自己定制。

    clipboard[20]

    ========================END=================================

    http://redking.blog.51cto.com/27212/1612136

  • 相关阅读:
    CodeForces 834C
    HDU 6048
    HDU 6052
    HDU 6036
    HDU 6042
    HDU 2614 Beat(DFS)
    UESTC 1272 Final Pan's prime numbers(乱搞)
    HDU 2064 汉诺塔III(递归)
    HDU 2102 A计划(DFS)
    HDU 1069 I Think I Need a Houseboat(模拟)
  • 原文地址:https://www.cnblogs.com/chen110xi/p/4290344.html
Copyright © 2011-2022 走看看