zoukankan      html  css  js  c++  java
  • open-falcon v0.2 监控部署记录

    前言

    好吧,不知道为什么要写,其实,官方文档已经很详细。但是,总是想写点什么,怕自己忘记了。那就简单说说吧,在部署过程中,发现官方文档和我想的不一样,可能是我按照顺序习惯了,所以想从新跟着顺写来记录一遍。其次,官方还有个别细节并没有说明,所以想记录下来。其次,关于open-falcon架构、设计原理,也不做说明,毕竟官方也很详细。

    一、环境准备

    1.1 基础环境

    ### 系统环境
    [root@localhost ~]# cat /etc/redhat-release 
    CentOS Linux release 7.3.1611 (Core)
    
    ### 禁用firewalld和selinux
    setenforce 0
    sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
    systemctl stop firewalld
    systemctl disable firewalld
    

    1.2 安装依赖包

    yum -y install wget  git net-tools deltarpm epel-release gcc*
    yum makecache

    1.3 安装mysql数据库

    wget http://dev.mysql.com/get/mysql-community-release-el7-5.noarch.rpm
    rpm -ivh mysql-community-release-el7-5.noarch.rpm
    yum install mysql-community-server
    systemctl start mysql
    systemctl enable mysqld

    提示:

      正式环境建议对MySQL做相关调优以及尽量选择MySQL而不是MariaDB。以及设置合理的用户及安全性等配置。

    1.4 安装Redis数据库

    yum install redis -y
    systemctl start redis
    systemctl enable redis

    提示:

      关于Redis建议设置好密码,如果能保证局域网绝对安全可以不设置。

    1.5 导入MySLQ表结构

    # 创建工作目录

    mkdir /opt/openfalcon
    cd /opt/openfalcon
    git clone https://github.com/open-falcon/falcon-plus.git
    

    # 导入表结构

    mysql -h 127.0.0.1 -u root -p < falcon-plus/scripts/mysql/db_schema/1_uic-db-schema.sql
    mysql -h 127.0.0.1 -u root -p < falcon-plus/scripts/mysql/db_schema/2_portal-db-schema.sql
    mysql -h 127.0.0.1 -u root -p < falcon-plus/scripts/mysql/db_schema/3_dashboard-db-schema.sql
    mysql -h 127.0.0.1 -u root -p < falcon-plus/scripts/mysql/db_schema/4_graph-db-schema.sql
    mysql -h 127.0.0.1 -u root -p < falcon-plus/scripts/mysql/db_schema/5_alarms-db-schema.sql

    二、分布式部署

    下载软件包

    cd /opt/openfalcon
    wget https://github.com/open-falcon/falcon-plus/releases/download/v0.2.1/open-falcon-v0.2.1.tar.gz
    tar zxf open-falcon-v0.2.1.tar.gz
    rm -rf open-falcon-v0.2.1.tar.gz
    

    2.1 HBS(Heartbeat Server,心跳服务器)

    # 更改配置文件

    cd /opt/openfalcon/hbs/
    vim config/cfg.json
    {
        "debug": true,
        "database": "root:@tcp(127.0.0.1:3306)/falcon_portal?loc=Local&parseTime=true",
        "hosts": "",
        "maxConns": 20,
        "maxIdle": 100,
        "listen": ":6030",
        "trustable": [""],
        "http": {
            "enabled": true,
            "listen": "0.0.0.0:6031"
        }
    }

    # 进程管理

    ./open-falcon start hbs      # 启动
    ./open-falcon stop hbs       # 停止
    ./open-falcon monitor hbs    # 查看日志
    

    2.2 judge

    # 更改配置文件

    cd /opt/openfalcon/judge
    vim config/cfg.json
    {
        "debug": true,
        "debugHost": "nil",
        "remain": 11,
        "http": {
            "enabled": true,
            "listen": "0.0.0.0:6081"
        },
        "rpc": {
            "enabled": true,
            "listen": "0.0.0.0:6080"
        },
        "hbs": {
            "servers": ["127.0.0.1:6030"], # hbs最好放到lvs vip后面,所以此处最好配置为vip:port
            "timeout": 300,
            "interval": 60
        },
        "alarm": {
            "enabled": true,
            "minInterval": 300, # 连续两个报警之间至少相隔的秒数,维持默认即可
            "queuePattern": "event:p%v",
            "redis": {
                "dsn": "127.0.0.1:6379", # 与alarm、sender使用一个redis
                "maxIdle": 5,
                "connTimeout": 5000,
                "readTimeout": 5000,
                "writeTimeout": 5000
            }
        }
    }

    # 进程管理

    ./open-falcon start judge            # 启动
    ./open-falcon stop judge             # 停止
    ./open-falcon monitor judge          # 查看日志
    

    2.3 Graph

    # 更改配置文件

    cd /opt/openfalcon/graph/
    vim config/cfg.json
    {
        "debug": false, //true or false, 是否开启debug日志
        "http": {
            "enabled": true, //true or false, 表示是否开启该http端口,该端口为控制端口,主要用来对graph发送控制命令、统计命令、debug命令
            "listen": "0.0.0.0:6071" //表示监听的http端口
        },
        "rpc": {
            "enabled": true, //true or false, 表示是否开启该rpc端口,该端口为数据接收端口
            "listen": "0.0.0.0:6070" //表示监听的rpc端口
        },
        "rrd": {
            "storage": "./data/6070" // 历史数据的文件存储路径(如有必要,请修改为合适的路)
        },
        "db": {
            "dsn": "root:@tcp(127.0.0.1:3306)/graph?loc=Local&parseTime=true", //MySQL的连接信息,默认用户名是root,密码为空,host为127.0.0.1,database为graph(如有必要,请修改)
            "maxIdle": 4  //MySQL连接池配置,连接池允许的最大连接数,保持默认即可
        },
        "callTimeout": 5000,  //RPC调用超时时间,单位ms
        "migrate": {  //扩容graph时历史数据自动迁移
            "enabled": false,  //true or false, 表示graph是否处于数据迁移状态
            "concurrency": 2, //数据迁移时的并发连接数,建议保持默认
            "replicas": 500, //这是一致性hash算法需要的节点副本数量,建议不要变更,保持默认即可(必须和transfer的配置中保持一致)
            "cluster": { //未扩容前老的graph实例列表
                "graph-00" : "127.0.0.1:6070"
            }
        }
    }
    

    # 进程管理

    ./open-falcon start graph       # 启动服务
    ./open-falcon stop graph        # 停止服务
    ./open-falcon monitor graph     # 查看日志

    2.4 API

    # 更改配置文件

    cd /opt/openfalcon/api/
    vim config/cfg.json
    {
        "log_level": "debug",
        "db": {  //数据库相关的连接配置信息
            "faclon_portal": "root:@tcp(127.0.0.1:3306)/falcon_portal?charset=utf8&parseTime=True&loc=Local",
            "graph": "root:@tcp(127.0.0.1:3306)/graph?charset=utf8&parseTime=True&loc=Local",
            "uic": "root:@tcp(127.0.0.1:3306)/uic?charset=utf8&parseTime=True&loc=Local",
            "dashboard": "root:@tcp(127.0.0.1:3306)/dashboard?charset=utf8&parseTime=True&loc=Local",
            "alarms": "root:@tcp(127.0.0.1:3306)/alarms?charset=utf8&parseTime=True&loc=Local",
            "db_bug": true
        },
        "graphs": {  // graph模块的部署列表信息
            "cluster": {
                "graph-00": "127.0.0.1:6070"
            },
            "max_conns": 100,
            "max_idle": 100,
            "conn_timeout": 1000,
            "call_timeout": 5000,
            "numberOfReplicas": 500
        },
        "metric_list_file": "./api/data/metric",
        "web_port": ":8080",  // http监听端口
        "access_control": true, // 如果设置为false,那么任何用户都可以具备管理员权限
        "salt": "pleaseinputwhichyouareusingnow",  //数据库加密密码的时候的salt
        "skip_auth": false, //如果设置为true,那么访问api就不需要经过认证
        "default_token": "default-token-used-in-server-side",  //用于服务端各模块间的访问授权
        "gen_doc": false,
        "gen_doc_path": "doc/module.html"
    }
    

    # 进程管理

    ./open-falcon start api        # 启动服务
    ./open-falcon stop api         # 停止服务
    ./open-falcon monitor api      # 查看日志
    

    2.5 transfer

    # 更改配置文件

    cd /opt/openfalcon/transfer/
    vim config/cfg.json
    {
        "debug": true,
        "minStep": 30,
        "http": {
            "enabled": true,
            "listen": "0.0.0.0:6060"
        },
        "rpc": {
            "enabled": true,
            "listen": "0.0.0.0:8433"
        },
        "socket": {
            "enabled": true,
            "listen": "0.0.0.0:4444",
            "timeout": 3600
        },
        "judge": {
            "enabled": true,
            "batch": 200,
            "connTimeout": 1000,
            "callTimeout": 5000,
            "maxConns": 32,
            "maxIdle": 32,
            "replicas": 500,
            "cluster": {
                "judge-00" : "0.0.0.0:6080"
            }
        },
        "graph": {
            "enabled": true,
            "batch": 200,
            "connTimeout": 1000,
            "callTimeout": 5000,
            "maxConns": 32,
            "maxIdle": 32,
            "replicas": 500,
            "cluster": {
                "graph-00" : "0.0.0.0:6070"
            }
        },
        "tsdb": {
            "enabled": false,
            "batch": 200,
            "connTimeout": 1000,
            "callTimeout": 5000,
            "maxConns": 32,
            "maxIdle": 32,
            "retry": 3,
            "address": "127.0.0.1:8088"
        }
    }
    

    # 进程管理

    ./open-falcon start transfer           # 启动服务
    ./open-falcon stop transfer            # 停止服务
    ./open-falcon monitor transfer         # 查看日志
    curl -s "127.0.0.1:6060/health"        # 服务验证
    

    2.6 Alarm

    # 更改配置文件

    cd /opt/openfalcon/alarm/
    vim config/cfg.json
    {
        "log_level": "debug",
        "http": {
            "enabled": true,
            "listen": "0.0.0.0:9912"
        },
        "redis": {
            "addr": "127.0.0.1:6379",
            "maxIdle": 5,
            "highQueues": [
                "event:p0",
                "event:p1",
                "event:p2"
            ],
            "lowQueues": [
                "event:p3",
                "event:p4",
                "event:p5",
                "event:p6"
            ],
            "userIMQueue": "/queue/user/im",
            "userSmsQueue": "/queue/user/sms",
            "userMailQueue": "/queue/user/mail"
        },
        "api": {
            "im": "http://127.0.0.1:10086/wechat",  //微信发送网关地址
            "sms": "http://127.0.0.1:10086/sms",  //短信发送网关地址
            "mail": "http://127.0.0.1:10086/mail", //邮件发送网关地址
            "dashboard": "http://127.0.0.1:8081",  //dashboard模块的运行地址
            "plus_api":"http://127.0.0.1:8080",   //falcon-plus api模块的运行地址
            "plus_api_token": "default-token-used-in-server-side" //用于和falcon-plus api模块服务端之间的通信认证token
        },
        "falcon_portal": {
            "addr": "root:@tcp(127.0.0.1:3306)/alarms?charset=utf8&loc=Asia%2FChongqing",
            "idle": 10,
            "max": 100
        },
        "worker": {
            "im": 10,
            "sms": 10,
            "mail": 50
        },
        "housekeeper": {
            "event_retention_days": 7,  //报警历史信息的保留天数
            "event_delete_batch": 100
        }
    }
    

    # 进程管理

    ./open-falcon start alarm       # 启动
    ./open-falcon stop alarm        # 停止
    ./open-falcon monitor alarm     # 查看日志
    

    2.7 task

    # 下载软件名

    mkdir /opt/openfalcon/task
    cd /opt/openfalcon/task
    wget https://github.com/open-falcon/task/releases/download/v0.0.10/falcon-task-0.0.10.tar.gz
    tar zxf falcon-task-0.0.10.tar.gz
    rm -rf falcon-task-0.0.10.tar.gz
    

    # 更改配置文件

    vim /opt/openfalcon/task/cfg.json 
    {
        "debug": false,
        "http": {
            "enable": true,
            "listen": "0.0.0.0:8002"
        },
        "index": {
            "enable": true,
            "dsn": "root:root@tcp(127.0.0.1:3306)/graph?loc=Local&parseTime=true",
            "maxIdle": 4,
            "autoDelete": false,
            "cluster":{
                "test.hostname01:6071" : "0 0 0 ? * 0-5",
                "test.hostname02:6071" : "0 30 0 ? * 0-5"
            }
        },
        "collector" : {
            "enable": true,
            "destUrl" : "http://127.0.0.1:1988/v1/push",
            "srcUrlFmt" : "http://%s/statistics/all",
            "cluster" : [
                "transfer,test.hostname:6060",
                "graph,test.hostname:6071",
                "task,test.hostname:8001"
            ]
        }
    }
    

    2.8 Nodata

    # 更改配置文件

    cd /opt/openfalcon/nodata/
    vim config/cfg.json
    {
        "debug": true,
        "http": {
            "enabled": true,
            "listen": "0.0.0.0:6090"
        },
        "plus_api":{
            "connectTimeout": 500,
            "requestTimeout": 2000,
            "addr": "http://127.0.0.1:8080",  #falcon-plus api模块的运行地址
            "token": "default-token-used-in-server-side"  #用于和falcon-plus api模块的交互认证token
        },
        "config": {
            "enabled": true,
            "dsn": "root:@tcp(127.0.0.1:3306)/falcon_portal?loc=Local&parseTime=true&wait_timeout=604800",
            "maxIdle": 4
        },
        "collector":{
            "enabled": true,
            "batch": 200,
            "concurrent": 10
        },
        "sender":{
            "enabled": true,
            "connectTimeout": 500,
            "requestTimeout": 2000,
            "transferAddr": "127.0.0.1:6060",  #transfer的http监听地址,一般形如"domain.transfer.service:6060"
            "batch": 500
        }
    }
    

    # 进程管理

    ./open-falcon start nodata         # 启动服务
    ./open-falcon stop nodata          # 停止服务
    ./open-falcon monitor nodata       # 检查日志
    

    2.9 Aggregator

    # 更改配置文件

    cd /opt/openfalcon/aggregator/
    vim config/cfg.json
    {
        "debug": true,
        "http": {
            "enabled": true,
            "listen": "0.0.0.0:6055"
        },
        "database": {
            "addr": "root:@tcp(127.0.0.1:3306)/falcon_portal?loc=Local&parseTime=true",
            "idle": 10,
            "ids": [1, -1],
            "interval": 55
        },
        "api": {
            "connect_timeout": 500,
            "request_timeout": 2000,
            "plus_api": "http://127.0.0.1:8080",  #falcon-plus api模块的运行地址
            "plus_api_token": "default-token-used-in-server-side", #和falcon-plus api 模块交互的认证token
            "push_api": "http://127.0.0.1:1988/v1/push"  #push数据的http接口,这是agent提供的接口
        }
    }
    

    # 进程管理

    ./open-falcon start aggregator         # 启动服务
    ./open-falcon monitor aggregator       # 检查log
    ./open-falcon stop aggregator          # 停止服务
    

    2.10 agent

    # 更改配置文件

    cd /opt/openfalcon/agent/
    vim config/cfg.json
    {
        "debug": true,  # 控制一些debug信息的输出,生产环境通常设置为false
        "hostname": "", # agent采集了数据发给transfer,endpoint就设置为了hostname,默认通过`hostname`获取,如果配置中配置了hostname,就用配置中的
        "ip": "", # agent与hbs心跳的时候会把自己的ip地址发给hbs,agent会自动探测本机ip,如果不想让agent自动探测,可以手工修改该配置
        "plugin": {
            "enabled": false, # 默认不开启插件机制
            "dir": "./plugin",  # 把放置插件脚本的git repo clone到这个目录
            "git": "https://github.com/open-falcon/plugin.git", # 放置插件脚本的git repo地址
            "logs": "./logs" # 插件执行的log,如果插件执行有问题,可以去这个目录看log
        },
        "heartbeat": {
            "enabled": true,  # 此处enabled要设置为true
            "addr": "127.0.0.1:6030", # hbs的地址,端口是hbs的rpc端口
            "interval": 60, # 心跳周期,单位是秒
            "timeout": 1000 # 连接hbs的超时时间,单位是毫秒
        },
        "transfer": {
            "enabled": true, 
            "addrs": [
                "127.0.0.1:18433"
            ],  # transfer的地址,端口是transfer的rpc端口, 可以支持写多个transfer的地址,agent会保证HA
            "interval": 60, # 采集周期,单位是秒,即agent一分钟采集一次数据发给transfer
            "timeout": 1000 # 连接transfer的超时时间,单位是毫秒
        },
        "http": {
            "enabled": true,  # 是否要监听http端口
            "listen": ":1988",
            "backdoor": false
        },
        "collector": {
            "ifacePrefix": ["eth", "em", "ens"], # 默认配置只会采集网卡名称前缀是eth、em的网卡流量,配置为空就会采集所有的,lo的也会采集。可以从/proc/net/dev看到各个网卡的流量信息
            "mountPoint": []
        },
        "default_tags": {
        },
        "ignore": {  # 默认采集了200多个metric,可以通过ignore设置为不采集
            "cpu.busy": true,
            "df.bytes.free": true,
            "df.bytes.total": true,
            "df.bytes.used": true,
            "df.bytes.used.percent": true,
            "df.inodes.total": true,
            "df.inodes.free": true,
            "df.inodes.used": true,
            "df.inodes.used.percent": true,
            "mem.memtotal": true,
            "mem.memused": true,
            "mem.memused.percent": true,
            "mem.memfree": true,
            "mem.swaptotal": true,
            "mem.swapused": true,
            "mem.swapfree": true
        }
    }
    

    # 进程管理

    ./open-falcon start agent      # 启动进程
    ./open-falcon stop agent       # 停止进程
    ./open-falcon monitor agent    # 查看日志
    

    总结:

      最好部署到这里,基本能用了。但是如果多机房或告警,还需要部署 "gateway" 和 "邮件、短信、微信发送接口" 。还有就是agent-updater这个工具,用于管理falcon-agent,agent-updater也有一个agent:ops-updater,可以看做是一个超级agent,用于管理其他agent的agent,呵呵,ops-updater推荐在装机的时候一起安装上。ops-updater通常是不会升级的。这些插件根据自己的实际情况进行部署。

    三、Dashboard 部署

    3.1 创建相关目录

    export HOME=/home/work
    export WORKSPACE=$HOME/open-falcon
    mkdir -p $WORKSPACE
    cd $WORKSPACE
    

    3.2 下载代码

    cd $WORKSPACE
    git clone https://github.com/open-falcon/dashboard.git
    

    3.3 安装依赖

    yum install -y python-virtualenv
    yum install -y python-devel
    yum install -y openldap-devel
    yum install -y mysql-devel
    yum groupinstall "Development tools"
    
    cd $WORKSPACE/dashboard/
    virtualenv ./env
    
    ./env/bin/pip install -r pip_requirements.txt -i https://pypi.douban.com/simple
    

    3.4 修改配置文件

    dashboard的配置文件为: 'rrd/config.py',请根据实际情况修改
    
    ## API_ADDR 表示后端api组件的地址
    API_ADDR = "http://127.0.0.1:8080/api/v1" 
    
    ## 根据实际情况,修改PORTAL_DB_*, 默认用户名为root,默认密码为""
    ## 根据实际情况,修改ALARM_DB_*, 默认用户名为root,默认密码为""
    

    3.5 启动并访问

    启动生产环境
    ./control start
    open http://127.0.0.1:8081 in your browser.
    
    查看日志
    ./control tail
    

    提示:open-falcon dashboard不存在默认用户和密码,需要自己注册;

    总结,最后就部署好了,关于相关使用和其他服务监控,在小米github以及第三方都有相关插件。总体来说,open-falcon确实不错,个人觉得是替代zabbix的不二之选。至于是选择zabbix还是open-falcon,我个人觉得,如果你对新的技术比较向往或者想换一款软件(比如我),那么open-falcon是不错的。如果公司的技术支持或者个人对zabbix比较熟悉也不想在更换监控,那么zabbix肯定是不错的选择。

      

  • 相关阅读:
    Chap5:操作文件和目录[The Linux Command Line]
    ABC3
    ABC2
    ABC
    Spring MVC / Boot
    Usefull Resources
    [ Learning ] Design Pattens
    [ Learning ] Spring Resources
    URL Resources
    [ Windows BAT Script ] BAT 脚本获取windows权限
  • 原文地址:https://www.cnblogs.com/yangxiaoyi/p/7495218.html
Copyright © 2011-2022 走看看