zoukankan      html  css  js  c++  java
  • 基于grafana+prometheus构建Flink监控

    先上一个架构图

    Flink App : 通过report 将数据发出去

    Pushgateway :  Prometheus 生态中一个重要工具

    Prometheus :  一套开源的系统监控报警框架 (Prometheus 入门与实践

    Grafana: 一个跨平台的开源的度量分析和可视化工具,可以通过将采集的数据查询然后可视化的展示,并及时通知(可视化工具Grafana:简介及安装

    Node_exporter : 跟Pushgateway一样是Prometheus 的组件,采集到主机的运行指标如CPU, 内存,磁盘等信息

    以下安装,大部分参考博客: https://www.cnblogs.com/xiao987334176/p/9930517.html#autoid-0-0-0

    1、docker  pull 镜像

    docker pull prom/node-exporter
    docker pull prom/pushgateway
    docker pull prom/prometheus
    docker pull grafana/grafana

    查看下载的镜像

    $ docker images
    REPOSITORY           TAG                 IMAGE ID            CREATED             SIZE
    prom/prometheus      latest              d5b9d7ed160a        2 weeks ago         138MB
    grafana/grafana      latest              a6e14b4109af        2 weeks ago         253MB
    prom/pushgateway     latest              20e6dcae675f        4 weeks ago         19.2MB
    prom/node-exporter   latest              e5a616e4b9cf        2 months ago        22.9MB

    2、编辑prometheus.yml 、创建 Grafana 数据存储目录

    $ mkdir /opt/grafana-storage  # grafana 数据存储目录

    $ cat /opt/prometheus/prometheus.yml # prometheus 配置
    global:
      scrape_interval:     60s
      evaluation_interval: 60s
     
    scrape_configs:
      - job_name: prometheus
        static_configs:
          - targets: ['localhost:9090']
            labels:
              instance: prometheus
     
      - job_name: linux
        static_configs:
          - targets: ['venn:9100']
            labels:
              instance: localhost
      - job_name: 'pushgateway'
        static_configs:
          - targets: ['venn:9091']
            labels:
              instance: 'pushgateway'

    3、启动各个组件

    docker run -d -p 3000:3000   --name=grafana   -v /opt/grafana-storage:/var/lib/grafana   grafana/grafana
    docker run -d -p 9100:9100  -v "/proc:/host/proc:ro"  -v "/sys:/host/sys:ro"  -v "/:/rootfs:ro"  --net="host"  prom/node-exporter
    docker run -d -p 9090:9090  -v /opt/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml  prom/prometheus
    docker run -d -p 9091:9091 prom/pushgateway

    查看docker进程

    $ docker ps
    CONTAINER ID        IMAGE                COMMAND                  CREATED             STATUS              PORTS                    NAMES
    4a689cf48e10        prom/pushgateway     "/bin/pushgateway"       5 days ago          Up 5 days           0.0.0.0:9091->9091/tcp   infallible_goldstine
    fcc40433bf75        grafana/grafana      "/run.sh"                5 days ago          Up 5 days           0.0.0.0:3000->3000/tcp   grafana
    8ba942d0cf35        prom/prometheus      "/bin/prometheus --c…"   5 days ago          Up 5 days           0.0.0.0:9090->9090/tcp   quizzical_colden
    b84b0f4be2b2        prom/node-exporter   "/bin/node_exporter"     5 days ago          Up 5 days                                    fervent_poitras

    查看端口

    $ netstat -apn | grep -E '9091|3000|9090|9100'
    (Not all processes could be identified, non-owned process info
     will not be shown, you would have to be root to see it all.)
    tcp        0      0 172.17.0.1:39028        172.17.0.4:9091         ESTABLISHED -                   
    tcp6       0      0 :::9100                 :::*                    LISTEN      -                   
    tcp6       0      0 :::3000                 :::*                    LISTEN      -                   
    tcp6       0      0 :::9090                 :::*                    LISTEN      -                   
    tcp6       0      0 :::9091                 :::*                    LISTEN      -                   
    tcp6       0      0 192.168.229.129:45864   192.168.229.128:9091    TIME_WAIT   -                   
    tcp6       0      0 192.168.229.129:45856   192.168.229.128:9091    TIME_WAIT   -                   
    tcp6       0      0 192.168.229.129:45824   192.168.229.128:9091    TIME_WAIT   -                   
    tcp6       0      0 192.168.229.129:45874   192.168.229.128:9091    TIME_WAIT   -                   
    tcp6       0      0 192.168.229.129:45854   192.168.229.128:9091    TIME_WAIT   -                   
    tcp6       0      0 192.168.229.129:45836   192.168.229.128:9091    TIME_WAIT   -                   
    tcp6       0      0 192.168.229.129:45814   192.168.229.128:9091    TIME_WAIT   -                   
    tcp6       0      0 192.168.229.128:9100    192.168.229.1:13405     ESTABLISHED -                   
    tcp6       0      0 192.168.229.129:45826   192.168.229.128:9091    TIME_WAIT   -                   
    tcp6       0      0 192.168.229.129:45844   192.168.229.128:9091    TIME_WAIT   -                   
    tcp6       0      0 192.168.229.128:9091    172.17.0.2:53930        ESTABLISHED -                   
    tcp6       0      0 192.168.229.129:45846   192.168.229.128:9091    TIME_WAIT   -                   
    tcp6       0      0 192.168.229.128:9100    172.17.0.2:54776        ESTABLISHED -                   
    tcp6       0      0 192.168.229.129:45816   192.168.229.128:9091    TIME_WAIT   -                   
    tcp6       0      0 192.168.229.129:45876   192.168.229.128:9091    ESTABLISHED 40846/java          
    tcp6       0      0 192.168.229.129:45834   192.168.229.128:9091    TIME_WAIT   -                   
    tcp6       0      0 192.168.229.129:45866   192.168.229.128:9091    TIME_WAIT   -   

    4、查看组件页面

    node_exporter:  ip:9100/metrics

    查看 prometheus: ip:9090/targets

    如果state 不是 UP 的,等一会就起来了 

    查看Grafana: 

     

      默认用户名密码 : amin/admin

    此处不再赘述,配置数据源、创建系统负载监控参考博客:https://www.cnblogs.com/xiao987334176/p/9930517.html#autoid-0-0-0 

    5、配置Flink report :

    在Flink 配置文件 flink-conf.yml 中添加如下内容:

    ##metrics
    metrics.reporter.promgateway.class: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter
    metrics.reporter.promgateway.host: venn
    metrics.reporter.promgateway.port: 9091
    metrics.reporter.promgateway.jobName: myJob
    metrics.reporter.promgateway.randomJobNameSuffix: true
    metrics.reporter.promgateway.deleteOnShutdown: false

    启动一个任务(上一篇博客的案例迟到数据处理):

    flink run -m yarn-cluster -ynm LateDataProcess -yn 1 -c com.venn.stream.api.sideoutput.lateDataProcess.LateDataProcess jar/flinkDemo-1.0.jar

    查看任务webUI:

    PS:任务已经跑了一段时间了

    6、Grafana 中配置Flink监控

    由于上面一句配置好Flink report、 pushgateway、prometheus,并且在Grafana中已经添加了prometheus 数据源,所以Grafana中会自动获取到 flink job的metrics 。

     Grafana 首页,点击New dashboard,创建一个新的dashboard

    选中之后,即会出现对应的监控指标

    至此,Flink 的metrics 的指标展示在Grafana 中了

    flink 指标对应的指标名比较长,可以在Legend 中配置显示内容,在{{key}} 将key换成对应需要展示的字段即可,如: {{job_name}},{{operator_name}}

    对应显示如下:

    保存,搞定

  • 相关阅读:
    IntelliJ IDEA 14.03 java 中文文本处理中的编码格式设置
    应聘感悟
    STL string分析
    CUDA SDK VolumeRender 分析 (1)
    BSP
    CUDA SDK VolumeRender 分析 (3)
    CUDA SDK VolumeRender 分析 (2)
    Windows软件发布时遇到的一些问题
    Ten Commandments of Egoless Programming (转载)
    复习下光照知识
  • 原文地址:https://www.cnblogs.com/Springmoon-venn/p/11445023.html
Copyright © 2011-2022 走看看