zoukankan      html  css  js  c++  java
  • prometheus监控1

    1.prometheus服务端监控安装

    # cd /usr/local/
    # wget
    https://github.com/prometheus/prometheus/releases/download/v2.21.0/prometheus-2.21.0.linux-amd64.tar.gz
    # tar xf prometheus-2.21.0-rc.0.linux-amd64.tar.gz
    # ln -sv prometheus-2.21.0-rc.0.linux-amd64 prometheus
    # /usr/local/prometheus/promtool check config prometheus.yml
    # /usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml &

    安装完成并启动后可以查询日志输出:tail -f /var/log/messages

    prometheus监控进程以及端口信息:

    [root@master prometheus]# ps -ef|grep prometheus
    root      10630  10412  0 17:12 pts/0    00:00:06 /usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml
    root      10848  10412  0 17:46 pts/0    00:00:00 grep --color=auto prometheus
    
    [root@master prometheus]# lsof -i :9090
    COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
    prometheu 10630 root    3u  IPv6  34901      0t0  TCP localhost:59090->localhost:websm (ESTABLISHED)
    prometheu 10630 root    8u  IPv6  34894      0t0  TCP *:websm (LISTEN)
    prometheu 10630 root    9u  IPv4  34268      0t0  TCP localhost:52242->localhost:websm (ESTABLISHED)
    prometheu 10630 root   11u  IPv6  34897      0t0  TCP localhost:websm->localhost:52242 (ESTABLISHED)
    prometheu 10630 root   13u  IPv6  33279      0t0  TCP localhost:websm->localhost:59090 (ESTABLISHED)

    确认prometheus服务安装没有问题后,可以通过prometheus内置的控制台进行访问:

    当然也可以查询采集的数据:

     prometheus将其可以拉取数据指标的来源称之为endpoint,endpoint可以是各种exporter或者应用程序.然后为了拉取endpoint的数据,prometheus定义了名为target的配置,告诉拉取时要如何进行连接等信息,多个具有相同功能角色的target组合在一起就构成了一个job.例如,具有相同用途的一组主机的资源监控器node_exporter或者mysql数据库的监控器mysqld_exporter

    prometheus默认是将收集到的时间序列的数据存储在本地tsdb数据库中,且默认只保留15天,也可以配置发送到其他时间序列数据库中

    2.监控linux机器node_exporter安装

    # cd /usr/local/
    # wget https://github.com/prometheus/node_exporter/releases/download/v1.0.1/node_exporter-1.0.1.linux-amd64.tar.gz
    # tar xf node_exporter-1.0.1.linux-amd64.tar.gz
    # ln -sv node_exporter-1.0.1.linux-amd64 node_exporter

    启动node_exporter程序:

    [root@master local]# cat /usr/lib/systemd/system/node_exporter.service
    [Unit]
    Description=node_exporter
    
    [Service]
    ExecStart=/usr/local/node_exporter/node_exporter 
    --web.listen-address=:9100 
    --collector.systemd 
    --collector.systemd.unit-whitelist="(ssh|docker|rsyslog|redis-server).service"
    
    Restart=on-failure
    
    [Install]
    WantedBy=mutil-user.target
    
    # systemctl enable node_exporter
    # systemctl start node_exporter
    # ps -ef|grep node
    # lsof -i:9100
    # tail -f /var/log/messages

    现在修改prometheus服务端配置文件将node_exporter节点添加到job中:

    # cp prometheus.yml prometheus.yml.bak20200920
    [root@master prometheus]# cat prometheus.yml
    # my global config
    global:
      scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
      evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
      # scrape_timeout is set to the global default (10s).
    
    # Alertmanager configuration
    alerting:
      alertmanagers:
      - static_configs:
        - targets:
          # - alertmanager:9093
    
    # Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
    rule_files:
      # - "first_rules.yml"
      # - "second_rules.yml"
    
    # A scrape configuration containing exactly one endpoint to scrape:
    # Here it's Prometheus itself.
    scrape_configs:
      # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
      - job_name: 'prometheus'
    
        # metrics_path defaults to '/metrics'
        # scheme defaults to 'http'.
    
        static_configs:
        - targets: ['localhost:9090']
    
      - job_name: 'linux_node'       # 新增下面几行,prometheus会自动pull从node_exporter的数据到tsdb中
        static_configs:
        - targets: ['172.16.23.120:9100']
          labels:
            nodename: master
            role: master

    # 检查语法:

      [root@master prometheus]# ./promtool check config prometheus.yml
      Checking prometheus.yml
      SUCCESS: 0 rule files found

    重启prometheus服务端:

    [root@master prometheus]# ps -ef|grep prometheus
    root      10630  10412  0 17:12 pts/0    00:00:12 /usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml
    root      11167  10412  0 18:26 pts/0    00:00:00 grep --color=auto prometheus
    [root@master prometheus]# kill 10630
    level=warn ts=2020-09-20T10:27:01.529Z caller=main.go:551 msg="Received SIGTERM, exiting gracefully..."
    [root@master prometheus]# level=info ts=2020-09-20T10:27:01.529Z caller=main.go:574 msg="Stopping scrape discovery manager..."
    level=info ts=2020-09-20T10:27:01.529Z caller=main.go:588 msg="Stopping notify discovery manager..."
    level=info ts=2020-09-20T10:27:01.529Z caller=main.go:610 msg="Stopping scrape manager..."
    level=info ts=2020-09-20T10:27:01.529Z caller=main.go:584 msg="Notify discovery manager stopped"
    level=info ts=2020-09-20T10:27:01.529Z caller=main.go:570 msg="Scrape discovery manager stopped"
    level=info ts=2020-09-20T10:27:01.529Z caller=manager.go:908 component="rule manager" msg="Stopping rule manager..."
    level=info ts=2020-09-20T10:27:01.529Z caller=manager.go:918 component="rule manager" msg="Rule manager stopped"
    level=info ts=2020-09-20T10:27:01.529Z caller=main.go:604 msg="Scrape manager stopped"
    level=info ts=2020-09-20T10:27:01.532Z caller=notifier.go:601 component=notifier msg="Stopping notification manager..."
    level=info ts=2020-09-20T10:27:01.532Z caller=main.go:778 msg="Notifier manager stopped"
    level=info ts=2020-09-20T10:27:01.533Z caller=main.go:790 msg="See you next time!"
    
    [1]+  完成                  /usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml
    [root@master prometheus]# /usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml &
    [1] 11174
    [root@master prometheus]# level=info ts=2020-09-20T10:28:01.627Z caller=main.go:310 msg="No time or size retention was set so using the default time retention" duration=15d
    level=info ts=2020-09-20T10:28:01.628Z caller=main.go:346 msg="Starting Prometheus" version="(version=2.21.0-rc.0, branch=HEAD, revision=1195cc24e3c8b9af8aeafcfc46473f6486ca3f64)"
    level=info ts=2020-09-20T10:28:01.628Z caller=main.go:347 build_context="(go=go1.15, user=root@1e754dfec932, date=20200827-23:23:27)"
    level=info ts=2020-09-20T10:28:01.628Z caller=main.go:348 host_details="(Linux 3.10.0-862.el7.x86_64 #1 SMP Fri Apr 20 16:44:24 UTC 2018 x86_64 master (none))"
    level=info ts=2020-09-20T10:28:01.628Z caller=main.go:349 fd_limits="(soft=1024, hard=4096)"
    level=info ts=2020-09-20T10:28:01.628Z caller=main.go:350 vm_limits="(soft=unlimited, hard=unlimited)"
    level=info ts=2020-09-20T10:28:01.630Z caller=main.go:701 msg="Starting TSDB ..."
    level=info ts=2020-09-20T10:28:01.630Z caller=web.go:523 component=web msg="Start listening for connections" address=0.0.0.0:9090
    level=info ts=2020-09-20T10:28:01.636Z caller=head.go:644 component=tsdb msg="Replaying on-disk memory mappable chunks if any"
    level=info ts=2020-09-20T10:28:01.636Z caller=head.go:658 component=tsdb msg="On-disk memory mappable chunks replay completed" duration=229.637µs
    level=info ts=2020-09-20T10:28:01.636Z caller=head.go:664 component=tsdb msg="Replaying WAL, this may take a while"
    level=info ts=2020-09-20T10:28:01.647Z caller=head.go:716 component=tsdb msg="WAL segment loaded" segment=0 maxSegment=1
    level=info ts=2020-09-20T10:28:01.648Z caller=head.go:716 component=tsdb msg="WAL segment loaded" segment=1 maxSegment=1
    level=info ts=2020-09-20T10:28:01.648Z caller=head.go:719 component=tsdb msg="WAL replay completed" checkpoint_replay_duration=28.179µs wal_replay_duration=10.994542ms total_replay_duration=11.274975ms
    level=info ts=2020-09-20T10:28:01.649Z caller=main.go:721 fs_type=XFS_SUPER_MAGIC
    level=info ts=2020-09-20T10:28:01.649Z caller=main.go:724 msg="TSDB started"
    level=info ts=2020-09-20T10:28:01.649Z caller=main.go:850 msg="Loading configuration file" filename=/usr/local/prometheus/prometheus.yml
    level=info ts=2020-09-20T10:28:01.650Z caller=main.go:881 msg="Completed loading of configuration file" filename=/usr/local/prometheus/prometheus.yml totalDuration=691.925µs remote_storage=8.401µs web_handler=412ns query_engine=975ns scrape=292.136µs scrape_sd=62.841µs notify=25.538µs notify_sd=9.268µs rules=2.757µs
    level=info ts=2020-09-20T10:28:01.650Z caller=main.go:673 msg="Server is ready to receive web requests."

    然后刷新prometheus控制台:

     将prometheus服务端设置为系统服务:

    [root@master prometheus]# cat /usr/lib/systemd/system/prometheus.service
    [Unit]
    Description=prometheus
    
    
    [Service]
    ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --storage.tsdb.path=/data/prometheus --web.enable-lifecycle --storage.tsdb.retention.time=180d
    
    Restart=on-failure
    
    [Install]
    WantedBy=multi-user.target
    
    # systemctl enable prometheus

    然后将当前的prometheus进程停止启动服务方式:

    # systemctl start prometheus
    [root@master prometheus]# ps -ef|grep prometheus
    root      11252      1  5 18:34 ?        00:00:00 /usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --storage.tsdb.path=/data/prometheus --web.enable-lifecycle --storage.tsdb.retention.time=180d
    root      11263  10412  0 18:34 pts/0    00:00:00 grep --color=auto prometheus
  • 相关阅读:
    mock.js 模拟数据
    pa
    观察者模式
    WebSocket
    Nginx官方文档学习
    Java中文乱码解决
    Jersey+Spring+Maven(转)
    App架构经验总结(转)
    JSONP跨域的原理解析(转)
    mongoDB学习
  • 原文地址:https://www.cnblogs.com/jsonhc/p/13701342.html
Copyright © 2011-2022 走看看