zoukankan      html  css  js  c++  java
  • prometheus添加自定义监控与告警(etcd为例)

    一、步骤及注意事项(前提,部署参考部署篇)

    1. 一般etcd集群会开启HTTPS认证,因此访问etcd需要对应的证书
    2. 使用证书创建etcd的secret
    3. 将etcd的secret挂在到prometheus
    4. 创建etcd的servicemonitor对象(匹配kube-system空间下具有k8s-app=etcd标签的service)
    5. 创建service关联被监控对象

    二、实际操作步骤(etcd证书默认路径:/etc/kubernetes/pki/etcd/)

    1、创建etcd的secret

    cd /etc/kubernetes/pki/etcd/
    kubectl create secret generic etcd-certs --from-file=healthcheck-client.crt --from-file=healthcheck-client.key --from-file=ca.crt -n monitoring

    2、添加secret到名为k8s的prometheus对象上(kubectl edit prometheus k8s -n monitoring或者修改yaml文件并更新资源)

    apiVersion: monitoring.coreos.com/v1
    kind: Prometheus
    metadata:
      labels:
        prometheus: k8s
      name: k8s
      namespace: monitoring
    spec:
      alerting:
        alertmanagers:
        - name: alertmanager-main
          namespace: monitoring
          port: web
      baseImage: quay.io/prometheus/prometheus
      nodeSelector:
        kubernetes.io/os: linux
      podMonitorNamespaceSelector: {}
      podMonitorSelector: {}
      replicas: 2
      secrets:
      - etcd-certs
      resources:
        requests:
          memory: 400Mi
      ruleSelector:
        matchLabels:
          prometheus: k8s
          role: alert-rules
      securityContext:
        fsGroup: 2000
        runAsNonRoot: true
        runAsUser: 1000
      serviceAccountName: prometheus-k8s
      serviceMonitorNamespaceSelector: {}
      serviceMonitorSelector: {}
      version: v2.11.0

    3、创建servicemonitoring对象

    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
      name: etcd-k8s
      namespace: monitoring
      labels:
        k8s-app: etcd-k8s
    spec:
      jobLabel: k8s-app
      endpoints:
      - port: port
        interval: 30s
        scheme: https
        tlsConfig:
          caFile: /etc/prometheus/secrets/etcd-certs/ca.crt
          certFile: /etc/prometheus/secrets/etcd-certs/healthcheck-client.crt
          keyFile: /etc/prometheus/secrets/etcd-certs/healthcheck-client.key
          insecureSkipVerify: true
      selector:
        matchLaels:
          k8s-app: etcd
      namespaceSelector:
        matchNames:
        - kube-system

    4、创建service并自定义endpoint(考虑到etcd可能部署在kubernetes集群外,因此自定义endpoint)

    apiVersion: v1
    kind: Service
    metadata:
      name: etcd-k8s
      namespace: kube-system
      labels:
        k8s-app: etcd
    spec:
      type: ClusterIP
      clusterIP: None
      ports:
      - name: port
        port: 2379
        protocol: TCP
    
    ---
    apiVersion: v1
    kind: Endpoints
    metadata:
      name: etcd-k8s
      namespace: kube-system
      labels:
        k8s-app: etcd
    subsets:
    - addresses:
      - ip: 1.1.1.11
    -
    ip: 1.1.1.12
    - ip: 1.1.1.13
        nodeName: etcd-master
      ports:
      - name: port
        port: 2379
        protocol: TCP

    此处正常能通过prometheus的页面看到对应的监控信息了

    若监控中出现报错:connection refused,修改/etc/kubernetes/manifests下的etcd.yaml文件

    方法一:--listen-client-urls=https://0.0.0.0:2379

    方法二:--listen-client-urls=https://127.0.0.1:2379,https://1.1.1.11:2379

    三、创建自定义告警

    1. 创建一个prometheusRule资源后再prometheus的pod中会生成对应的告警配置文件
    2. 注意:此处的标签一定要匹配
    3. 告警项:若etcd集群有一半以上的节点可用,则认为集群可用,否则产生告警
    apiVersion: monitoring.coreos.com/v1
    kind: PrometheusRule
    metadata:
      labels:
        prometheus: k8s
        role: alert-rules
      name: etcd-rules
      namespace: monitoring
    spec:
      groups:
      - name: etcd-exporter.rules
        rules:
        - alert: EtcdClusterUnavailable
          annotations:
            summary: etcd cluster small
            description: If one more etcd peer goes down the cluster will be unavailable
          expr: |
            count(up{job="etcd"} == 0) > (count(up{job="etcd"}) / 2-1)
          for: 3m
          labels:
            severity: critical
  • 相关阅读:
    关于组建“智彩足球技术研究团队”的说明
    2次成功投诉EMS和中国移动的经验
    为什么选择玩足球彩票以及玩彩票的心态?
    【原创】机器学习之PageRank算法应用与C#实现(1)算法介绍
    【原创】开源Math.NET基础数学类库使用(17)C#计算矩阵条件数
    【原创】开源Math.NET基础数学类库使用(16)C#计算矩阵秩
    【文章】本站收集与转载文章目录
    【原创】.NET读写Excel工具Spire.Xls使用(3)单元格控制
    分享一个Visual Studio的背景插件,让堆码更富情趣
    【原创】开源Math.NET基础数学类库使用(15)C#计算矩阵行列式
  • 原文地址:https://www.cnblogs.com/jayce9102/p/12073559.html
Copyright © 2011-2022 走看看