zoukankan      html  css  js  c++  java
  • kubernetes实战(二十):k8s一键部署高可用Prometheus并实现邮件告警

    1、基本概念

      本次部署使用的是CoreOS的prometheus-operator。

      本次部署包含监控etcd集群。

      本次部署适用于二进制和kubeadm安装方式。

      本次部署适用于k8s v1.10版本以上,其他版本自行测试。

      项目地址:https://github.com/coreos/prometheus-operator/tree/master/contrib/kube-prometheus

      使用Helm安装:https://github.com/helm/charts/tree/master/stable/prometheus-operator

     

    2、安装

      下载安装文件:

    [root@k8s-master01 ~]# git clone https://github.com/dotbalo/k8s.git
    Cloning into 'k8s'...
    remote: Enumerating objects: 373, done.
    remote: Counting objects: 100% (373/373), done.
    remote: Compressing objects: 100% (264/264), done.
    remote: Total 373 (delta 127), reused 349 (delta 103), pack-reused 0
    Receiving objects: 100% (373/373), 4.92 MiB | 553.00 KiB/s, done.
    Resolving deltas: 100% (127/127), done.

     [root@k8s-master01 prometheus-operator]# ls
     alertmanager-config.yam.bak bundle.yaml mail-template.tmpl README.md
     alertmanager.yaml deploy manifests teardown

     

      修改相关配置:

      1) 修改deploy文件中的etcd证书文件,kubeadm安装方式的无须修改

      2)修改manifests/prometheus/prometheus-etcd.yaml的tlsConfig(kubeadm安装方式的无须修改)和addresses(etcd地址)

      3)修改alertmanager.yaml文件的邮件告警配置和收件人配置

      一键安装:(注意:如果集群是二进制安装的,首次安装注册时间可能会很长很长,kubeadm安装方式较迅速。)

    [root@k8s-master01 prometheus-operator]# ./deploy 
    namespace/monitoring created
    secret/alertmanager-main created
    secret/etcd-certs created
    clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
    clusterrole.rbac.authorization.k8s.io/prometheus-operator created
    serviceaccount/prometheus-operator created
    service/prometheus-operator created
    deployment.apps/prometheus-operator created
    Waiting for Operator to register custom resource definitions...done!
    clusterrolebinding.rbac.authorization.k8s.io/node-exporter created
    clusterrole.rbac.authorization.k8s.io/node-exporter created
    daemonset.extensions/node-exporter created
    serviceaccount/node-exporter created
    service/node-exporter created
    clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created
    clusterrole.rbac.authorization.k8s.io/kube-state-metrics created
    deployment.extensions/kube-state-metrics created
    rolebinding.rbac.authorization.k8s.io/kube-state-metrics created
    role.rbac.authorization.k8s.io/kube-state-metrics-resizer created
    serviceaccount/kube-state-metrics created
    service/kube-state-metrics created
    secret/grafana-credentials created
    secret/grafana-credentials unchanged
    configmap/grafana-dashboard-definitions-0 created
    configmap/grafana-dashboards created
    configmap/grafana-datasources created
    deployment.apps/grafana created
    service/grafana created
    service/etcd-k8s created
    endpoints/etcd-k8s created
    servicemonitor.monitoring.coreos.com/etcd-k8s created
    configmap/prometheus-k8s-rules created
    serviceaccount/prometheus-k8s created
    servicemonitor.monitoring.coreos.com/alertmanager created
    servicemonitor.monitoring.coreos.com/kube-apiserver created
    servicemonitor.monitoring.coreos.com/kube-controller-manager created
    servicemonitor.monitoring.coreos.com/kube-scheduler created
    servicemonitor.monitoring.coreos.com/kube-state-metrics created
    servicemonitor.monitoring.coreos.com/kubelet created
    servicemonitor.monitoring.coreos.com/node-exporter created
    servicemonitor.monitoring.coreos.com/prometheus-operator created
    servicemonitor.monitoring.coreos.com/prometheus created
    service/prometheus-k8s created
    prometheus.monitoring.coreos.com/k8s created
    role.rbac.authorization.k8s.io/prometheus-k8s created
    role.rbac.authorization.k8s.io/prometheus-k8s created
    role.rbac.authorization.k8s.io/prometheus-k8s created
    clusterrole.rbac.authorization.k8s.io/prometheus-k8s created
    rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
    rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
    rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
    clusterrolebinding.rbac.authorization.k8s.io/prometheus-k8s created
    service/alertmanager-main created
    alertmanager.monitoring.coreos.com/main created

     

    3、验证安装

      查看pods

    [root@k8s-master01 prometheus-operator]# kubectl get po -n monitoring
    NAME                                   READY     STATUS    RESTARTS   AGE
    alertmanager-main-0                    2/2       Running   0          2m
    alertmanager-main-1                    2/2       Running   0          1m
    alertmanager-main-2                    2/2       Running   0          1m
    grafana-59f56c4789-dzvgf               1/1       Running   0          2m
    kube-state-metrics-575464c49c-m8w4w    4/4       Running   0          2m
    node-exporter-5kvxf                    2/2       Running   0          2m
    node-exporter-66p7h                    2/2       Running   0          2m
    node-exporter-clxzk                    2/2       Running   0          2m
    node-exporter-hsgm8                    2/2       Running   0          2m
    node-exporter-m5l24                    2/2       Running   0          2m
    prometheus-k8s-0                       2/2       Running   0          2m
    prometheus-k8s-1                       2/2       Running   0          2m
    prometheus-operator-8597f9b976-2hvd5   1/1       Running   0          2m

      查看svc

    [root@k8s-master01 prometheus-operator]# kubectl get svc -n !$
    kubectl get svc -n monitoring
    NAME                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
    alertmanager-main       NodePort    10.106.201.155   <none>        9093:30903/TCP      2m
    alertmanager-operated   ClusterIP   None             <none>        9093/TCP,6783/TCP   2m
    etcd-k8s                ClusterIP   None             <none>        2379/TCP            2m
    grafana                 NodePort    10.99.143.133    <none>        3000:30902/TCP      2m
    kube-state-metrics      ClusterIP   None             <none>        8443/TCP,9443/TCP   2m
    node-exporter           ClusterIP   None             <none>        9100/TCP            2m
    prometheus-k8s          NodePort    10.101.175.59    <none>        9090:30900/TCP      2m
    prometheus-operated     ClusterIP   None             <none>        9090/TCP            2m
    prometheus-operator     ClusterIP   10.107.31.10     <none>        8080/TCP            2m

      此时开放了三个端口:

    •   alertmanager UI:30903
    •   grafana:30902
    •   prometheus UI:30900

     

    4、访问测试

      alertmanager:

      prometheus:

      grafana:

     

     

      告警邮件查看:

     

    5、卸载

    [root@k8s-master01 prometheus-operator]# ./teardown 
    clusterrolebinding.rbac.authorization.k8s.io "node-exporter" deleted
    clusterrole.rbac.authorization.k8s.io "node-exporter" deleted
    daemonset.extensions "node-exporter" deleted
    serviceaccount "node-exporter" deleted
    service "node-exporter" deleted
    clusterrolebinding.rbac.authorization.k8s.io "kube-state-metrics" deleted
    clusterrole.rbac.authorization.k8s.io "kube-state-metrics" deleted
    deployment.extensions "kube-state-metrics" deleted
    rolebinding.rbac.authorization.k8s.io "kube-state-metrics" deleted
    role.rbac.authorization.k8s.io "kube-state-metrics-resizer" deleted
    serviceaccount "kube-state-metrics" deleted
    service "kube-state-metrics" deleted
    secret "grafana-credentials" deleted
    configmap "grafana-dashboard-definitions-0" deleted
    configmap "grafana-dashboards" deleted
    configmap "grafana-datasources" deleted
    deployment.apps "grafana" deleted
    service "grafana" deleted
    service "etcd-k8s" deleted
    servicemonitor.monitoring.coreos.com "etcd-k8s" deleted
    ......

    赞助作者:

      

  • 相关阅读:
    根据当前日期转目的国地区时间戳
    时间戳转换作用域问题
    字符串拼接问题
    input全选和取消全选
    循环遍历渲染模块
    jQuery实现获取选中复选框的值
    React组件
    underscore.js依赖库函数分析二(查找)
    underscore.js依赖库函数分析一(遍历)
    React入门
  • 原文地址:https://www.cnblogs.com/dukuan/p/10177757.html
Copyright © 2011-2022 走看看