zoukankan      html  css  js  c++  java
  • Prometheus operator

    一、简介

    地址:https://github.com/prometheus-operator/kube-prometheus

    https://blog.csdn.net/choerodon/article/details/98587027

    Prometheus Operator架构图:

       

    • Operator:根据自定义资源(Custom Resource Definition / CRDs)来部署和管理Prometheus Server,同时监控这些自定义资源事件的变化来做相应的处理,是整个系统的控制中心
    • Prometheus:声明Prometheus deployment期望的状态,Operator确保这个deployment运行时一直与定义保持一致
    • Prometheus Server:Opreator根据自定义资源Prometheus类型中定义内容而部署的Prometheus Server集群,这些自定义资源可以看作是用来管理Prometheus Server集群的StatefulSets资源
    • ServiceMonitor:声明指定监控的服务,描述了一组被Prometheus监控的目标列表。该资源通过Labels来获取对应的Service Endpoint,让Prometheus Server通过选取的Service 来获取 Metrics信息
    • Service:简单的说就是Prometheus监控的对象
    • Alertmanager:定义AlertManager deployment期望的状态,Operator确保这个deployment运行时一直与定义保持一致

     

    二、部署

    Prometheus Operator部署很简单

    # 下载
    # git clone https://github.com/prometheus-operator/kube-prometheus.git
    ​
    # cd kube-prometheus
    ​
    # 安装operator
    # kubectl create -f manifests/setup
    ​
    # 安装prometheus
    kubectl create -f manifests/
    • 可以在replicas定义启动个数

    查看

    # kubectl get pods -n monitoring
    NAME                                   READY   STATUS    RESTARTS   AGE
    alertmanager-main-0                    2/2     Running   10         8d
    blackbox-86b7486879-w6n22              1/1     Running   0          18h
    grafana-5cb8d5c55b-wplg4               1/1     Running   5          8d
    kafka-exporter-5cf8fdd8f8-c4j5t        1/1     Running   0          20h
    kube-state-metrics-65f69f9759-spcr6    3/3     Running   27         8d
    node-exporter-rdjl9                    2/2     Running   2          24h
    prometheus-adapter-865cc8dbcd-bc7v6    1/1     Running   34         8d
    prometheus-k8s-0                       2/2     Running   3          76m
    prometheus-operator-56d44459f7-vt2l9   2/2     Running   15         8d
    # kubectl get svc -n monitoring
    NAME                    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
    alertmanager-main       ClusterIP   10.99.189.210   <none>        9093/TCP                     8d
    alertmanager-operated   ClusterIP   None            <none>        9093/TCP,9094/TCP,9094/UDP   8d
    blackbox                ClusterIP   10.108.47.141   <none>        9115/TCP                     18h
    grafana                 ClusterIP   10.104.30.183   <none>        3000/TCP                     8d
    kafka-exporter          ClusterIP   10.98.228.115   <none>        9308/TCP                     20h
    kube-state-metrics      ClusterIP   None            <none>        8443/TCP,9443/TCP            8d
    node-exporter           ClusterIP   None            <none>        9100/TCP                     8d
    prometheus-adapter      ClusterIP   10.108.67.0     <none>        443/TCP                      8d
    prometheus-k8s          ClusterIP   10.96.50.138    <none>        9090/TCP                     8d
    prometheus-operated     ClusterIP   None            <none>        9090/TCP                     16h
    prometheus-operator     ClusterIP   None            <none>        8443/TCP                     8d

    定义ingress,用于访问alertmanager、grafana、prometheus

    prom-monitor.yaml

    apiVersion: extensions/v1beta1
    kind: Ingress
    metadata:
      name: prom-monitor
      namespace: monitoring
    spec:
      rules:
      - host: alert.test.com
        http:
          paths:
          - backend:
              serviceName: alertmanager-main
              servicePort: 9093
            path: /
      - host: grafana.test.com
        http:
          paths:
          - backend:
              serviceName: grafana
              servicePort: 3000
            path: /
      - host: prom.test.com
        http:
          paths:
          - backend:
              serviceName: prometheus-k8s
              servicePort: 9090
            path: /
    • grafana.test.com prom.test.com alert.test.com

    修改本机hosts文件

    访问 grafana.test.com,其本身提供了很多dashboard

      

     

    三、处理无法监控controller-manager

      二进制安装的k8s,在使用operator安装的Prometheus,默认是监控不到controller-manager和scheduler,需要另行配置这两个组件。原因在于servicemonitor是通过匹配service中的label来添加监控的,但是二进制安装的k8s中,kube-system这个namespace中没有controller-manager和scheduler的service。

      查看 

    # 查看servicemonitor
    # kubectl get servicemonitor -n monitoring
    NAME                      AGE
    alertmanager              7d2h
    coredns                   7d2h
    grafana                   7d2h
    kube-apiserver            7d2h
    kube-controller-manager   7d2h
    kube-scheduler            7d2h
    kube-state-metrics        7d2h
    kubelet                   7d2h
    node-exporter             7d2h
    prometheus                7d2h
    prometheus-adapter        7d2h
    prometheus-operator       7d2h

      查看kube-controller-manager的servicemonitor

    # kubectl get servicemonitor kube-controller-manager -n monitoring -o yaml | tail -15
    ...
        port: http-metrics
        scheme: http
        tlsConfig:
          insecureSkipVerify: false
      jobLabel: k8s-app
      namespaceSelector:
        matchNames:
        - kube-system
      selector:
        matchLabels:
          k8s-app: kube-controller-manager
    • 其需要在kube-system下匹配一个含有k8s-app=kube-controller-manager的service
    • 修改其scheme为http,默认为https

      kube-controller-manager这个标签的serviceendpoints在kube-system这个namespace是没有的,所有Prometheus无法获取controller-manager的信息,所以需要创建controller-manager的service和endpoint

      controller-endpoint.yaml

    apiVersion: v1
    kind: Endpoints
    metadata:
      name: kube-controller-manager-monitoring
      namespace: kube-system
      labels:
        k8s-app: kube-controller-manager
    subsets:
      - addresses:
        - ip: 192.168.10.240
        - ip: 192.168.10.241
        - ip: 192.168.10.242
        ports:
        - name: http-metrics
          port: 10252
          protocol: TCP

      controller-service.yaml

    apiVersion: v1
    kind: Service
    metadata:
      name: kube-controller-manager-monitoring
      namespace: kube-system
      labels:
        k8s-app: kube-controller-manager
    spec:
      ports:
      - port: 10252
        name: http-metrics
        protocol: TCP
      type: ClusterIP

    创建

    # kubectl create -f .

    查看

    # kubectl get svc,ep -n kube-system -l k8s-app=kube-controller-manager
    NAME                                         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)     AGE
    service/kube-controller-manager-monitoring   ClusterIP   10.102.204.13   <none>        10252/TCP   44m
    
    NAME                                           ENDPOINTS                                                        AGE
    endpoints/kube-controller-manager-monitoring   192.168.10.240:10252,192.168.10.241:10252,192.168.10.242:10252   44m


    同时修改controller-manager的启动配置文件

    /usr/lib/systemd/system/kube-controller-manager.service

    # 修改地址
    --address=0.0.0.0 

    重启controller-manager

     

    测试

    # curl 127.0.0.1:10252
    404 page not found
    
    # curl 10.102.204.13:10252
    404 page not found

    访问本机端口和controller-manager的service端口的结果是一样的

    查看prometheus

      

    同理修改scheduler的相关配置,就能监控scheduler的信息

  • 相关阅读:
    我的WCF之旅(1):创建一个简单的WCF程序
    网页设计中颜色的搭配
    CSS HACK:全面兼容IE6/IE7/IE8/FF的CSS HACK
    UVa 1326 Jurassic Remains
    UVa 10340 All in All
    UVa 673 Parentheses Balance
    UVa 442 Matrix Chain Multiplication
    UVa 10970 Big Chocolate
    UVa 679 Dropping Balls
    UVa 133 The Dole Queue
  • 原文地址:https://www.cnblogs.com/bigberg/p/14009940.html
Copyright © 2011-2022 走看看