zoukankan      html  css  js  c++  java
  • Prometheus 收集指标插件工具

    监控工具

    cAdvirsor

    推荐使用监控容器的工具,它是由 Google 开源的,在Kubernetes中,不需要单独去安装,cAdvisor 作为 kubelet 内置的一部分程序可以直接使用,主要是容器的CPU、内存、磁盘、网络、负载等指标;

    node-exporter

    宿主机监控工具,监控宿主机的CPU、内存、磁盘、网络及可用性等指标;

    kube-state-metrics

    它监听API Server 生成有关资源对象的状态指标,比如:Deployment 调度了多少个Pod副本、现在可用的有几个、有多少个Pod是Running、stopped或terminated状态、Pod重启了多少次等等信息;
    需要注意的是kube-state-metrics只是简单提供了一个metrics指标数据,并不会存储这些数据,需要后端数据库来存储这些数据,此外kube-state-metrics采集的是metrics数据的名称和标签是不固定的,可能会改变,需要根据实际环境灵活配置;

    metrics-server:监控核心组件之一

    metrics-server 它是集群范围资源使用数据的聚合器,实现了Resource Metrics API,通过从 kubelet 公开的 Summary API 中采集指标信息,在 kubernetes 1.16 版本的时候kubernetes集群资源监控heaspter已经被废弃了,现在采用 metrics-server 。

    它负责从kubelet收集资源指标,然后对这些指标监控数据进行聚合(依赖kube-aggregator),并在Kubernetes Apiserver中通过Metrics API( /apis/metrics.k8s.io/)公开暴露它们,但是metrics-server只存储最新的指标数据(CPU/Memory)。但是并不会将指标转发给第三方目标。如果使用Metrics-server需要对集群做一些特殊的配置,但是这些配置不是集群安装时候默认配置好的,所以你的集群需要满足这些要求:

    • 你的kube-apiserver要能访问到metrics-server

    • 需要kube-apiserver启用聚合层

    • 组件要有kubectl的认证配置并且绑定到Metrics-server

    • Pod / Node指标需要由Summary API通过Kubelet公开

    git clone https://github.com/kubernetes-sigs/metrics-server.git
    

     在kube-apiserver中启用聚合层,需要修改Kube-apiserver的一些配置选项,可以参考官方启用聚合层文档:

    --requestheader-client-ca-file=<path to aggregator CA cert>
    --requestheader-allowed-names=front-proxy-client
    --requestheader-extra-headers-prefix=X-Remote-Extra-
    --requestheader-group-headers=X-Remote-Group
    --requestheader-username-headers=X-Remote-User
    --proxy-client-cert-file=<path to aggregator proxy cert>
    --proxy-client-key-file=<path to aggregator proxy key>
    

     Kubernetes集群中有些组件依赖资源指标API(metric API)的功能,比如 kubectl top、HPA和VPA。如果没有资源指标API接口,这些组件无法运行。

    在Kubernetes集群中部署Metrics-server

    # mkdir ./metrics-server  
    # cd $_
    # for file in aggregated-metrics-reader.yaml auth-delegator.yaml auth-reader.yaml metrics-apiservice.yaml metrics-server-deployment.yaml metrics-server-service.yaml resource-reader.yaml; do  wget https://raw.githubusercontent.com/kubernetes-sigs/metrics-server/master/deploy/kubernetes/$file;done
    

     修改 metrics-server-deployment.yaml清单文件

     containers:
     - name: metrics-server
       image: registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-serveramd64:v0.3.6
       command:
          - /metrics-server
          - --v=4   # 打印详细日志为了debug,你也可以调成2
          - --kubelet-insecure-tls
          - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
       imagePullPolicy: Always
    

     应用修改后的metrics-server配置清单

    # kubectl apply -f .

    验证

    第三方专用exporter

    还有很多专用的exporter,比如MySQL exporter、Redis exporter、Prometheus exporter等等

    cAdvirsor

    简单说明

    Prometheus 提供了几种方法来监控 Docker 容器,包括一些自定义的 exporter,一般情况下不会使用这些 exporter,而是推荐使用 Google 的 cAdvisor,它是 Google 开源的、专门针对容器资源的监控和性能分析工具,可以单独部署一个容器来运行 cAdvisor 进行采集监控指标数据, 但在 Kubernetes 集群中,不需要单独去安装,cAdvisor 已经作为 kubelet 程序内置的一部分,可以直接使用 cadvisor 采集与容器运行相关的所有指标数据,单独安装 cAdvisor 时数据采集路径为/api/v1/nodes/[节点名称]/proxy/metrics/cadvisor,如果是集成到kubelet的话,采集数据的路径是https://127.0.0.1:10250/metrics/cadvisor。

    下面我们针对 kubernetes 的使用进行演示,由于kubelet启用了 https,所以需要拥有一个认证帐户去访问它,这里我们创建一个ServiceAccount账号;

    # 创建一个监控专用的名称空间 monitor
    [root@master01 ~]# kubectl create ns monitor
    namespace/monitor created
    
    # 创建一个SA帐号
    [root@master01 ~]# kubectl create serviceaccount monitor -n monitor
    serviceaccount/monitor created
    
    # 查看创建SA后,生成的 secret 信息
    [root@master01 ~]# kubectl get secret -n monitor
    NAME TYPE DATA AGE
    default-token-kdrzm kubernetes.io/service-account-token 3      34s
    monitor-token-2ktr2 kubernetes.io/service-account-token 3      18s
    [root@master01 ~]#
    
    # SA:monitor 绑定最高集群角色
    [root@master01 ~]# kubectl create clusterrolebinding monitor-cluster -n monitor --clusterrole=cluster-admin --serviceaccount=monitor:monitor
    clusterrolebinding.rbac.authorization.k8s.io/monitor-cluster created
    

    验证

    根据创建 serviceAccount 帐号 monitor 的 token 去访问 kubelet 的10250端口验证

    [root@master01 ~]# kubectl describe secret monitor-token-2ktr2 -n monitor
    Name: monitor-token-2ktr2
    Namespace: monitor
    Labels:       <none>
    Annotations:  kubernetes.io/service-account.name: monitor
                  kubernetes.io/service-account.uid: 718326e6-57ec-490c-9fcb-60698acca518
    
    Type: kubernetes.io/service-account-token
    
    Data
    ====
    ca.crt:     1025 bytes
    namespace: 7 bytes
    token: eyJhbGciOiJSUzI1NiIsImtpZCI6IlZ2bGJjaEN2MjFwazRmLUNWdkxBYVoxUHBleTBCUFBzWW0xU25uMGM1Y3MifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJtb25pdG9yIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6Im1vbml0b3ItdG9rZW4tMmt0cjIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoibW9uaXRvciIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjcxODMyNmU2LTU3ZWMtNDkwYy05ZmNiLTYwNjk4YWNjYTUxOCIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDptb25pdG9yOm1vbml0b3IifQ.cVml5Of1fZxyv-hRUKnqWWNK_52_btbdISvmP1Fw6Um-D9kqq5CieymC4f5KHVdxdJnA_-54ih3No5VUfetefBryh06yX_Qr01k0TGKKU_MwXcTgKgKs1Ydet7cS3VTBgZHNERdvHmK_phSnwEA87zJUkQNIMWPjTzsAUVlk0nve60MF-EohI_RqxILntlSKRpI5X5WG1p_IT7NebA5UYeKDYoabI9-YqoEPQd6XQ6Lfc5nf_tC1gUMExyaczVZTrsxjnpsZl5cFpAGg1b4NNixTLRbqWdeuu1uV5i_WJTlYMsfPNCvb2eP8KC9d0DE8UMSDNMwrehYyrmviAGqKVQ
    [root@master01 ~]#
    

     访问 kubelet 暴露的10250 端口

    [root@master01 ~]# curl https://127.0.0.1:10250/metrics/cadvisor -k -H "Authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IlZ2bGJjaEN2MjFwazRmLUNWdkxBYVoxUHBleTBCUFBzWW0xU25uMGM1Y3MifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJtb25pdG9yIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6Im1vbml0b3ItdG9rZW4tMmt0cjIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoibW9uaXRvciIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjcxODMyNmU2LTU3ZWMtNDkwYy05ZmNiLTYwNjk4YWNjYTUxOCIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDptb25pdG9yOm1vbml0b3IifQ.cVml5Of1fZxyv-hRUKnqWWNK_52_btbdISvmP1Fw6Um-D9kqq5CieymC4f5KHVdxdJnA_-54ih3No5VUfetefBryh06yX_Qr01k0TGKKU_MwXcTgKgKs1Ydet7cS3VTBgZHNERdvHmK_phSnwEA87zJUkQNIMWPjTzsAUVlk0nve60MF-EohI_RqxILntlSKRpI5X5WG1p_IT7NebA5UYeKDYoabI9-YqoEPQd6XQ6Lfc5nf_tC1gUMExyaczVZTrsxjnpsZl5cFpAGg1b4NNixTLRbqWdeuu1uV5i_WJTlYMsfPNCvb2eP8KC9d0DE8UMSDNMwrehYyrmviAGqKVQ" | more
      % Total % Received % Xferd Average Speed Time Time Time Current
                                     Dload Upload Total Spent Left Speed
      0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP cadvisor_version_info A metric with a constant '1' value labeled by kernel version, OS version, docker version, cadvisor version & cadvisor revision.
    # TYPE cadvisor_version_info gauge
    cadvisor_version_info{cadvisorRevision="",cadvisorVersion="",dockerVersion="19.03.8",kernelVersion="3.10.0-1062.12.1.el7.x86_64",osVersion="CentOS Linux 7 (Core)"} 1
    # HELP container_cpu_load_average_10s Value of container cpu load average over the last 10 seconds.
    # TYPE container_cpu_load_average_10s gauge
    container_cpu_load_average_10s{container="",id="/",image="",name="",namespace="",pod=""} 0 1585634068599
    container_cpu_load_average_10s{container="",id="/kubepods",image="",name="",namespace="",pod=""} 0 1585634068611
    container_cpu_load_average_10s{container="",id="/kubepods/besteffort",image="",name="",namespace="",pod=""} 0 1585634073752
    。。。
    

    通过上面的操作发现已经可以正常访问容器的指标数据了,里面有很多指标数据,每个指标数据前都有两行注意如:

    # HELP container_cpu_load_average_10s Value of container cpu load average over the last 10 seconds.

    # TYPE container_cpu_load_average_10s gauge

    第一行是监控指标的解释;

    第二行是指标类型,是仪表盘、直方图、摘要、计数器等;

    node-exporter

    安装

    这里把 node-exporter 部署为Pod,使用 DaemonSet 资源类型部署,方便维护,这样每一个kubernetes 集群节点都会部署一个,资源配置文件清单如下:

    [root@master01 monitor]# cat node-exporter.yaml
    apiVersion: apps/v1
    kind: DaemonSet
    metadata:
      name: node-exporter
      namespace: monitor
      labels:
        name: node-exporter
    spec:
      selector:
        matchLabels:
         name: node-exporter
      template:
        metadata:
          labels:
            name: node-exporter
        spec:
          hostPID: true
          hostIPC: true
          hostNetwork: true
          containers:
          - name: node-exporter
            image: prom/node-exporter:latest
            ports:
            - containerPort: 9100
            resources:
              requests:
                cpu: 0.15
            securityContext:
              privileged: true
            args:
            - --path.procfs
            - /host/proc
            - --path.sysfs
            - /host/sys
            - --collector.filesystem.ignored-mount-points
            - '"^/(sys|proc|dev|host|etc)($|/)"'
            volumeMounts:
            - name: dev
              mountPath: /host/dev
            - name: proc
              mountPath: /host/proc
            - name: sys
              mountPath: /host/sys
            - name: rootfs
              mountPath: /rootfs
          tolerations:
          - key: "node-role.kubernetes.io/master"
            operator: "Exists"
            effect: "NoSchedule"
          volumes:
            - name: proc
              hostPath:
                path: /proc
            - name: dev
              hostPath:
                path: /dev
            - name: sys
              hostPath:
                path: /sys
            - name: rootfs
              hostPath:
                path: /
    

    这里使用hostnetwork为true,使用宿主机网络,会监控在宿主机上面的9100口;

    验证

    # 创建 DaemonSet 资源类型的 Pod
    [root@master01 monitor]# kubectl apply -f node-exporter.yaml
    daemonset.apps/node-exporter created
    [root@master01 monitor]#
    
    # 验证
    [root@master01 monitor]# curl http://127.0.0.1:9100/metrics|more
      % Total % Received % Xferd Average Speed Time Time Time Current
                                     Dload Upload Total Spent Left Speed
      0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP go_gc_duration_seconds A summary of the GC invocation durations.
    # TYPE go_gc_duration_seconds summary
    go_gc_duration_seconds{quantile="0"} 0
    go_gc_duration_seconds{quantile="0.25"} 0
    go_gc_duration_seconds{quantile="0.5"} 0
    go_gc_duration_seconds{quantile="0.75"} 0
    go_gc_duration_seconds{quantile="1"} 0
    go_gc_duration_seconds_sum 0
    go_gc_duration_seconds_count 0
    # HELP go_goroutines Number of goroutines that currently exist.
    # TYPE go_goroutines gauge
    go_goroutines 6
    

    查看pod

    [root@master01 monitor]# kubectl get pods -n monitor -o wide
    NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
    node-exporter-c67rd 1/1     Running 0          11m   172.31.117.228   node01 <none>           <none>
    node-exporter-jrzfx 1/1     Running 0          11m   172.31.117.227   master03 <none>           <none>
    node-exporter-mqsw5 1/1     Running 0          11m   172.31.117.225   master01 <none>           <none>
    node-exporter-zhnl4 1/1     Running 0          11m   172.31.117.226   master02 <none>           <none>
    

     从上面可以看出,已经监控到所有宿主机 CPU、内存、负载、网络流量、文件系统等指标信息,后续可供 Prometheus 收集。

    kube-state-metrics

    Kube-state-metrics 它是通过监听 kube-apiserv括r 而生成有关资源对象的指标信息,主要包括Node、Pod、Service 、Endpoint、Namespace等资源的metric,需要注意的是kube-state-metrics只是简单的提供一个metrics数据,并不会存储这些指标数据,后续可以使用Prometheus 来抓取这些数据然后存储,它主要关注的是业务资源workload的元数据信息。

    这里也需要一个ServiceAccount帐户并授权绑定。

    [root@master01 monitor]# cat kube-state-metrics-rbac.yaml
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: kube-state-metrics
      namespace: kube-system
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: kube-state-metrics
    rules:
    - apiGroups: [""]
      resources: ["nodes", "pods", "services", "resourcequotas", "replicationcontrollers", "limitranges", "persistentvolumeclaims", "persistentvolumes", "namespaces", "endpoints"]
      verbs: ["list", "watch"]
    - apiGroups: ["extensions"]
      resources: ["daemonsets", "deployments", "replicasets"]
      verbs: ["list", "watch"]
    - apiGroups: ["apps"]
      resources: ["statefulsets"]
      verbs: ["list", "watch"]
    - apiGroups: ["batch"]
      resources: ["cronjobs", "jobs"]
      verbs: ["list", "watch"]
    - apiGroups: ["autoscaling"]
      resources: ["horizontalpodautoscalers"]
      verbs: ["list", "watch"]
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: kube-state-metrics
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: kube-state-metrics
    subjects:
    - kind: ServiceAccount
      name: kube-state-metrics
      namespace: kube-system
    

     创建 Pod 及service 配置文件

    [root@master01 monitor]# cat kube-state-metrics-deployment-svc.yaml
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: kube-state-metrics
      namespace: kube-system
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: kube-state-metrics
      template:
        metadata:
          labels:
            app: kube-state-metrics
        spec:
          serviceAccountName: kube-state-metrics
          containers:
          - name: kube-state-metrics
            image: quay.io/coreos/kube-state-metrics:v1.9.5
            ports:
            - containerPort: 8080
    
    ---
    apiVersion: v1
    kind: Service
    metadata:
      annotations:
        prometheus.io/scrape: 'true'
      name: kube-state-metrics
      namespace: kube-system
      labels:
        app: kube-state-metrics
    spec:
      ports:
      - name: kube-state-metrics
        port: 8080
        protocol: TCP
      selector:
        app: kube-state-metrics
    

     部署及查看

    [root@master01 monitor]# kubectl apply -f kube-state-metrics-rbac.yaml
    serviceaccount/kube-state-metrics created
    clusterrole.rbac.authorization.k8s.io/kube-state-metrics created
    clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created
    [root@master01 monitor]# kubectl apply -f kube-state-metrics-deployment-svc.yaml
    deployment.apps/kube-state-metrics created
    service/kube-state-metrics created
    [root@master01 monitor]#
    
    # 查看部署情况
    [root@master01 monitor]# kubectl get clusterrolebinding |grep kube-state
    kube-state-metrics ClusterRole/kube-state-metrics 4m4s
    [root@master01 monitor]#
    [root@master01 monitor]# kubectl get pods -n kube-system |grep kube-state-metrics
    kube-state-metrics-84b8477f75-65gcg 1/1     Running 0          4m26s
    

    验证

    后面安装完成prometheus后,在监控指标中有很多kube_开头的指标数据,都是由它抓取生成的。

    metrics-server

    前期准备

    在较早的版本中,集群监控使用的是 heaspter,集群通过它的监控指标实现HPA、VPA和kubectl top等,在新版本中由 metrics-server 替代,至于原因,可以Google一下。metrics-server 是 kubernetes 监控体系中的核心组件之一,从 kubelet 中收集 Pod/Node 等资源指标,然后对这些指标数据进行聚合,最后再通过 Kube-apiserver 中 Metrics API( /apis/metrics.k8s.io/)公开暴露,metrics-server只存储最新的指标数据(CPU/Memory),并不会把指标数据转发给第三方目标,如果想使用 Metrics-server 指标数据,就需要对集群做一些特殊的配置,这些配置默认情况下,是不会安装的,具体配置如下几点,1、kube-apiserver要能访问到metrics-server;2、kube-apiserver启用参数中启用聚合层功能;3、组件要有kubectl的认证配置并且绑定到Metrics-server;4、Pod/Node指标需要由Summary API通过Kubelet公开。

    [root@master01 ~]# cd /etc/kubernetes/manifests/
    [root@master01 manifests]# ls
    kube-apiserver.yaml kube-controller-manager.yaml kube-scheduler.yaml
    [root@master01 manifests]# pwd
    /etc/kubernetes/manifests
    

     二进制安装的话,进入到以上目录,并修改kube-apiserver.yaml,主要是加上- --enable-aggregator-routing=true,其它的默认应该是有的,修改如下配置:

    。。。
      - --requestheader-allowed-names=front-proxy-client
        - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
        - --requestheader-extra-headers-prefix=X-Remote-Extra-
        - --requestheader-group-headers=X-Remote-Group
        - --requestheader-username-headers=X-Remote-User
        - --enable-aggregator-routing=true
    。。。。
    

     下载软件包

    git clone https://github.com/kubernetes-sigs/metrics-server.git
    

     安装

    # 下载目录中有以下文件,可以自行查看下
    [root@master01 kubernetes]# pwd
    /root/monitor/metrics-server/deploy/kubernetes
    [root@master01 kubernetes]# ll
    总用量 28
    -rw-r--r-- 1 root root  397 3月 31 14:23 aggregated-metrics-reader.yaml
    -rw-r--r-- 1 root root  303 3月 31 14:23 auth-delegator.yaml
    -rw-r--r-- 1 root root  324 3月 31 14:23 auth-reader.yaml
    -rw-r--r-- 1 root root  298 3月 31 14:23 metrics-apiservice.yaml
    -rw-r--r-- 1 root root 1184 3月 31 14:23 metrics-server-deployment.yaml
    -rw-r--r-- 1 root root  297 3月 31 14:23 metrics-server-service.yaml
    -rw-r--r-- 1 root root  532 3月 31 14:23 resource-reader.yaml
    [root@master01 kubernetes]#
    
    # 部署
    [root@master01 kubernetes]# kubectl apply -f .
    clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
    clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
    rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
    apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
    serviceaccount/metrics-server created
    deployment.apps/metrics-server created
    service/metrics-server created
    clusterrole.rbac.authorization.k8s.io/system:metrics-server created
    clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
    [root@master01 kubernetes]#
    

    坑一

    root@master01 kubernetes]# kubectl top node
    error: metrics not available yet
    [root@master01 kubernetes]#
    
    # 查看错误日志
    unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:master01: unable to fetch metrics from Kubelet master01 (master01): Get https://master01:10250/stats/summary?only_cpu_and_memory=true: dial tcp: lookup master01 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:master03: unable to fetch metrics from Kubelet master03 (master03): Get https://master03:10250/stats/summary?only_cpu_and_memory=true: dial tcp: lookup master03 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:master02: unable to fetch metrics from Kubelet master02 (master02): Get https://master02:10250/stats/summary?only_cpu_and_memory=true: dial tcp: lookup master02 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:node01: unable to fetch metrics from Kubelet node01 (node01): Get https://node01:10250/stats/summary?only_cpu_and_memory=true: dial tcp: lookup node01 on 10.96.0.10:53: no such host]
    

    这个坑的解决方式是 - --kubelet-insecure-tls ,修改 metrics-server-deployment.yaml 添加这个参数,删除再重新创建

    坑二

    [root@master01 kubernetes]# kubectl top node
    Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
    [root@master01 kubernetes]#
    
    [root@master01 kubernetes]# kubectl logs -f metrics-server-64b57fd654-bt6fx -n kube-system
    E0331 07:03:59.658787       1 reststorage.go:135] unable to fetch node metrics for node "master03": no metrics known for node
    E0331 07:03:59.658793       1 reststorage.go:135] unable to fetch node metrics for node "node01": no metrics known for node
    。。。
    
    [root@master01 kubernetes]#
    

     解决方式是添加 - --kubelet-preferred-address-types=InternalIP 启动参数 ,修改 metrics-server-deployment.yaml 添加这个参数,最终如下所示,再删除重建即可

    。。。
              - --cert-dir=/tmp
              - --secure-port=4443
              - --kubelet-preferred-address-types=InternalIP
              - --kubelet-insecure-tls
    。。。
    

     验证

    注意一下,刚开始有可能会出错,稍等一下即可,出错后,及时查看日志;

    [root@master01 kubernetes]# kubectl top pods
    W0331 15:07:34.977285   30613 top_pod.go:274] Metrics not available for pod default/default-deployment-nginx-fffdfd45-vh8sc, age: 3h45m6.977273348s
    error: Metrics not available for pod default/default-deployment-nginx-fffdfd45-vh8sc, age: 3h45m6.977273348s
    [root@master01 kubernetes]#
    
    [root@master01 ~]# kubectl top node
    NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
    master01 179m 8% 2263Mi 61%
    master02 139m 6% 2184Mi 59%
    master03 146m 7% 2280Mi 61%
    node01 107m 5% 1825Mi 49%
    [root@master01 ~]# kubectl top pods
    NAME CPU(cores) MEMORY(bytes)
    default-deployment-nginx-fffdfd45-vh8sc 0m 1Mi
    [root@master01 ~]#
    





  • 相关阅读:
    SQL server 事务介绍,创建与使用
    DOM操作系列-01
    JS键盘事件
    Js获取当前日期时间及其它操作
    js中!!的作用
    js == 与 === 的区别[转]
    学习总结--Dom
    css历史
    javascript中 visibility和display的区别
    “==”和Equals区别
  • 原文地址:https://www.cnblogs.com/zjz20/p/12938610.html
Copyright © 2011-2022 走看看