zoukankan      html  css  js  c++  java
  • k8s prometheus api

    rancher api:

    https://10.10.10.90/k8s/clusters/c-x985z/api/v1/namespaces/cattle-prometheus/services/expose-kubernetes-metrics:8080/proxy/

    https://10.11.30.119/k8s/clusters/c-k8598/api/v1/namespaces/ilsuat/pods/http:ils-aiservices-6bfbcbff85-s64hx:5000/proxy/ai/

    prom conf案例一:(获取所有node的cadvisor)

    global:
      scrape_interval:     15s
      evaluation_interval: 15s
    
    alerting:
      alertmanagers:
      - static_configs:
        - targets:
    
    rule_files:
    
    scrape_configs:
    - job_name: alexcadvisor
      static_configs:
        - targets: ["10.10.10.68:10250","10.10.10.95:10250","10.10.10.96:10250","10.10.10.211:10250","10.10.10.212:10250","10.10.10.217:10250","10.10.10.216:10250","10.10.10.18:10250","10.10.10.19:10250","10.10.10.20:10250"]
      scrape_interval: 15s
      scrape_timeout: 10s
      metrics_path: /metrics/cadvisor
      scheme: https
      bearer_token_file: /root/prometheus-2.22.0.linux-amd64/token
      tls_config:
        insecure_skip_verify: true

    案例2:获取所有node的node_exporter

    - job_name: alexnodeexporter 
      scrape_interval: 1m
      scrape_timeout: 10s
      metrics_path: /metrics
      bearer_token_file: /root/prometheus-2.22.0.linux-amd64/token
      scheme: http
      kubernetes_sd_configs:
      - role: endpoints
        api_server: 'https://10.10.10.68:6443'
        namespaces:
          names:
          - cattle-prometheus
        tls_config:
          insecure_skip_verify: true
        bearer_token_file: /root/prometheus-2.22.0.linux-amd64/token
      relabel_configs:
      - source_labels: [__meta_kubernetes_service_label_app]
        separator: ;
        regex: exporter-node
        replacement: $1
        action: keep
      - source_labels: [__meta_kubernetes_service_label_chart]
        separator: ;
        regex: exporter-node-0.0.1
        replacement: $1
        action: keep
      - source_labels: [__meta_kubernetes_service_label_monitoring_coreos_com]
        separator: ;
        regex: "true"
        replacement: $1
        action: keep
      - source_labels: [__meta_kubernetes_service_label_release]
        separator: ;
        regex: cluster-monitoring
        replacement: $1
        action: keep
      - source_labels: [__meta_kubernetes_endpoint_port_name]
        separator: ;
        regex: metrics
        replacement: $1
        action: keep
      - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
        separator: ;
        regex: Node;(.*)
        target_label: node
        replacement: ${1}
        action: replace
      - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
        separator: ;
        regex: Pod;(.*)
        target_label: pod
        replacement: ${1}
        action: replace
      - source_labels: [__meta_kubernetes_namespace]
        separator: ;
        regex: (.*)
        target_label: namespace
        replacement: $1
        action: replace
      - source_labels: [__meta_kubernetes_service_name]
        separator: ;
        regex: (.*)
        target_label: service
        replacement: $1
        action: replace
      - source_labels: [__meta_kubernetes_pod_name]
        separator: ;
        regex: (.*)
        target_label: pod
        replacement: $1
        action: replace
      - source_labels: [__meta_kubernetes_service_name]
        separator: ;
        regex: (.*)
        target_label: job
        replacement: ${1}
        action: replace
      - separator: ;
        regex: (.*)
        target_label: endpoint
        replacement: metrics
        action: replace
      - source_labels: [__meta_kubernetes_pod_host_ip]
        separator: ;
        regex: (.+)
        target_label: host_ip
        replacement: $1
        action: replace
      - source_labels: [__meta_kubernetes_pod_node_name]
        separator: ;
        regex: (.+)
        target_label: node
        replacement: $1
        action: replace

    curl -k https://10.10.10.68:6443/api/v1/nodes -H "Authorization: Bearer token-mj6c7:8m2zlxp5qhr25hh8dtzlrl8cn472wws94m9ntbkggqt8x9sfg7q4w4"

    curl -k https://10.10.10.68:6443/api/v1/nodes  --cacert kube-ca.pem --cert alex.pem --key alexkey.pem

    curl -k https://10.10.10.68:6443/api/v1/nodes  --cacert kube-ca.pem --cert kube-node.pem --key kube-node-key.pem

    curl -k https://10.10.10.68:6443/api/v1/nodes  --cacert kube-ca.pem --cert kube-controller-manager.pem  --key kube-controller-manager-key.pem 

    curl -k https://10.10.10.68:6443/api/v1/nodes  --cacert kube-ca.pem --cert  kube-scheduler.pem  --key  kube-scheduler-key.pem

    curl -k https://10.10.10.96:10250/metrics  --cacert kube-ca.pem --cert kube-service-account-token.pem --key kube-service-account-token-key.pem 

    curl -k https://10.10.10.68:10250/metrics/cadvisor --cacert kube-ca.pem -H "Authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJjYXR0bGUtcHJvbWV0aGV1cyIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJjbHVzdGVyLW1vbml0b3JpbmctdG9rZW4tbjlmbjIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiY2x1c3Rlci1tb25pdG9yaW5nIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiOTcxNjY1NmQtNWRmMS0xMWVhLTk1YzktMDAxNTVkMGEzNjAxIiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50OmNhdHRsZS1wcm9tZXRoZXVzOmNsdXN0ZXItbW9uaXRvcmluZyJ9.KcO1v8qfhCeXRT3zSG2lckl3bqzofFFhM2pZEum02u3PS7m4anQw6ldP806ncme21JH0Hq0SrjscFxvrkDaKnOPR3eX2dqoQxyXyN-t7jJ9B1YHAAOanLVYfUiXUm7EJMekgsAVac9aueAIwzfFtkERK-kvHYsHvSC0nOIBUxSjZs4YfhZbf3ys-tyZB5sspM5_P_P54NQQJD2B-sn-3VJuFWTE2Wy_pa3D6kdjywG_9_T5yBHFXAQ2dneLOcqfUUoox2q-4gRWslv0Dziy1DwwAQiZA6uMZYkIKN_ngueynoxKg4d2OIVYGiHqzzBFllAKysvKIZ7uVPs4RLkqPqA"

    container_memory_cache{namespace="local-ils",image!~".+pause.+",container_name=~"ils-system"}

    按照内存大小降序显示:

    sort_desc(container_memory_usage_bytes{namespace="local-boss",image!~".+pause.+",namespace=~".+boss.*" ,image=~".+",container!~"filebeat"})

    pod用了多少内存:

    container_memory_working_set_bytes{container!='POD',namespace=~'.*local.*|.*rc.*'}

    pod网络流量:

    sum (rate (container_network_receive_bytes_total{image!="",name=~"^k8s_.*",node=~"^$Node$"}[5m])) by (pod_name)    接收

    - sum (rate (container_network_transmit_bytes_total{image!="",name=~"^k8s_.*",node=~"^$Node$"}[5m])) by (pod_name)    发送

    多少pod在跑:

    sum(kube_pod_status_phase{namespace=~".*", phase="Running"})

    申通一年截止202112130:

    token-4xncr:lkpddtfmmkpskqm52mpwn94cg68vxmrtfd7s8sllr84d5z87c5q97n

    grafana的配置:

    最终效果图:

    2个variables:

    内存使用情况:ram

    cpu实时负载:

    grafana cpu使用率:

    100 - (avg by (instance) (rate(node_cpu_seconds_total{ mode="idle"}[5m])) * 100)

    pods cpu使用情况:

    sum (rate (container_cpu_usage_seconds_total{image!="",name=~"^k8s_.*",namespace=~"$stNs"}[5m])) by (pod_name)

    alertmanager.yml

    global:
      resolve_timeout: 5m
    
    route:
      group_by: ['alertname']
      group_wait: 10s
      group_interval: 10s
      repeat_interval: 1m
      receiver: 'web.hook'
    receivers:
    - name: 'web.hook'
      webhook_configs:
      - url: 'http://127.0.0.1:5001/'

    prometheus.yml

    global:
      scrape_interval:     15s
      evaluation_interval: 15s
    
    alerting:
      alertmanagers:
      - static_configs:
        - targets:
          - localhost:9093
    
    rule_files:
      - alexrules.yml 
    
    scrape_configs:
    - job_name: alexcadvisor
      static_configs:
        - targets: ["10.10.10.68:10250","10.10.10.95:10250","10.10.10.96:10250","10.10.10.211:10250","10.10.10.212:10250","10.10.10.217:10250","10.10.10.216:10250","10.10.10.18:10250","10.10.10.19:10250","10.10.10.20:10250"]
      scrape_interval: 30s
      scrape_timeout: 10s
      metrics_path: /metrics/cadvisor
      scheme: https
      bearer_token_file: /root/prometheus-2.22.0.linux-amd64/token
      tls_config:
        insecure_skip_verify: true
    - job_name: alexnodeexporter 
      scrape_interval: 1m
      scrape_timeout: 10s
      metrics_path: /metrics
      bearer_token_file: /root/prometheus-2.22.0.linux-amd64/token
      scheme: http
      kubernetes_sd_configs:
      - role: endpoints
        api_server: 'https://10.10.10.68:6443'
        namespaces:
          names:
          - cattle-prometheus
        tls_config:
          insecure_skip_verify: true
        bearer_token_file: /root/prometheus-2.22.0.linux-amd64/token
      relabel_configs:
      - source_labels: [__meta_kubernetes_service_label_app]
        separator: ;
        regex: exporter-node
        replacement: $1
        action: keep
      - source_labels: [__meta_kubernetes_service_label_chart]
        separator: ;
        regex: exporter-node-0.0.1
        replacement: $1
        action: keep
      - source_labels: [__meta_kubernetes_service_label_monitoring_coreos_com]
        separator: ;
        regex: "true"
        replacement: $1
        action: keep
      - source_labels: [__meta_kubernetes_service_label_release]
        separator: ;
        regex: cluster-monitoring
        replacement: $1
        action: keep
      - source_labels: [__meta_kubernetes_endpoint_port_name]
        separator: ;
        regex: metrics
        replacement: $1
        action: keep
      - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
        separator: ;
        regex: Node;(.*)
        target_label: node
        replacement: ${1}
        action: replace
      - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
        separator: ;
        regex: Pod;(.*)
        target_label: pod
        replacement: ${1}
        action: replace
      - source_labels: [__meta_kubernetes_namespace]
        separator: ;
        regex: (.*)
        target_label: namespace
        replacement: $1
        action: replace
      - source_labels: [__meta_kubernetes_service_name]
        separator: ;
        regex: (.*)
        target_label: service
        replacement: $1
        action: replace
      - source_labels: [__meta_kubernetes_pod_name]
        separator: ;
        regex: (.*)
        target_label: pod
        replacement: $1
        action: replace
      - source_labels: [__meta_kubernetes_service_name]
        separator: ;
        regex: (.*)
        target_label: job
        replacement: ${1}
        action: replace
      - separator: ;
        regex: (.*)
        target_label: endpoint
        replacement: metrics
        action: replace
      - source_labels: [__meta_kubernetes_pod_host_ip]
        separator: ;
        regex: (.+)
        target_label: host_ip
        replacement: $1
        action: replace
      - source_labels: [__meta_kubernetes_pod_node_name]
        separator: ;
        regex: (.+)
        target_label: node
        replacement: $1
        action: replace

    alexrules.yml

    groups:
    - name: alexexample
      rules:
      - alert: node cpu high than 12%
        expr: 100 - (avg by (instance) (rate(node_cpu_seconds_total{ mode="idle"}[5m])) * 100) > 12
        for: 10s
        labels:
          severity: critical
        annotations:
          description: "{{$labels.instance}}的{{$labels.job}}组件的cpu使用率超过12%"
          value: "{{ $value }}%"
          threshold: "80%" 
  • 相关阅读:
    asterisk配置SIP服务器
    虚拟机桥接网卡下配置centOS静态IP
    在centOS5.9安装asterisk
    Cut 命令截取不同空格的string
    shell 中 贪婪匹配 和 非贪婪匹配
    shell 一些例子
    linux 系统时间 EST CST
    awk简单应用
    python3.5-ssh免输入密码过程
    GitHub个人使用入门
  • 原文地址:https://www.cnblogs.com/alexhjl/p/14182588.html
Copyright © 2011-2022 走看看