zoukankan      html  css  js  c++  java
  • 第六课:部署集群监控系统

    15. 部署监控系统

    node-export,alertmanager,grafana,kube-state-metrics,promeheus.

    组件说明
    MettricServer: 是kubernetes集群资源使用情况的聚合器,手机数据给kubernetes集群内使用,如kubectl,hpa,scheduler等。
    NodeExporter: 用于个node的关键度量指标状态数据。
    KubeStateMetrics: 收集kubernetes集群内资源对象数据,指定告警规则。
    Prometheus-adapter: 自定义监控指标与容器指标
    Prometheus: 采用pull方式收集apiserver,scheduler,controller-manager,kubelet组件数据,通过http协议传输。
    Grafana:可视化数据展示和监控平台。
    Alertmanager: 实现短信或邮件告警。

    15.1 安装NFS服务端

    此处安装nfs用于在试验环境存储数据使用,生产环境依情况而定。

    15.1.1 master节点安装nfs

    yum -y install nfs-utils

    15.1.2 创建nfs目录

    mkdir -p /ifs/kubernetes

    15.1.3 修改权限

    chmod -R 777 /ifs/kubernetes

    15.1.4 编辑export文件
    vim /etc/exports
    /ifs/kubernetes *(rw,no_root_squash,sync)
    
    15.1.5 修改配置启动文件
    cat >/usr/lib/systemd/system/rpcbind.socket<<EOF
    [Unit]
    Description=RPCbind Server Activation Socket
    [Socket]
    ListenStream=/var/run/rpcbind.sock
    ListenStream=0.0.0.0:111
    ListenDatagram=0.0.0.0:111
    [Install]
    WantedBy=socket.target
    EOF
    
    15.1.6 配置生效

    exportfs -f

    15.1.7 启动服务
    systemctl start rpcbind
    systemctl status rpcbind
    systemctl enable rpcbind
    systemctl start nfs
    systemctl status nfs
    
    15.1.8 showmount测试(master01)
    showmount -e 192.168.68.146
    [root@master01 sockets.target.wants]# showmount -e 192.168.68.146
    Export list for 192.168.68.146:
    /ifs/kubernetes *
    
    15.1.9 在所有的node节点安装客户端

    yum -y install nfs-utils

    15.1.10 在node节点检查 (node01,node02)

    所有的node节点都要显示服务端信息,才能挂载成功

    [root@node01 cfg]# showmount -e 192.168.68.146
    Export list for 192.168.68.146:
    /ifs/kubernetes *
    

    15.2 部署PVC

    nfs服务端地址需要修改:192.168.68.146

    kubectl apply -f nfs-class.yaml
    #修改nfs-deployment.yaml中的NFS ip地址
    kubectl apply -f nfs-deployment.yaml
    kubectl apply -f nfs-rabc.yaml
    

    查看nfs pod状态

    [root@master01 nfs]# kubectl get pod
    NAME                                     READY   STATUS    RESTARTS   AGE
    nfs-client-provisioner-7d4864f8f-wbbsq   1/1     Running   0          12s
    
    15.2.1 查看是否部署成功
    kubectl get StorageClass
    [root@master01 nfs]# kubectl get StorageClass
    NAME                  PROVISIONER      AGE
    managed-nfs-storage   fuseim.pri/ifs   2m14s
    
    15.2.2 登陆页面查看

    从k8s dashboard页面的StorageClass页面可以看到我们刚部署完的nfs服务
    avator

    15.3 部署监控系统

    注意需要修改的配置文件
    修改IP:
    ServiceMonitor/prometheus-EtcdService.yaml
    ServiceMonitor/prometheus-kubeControllerManagerService.yaml
    ServiceMonitor/prometheus-kubeSchedulerService.yaml
    ServiceMonitor/prometheus-KubeProxyService.yaml

    [root@master01 serviceMonitor]# ls | xargs grep 68
    prometheus-EtcdService.yaml:  - ip: 192.168.68.146
    prometheus-EtcdService.yaml:  - ip: 192.168.68.147
    prometheus-EtcdService.yaml:  - ip: 192.168.68.148
    prometheus-kubeControllerManagerService.yaml:  - ip: 192.168.68.146
    prometheus-kubeControllerManagerService.yaml:  - ip: 192.168.68.147
    prometheus-kubeControllerManagerService.yaml:  - ip: 192.168.68.148
    prometheus-KubeProxyService.yaml:  - ip: 192.168.68.149
    prometheus-KubeProxyService.yaml:  - ip: 192.168.68.151
    prometheus-kubeSchedulerService.yaml:  - ip: 192.168.68.146
    prometheus-kubeSchedulerService.yaml:  - ip: 192.168.68.147
    prometheus-kubeSchedulerService.yaml:  - ip: 192.168.68.148
    

    配置
    kubectl apply -f setup/创建monitoring命名空间,部署operator服务
    kubectl apply -f alertmanager/部署alertmanager服务
    kubectl apply -f node-exporter/部署node-exporter服务
    kubectl apply -f kube-state-metrics/
    kubectl apply -f grafana/
    kubectl apply -f prometheus/
    kubectl apply -f serviceMonitor/

    操作过程:

    cd /root/monitor/prometheus
    kubectl apply -f setup/
    [root@master01 prometheus]# kubectl apply -f setup/
    namespace/monitoring created
    customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created
    customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created
    customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created
    customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created
    customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created
    clusterrole.rbac.authorization.k8s.io/prometheus-operator created
    clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
    deployment.apps/prometheus-operator created
    service/prometheus-operator created
    serviceaccount/prometheus-operator created
    
    kubectl apply -f alertmanager/
    #修改alertmanager/alertmanager-alertmanager.yaml将副本数改为2 replicas: 2
    kubectl apply -f alertmanager/
    alertmanager.monitoring.coreos.com/main created
    secret/alertmanager-main created
    service/alertmanager-main created
    serviceaccount/alertmanager-main created
    
    cd no-exporter
    ls | xargs grep image
    node-exporter-daemonset.yaml:        image: prom/node-exporter:v0.18.1
    node-exporter-daemonset.yaml:        image: quay.io/coreos/kube-rbac-proxy:v0.4.1
    #可以先在node节点把需要的镜像pull到本地然后在部署服务
    [root@master01 prometheus]# kubectl apply -f node-exporter/
    clusterrole.rbac.authorization.k8s.io/node-exporter created
    clusterrolebinding.rbac.authorization.k8s.io/node-exporter created
    daemonset.apps/node-exporter created
    service/node-exporter created
    serviceaccount/node-exporter created
    
    cd kube-state-metrics
    [root@master01 kube-state-metrics]# ls | xargs grep image
    kube-state-metrics-deployment.yaml:        image: quay.io/coreos/kube-rbac-proxy:v0.4.1
    kube-state-metrics-deployment.yaml:        image: quay.io/coreos/kube-rbac-proxy:v0.4.1
    kube-state-metrics-deployment.yaml:        image: quay.io/coreos/kube-state-metrics:v1.8.0
    [root@master01 prometheus]# kubectl apply -f kube-state-metrics/
    clusterrole.rbac.authorization.k8s.io/kube-state-metrics created
    clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created
    deployment.apps/kube-state-metrics created
    clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics-rbac created
    role.rbac.authorization.k8s.io/kube-state-metrics created
    rolebinding.rbac.authorization.k8s.io/kube-state-metrics created
    service/kube-state-metrics created
    serviceaccount/kube-state-metrics created
    
    #我们前面安装的nfs存储服务用在这里的grafana-pvc,prometheus-pvc服务,storageClassName: managed-nfs-storage
    [root@master01 prometheus]# kubectl get storageClass
    NAME                  PROVISIONER      AGE
    managed-nfs-storage   fuseim.pri/ifs   3h28m
    这里的name要对应起来。
    
    [root@master01 prometheus]# kubectl apply -f grafana/
    secret/grafana-datasources created
    configmap/grafana-dashboard-apiserver created
    configmap/grafana-dashboard-cluster-total created
    configmap/grafana-dashboard-controller-manager created
    configmap/grafana-dashboard-k8s-resources-cluster created
    configmap/grafana-dashboard-k8s-resources-namespace created
    configmap/grafana-dashboard-k8s-resources-node created
    configmap/grafana-dashboard-k8s-resources-pod created
    configmap/grafana-dashboard-k8s-resources-workload created
    configmap/grafana-dashboard-k8s-resources-workloads-namespace created
    configmap/grafana-dashboard-kubelet created
    configmap/grafana-dashboard-namespace-by-pod created
    configmap/grafana-dashboard-namespace-by-workload created
    configmap/grafana-dashboard-node-cluster-rsrc-use created
    configmap/grafana-dashboard-node-rsrc-use created
    configmap/grafana-dashboard-nodes created
    configmap/grafana-dashboard-persistentvolumesusage created
    configmap/grafana-dashboard-pod-total created
    configmap/grafana-dashboard-pods created
    configmap/grafana-dashboard-prometheus-remote-write created
    configmap/grafana-dashboard-prometheus created
    configmap/grafana-dashboard-proxy created
    configmap/grafana-dashboard-scheduler created
    configmap/grafana-dashboard-statefulset created
    configmap/grafana-dashboard-workload-total created
    configmap/grafana-dashboards created
    deployment.apps/grafana created
    persistentvolumeclaim/grafana created
    clusterrolebinding.rbac.authorization.k8s.io/grafana-rbac created
    service/grafana created
    serviceaccount/grafana created
    
    [root@master01 prometheus]# kubectl apply -f prometheus/
    clusterrole.rbac.authorization.k8s.io/prometheus-k8s created
    clusterrolebinding.rbac.authorization.k8s.io/prometheus-k8s created
    prometheus.monitoring.coreos.com/k8s created
    persistentvolumeclaim/prometheus-data created
    clusterrolebinding.rbac.authorization.k8s.io/prometheus-rbac created
    rolebinding.rbac.authorization.k8s.io/prometheus-k8s-config created
    rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
    rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
    rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
    role.rbac.authorization.k8s.io/prometheus-k8s-config created
    role.rbac.authorization.k8s.io/prometheus-k8s created
    role.rbac.authorization.k8s.io/prometheus-k8s created
    role.rbac.authorization.k8s.io/prometheus-k8s created
    prometheusrule.monitoring.coreos.com/prometheus-k8s-rules created
    service/prometheus-k8s created
    serviceaccount/prometheus-k8s created
    
    [root@master01 prometheus]# kubectl apply -f serviceMonitor/
    servicemonitor.monitoring.coreos.com/alertmanager created
    servicemonitor.monitoring.coreos.com/grafana created
    servicemonitor.monitoring.coreos.com/kube-state-metrics created
    servicemonitor.monitoring.coreos.com/node-exporter created
    service/kube-etcd created
    endpoints/kube-etcd created
    service/kube-proxy created
    endpoints/kube-proxy created
    service/kube-controller-manager created
    Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply
    endpoints/kube-controller-manager configured
    service/kube-scheduler created
    Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply
    endpoints/kube-scheduler configured
    servicemonitor.monitoring.coreos.com/prometheus-operator created
    servicemonitor.monitoring.coreos.com/prometheus created
    servicemonitor.monitoring.coreos.com/kube-apiserver created
    servicemonitor.monitoring.coreos.com/coredns created
    servicemonitor.monitoring.coreos.com/kube-etcd created
    servicemonitor.monitoring.coreos.com/kube-controller-manager created
    servicemonitor.monitoring.coreos.com/kube-proxy created
    servicemonitor.monitoring.coreos.com/kube-scheduler created
    servicemonitor.monitoring.coreos.com/kubelet created
    

    如果提示权限问题,解决方法如下:

    kubectl create serviceaccount kube-state-metrics -n monitoring
    kubectl create serviceaccount grafana -n monitoring
    kubectl create serviceaccount prometheus-k8s -n monitoring
    

    创建权限文件

    #kube-state-metrics
    [root@master01 prometheus]# cat kube-state-metrics/kube-state-metrics-rabc.yaml   
    apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: ClusterRoleBinding
    metadata:
      name: kube-state-metrics-rbac
    subjects:
      - kind: ServiceAccount
        name: kube-state-metrics
        namespace: monitoring
    roleRef:
      kind: ClusterRole
      name: cluster-admin
      apiGroup: rbac.authorization.k8s.io
      
    #grafana
    [root@master01 prometheus]# cat grafana/grafana-rabc.yaml 
    apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: ClusterRoleBinding
    metadata:
      name: grafana-rbac
    subjects:
      - kind: ServiceAccount
        name: grafana
        namespace: monitoring
    roleRef:
      kind: ClusterRole
      name: cluster-admin
      apiGroup: rbac.authorization.k8s.io
    
    #prometheus
    [root@master01 prometheus]# cat prometheus/prometheus-rabc.yaml 
    apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: ClusterRoleBinding
    metadata:
      name: prometheus-rbac
    subjects:
      - kind: ServiceAccount
        name: prometheus-k8s
        namespace: monitoring
    roleRef:
      kind: ClusterRole
      name: cluster-admin
      apiGroup: rbac.authorization.k8s.io
      
    
    15.3.1 获取grafana pod和svc信息

    获取grafana的nodeport地址和端口用于web访问

    [root@master01 prometheus]# kubectl get pod,svc -A -o wide | grep grafana
    monitoring             pod/grafana-5dc77ff8cb-9lcgc                     1/1     Running   0          14m     172.17.82.8      192.168.68.151   <none>           <none>
    
    monitoring             service/grafana                     NodePort    10.0.0.56    <none>        3000:30093/TCP               14m     app=grafana
    
    15.3.2 登陆grafana dashboard

    用户名密码:admin/admin
    http://192.168.68.151:30093/

    15.3.3 获取prometheus pod和svc信息
    [root@master01 prometheus]# kubectl get pod,svc -A -o wide | grep prometheus
    
    monitoring             pod/prometheus-k8s-0                             3/3     Running   1          29m     172.17.15.9      192.168.68.149   <none>           <none>
    monitoring             pod/prometheus-k8s-1                             3/3     Running   1          29m     172.17.82.9      192.168.68.151   <none>           <none>
    monitoring             pod/prometheus-operator-6685db5c6-hszsn          1/1     Running   0          3h29m   172.17.82.5      192.168.68.151   <none>           <none>
    monitoring             service/prometheus-k8s              NodePort    10.0.0.65    <none>        9090:38883/TCP               29m     app=prometheus,prometheus=k8s
    
    15.3.4 登陆prometheus dashboard

    http://192.168.68.149:38883/

    关于grafana和prometheus的使用这里不详细说明,这里已经有现成的一些模板展示k8s集群的相关监控,也可以自行导入监控模板。

  • 相关阅读:
    剑指OFFER之合并有序链表(九度OJ1519)
    剑指OFFER之反转链表(九度OJ1518)
    剑指OFFER之链表中倒数第k个节点(九度OJ1517)
    一分钟教你在博客园中制作自己的动态云球形标签页
    剑指OFFER之调整数组顺序使奇数位于偶数前面找(九度OJ1516)
    剑指OFFER之打印1到最大的N位数(九度OJ1515)
    剑指OFFER之矩形覆盖(九度OJ1390)
    剑指OFFER之数值的整数次方(九度OJ1514)
    剑指OFFER之变态跳台阶(九度OJ1389)
    剑指OFFER之二进制中1的个数(九度OJ1513)
  • 原文地址:https://www.cnblogs.com/Doc-Yu/p/13552724.html
Copyright © 2011-2022 走看看