Prometheus 介绍
Prometheus(普罗米修斯)是一个最初在SoundCloud上构建的监控系统。自2012年成为社区开源项目,拥有非常活跃的开发人员和用户社区。为强调开源及独立维护,Prometheus于2016年加入云原生云计算基金会(CNCF),成为继Kubernetes之后的第二个托管项目。 显然 Prometheus 已经成了K8S平台的标配 众多私有云部署K8S选择prometheus作为监控告警平台(因为免费)
那么Prometheus有什么缺点吗? 还是要有的 尽管免费但是入门相对来说比较难 因为它处在一个快速增长的阶段 需要用户不断的去试错来适配和完善 并且对于运维人员来说 需要懂Prometheus的语法PromQL(Prometheus Query Language)
才能真正的使用它 基于这两点Prometheus真的还需要时间来检验。 一是产品有待成熟 二是用户需要熟悉该产品
Prometheus 架构
Prometheus Server:收集指标和存储时间序列数据,并提供查询接口
Push Gateway:短期存储指标数据。主要用于临时性的任务
Exporters:采集已有的第三方服务监控指标并暴露metrics
Alertmanager:告警
Web UI:简单的Web控制台
总结:
Prometheus主动去拉取监控端的性能数据(需要在监控节点部署agent)
Prometheus通过push gateway可以吧数据吐出来给第三方系统进行分析
Prometheus通过alertmanager模块可以生成告警 通过邮件 短信平台通知用户
Prometheus没有UI供用户管理 所以需要继承grafana进行展示和配置
Prometheus 使用
Docker 部署 Prometheus
### 部署 Prometheus docker run -d --name=prometheus -p 9090:9090 prom/prometheus 访问地址:http://IP:9090 ### 部署 Grafana docker run -d --name=grafana -p 3000:3000 grafana/grafana 访问地址:http://IP:3000
Grafana添加 Prometheus数据源
输入promethues端点IP地址
添加成功
输入相关指标数据可以进行展示
Prometheus 监控Linux机器
被监控端部署 node_export
wget https://github.com/prometheus/node_exporter/releases/download/v1.0.1/node_exporter-1.0.1.linux-amd64.tar.gz tar xvfz node_exporter-*.*-amd64.tar.gz cd node_exporter-*.*-amd64 ./node_exporter
访问 监控端 http://172.16.0.12:9100
进入 promethues 修改配置文件 添加断点
[root@k8s-master01 ~]# docker exec -it prometheus sh
/prometheus $ vi /etc/prometheus/prometheus.yml
添加监控端
- job_name: 'linux server' static_configs: - targets: ['172.16.0.12:9100']
[root@k8s-master01 ~]# docker restart prometheus
再次登陆 Prometheus查看新的端点已经被添加上
Grafana导入仪表盘
更改组为 ‘linux server’
Promethues 监控K8S
Prometheus监控K8S 主要从两个维度 群集角度和应用角度
Kubernetes本身监控
•Node资源利用率
•Node数量
•每个Node运行Pod数量
•资源对象状态
Pod监控
•Pod总数量及每个控制器预期数量
•Pod状态
•容器资源利用率:CPU、内存、网络
监控架构
Pod
kubelet的节点使用cAdvisor提供的metrics接口获取该节点所有Pod和容器相关的性能指标数据。
指标接口:https://NodeIP:10250/metrics/cadvisor
Node
使用node_exporter收集器采集节点资源利用率。
项目地址:https://github.com/prometheus/node_exporter
K8s资源对象
kube-state-metrics采集了k8s中各种资源对象的状态信息。
项目地址:https://github.com/kubernetes/kube-state-metrics
准备YAML文件
-rw-r--r-- 1 root root 550 Jan 1 16:26 alertmanager-configmap.yaml -rw-r--r-- 1 root root 2518 Jan 1 16:26 alertmanager-deployment.yaml drwxr-xr-x 2 root root 129 Jan 1 16:27 dashboard -rw-r--r-- 1 root root 1262 Jan 1 16:26 grafana.yaml -rw-r--r-- 1 root root 4358 Jan 1 16:26 kube-state-metrics.yaml -rw-r--r-- 1 root root 1648 Jan 1 16:25 node-exporter.yml -rw-r--r-- 1 root root 4815 Jan 1 16:25 prometheus-configmap.yaml -rw-r--r-- 1 root root 3922 Jan 1 16:25 prometheus-deployment.yaml -rw-r--r-- 1 root root 4840 Jan 1 16:25 prometheus-rules.yaml
kubectl create namespace ops
kubectl apply -f prometheus-*
[root@k8s-master03 promethues]# kubectl get pod -n ops NAME READY STATUS RESTARTS AGE prometheus-859dbbc5f7-qgpwb 2/2 Running 0 76s
通过NodePort访问Prometheus界面看是否成功
也已经通过cadvisor收集到了node信息
部署grafana
[root@k8s-master03 promethues]# kubectl apply -f grafana.yaml deployment.apps/grafana created persistentvolumeclaim/grafana created service/grafana created
[root@k8s-master03 promethues]# kubectl get pod -n ops NAME READY STATUS RESTARTS AGE grafana-757fcd5f7c-wdnt4 1/1 Running 0 78s prometheus-859dbbc5f7-qgpwb 2/2 Running 0 15m [root@k8s-master03 promethues]# kubectl get svc -n ops NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE grafana NodePort 10.100.91.227 <none> 80:30030/TCP 86s prometheus NodePort 10.108.216.228 <none> 9090:30090/TCP 15m
通过NodePort访问 Grafana 也是能够被登陆的哈
先添加Prometheus数据源 http://172.16.0.21:30090
导入群集监控报表
部署daemonset 监控node
[root@k8s-master03 promethues]# kubectl apply -f node-exporter.yml daemonset.apps/node-exporter created service/node-exporter created [root@k8s-master03 promethues]# kubectl get pod -n ops NAME READY STATUS RESTARTS AGE grafana-757fcd5f7c-wdnt4 1/1 Running 0 12m node-exporter-rsbzk 1/1 Running 0 44s node-exporter-wmd65 1/1 Running 0 44s prometheus-859dbbc5f7-qgpwb 2/2 Running 0 26m
导入监控NODE节点报表
部署 kube_state_metrics 获取kube api资源指标
[root@k8s-master03 promethues]# kubectl apply -f kube-state-metrics.yaml deployment.apps/kube-state-metrics created configmap/kube-state-metrics-config created service/kube-state-metrics created serviceaccount/kube-state-metrics created clusterrole.rbac.authorization.k8s.io/kube-state-metrics created role.rbac.authorization.k8s.io/kube-state-metrics-resizer created clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created rolebinding.rbac.authorization.k8s.io/kube-state-metrics created [root@k8s-master03 promethues]# kubectl get svc -n ops NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE grafana NodePort 10.100.91.227 <none> 80:30030/TCP 19m kube-state-metrics ClusterIP 10.106.88.221 <none> 8080/TCP,8081/TCP 21s node-exporter ClusterIP None <none> 9100/TCP 7m33s prometheus NodePort 10.108.216.228 <none> 9090:30090/TCP 32m
[root@k8s-master03 promethues]# kubectl get pod -n ops NAME READY STATUS RESTARTS AGE grafana-757fcd5f7c-wdnt4 1/1 Running 0 19m kube-state-metrics-667bc48f47-b5hb4 2/2 Running 0 47s node-exporter-rsbzk 1/1 Running 0 8m node-exporter-wmd65 1/1 Running 0 8m prometheus-859dbbc5f7-qgpwb 2/2 Running 0 33m
导入 kube metric 报表
配置告警
[root@k8s-master03 promethues]# kubectl apply -f alertmanager-configmap.yaml configmap/alertmanager-config created [root@k8s-master03 promethues]# kubectl apply -f alertmanager-deployment.yaml deployment.apps/alertmanager created persistentvolumeclaim/alertmanager created service/alertmanager created [root@k8s-master03 promethues]# kubectl get pod -n ops NAME READY STATUS RESTARTS AGE alertmanager-7d5fb96b7b-8zfdk 1/2 Running 0 34s grafana-757fcd5f7c-wdnt4 1/1 Running 0 37m kube-state-metrics-667bc48f47-b5hb4 2/2 Running 0 18m node-exporter-rsbzk 1/1 Running 0 25m node-exporter-wmd65 1/1 Running 0 25m prometheus-859dbbc5f7-qgpwb 2/2 Running 0 51m [root@k8s-master03 promethues]# kubectl get svc -n ops NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE alertmanager NodePort 10.108.223.228 <none> 80:30093/TCP 41s grafana NodePort 10.100.91.227 <none> 80:30030/TCP 37m kube-state-metrics ClusterIP 10.106.88.221 <none> 8080/TCP,8081/TCP 18m node-exporter ClusterIP None <none> 9100/TCP 25m prometheus NodePort 10.108.216.228 <none> 9090:30090/TCP 51m
登陆收件人邮箱查看
所有的告警规则都是在 prometheus-rules.yaml 文件中定义的
也可以登陆promethues 界面查看
手工创建以Pending pod
[root@k8s-master03 promethues]# cat pod.yaml apiVersion: v1 kind: Pod metadata: labels: run: nginx name: nginx spec: containers: - image: nginx name: nginx resources: requests: cpu: 8 [root@k8s-master03 promethues]# kubectl get pod NAME READY STATUS RESTARTS AGE delightful-produce-mariadb-0 1/1 Running 0 3h34m delightful-produce-wordpress-cf45b6d99-tptzf 1/1 Running 0 3h34m nfs-client-provisioner-7b87dc5c48-rx7zd 1/1 Running 0 4h2m nginx 0/1 Pending 0 3s
告警被触发
收到告警邮件了也
- job_name: 'linux server' static_configs: - targets: ['172.16.0.12:9100']