Prometheus promQL查询语言
Prometheus提供了一种名为PromQL (Prometheus查询语言)的函数式查询语言,允许用户实时选择和聚合时间序列数据。表达式的结果既可以显示为图形,也可以在Prometheus的表达式浏览器中作为表格数据查看,或者通过HTTP API由外部系统使用。
准备工作
在进行查询,这里提供下我的配置文件如下
[root@node00 prometheus]# cat prometheus.yml # my global config global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: # - "first_rules.yml" # - "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: 'prometheus' # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ['localhost:9090'] - job_name: "node" file_sd_configs: - refresh_interval: 1m files: - "/usr/local/prometheus/prometheus/conf/node*.yml" remote_write: - url: "http://localhost:8086/api/v1/prom/write?db=prometheus" remote_read: - url: "http://localhost:8086/api/v1/prom/read?db=prometheus" [root@node00 prometheus]# cat conf/node-dis.yml - targets: - "192.168.100.10:20001" labels: __datacenter__: dc0 __hostname__: node00 __businees_line__: "line_a" __region_id__: "cn-beijing" __availability_zone__: "a" - targets: - "192.168.100.11:20001" labels: __datacenter__: dc1 __hostname__: node01 __businees_line__: "line_a" __region_id__: "cn-beijing" __availability_zone__: "a" - targets: - "192.168.100.12:20001" labels: __datacenter__: dc0 __hostname__: node02 __businees_line__: "line_c" __region_id__: "cn-beijing" __availability_zone__: "b"
简单时序查询
直接查询特定metric_name
# 节点的forks的总次数
node_forks_total
#结果如下
Element | Value |
---|---|
node_forks_total{instance="192.168.100.10:20001",job="node"} | 201518 |
node_forks_total{instance="192.168.100.11:20001",job="node"} | 23951 |
node_forks_total{instance="192.168.100.12:20001",job="node"} | 24127 |
带标签的查询
node_forks_total{instance="192.168.100.10:20001"}
# 结果如下
Element | Value |
---|---|
node_forks_total{instance="192.168.100.10:20001",job="node"} | 201816 |
多标签查询
node_forks_total{instance="192.168.100.10:20001",job="node"}
# 结果如下
Element | Value |
---|---|
node_forks_total{instance="192.168.100.10:20001",job="node"} | 201932 |
查询2分钟的时序数值
node_forks_total{instance="192.168.100.10:20001",job="node"}[2m]
Element | Value |
---|---|
node_forks_total{instance="192.168.100.10:20001",job="node"} | 201932 @1569492864.036 201932 @1569492879.036 201932 @1569492894.035 201932 @1569492909.036 201985 @1569492924.036 201989 @1569492939.036 201993 @1569492954.036 |
正则匹配
node_forks_total{instance=~"192.168.*:20001",job="node"}
Element | Value |
---|---|
node_forks_total{instance="192.168.100.10:20001",job="node"} | 202107 |
node_forks_total{instance="192.168.100.11:20001",job="node"} | 24014 |
node_forks_total{instance="192.168.100.12:20001",job="node"} | 24186 |
常用函数查询
官方提供的函数比较多, 具体可以参考地址如下: https://prometheus.io/docs/prometheus/latest/querying/functions/
这里主要就常用函数进行演示。
irate
irate用于计算速率。
# 通过标签查询,特定实例特定job,特定cpu 在idle状态下的cpu次数速率
irate(node_cpu_seconds_total{cpu="0",instance="192.168.100.10:20001",job="node",mode="idle"}[1m])
Element | Value |
---|---|
{cpu="0",instance="192.168.100.10:20001",job="node",mode="idle"} | 0.9833988932595507 |
count_over_time
计算特定的时序数据中的个数。
# 这个数值个数和采集频率有关, 我们的采集间隔是15s,在一分钟会有4个点位数据。
count_over_time(node_boot_time_seconds[1m])
Element | Value |
---|---|
{instance="192.168.100.10:20001",job="node"} | 4 |
{instance="192.168.100.11:20001",job="node"} | 4 |
{instance="192.168.100.12:20001",job="node"} | 4 |
子查询
# 过去的10分钟内, 每分钟计算下过去5分钟的一个速率值。 一个采集10m/1m一共10个值。
rate(node_cpu_seconds_total{cpu="0",instance="192.168.100.10:20001",job="node",mode="idle"}[5m])[10m:1m]
Element | Value |
---|---|
{cpu="0",instance="192.168.100.10:20001",job="node",mode="idle"} | 0.9865228543057867 @1569494040 0.9862807017543735 @1569494100 0.9861087231885309 @1569494160 0.9864946894550303 @1569494220 0.9863192502430038 @1569494280 0.9859649122807017 @1569494340 0.9859298245613708 @1569494400 0.9869122807017177 @1569494460 0.9867368421052672 @1569494520 0.987438596491273 @1569494580 |
复杂查询
计算内存使用百分比
node_memory_MemFree_bytes / node_memory_MemTotal_bytes * 100
Element | Value |
---|---|
{instance="192.168.100.10:20001",job="node"} | 9.927579722322251 |
{instance="192.168.100.11:20001",job="node"} | 59.740727403673034 |
{instance="192.168.100.12:20001",job="node"} | 63.2080982675149 |
获取所有实例的内存使用百分比前2个
topk(2,node_memory_MemFree_bytes / node_memory_MemTotal_bytes * 100 )
Element | Value |
---|---|
{instance="192.168.100.12:20001",job="node"} | 63.20129636298163 |
{instance="192.168.100.11:20001",job="node"} | 59.50586164125955 |
实用查询样例
获取cpu核心个数
# 计算所有的实例cpu核心数
count by (instance) ( count by (instance,cpu) (node_cpu_seconds_total{mode="system"}) )
# 计算单个实例的
count by (instance) ( count by (instance,cpu) (node_cpu_seconds_total{mode="system",instance="192.168.100.11:20001"})
计算内存使用率
(1 - (node_memory_MemAvailable_bytes{instance=~"192.168.100.10:20001"} / (node_memory_MemTotal_bytes{instance=~"192.168.100.10:20001"})))* 100
Element | Value |
---|---|
{instance="192.168.100.10:20001",job="node"} | 87.09358620413717 |
计算根分区使用率
100 - ((node_filesystem_avail_bytes{instance="192.168.100.10:20001",mountpoint="/",fstype=~"ext4|xfs"} * 100) / node_filesystem_size_bytes {instance=~"192.168.100.10:20001",mountpoint="/",fstype=~"ext4|xfs"})
Element | Value |
---|---|
{device="/dev/mapper/centos-root",fstype="xfs",instance="192.168.100.10:20001",job="node",mountpoint="/"} | 4.175111443575972 |
预测磁盘空间
# 整体分为 2个部分, 中间用and分割, 前面部分计算根分区使用率大于85的, 后面计算根据近6小时的数据预测接下来24小时的磁盘可用空间是否小于0 。
(1- node_filesystem_avail_bytes{fstype=~"ext4|xfs",mountpoint="/"} / node_filesystem_size_bytes{fstype=~"ext4|xfs",mountpoint="/"}) * 100 >= 85 and (predict_linear(node_filesystem_avail_bytes[6h],3600 * 24) < 0)