zoukankan      html  css  js  c++  java
  • Better Prometheus rate() Function with VictoriaMetrics

    转自:https://www.percona.com/blog/2020/02/28/better-prometheus-rate-function-with-victoriametrics/

    There are a lot of things I love about Prometheus; its data model is fantastic for monitoring applications and PromQL language is often more expressive than SQL for data retrieval needs you have in the observability space. One thing, though, I hate about Prometheus with a deep passion is the behavior of its rate() and similar functions, deeply rooted in the Prometheus computational model, which I was told by the development team is not likely to change.

    So What’s the Problem, and Why is it Such a Big Deal?

    First – the problem.  rate() functions give you the rate of change of the time series for the Interval supplied, so rate(mysql_global_status_questions[10s]) will basically give us the average number of MySQL questions over the last 10seconds. Everything is great so far.

    But what if the resolution of this time series is lower than 10 seconds, for example, if we take mysql_global_status_questions measurement only every minute? In this case, rate() function will return nothing and data will disappear from the graph.

    What would I like to see instead? Give the common sense answer! If I tell you MySQL Question was 1M at 0:00 and 2M at 10:00, and ask you what the average number of queries per second was from 4:00 to 5:00, you will just use the best estimate you have available and give the average based on the data available.

    Of course, such an approach is not without its problems, for example, it is possible MySQL actually went to 10M queries at 5:00 and when was restarted it went to 2M, and then the data will be wrong; yet I believe for most cases having such data is more preferred to having no data available.

    Existing “Solutions”

    One “solution” Prometheus provides to this problem is irate() function which gives you the “instant rate” based on the last two data points in time series. If you use irate() with a large enough interval, you can avoid getting “no data” but you get into another problem: you’ll be getting very volatile data based on two measurements, which, while of less volatile value, are smoothed over a longer period of time and might be desired.

    Another problem with irate() is that only rate() function has such a corresponding function, while other functions such as avg_over_time() or max_over_time() do not have any great options.

    One solution, which is often recommended, is to just build your dashboards to match the data capture resolution so you can’t get into such situations.

    This is a non-starter for our use case at Percona.

    We use Prometheus as a key component in Percona Monitoring and Management (PMM) and the data capture resolution is highly configurable, and so can be different in the different periods of time and different time series in the system. Additionally, most of the dashboards we provide are dynamic, using a lower averaging period as you “zoom in” to the data.

    VictoriaMetrics to the Rescue

    VictoriaMetrics is a Time Series Database which can be connected to Prometheus using the RemoteWrite backend.  It implements Read API, which is mostly compatible with Prometheus as well as MetricsQL, which is mostly compatible with PromQL and offers some additional language features.

    VictoriaMetrics has other advantages compared to Prometheus, ranging from massively parallel operation for scalability, better performance, and better data compression, though what we focus on for this blog post is a rate() function handling.

    VictoriaMetrics handles rate() function in the common sense way I described earlier!

    Let’s take a look at the difference in practice. Here I am using a prototype build of Percona Monitoring and Management with VictoriaMetrics. In the “Questions” panel we use the needlessly complicated formula:

    Which we can simplify to the “common sense” formula we’d like to use, without workarounds required:

    Let’s now compare the graphs between Prometheus (Left) and VictoriaMetrics(Right)

    1h Range

    Prometheus vs VictoriaMetrics

    For 1 hour range, we get high enough resolution for both Prometheus and VictoriaMetrics display data. The differences in the graphs come from the fact it is two separate instances running similar workloads rather than the same data in both data stores.

    5min Range

    Prometheus vs VictoriaMetrics

    In this case, as you can see, Prometheus shows no data while VictoriaMetrics provides data even if the attempted resolution is 1sec and data is available with only 5 seconds resolution.

    Summary

    We’re very early in the process evaluating VictoriaMetrics but I’m super thrilled it solves this very annoying problem we have with Prometheus query handling. I wonder if this is a problem for you as well, and if you too find VictoriaMetrics behavior more user-friendly or if Prometheus’ behavior is preferred in your environment.

  • 相关阅读:
    MySQL常用函数大全讲解
    mysql 获取最近一个月每一天
    Mysql查询某个月的每一天的数据
    Mysql 查询一天中,每个小时数据的数量
    oracle 和 mysql 遍历当前月份每一天
    sql查询总结
    Qt样式表——选择器详解(父子关系,插图详细解释)
    Qt样式表之盒子模型(以QSS来讲解,而不是CSS)
    程序员晋升必备技能——单元测试框架(小豆君的干货铺)
    为什么川普反对中国补贴农业(渐进式发展是非常正确的,如果贸然改动农村土地制度,在城市还不能提供足够的就业岗位下将大量的农民推向城市……请欣赏一下巴西、印度城市的贫民窟)
  • 原文地址:https://www.cnblogs.com/rongfengliang/p/12793491.html
Copyright © 2011-2022 走看看