zoukankan      html  css  js  c++  java
  • hystrix熔断器之metrics

    Metric概述

      HystrixCommands和HystrixObservableCommands执行过程中,会产生执行的数据,这些数据对于观察调用的性能表现非常有用。

      命令产生数据后,Metrics会根据不同纬度进行统计,主要有一下三个纬度:一段时间内(窗口期)的累计统计数据、持续的累计统计数据、一段时间内(窗口期)的数据分布。

    Metric实现

      Metrics实现主要的流程如下:

      1.命令在开始执行前会向开始消息流(HystrixCommandStartStream)发送开始消息(HystrixCommandExecutionStarted)。  

      2.如果是线程池执行,执行前会向线程池开始消息流(HystrixThreadPoolStartStream)发送开始消息(HystrixCommandExecutionStarted)。

      3.如果是线程池执行,执行后会向线程池结束消息流(HystrixThreadPoolCompletionStream)发送完成消息(HystrixCommandCompletion)。

      4.命令在结束执行前会向完成消息流(HystrixCommandCompletionStream)发送完成消息(HystrixCommandCompletion)。

      5.不同类型的统计流,会监听开始消息流或完成消息流,根据接受到的消息内容,进行统计。

    Hystrix消息类型

      HystrixCommandCompletion有一下消息类型

     一段时间内统计

       统计流首先监听一个消息流(开始消息流或者完成消息流),统计一段时间内各个类型消息的累计数据(时间为:metrics.rollingStats.timeInMilliseconds/metrics.rollingStats.numBuckets)。然后再对累计的数据进行累加(个数为:metrics.rollingStats.numBuckets),即为最终累计数据。

      RollingCommandEventCounterStream消息流监听了HystrixCommandCompletionStream消息流,并统计各种消息类型次数。      

      RollingCollapserEventCounterStream消息流监听了HystrixCollapserEventStream消息流,并统计各种消息类型次数。 

      RollingThreadPoolEventCounterStream消息流监听了HystrixThreadPoolCompletionStream消息流,并统计各种消息类型次数。 

      HealthCountsStream消息流监听了HystrixThreadPoolCompletionStream消息流,并统计成功次数,失败次数,失败率。 

    持续统计

       统计流首先监听一个消息流(开始消息流或者完成消息流),统计一段时间内各个类型消息的累计数据(时间为:metrics.rollingStats.timeInMilliseconds/metrics.rollingStats.numBuckets)。然后不断的累加累计数据。

      CumulativeCommandEventCounterStream监听了HystrixCommandCompletionStream消息流,并统计各种消息类型次数。    

      CumulativeCollapserEventCounterStream监听了HystrixCollapserEventStream消息流,并统计各种消息类型次数。    

      CumulativeThreadPoolEventCounterStream监听了HystrixThreadPoolCompletionStream消息流,并统计各种消息类型次数。    

    一段时间内分布统计

      RollingDistributionStream监听一个消息流,例如HystrixCommandStartStream,然后通过RX java对一段时间内的数值进行运算操作,生成统计值放在Histogram对象中,然后重新发射,对窗口期内的Histogram对象进行运算操作,并生成统计值重新发射。

        子类RollingCommandLatencyDistributionStream监听了HystrixCommandCompletionStream消息流,并且通过RX java监听窗口期内的executelatency,通过Histogram计算窗口期内延时的分布。

        子类RollingCommandUserLatencyDistributionStream监听了HystrixCommandCompletionStream消息流,并且通过RX java监听窗口期内的totalLatency,通过Histogram计算窗口期内延时的分布。

        子类RollingCollapserBatchSizeDistributionStream监听了HystrixCollapserEventStream消息流,并且通过RX java监听窗口期内的ADDED_TO_BATCH消息类型次数,通过Histogram计算窗口期内延时的分布。

      RollingConcurrencyStream监听一个消息流,例如HystrixCommandStartStream,然后通过RX java对一段时间内的执行并发量取最大值,重新发射,对窗口期内的执行并发量取最大值,重新发射。

        子类RollingCommandMaxConcurrencyStream监听了HystrixCommandStartStream,然后通过RX java对窗口期内的执行并发量取最大值。

        子类RollingThreadPoolMaxConcurrencyStream监听了HystrixThreadPoolStartStream,然后通过RX java对窗口期内的执行并发量取最大值。

    其他数据流

      还有一些独立于消息流的数据流,对于理解系统信息也非常有帮助。

      配置流HystrixConfigurationStream,通过该数据流可以定时获取hystrix最新的properties配置信息,com.netflix.hystrix.contrib.sample.stream.HystrixConfigSseServlet就是用该流来获取配置信息。

      数据格式:

    data: {"type":"HystrixConfig","commands":{"CreditCardCommand":{"threadPoolKey":"CreditCard","groupKey":"CreditCard","execution":{"isolationStrategy":"THREAD","threadPoolKeyOverride":null,"requestCacheEnabled":true,"requestLogEnabled":true,"timeoutEnabled":true,"fallbackEnabled":true,"timeoutInMilliseconds":3000,"semaphoreSize":10,"fallbackSemaphoreSize":10,"threadInterruptOnTimeout":true},"metrics":{"healthBucketSizeInMs":500,"percentileBucketSizeInMilliseconds":60000,"percentileBucketCount":10,"percentileEnabled":true,"counterBucketSizeInMilliseconds":10000,"counterBucketCount":10},"circuitBreaker":{"enabled":true,"isForcedOpen":false,"isForcedClosed":false,"requestVolumeThreshold":20,"errorPercentageThreshold":50,"sleepInMilliseconds":5000}},"GetUserAccountCommand":{"threadPoolKey":"User","groupKey":"User","execution":{"isolationStrategy":"THREAD","threadPoolKeyOverride":null,"requestCacheEnabled":true,"requestLogEnabled":true,"timeoutEnabled":true,"fallbackEnabled":true,"timeoutInMilliseconds":50,"semaphoreSize":10,"fallbackSemaphoreSize":10,"threadInterruptOnTimeout":true},"metrics":{"healthBucketSizeInMs":500,"percentileBucketSizeInMilliseconds":60000,"percentileBucketCount":10,"percentileEnabled":true,"counterBucketSizeInMilliseconds":10000,"counterBucketCount":10},"circuitBreaker":{"enabled":true,"isForcedOpen":false,"isForcedClosed":false,"requestVolumeThreshold":20,"errorPercentageThreshold":50,"sleepInMilliseconds":5000}},"GetOrderCommand":{"threadPoolKey":"Order","groupKey":"Order","execution":{"isolationStrategy":"THREAD","threadPoolKeyOverride":null,"requestCacheEnabled":true,"requestLogEnabled":true,"timeoutEnabled":true,"fallbackEnabled":true,"timeoutInMilliseconds":1000,"semaphoreSize":10,"fallbackSemaphoreSize":10,"threadInterruptOnTimeout":true},"metrics":{"healthBucketSizeInMs":500,"percentileBucketSizeInMilliseconds":60000,"percentileBucketCount":10,"percentileEnabled":true,"counterBucketSizeInMilliseconds":10000,"counterBucketCount":10},"circuitBreaker":{"enabled":true,"isForcedOpen":false,"isForcedClosed":false,"requestVolumeThreshold":20,"errorPercentageThreshold":50,"sleepInMilliseconds":5000}},"GetPaymentInformationCommand":{"threadPoolKey":"PaymentInformation","groupKey":"PaymentInformation","execution":{"isolationStrategy":"THREAD","threadPoolKeyOverride":null,"requestCacheEnabled":true,"requestLogEnabled":true,"timeoutEnabled":true,"fallbackEnabled":true,"timeoutInMilliseconds":1000,"semaphoreSize":10,"fallbackSemaphoreSize":10,"threadInterruptOnTimeout":true},"metrics":{"healthBucketSizeInMs":500,"percentileBucketSizeInMilliseconds":60000,"percentileBucketCount":10,"percentileEnabled":true,"counterBucketSizeInMilliseconds":10000,"counterBucketCount":10},"circuitBreaker":{"enabled":true,"isForcedOpen":false,"isForcedClosed":false,"requestVolumeThreshold":20,"errorPercentageThreshold":50,"sleepInMilliseconds":5000}}},"threadpools":{"User":{"coreSize":8,"maxQueueSize":-1,"queueRejectionThreshold":5,"keepAliveTimeInMinutes":1,"counterBucketSizeInMilliseconds":10000,"counterBucketCount":10},"CreditCard":{"coreSize":8,"maxQueueSize":-1,"queueRejectionThreshold":5,"keepAliveTimeInMinutes":1,"counterBucketSizeInMilliseconds":10000,"counterBucketCount":10},"Order":{"coreSize":8,"maxQueueSize":-1,"queueRejectionThreshold":5,"keepAliveTimeInMinutes":1,"counterBucketSizeInMilliseconds":10000,"counterBucketCount":10},"PaymentInformation":{"coreSize":8,"maxQueueSize":-1,"queueRejectionThreshold":5,"keepAliveTimeInMinutes":1,"counterBucketSizeInMilliseconds":10000,"counterBucketCount":10}},"collapsers":{}}

      功能流HystrixUtilizationStream,通过该数据流可以获得并发量,线程池状况等信息。com.netflix.hystrix.contrib.sample.stream.HystrixUtilizationSseServlet就是用该流来获取配置信息。

      数据格式:  

    data: {"type":"HystrixUtilization","commands":{"CreditCardCommand":{"activeCount":0},"GetUserAccountCommand":{"activeCount":0},"GetOrderCommand":{"activeCount":1},"GetPaymentInformationCommand":{"activeCount":0}},"threadpools":{"User":{"activeCount":0,"queueSize":0,"corePoolSize":8,"poolSize":2},"CreditCard":{"activeCount":0,"queueSize":0,"corePoolSize":8,"poolSize":1},"Order":{"activeCount":1,"queueSize":0,"corePoolSize":8,"poolSize":2},"PaymentInformation":{"activeCount":0,"queueSize":0,"corePoolSize":8,"poolSize":2}}}

      请求数据流HystrixRequestEventsStream,通过该数据流可以获得http请求相关的信息,com.netflix.hystrix.contrib.requests.stream.HystrixRequestEventsSseServlet就是用该流来获取配置信息。

      数据格式:

    data: {"name":"GetOrderCommand","events":["SUCCESS"],"latencies":[111]},{"name":"GetPaymentInformationCommand","events":["SUCCESS"],"latencies":[15]},{"name":"GetUserAccountCommand","events":["TIMEOUT","FALLBACK_SUCCESS"],"latencies":[59],"cached":2},{"name":"CreditCardCommand","events":["SUCCESS"],"latencies":[957]}],[{"name":"GetUserAccountCommand","events":["SUCCESS"],"latencies":[3],"cached":2},{"name":"GetOrderCommand","events":["SUCCESS"],"latencies":[77]},{"name":"GetPaymentInformationCommand","events":["SUCCESS"],"latencies":[21]},{"name":"CreditCardCommand","events":["SUCCESS"],"latencies":[1199]}

    MetricPublisher

      有时,我们需要发布Hystrix中的metrics到其他地方,Hystrix提供了相应的接口(HystrixMetricsPublisherCollapser,HystrixMetricsPublisherCommand,HystrixMetricsPublisherThreadPool),实现这些接口,并在initial方法中实现发送hystrix的metrics。并且实现HystrixMetricsPublisher,来创建这些实现类。

      Hystrix原理:

      Hystrix运行时,HystrixMetricsPublisherFactory通过HystrixPlugins获取HystrixMetricsPublisher的实现类。并且通过该实现类来创建(HystrixMetricsPublisherCollapser,HystrixMetricsPublisherCommand,HystrixMetricsPublisherThreadPool)的实现类,并在初次创建时调用其initial方法。

      coda hale 实现了将hystrix 的metrics信息输出到指定metrics监控系统中。

      引入jar包:

      <dependency>

                <groupId>com.netflix.hystrix</groupId>

                <artifactId>hystrix-codahale-metrics-publisher</artifactId>

                <version>1.5.9</version>

            </dependency>

      创建HystrixMetricsPublisher对象并注册到HystrixPlugins:

      @Bean

        HystrixMetricsPublisher hystrixMetricsPublisher() {

            HystrixCodaHaleMetricsPublisher publisher = new HystrixCodaHaleMetricsPublisher(metricRegistry);

            HystrixPlugins.getInstance().registerMetricsPublisher(publisher);

            return publisher;

        }

      coda hale实现源码如下:

      public class HystrixCodaHaleMetricsPublisher extends HystrixMetricsPublisher {

          private final String metricsRootNode;

          private final MetricRegistry metricRegistry;

          public HystrixCodaHaleMetricsPublisher(MetricRegistry metricRegistry) {

              this(null, metricRegistry);

          }

          public HystrixCodaHaleMetricsPublisher(String metricsRootNode, MetricRegistry metricRegistry) {

              this.metricsRootNode = metricsRootNode;

              this.metricRegistry = metricRegistry;

          }

          @Override

          public HystrixMetricsPublisherCommand getMetricsPublisherForCommand(HystrixCommandKey commandKey, HystrixCommandGroupKey commandGroupKey, HystrixCommandMetrics metrics, HystrixCircuitBreaker circuitBreaker, HystrixCommandProperties properties) {

              return new HystrixCodaHaleMetricsPublisherCommand(metricsRootNode, commandKey, commandGroupKey, metrics, circuitBreaker, properties, metricRegistry);

          }

          @Override

          public HystrixMetricsPublisherThreadPool getMetricsPublisherForThreadPool(HystrixThreadPoolKey threadPoolKey, HystrixThreadPoolMetrics metrics, HystrixThreadPoolProperties properties) {

              return new HystrixCodaHaleMetricsPublisherThreadPool(metricsRootNode, threadPoolKey, metrics, properties, metricRegistry);

          }

          @Override

          public HystrixMetricsPublisherCollapser getMetricsPublisherForCollapser(HystrixCollapserKey collapserKey, HystrixCollapserMetrics metrics, HystrixCollapserProperties properties) {

              return new HystrixCodaHaleMetricsPublisherCollapser(collapserKey, metrics, properties, metricRegistry);

          }

      }

      HystrixCodaHaleMetricsPublisher负责创建HystrixCodaHaleMetricsPublisherCommand,HystrixCodaHaleMetricsPublisherThreadPool,HystrixCodaHaleMetricsPublisherCollapser。这三个对象实现基本逻辑是在initialize方法中向metricRegistry中设置相应信息。

       public void initialize() {

          metricRegistry.register(createMetricName("isCircuitBreakerOpen"), new Gauge<Boolean>() {

          @Override

          public Boolean getValue() {

               return circuitBreaker.isOpen();

          }

        .....

     }

      HystrixCodaHaleMetricsPublisherCommand向metricRegistry设置了一下metrics信息:

      isCircuitBreakerOpen  是否熔断,通过熔断器获得。

      currentTime  当前系统时间

      countBadRequests  通过HystrixCommandMetrics获得。

    统计流类实现类

    Hystrix的Metrics功能模块中存储了与Hystrix运行相关的度量信息,主要有三类类型:

      1)HystrixCommandMetrics:

        保存hystrix命令执行的度量信息。

        markCommandStart 当命令开始执行,调用该方法。

        markCommandDone 命令执行完成,调用该方法。

        getRollingCount 获取某一事件类型窗口期内的统计数值

        getCumulativeCount 获取某一事件类型持续的统计数值

            getExecutionTimePercentile  获取某一百分比的请求执行时间

            getExecutionTimeMean  获取平均请求执行时间

        getTotalTimePercentile,获取某一百分比的请求执行总时间

        getTotalTimeMean,获取平均请求执行总时间

        getRollingMaxConcurrentExecutions 获取上一个窗口期内最大的并发数

            getHealthCountsStream 获取窗口期内的失败次数,总次数,失败比率

        记录以下事件类型:

        HystrixRollingNumberEvent.SUCCESS 命令执行成功的

        FAILURE

        TIMEOUT(1), SHORT_CIRCUITED(1), THREAD_POOL_REJECTED(1), SEMAPHORE_REJECTED(1), BAD_REQUEST(1),     FALLBACK_SUCCESS(1), FALLBACK_FAILURE(1), FALLBACK_REJECT

      2)HystrixThreadPoolMetrics 保存hystrix线程池执行的度量信息。

      markThreadCompletion 当线程吃执行一个任务时调用。

      markThreadExecution  当线程池完成一个任务时调用。

      getRollingCount 获取某一事件类型窗口期内的统计数值。

      getCumulativeCount  获取某一事件类型持续的统计数值。

    HystrixCommandMetrics实现:

      当调用markCommandStart方法时,实际向消息流对象HystrixCommandStartStream 写入HystrixCommandExecutionStarted消息。消息流是消息传输的中间件,其内部是一个RX java Subject。消息监听者通过订阅这些消息流来监听这些消息。如果是线程模式执行,还需要向消息流对象HystrixThreadPoolStartStream 写入HystrixCommandExecutionStarted消息。

      当调用markCommandDone方法时, 实际向消息流对象HystrixCommandCompletionStream 写入HystrixCommandCompletion消息。如果是线程模式执行,还需要向消息流对象HystrixThreadPoolCompletionStream 写入HystrixCommandCompletion消息。

      当调用collapserResponseFromCache方法时,实际向消息流对象HystrixCollapserEventStream写入HystrixCollapserEvent消息。消息流是消息传输的中间件,其内部是一个RX java Subject。消息监听者通过订阅这些消息流来监听这些消息。

      当调用collapserBatchExecuted方法时,实际向消息流对象HystrixCollapserEventStream写入HystrixCollapserEvent消息。消息流是消息传输的中间件,其内部是一个RX java Subject。消息监听者通过订阅这些消息流来监听这些消息。

      当调用getRollingCount方法时,实际从消息流对象RollingCommandEventCounterStream获取相应的信息。RollingCommandEventCounterStream消息流监听了HystrixCommandCompletionStream消息流,并且通过RX java 对各个消息类型进行一段时间内数据的统计。      

      当调用getCumulativeCount方法时,实际从消息流对象CumulativeCommandEventCounterStream获取相应的信息。CumulativeCommandEventCounterStream消息流监听了HystrixCommandCompletionStream消息流,并且通过RX java 对各个消息类型进行持续的数据的统计。   

           当调用getExecutionTimePercentile,getExecutionTimeMean方法时,实际从消息流对象RollingCommandLatencyDistributionStream获取相应的信息。RollingCommandLatencyDistributionStream消息流监听了HystrixCommandCompletionStream消息流。并且通过RX java对窗口期内的请求的executionLatency的分布进行计算。

           当调用getTotalTimePercentile,getTotalTimeMean方法时,实际从消息流对象RollingCommandUserLatencyDistributionStream获取相应的信息。RollingCommandUserLatencyDistributionStream消息流监听了HystrixCommandCompletionStream消息流。并且通过RX java对窗口期内的请求的totalLatency的分布进行计算。

           当调用getHealthCountsStream,实际从消息流对象HealthCountsStream获取想要信息,HealthCountsStream消息流监听了HystrixCommandCompletionStream消息流,并且通过RX java对窗口期内的请求成功,失败,超时,进行统计得出失败次数,总次数,失败比率。

       当调用getRollingMaxConcurrentExecutions,实际从消息流对象RollingCommandMaxConcurrencyStream获取相应的信息。RollingCommandMaxConcurrencyStream消息流监听了HystrixCommandStartStream消息流,并且通过RX java 获取窗口期内最大的并发数。

    HystrixThreadPoolMetrics实现:

      当调用getRollingCount方法时,实际从消息流对象RollingThreadPoolEventCounterStream获取相应的信息。RollingThreadPoolEventCounterStream消息流监听了HystrixCommandExecutionStarted消息流,并且进行一段时间内数据的统计。

    当前服务的健康状况, 包括服务调用总次数和服务调用失败次数等. 根据Metrics的计数, 熔断器从而能计算出当前服务的调用失败率, 用来和设定的阈值比较从而决定熔断器的状态切换逻辑. 因此Metrics的实现非常重要。

    HystrixRollingNumber统计一定时间内的统计数值,基本思想就是分段统计,比如说要统计qps,即1秒内的请求总数。如下图所示,我们可以将1s的时间分成10段,每段100ms。在第一个100ms内,写入第一个段中进行计数,在第二个100ms内,写入第二个段中进行计数,这样如果要统计当前时间的qps,我们总是可以通过统计当前时间前1s(共10段)的计数总和值。

  • 相关阅读:
    Top 10 Product Manager Skills To Boost Your Resume In 2021
    大数据知识梳理
    B端产品如何设计权限系统?
    华三盒式交换机MAC、ARP、Route性能表项参数查询
    中了传说中的挖矿病毒
    SqlServer 2019 事务日志传送
    docker中生成的pdf中文是方框的解决方案
    The Live Editor is unable to run in the current system configuration
    2021 面试题大纲
    五分钟搞定Docker安装ElasticSearch
  • 原文地址:https://www.cnblogs.com/zhangwanhua/p/7467470.html
Copyright © 2011-2022 走看看