zoukankan      html  css  js  c++  java
  • hystrix熔断器之metrics

    Metric概述

      HystrixCommands和HystrixObservableCommands执行过程中,会产生执行的数据,这些数据对于观察调用的性能表现非常有用。

      命令产生数据后,Metrics会根据不同纬度进行统计,主要有一下三个纬度:一段时间内(窗口期)的累计统计数据、持续的累计统计数据、一段时间内(窗口期)的数据分布。

    Metric实现

      Metrics实现主要的流程如下:

      1.命令在开始执行前会向开始消息流(HystrixCommandStartStream)发送开始消息(HystrixCommandExecutionStarted)。  

      2.如果是线程池执行,执行前会向线程池开始消息流(HystrixThreadPoolStartStream)发送开始消息(HystrixCommandExecutionStarted)。

      3.如果是线程池执行,执行后会向线程池结束消息流(HystrixThreadPoolCompletionStream)发送完成消息(HystrixCommandCompletion)。

      4.命令在结束执行前会向完成消息流(HystrixCommandCompletionStream)发送完成消息(HystrixCommandCompletion)。

      5.不同类型的统计流,会监听开始消息流或完成消息流,根据接受到的消息内容,进行统计。

    Hystrix消息类型

      HystrixCommandCompletion有一下消息类型

     一段时间内统计

       统计流首先监听一个消息流(开始消息流或者完成消息流),统计一段时间内各个类型消息的累计数据(时间为:metrics.rollingStats.timeInMilliseconds/metrics.rollingStats.numBuckets)。然后再对累计的数据进行累加(个数为:metrics.rollingStats.numBuckets),即为最终累计数据。

      RollingCommandEventCounterStream消息流监听了HystrixCommandCompletionStream消息流,并统计各种消息类型次数。      

      RollingCollapserEventCounterStream消息流监听了HystrixCollapserEventStream消息流,并统计各种消息类型次数。 

      RollingThreadPoolEventCounterStream消息流监听了HystrixThreadPoolCompletionStream消息流,并统计各种消息类型次数。 

      HealthCountsStream消息流监听了HystrixThreadPoolCompletionStream消息流,并统计成功次数,失败次数,失败率。 

    持续统计

       统计流首先监听一个消息流(开始消息流或者完成消息流),统计一段时间内各个类型消息的累计数据(时间为:metrics.rollingStats.timeInMilliseconds/metrics.rollingStats.numBuckets)。然后不断的累加累计数据。

      CumulativeCommandEventCounterStream监听了HystrixCommandCompletionStream消息流,并统计各种消息类型次数。    

      CumulativeCollapserEventCounterStream监听了HystrixCollapserEventStream消息流,并统计各种消息类型次数。    

      CumulativeThreadPoolEventCounterStream监听了HystrixThreadPoolCompletionStream消息流,并统计各种消息类型次数。    

    一段时间内分布统计

      RollingDistributionStream监听一个消息流,例如HystrixCommandStartStream,然后通过RX java对一段时间内的数值进行运算操作,生成统计值放在Histogram对象中,然后重新发射,对窗口期内的Histogram对象进行运算操作,并生成统计值重新发射。

        子类RollingCommandLatencyDistributionStream监听了HystrixCommandCompletionStream消息流,并且通过RX java监听窗口期内的executelatency,通过Histogram计算窗口期内延时的分布。

        子类RollingCommandUserLatencyDistributionStream监听了HystrixCommandCompletionStream消息流,并且通过RX java监听窗口期内的totalLatency,通过Histogram计算窗口期内延时的分布。

        子类RollingCollapserBatchSizeDistributionStream监听了HystrixCollapserEventStream消息流,并且通过RX java监听窗口期内的ADDED_TO_BATCH消息类型次数,通过Histogram计算窗口期内延时的分布。

      RollingConcurrencyStream监听一个消息流,例如HystrixCommandStartStream,然后通过RX java对一段时间内的执行并发量取最大值,重新发射,对窗口期内的执行并发量取最大值,重新发射。

        子类RollingCommandMaxConcurrencyStream监听了HystrixCommandStartStream,然后通过RX java对窗口期内的执行并发量取最大值。

        子类RollingThreadPoolMaxConcurrencyStream监听了HystrixThreadPoolStartStream,然后通过RX java对窗口期内的执行并发量取最大值。

    其他数据流

      还有一些独立于消息流的数据流,对于理解系统信息也非常有帮助。

      配置流HystrixConfigurationStream,通过该数据流可以定时获取hystrix最新的properties配置信息,com.netflix.hystrix.contrib.sample.stream.HystrixConfigSseServlet就是用该流来获取配置信息。

      数据格式:

    data: {"type":"HystrixConfig","commands":{"CreditCardCommand":{"threadPoolKey":"CreditCard","groupKey":"CreditCard","execution":{"isolationStrategy":"THREAD","threadPoolKeyOverride":null,"requestCacheEnabled":true,"requestLogEnabled":true,"timeoutEnabled":true,"fallbackEnabled":true,"timeoutInMilliseconds":3000,"semaphoreSize":10,"fallbackSemaphoreSize":10,"threadInterruptOnTimeout":true},"metrics":{"healthBucketSizeInMs":500,"percentileBucketSizeInMilliseconds":60000,"percentileBucketCount":10,"percentileEnabled":true,"counterBucketSizeInMilliseconds":10000,"counterBucketCount":10},"circuitBreaker":{"enabled":true,"isForcedOpen":false,"isForcedClosed":false,"requestVolumeThreshold":20,"errorPercentageThreshold":50,"sleepInMilliseconds":5000}},"GetUserAccountCommand":{"threadPoolKey":"User","groupKey":"User","execution":{"isolationStrategy":"THREAD","threadPoolKeyOverride":null,"requestCacheEnabled":true,"requestLogEnabled":true,"timeoutEnabled":true,"fallbackEnabled":true,"timeoutInMilliseconds":50,"semaphoreSize":10,"fallbackSemaphoreSize":10,"threadInterruptOnTimeout":true},"metrics":{"healthBucketSizeInMs":500,"percentileBucketSizeInMilliseconds":60000,"percentileBucketCount":10,"percentileEnabled":true,"counterBucketSizeInMilliseconds":10000,"counterBucketCount":10},"circuitBreaker":{"enabled":true,"isForcedOpen":false,"isForcedClosed":false,"requestVolumeThreshold":20,"errorPercentageThreshold":50,"sleepInMilliseconds":5000}},"GetOrderCommand":{"threadPoolKey":"Order","groupKey":"Order","execution":{"isolationStrategy":"THREAD","threadPoolKeyOverride":null,"requestCacheEnabled":true,"requestLogEnabled":true,"timeoutEnabled":true,"fallbackEnabled":true,"timeoutInMilliseconds":1000,"semaphoreSize":10,"fallbackSemaphoreSize":10,"threadInterruptOnTimeout":true},"metrics":{"healthBucketSizeInMs":500,"percentileBucketSizeInMilliseconds":60000,"percentileBucketCount":10,"percentileEnabled":true,"counterBucketSizeInMilliseconds":10000,"counterBucketCount":10},"circuitBreaker":{"enabled":true,"isForcedOpen":false,"isForcedClosed":false,"requestVolumeThreshold":20,"errorPercentageThreshold":50,"sleepInMilliseconds":5000}},"GetPaymentInformationCommand":{"threadPoolKey":"PaymentInformation","groupKey":"PaymentInformation","execution":{"isolationStrategy":"THREAD","threadPoolKeyOverride":null,"requestCacheEnabled":true,"requestLogEnabled":true,"timeoutEnabled":true,"fallbackEnabled":true,"timeoutInMilliseconds":1000,"semaphoreSize":10,"fallbackSemaphoreSize":10,"threadInterruptOnTimeout":true},"metrics":{"healthBucketSizeInMs":500,"percentileBucketSizeInMilliseconds":60000,"percentileBucketCount":10,"percentileEnabled":true,"counterBucketSizeInMilliseconds":10000,"counterBucketCount":10},"circuitBreaker":{"enabled":true,"isForcedOpen":false,"isForcedClosed":false,"requestVolumeThreshold":20,"errorPercentageThreshold":50,"sleepInMilliseconds":5000}}},"threadpools":{"User":{"coreSize":8,"maxQueueSize":-1,"queueRejectionThreshold":5,"keepAliveTimeInMinutes":1,"counterBucketSizeInMilliseconds":10000,"counterBucketCount":10},"CreditCard":{"coreSize":8,"maxQueueSize":-1,"queueRejectionThreshold":5,"keepAliveTimeInMinutes":1,"counterBucketSizeInMilliseconds":10000,"counterBucketCount":10},"Order":{"coreSize":8,"maxQueueSize":-1,"queueRejectionThreshold":5,"keepAliveTimeInMinutes":1,"counterBucketSizeInMilliseconds":10000,"counterBucketCount":10},"PaymentInformation":{"coreSize":8,"maxQueueSize":-1,"queueRejectionThreshold":5,"keepAliveTimeInMinutes":1,"counterBucketSizeInMilliseconds":10000,"counterBucketCount":10}},"collapsers":{}}

      功能流HystrixUtilizationStream,通过该数据流可以获得并发量,线程池状况等信息。com.netflix.hystrix.contrib.sample.stream.HystrixUtilizationSseServlet就是用该流来获取配置信息。

      数据格式:  

    data: {"type":"HystrixUtilization","commands":{"CreditCardCommand":{"activeCount":0},"GetUserAccountCommand":{"activeCount":0},"GetOrderCommand":{"activeCount":1},"GetPaymentInformationCommand":{"activeCount":0}},"threadpools":{"User":{"activeCount":0,"queueSize":0,"corePoolSize":8,"poolSize":2},"CreditCard":{"activeCount":0,"queueSize":0,"corePoolSize":8,"poolSize":1},"Order":{"activeCount":1,"queueSize":0,"corePoolSize":8,"poolSize":2},"PaymentInformation":{"activeCount":0,"queueSize":0,"corePoolSize":8,"poolSize":2}}}

      请求数据流HystrixRequestEventsStream,通过该数据流可以获得http请求相关的信息,com.netflix.hystrix.contrib.requests.stream.HystrixRequestEventsSseServlet就是用该流来获取配置信息。

      数据格式:

    data: {"name":"GetOrderCommand","events":["SUCCESS"],"latencies":[111]},{"name":"GetPaymentInformationCommand","events":["SUCCESS"],"latencies":[15]},{"name":"GetUserAccountCommand","events":["TIMEOUT","FALLBACK_SUCCESS"],"latencies":[59],"cached":2},{"name":"CreditCardCommand","events":["SUCCESS"],"latencies":[957]}],[{"name":"GetUserAccountCommand","events":["SUCCESS"],"latencies":[3],"cached":2},{"name":"GetOrderCommand","events":["SUCCESS"],"latencies":[77]},{"name":"GetPaymentInformationCommand","events":["SUCCESS"],"latencies":[21]},{"name":"CreditCardCommand","events":["SUCCESS"],"latencies":[1199]}

    MetricPublisher

      有时,我们需要发布Hystrix中的metrics到其他地方,Hystrix提供了相应的接口(HystrixMetricsPublisherCollapser,HystrixMetricsPublisherCommand,HystrixMetricsPublisherThreadPool),实现这些接口,并在initial方法中实现发送hystrix的metrics。并且实现HystrixMetricsPublisher,来创建这些实现类。

      Hystrix原理:

      Hystrix运行时,HystrixMetricsPublisherFactory通过HystrixPlugins获取HystrixMetricsPublisher的实现类。并且通过该实现类来创建(HystrixMetricsPublisherCollapser,HystrixMetricsPublisherCommand,HystrixMetricsPublisherThreadPool)的实现类,并在初次创建时调用其initial方法。

      coda hale 实现了将hystrix 的metrics信息输出到指定metrics监控系统中。

      引入jar包:

      <dependency>

                <groupId>com.netflix.hystrix</groupId>

                <artifactId>hystrix-codahale-metrics-publisher</artifactId>

                <version>1.5.9</version>

            </dependency>

      创建HystrixMetricsPublisher对象并注册到HystrixPlugins:

      @Bean

        HystrixMetricsPublisher hystrixMetricsPublisher() {

            HystrixCodaHaleMetricsPublisher publisher = new HystrixCodaHaleMetricsPublisher(metricRegistry);

            HystrixPlugins.getInstance().registerMetricsPublisher(publisher);

            return publisher;

        }

      coda hale实现源码如下:

      public class HystrixCodaHaleMetricsPublisher extends HystrixMetricsPublisher {

          private final String metricsRootNode;

          private final MetricRegistry metricRegistry;

          public HystrixCodaHaleMetricsPublisher(MetricRegistry metricRegistry) {

              this(null, metricRegistry);

          }

          public HystrixCodaHaleMetricsPublisher(String metricsRootNode, MetricRegistry metricRegistry) {

              this.metricsRootNode = metricsRootNode;

              this.metricRegistry = metricRegistry;

          }

          @Override

          public HystrixMetricsPublisherCommand getMetricsPublisherForCommand(HystrixCommandKey commandKey, HystrixCommandGroupKey commandGroupKey, HystrixCommandMetrics metrics, HystrixCircuitBreaker circuitBreaker, HystrixCommandProperties properties) {

              return new HystrixCodaHaleMetricsPublisherCommand(metricsRootNode, commandKey, commandGroupKey, metrics, circuitBreaker, properties, metricRegistry);

          }

          @Override

          public HystrixMetricsPublisherThreadPool getMetricsPublisherForThreadPool(HystrixThreadPoolKey threadPoolKey, HystrixThreadPoolMetrics metrics, HystrixThreadPoolProperties properties) {

              return new HystrixCodaHaleMetricsPublisherThreadPool(metricsRootNode, threadPoolKey, metrics, properties, metricRegistry);

          }

          @Override

          public HystrixMetricsPublisherCollapser getMetricsPublisherForCollapser(HystrixCollapserKey collapserKey, HystrixCollapserMetrics metrics, HystrixCollapserProperties properties) {

              return new HystrixCodaHaleMetricsPublisherCollapser(collapserKey, metrics, properties, metricRegistry);

          }

      }

      HystrixCodaHaleMetricsPublisher负责创建HystrixCodaHaleMetricsPublisherCommand,HystrixCodaHaleMetricsPublisherThreadPool,HystrixCodaHaleMetricsPublisherCollapser。这三个对象实现基本逻辑是在initialize方法中向metricRegistry中设置相应信息。

       public void initialize() {

          metricRegistry.register(createMetricName("isCircuitBreakerOpen"), new Gauge<Boolean>() {

          @Override

          public Boolean getValue() {

               return circuitBreaker.isOpen();

          }

        .....

     }

      HystrixCodaHaleMetricsPublisherCommand向metricRegistry设置了一下metrics信息:

      isCircuitBreakerOpen  是否熔断,通过熔断器获得。

      currentTime  当前系统时间

      countBadRequests  通过HystrixCommandMetrics获得。

    统计流类实现类

    Hystrix的Metrics功能模块中存储了与Hystrix运行相关的度量信息,主要有三类类型:

      1)HystrixCommandMetrics:

        保存hystrix命令执行的度量信息。

        markCommandStart 当命令开始执行,调用该方法。

        markCommandDone 命令执行完成,调用该方法。

        getRollingCount 获取某一事件类型窗口期内的统计数值

        getCumulativeCount 获取某一事件类型持续的统计数值

            getExecutionTimePercentile  获取某一百分比的请求执行时间

            getExecutionTimeMean  获取平均请求执行时间

        getTotalTimePercentile,获取某一百分比的请求执行总时间

        getTotalTimeMean,获取平均请求执行总时间

        getRollingMaxConcurrentExecutions 获取上一个窗口期内最大的并发数

            getHealthCountsStream 获取窗口期内的失败次数,总次数,失败比率

        记录以下事件类型:

        HystrixRollingNumberEvent.SUCCESS 命令执行成功的

        FAILURE

        TIMEOUT(1), SHORT_CIRCUITED(1), THREAD_POOL_REJECTED(1), SEMAPHORE_REJECTED(1), BAD_REQUEST(1),     FALLBACK_SUCCESS(1), FALLBACK_FAILURE(1), FALLBACK_REJECT

      2)HystrixThreadPoolMetrics 保存hystrix线程池执行的度量信息。

      markThreadCompletion 当线程吃执行一个任务时调用。

      markThreadExecution  当线程池完成一个任务时调用。

      getRollingCount 获取某一事件类型窗口期内的统计数值。

      getCumulativeCount  获取某一事件类型持续的统计数值。

    HystrixCommandMetrics实现:

      当调用markCommandStart方法时,实际向消息流对象HystrixCommandStartStream 写入HystrixCommandExecutionStarted消息。消息流是消息传输的中间件,其内部是一个RX java Subject。消息监听者通过订阅这些消息流来监听这些消息。如果是线程模式执行,还需要向消息流对象HystrixThreadPoolStartStream 写入HystrixCommandExecutionStarted消息。

      当调用markCommandDone方法时, 实际向消息流对象HystrixCommandCompletionStream 写入HystrixCommandCompletion消息。如果是线程模式执行,还需要向消息流对象HystrixThreadPoolCompletionStream 写入HystrixCommandCompletion消息。

      当调用collapserResponseFromCache方法时,实际向消息流对象HystrixCollapserEventStream写入HystrixCollapserEvent消息。消息流是消息传输的中间件,其内部是一个RX java Subject。消息监听者通过订阅这些消息流来监听这些消息。

      当调用collapserBatchExecuted方法时,实际向消息流对象HystrixCollapserEventStream写入HystrixCollapserEvent消息。消息流是消息传输的中间件,其内部是一个RX java Subject。消息监听者通过订阅这些消息流来监听这些消息。

      当调用getRollingCount方法时,实际从消息流对象RollingCommandEventCounterStream获取相应的信息。RollingCommandEventCounterStream消息流监听了HystrixCommandCompletionStream消息流,并且通过RX java 对各个消息类型进行一段时间内数据的统计。      

      当调用getCumulativeCount方法时,实际从消息流对象CumulativeCommandEventCounterStream获取相应的信息。CumulativeCommandEventCounterStream消息流监听了HystrixCommandCompletionStream消息流,并且通过RX java 对各个消息类型进行持续的数据的统计。   

           当调用getExecutionTimePercentile,getExecutionTimeMean方法时,实际从消息流对象RollingCommandLatencyDistributionStream获取相应的信息。RollingCommandLatencyDistributionStream消息流监听了HystrixCommandCompletionStream消息流。并且通过RX java对窗口期内的请求的executionLatency的分布进行计算。

           当调用getTotalTimePercentile,getTotalTimeMean方法时,实际从消息流对象RollingCommandUserLatencyDistributionStream获取相应的信息。RollingCommandUserLatencyDistributionStream消息流监听了HystrixCommandCompletionStream消息流。并且通过RX java对窗口期内的请求的totalLatency的分布进行计算。

           当调用getHealthCountsStream,实际从消息流对象HealthCountsStream获取想要信息,HealthCountsStream消息流监听了HystrixCommandCompletionStream消息流,并且通过RX java对窗口期内的请求成功,失败,超时,进行统计得出失败次数,总次数,失败比率。

       当调用getRollingMaxConcurrentExecutions,实际从消息流对象RollingCommandMaxConcurrencyStream获取相应的信息。RollingCommandMaxConcurrencyStream消息流监听了HystrixCommandStartStream消息流,并且通过RX java 获取窗口期内最大的并发数。

    HystrixThreadPoolMetrics实现:

      当调用getRollingCount方法时,实际从消息流对象RollingThreadPoolEventCounterStream获取相应的信息。RollingThreadPoolEventCounterStream消息流监听了HystrixCommandExecutionStarted消息流,并且进行一段时间内数据的统计。

    当前服务的健康状况, 包括服务调用总次数和服务调用失败次数等. 根据Metrics的计数, 熔断器从而能计算出当前服务的调用失败率, 用来和设定的阈值比较从而决定熔断器的状态切换逻辑. 因此Metrics的实现非常重要。

    HystrixRollingNumber统计一定时间内的统计数值,基本思想就是分段统计,比如说要统计qps,即1秒内的请求总数。如下图所示,我们可以将1s的时间分成10段,每段100ms。在第一个100ms内,写入第一个段中进行计数,在第二个100ms内,写入第二个段中进行计数,这样如果要统计当前时间的qps,我们总是可以通过统计当前时间前1s(共10段)的计数总和值。

  • 相关阅读:
    SQL优化,解决系统运行效率瓶颈
    C#中 哪些是值类型 哪些是引用类型
    C#异常类相关总结
    对象 序列化 字节流 传输
    给数组中的每个元素赋值
    对象转化为 xml字符串
    .NET BETWEEN方法
    Datatable To List<Entity>
    ajax原理
    gulp记录
  • 原文地址:https://www.cnblogs.com/zhangwanhua/p/7467470.html
Copyright © 2011-2022 走看看