zoukankan      html  css  js  c++  java
  • kafka2.3性能测试:Kafka 2.3 Performance testing

    Introduction

    Knowing the performance of Kafka in general or on your hardware is an important part of capacity planning. Sizing can be hard to calculate, with different message sizes, retention periods, partitions, replica factor, network speeds and even synchronous vs asynchronous to pick from. So many decisions to make, but what type of performance can you realistically expect ?

    Benchmarking for Kafka has been done before, however this is a couple of years out of date. There are newer versions of Kafka, newer hardware, faster networking all which have improved performance by many factors. But how ? This article will run the same tests as before with the latest and greatest to see what the improvements are.

    Hardware specification:

    Cloud Provider AWS using m5.4xlarge * 6 machines
    OS Centos 7
    CPU 16 cores
    RAM 64GB
    Disk 50GB OS, 6 * 20 GB SSD IO Optimised for Kafka log dirs
    Networking 10GB/s
    Kafka Version CDH 6.3 with Kafka 2.3
    Message size 100 bytes per message

    3 of the 6 machines will participate as Kafka brokers in a single cluster, and the remaining three nodes will be producers / consumers. Given that we have 10GB/s networking, we should be able to push almost 30GB/s across the three Kafka nodes, and 60GB/s across all the nodes.

    To make the testing simpler, I've set some environment variables to re-use in running each command:

    BOOTSTRAP=10.0.0.4:9092,10.0.0.5:9092,10.0.0.6:9092

    First we need to create the various topics, depending on the partition count and replicas:

    kafka-topics --bootstrap-server ${BOOTSTRAP} --create --topic test-rep-one --partitions 6 --replication-factor 1
    kafka-topics --bootstrap-server ${BOOTSTRAP} --create --topic test-rep-three --partitions 6 --replication-factor 3
    kafka-topics --bootstrap-server ${BOOTSTRAP} --create --topic test-7k --partitions 18 --replication-factor 3

    Kafka ships with two handy scripts you can use to test your cluster, kafka-producer-perf-test and kafka-consumer-perf-test

    Test results

    Single producer, no consumer, no replication 1 628 081 msg/sec (155.27 MB/sec)
    Single producer, no consumer, 3x async replication 1 463 136 msg/sec (140.46 MB/sec)
    Single producer, no consumer, 3x syncronous replication 1 226 439 msg/sec (125.11 MB/sec)
    3 producers, no consumer, 3 asynchronous replication 3 960 110 msg/sec (377.69 MB/sec)
    No producer, single consumer 4 096 100 msg/sec (390.63 MB/sec)
    No producer, three consumers 11 813 321 msg/sec (1125 MB/sec)

     At this point we've duplicated the testing with the original tests done, hopefully with much improved numbers.
    However, as the tests were done with 100 byte records, the tests were re-run with 7KB records and optimised Kafka settings (larger Heap size at 8GB, larger batch sizes and some snappy compression applied).

    Optimised Kafka results, 7k records

    6 producers, no consumers, 3x async replication (larger batch sizes, snappy compression) 1 070 970 msg/sec (7321 MB/sec)
    0 producers, 6 consumers 963 071 msg/sec (6896 MB/sec)

    Over 1 million messages a second, reading and writing 7kb per message. We've reached the networking limit!

    The commands used for each test if you would like to reproduce yourself:

    Test 1:
    kafka-producer-perf-test --topic test-rep-one --num-records 50000000 --record-size 100 --throughput -1 --producer-props acks=0 bootstrap.servers=${BOOTSTRAP} 
    
    Test 2:
    kafka-producer-perf-test --topic test-rep-three --num-records 50000000 --record-size 100 --throughput -1 --producer-props acks=0 bootstrap.servers=${BOOTSTRAP}
    
    Test 3:
    kafka-producer-perf-test --topic test-rep-three --num-records 50000000 --record-size 100 --throughput -1 --producer-props acks=1 bootstrap.servers=${BOOTSTRAP}
    
    Test 4 (run three instances in parallel, one on each node):
    kafka-producer-perf-test --topic test-rep-three --num-records 50000000 --record-size 100 --throughput -1 --producer-props acks=0 bootstrap.servers=${BOOTSTRAP}
    
    Test 5:
    kafka-consumer-perf-test --broker-list ${BOOTSTRAP} --messages 50000000 --topic test-rep-three --threads 1 --timeout 60000 --print-metrics --num-fetch-threads 6
    
    Test 6 (run three instances in parallel, one on each node):
    kafka-consumer-perf-test --broker-list ${BOOTSTRAP} --messages 50000000 --topic test-rep-three --threads 1 --timeout 60000 --print-metrics --num-fetch-threads 6
    
    Test 7 (run a producer on each node, including the Kafka brokers):
    kafka-producer-perf-test --topic test-7k --num-records 50000000 --record-size 7168 --throughput -1 --producer-props acks=0 bootstrap.servers=${BOOTSTRAP} linger.ms=100 compression.type=snappy
    
    Test 8 (run a consumer on each node, including the Kafka brokers):
    kafka-consumer-perf-test --broker-list ${BOOTSTRAP} --messages 50000000 --topic test-7k --threads 1 --timeout 60000 --print-metrics --num-fetch-threads 18
    
    
  • 相关阅读:
    python字符串的常用方法
    python基础之数据类型
    python自定义带参数和不带参数的装饰器
    python中logging结合pytest打印日志
    本地的项目上传到gitee仓库步骤--适合小白上手
    Python中的分数运算
    2018年6月23日开通我的Python学习博客
    python多版本兼容性问题:当同时安装Python2和Python3后,如何兼容并切换
    github之关联远程仓库
    SHH验证
  • 原文地址:https://www.cnblogs.com/felixzh/p/12857190.html
Copyright © 2011-2022 走看看