zoukankan      html  css  js  c++  java
  • Hadoop生态圈-Kafka配置文件详解

                        Hadoop生态圈-Kafka配置文件详解

                                          作者:尹正杰

    版权声明:原创作品,谢绝转载!否则将追究法律责任。

    一.默认kafka配置文件内容([yinzhengjie@s101 ~]$ more /soft/kafka/config/server.properties )

    [yinzhengjie@s101 ~]$ more /soft/kafka/config/server.properties 
    # Licensed to the Apache Software Foundation (ASF) under one or more
    # contributor license agreements.  See the NOTICE file distributed with
    # this work for additional information regarding copyright ownership.
    # The ASF licenses this file to You under the Apache License, Version 2.0
    # (the "License"); you may not use this file except in compliance with
    # the License.  You may obtain a copy of the License at
    #
    #    http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    
    # see kafka.server.KafkaConfig for additional details and defaults
    
    ############################# Server Basics #############################
    
    # The id of the broker. This must be set to a unique integer for each broker.
    broker.id=0
    
    ############################# Socket Server Settings #############################
    
    # The address the socket server listens on. It will get the value returned from 
    # java.net.InetAddress.getCanonicalHostName() if not configured.
    #   FORMAT:
    #     listeners = listener_name://host_name:port
    #   EXAMPLE:
    #     listeners = PLAINTEXT://your.host.name:9092
    listeners=PLAINTEXT://s101:9092
    
    # Hostname and port the broker will advertise to producers and consumers. If not set, 
    # it uses the value for "listeners" if configured.  Otherwise, it will use the value
    # returned from java.net.InetAddress.getCanonicalHostName().
    #advertised.listeners=PLAINTEXT://your.host.name:9092
    
    # Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details
    #listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL
    
    # The number of threads that the server uses for receiving requests from the network and sending responses to the network
    num.network.threads=3
    
    # The number of threads that the server uses for processing requests, which may include disk I/O
    num.io.threads=8
    
    # The send buffer (SO_SNDBUF) used by the socket server
    socket.send.buffer.bytes=102400
    
    # The receive buffer (SO_RCVBUF) used by the socket server
    socket.receive.buffer.bytes=102400
    
    # The maximum size of a request that the socket server will accept (protection against OOM)
    socket.request.max.bytes=104857600
    
    
    ############################# Log Basics #############################
    
    # A comma separated list of directories under which to store log files
    log.dirs=/home/yinzhengjie/kafka/logs
    
    # The default number of log partitions per topic. More partitions allow greater
    # parallelism for consumption, but this will also result in more files across
    # the brokers.
    num.partitions=1
    
    # The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
    # This value is recommended to be increased for installations with data dirs located in RAID array.
    num.recovery.threads.per.data.dir=1
    
    ############################# Internal Topic Settings  #############################
    # The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state"
    # For anything other than development testing, a value greater than 1 is recommended for to ensure availability such as 3.
    offsets.topic.replication.factor=1
    transaction.state.log.replication.factor=1
    transaction.state.log.min.isr=1
    
    ############################# Log Flush Policy #############################
    
    # Messages are immediately written to the filesystem but by default we only fsync() to sync
    # the OS cache lazily. The following configurations control the flush of data to disk.
    # There are a few important trade-offs here:
    #    1. Durability: Unflushed data may be lost if you are not using replication.
    #    2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
    #    3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to excessive seeks.
    # The settings below allow one to configure the flush policy to flush data after a period of time or
    # every N messages (or both). This can be done globally and overridden on a per-topic basis.
    
    # The number of messages to accept before forcing a flush of data to disk
    #log.flush.interval.messages=10000
    
    # The maximum amount of time a message can sit in a log before we force a flush
    #log.flush.interval.ms=1000
    
    ############################# Log Retention Policy #############################
    
    # The following configurations control the disposal of log segments. The policy can
    # be set to delete segments after a period of time, or after a given size has accumulated.
    # A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
    # from the end of the log.
    
    # The minimum age of a log file to be eligible for deletion due to age
    log.retention.hours=168
    
    # A size-based retention policy for logs. Segments are pruned from the log unless the remaining
    # segments drop below log.retention.bytes. Functions independently of log.retention.hours.
    #log.retention.bytes=1073741824
    
    # The maximum size of a log segment file. When this size is reached a new log segment will be created.
    log.segment.bytes=1073741824
    
    # The interval at which log segments are checked to see if they can be deleted according
    # to the retention policies
    log.retention.check.interval.ms=300000
    
    ############################# Zookeeper #############################
    
    # Zookeeper connection string (see zookeeper docs for details).
    # This is a comma separated host:port pairs, each corresponding to a zk
    # server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
    # You can also append an optional chroot string to the urls to specify the
    # root directory for all kafka znodes.
    zookeeper.connect=s102:2181,s103:2181,s104:2181
    
    # Timeout in ms for connecting to zookeeper
    zookeeper.connection.timeout.ms=6000
    
    
    ############################# Group Coordinator Settings #############################
    
    # The following configuration specifies the time, in milliseconds, that the GroupCoordinator will delay the initial consumer rebalance.
    # The rebalance will be further delayed by the value of group.initial.rebalance.delay.ms as new members join the group, up to a maximum of max.poll.interval.ms.
    # The default value for this is 3 seconds.
    # We override this to 0 here as it makes for a better out-of-the-box experience for development and testing.
    # However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application st
    artup.
    group.initial.rebalance.delay.ms=0
    [yinzhengjie@s101 ~]$ 

    二.Kafka常用参数

    1>.broker.id

      答:是Kafka节点服务的唯一身份标识(和zookeeper的myid类似),,如果多台kafka节点的broker.id重复的话,那仅有一个节点可以正常提供服务。

    2>.listeners=PLAINTEXT://s101:9092

      答:listeners 表示监听端口,PLAINTEXT表示纯文本(也就是说,不管你发送什么数据类型都以纯文本的方式接收,包括图片,视频等等,只不过接收过来再打开的话可能会是乱码!),s101:9092表示的主机名和端口号。

    3>.num.network.threads=3

      答:默认的网络线程数是3。

    4>.num.io.threads=8

      答:默认的I/O线程是8。

    5>.socket.send.buffer.bytes=102400

      答:默认的套接字发送缓冲是100K

    6>.socket.receive.buffer.bytes=102400

      答:默认的tao-jiezi接受缓冲是100k。

    7>.socket.request.max.bytes=104857600

      答:默认接收到的最大字节数是100M。

    8>.log.dirs=/home/yinzhengjie/kafka/logs

      答:指定存放kafka的真实数据。我这里把真实的数据存在“/home/yinzhengjie/kafka/logs”。

    9>.num.partitions=1

      答:默认的分区数是1。

    10>.num.recovery.threads.per.data.dir=1

      答:每一个文件夹默认的恢复线程是1。

    11>.log.retention.hours=168

      答:默认数据保存时间是168小时,即一个星期(7天)。

    12>.log.segment.bytes=1073741824

       答:指定每个数据日志保存最大数据为1G,当超过这个值时,会自动进行日志滚动。

    13>.log.retention.check.interval.ms=300000

       答:每隔300秒(即5分钟)检查日志是否过期,如果数据超过了存货日期(默认7天),就会在删除log中对应的数据。

    14>.zookeeper.connect=s102:2181,s103:2181,s104:2181

       答:指定zookeeper服务器,如果有多个,需要用逗号分隔。

    15>.zookeeper.connection.timeout.ms=6000

      答:设置zookeeper的连接超时时间(默认为6秒),如果到达这个指定时间仍然连接不上的话就默认该节点已经挂掉!

    16>.

    三.

  • 相关阅读:
    稳扎稳打Silverlight(47) 4.0UI之操作剪切板, 隐式样式, CompositeTransform, 拖放外部文件到程序中
    返璞归真 asp.net mvc (9) asp.net mvc 3.0 新特性之 View(Razor)
    返璞归真 asp.net mvc (6) asp.net mvc 2.0 新特性
    稳扎稳打Silverlight(48) 4.0其它之打印, 动态绑定, 增强的导航系统, 杂七杂八
    精进不休 .NET 4.0 (9) ADO.NET Entity Framework 4.1 之 Code First
    稳扎稳打Silverlight(42) 4.0控件之Viewbox, RichTextBox
    稳扎稳打Silverlight(53) 4.0通信之对WCF NetTcpBinding的支持, 在Socket通信中通过HTTP检索策略文件, HTTP请求中的ClientHttp和BrowserHttp
    稳扎稳打 Silverlight 4.0 系列文章索引
    稳扎稳打Silverlight(54) 4.0通信之对UDP协议的支持: 通过 UdpAnySourceMulticastClient 实现 ASM(Any Source Multicast),即“任意源多播”
    返璞归真 asp.net mvc (8) asp.net mvc 3.0 新特性之 Model
  • 原文地址:https://www.cnblogs.com/yinzhengjie/p/9216974.html
Copyright © 2011-2022 走看看