zoukankan html css js c++ java

Kafka Consumer

python小例-生产、消费

生产

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from kafka import KafkaProducer
import json
producer = KafkaProducer(bootstrap_servers='localhost:9092')
for i in range(10):
    msg_dict = {
        'id': '349834',
        'task_id': '9kjifewo'
    }
    msg = json.dumps(msg_dict)
    producer.send('topic_a', msg)
producer.close()

消费

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from kafka import KafkaConsumer

consumer = KafkaConsumer('topic_a', bootstrap_servers='localhost:9092')
print consumer
print "<<" * 10
for msg in consumer:
    print msg.value, type(msg.value)
print ">>" * 10

基本用法

topic_name = 'my_topic_name'
consumer = KafkaConsumer(topic_name, bootstrap_servers=['localhost:9092'])
# consumer是一个消息队列，当后台有消息时，这个消息队列就会自动增加．所以遍历也总是会有数据，当消息队列中没有数据时，就会堵塞等待消息到来
for msg in consumer:
    recv = "%s:%d:%d: key=%s value=%s" % (msg.topic, msg.partition, msg.offset, msg.key, msg.value)
    print recv

指定分区、offset、消费组

#encoding:utf8
from kafka import KafkaConsumer, TopicPartition

my_topic = "my_topic_name" # 指定需要消费的主题

consumer = KafkaConsumer(
    bootstrap_servers = "192.168.70.221:19092,192.168.70.222:19092,192.168.70.223:19092", # kafka集群地址
    group_id = "my_group_a", # 消费组id
    enable_auto_commit = True, # 每过一段时间自动提交所有已消费的消息（在迭代时提交）
    auto_commit_interval_ms = 5000, # 自动提交的周期（毫秒）
)

consumer.assign([
    TopicPartition(topic=my_topic, partition=0),   
    TopicPartition(topic=my_topic, partition=1),
    TopicPartition(topic=my_topic, partition=2)
])

consumer.seek(TopicPartition(topic=my_topic, partition=0), 12)   # 指定起始offset为12
consumer.seek(TopicPartition(topic=my_topic, partition=1), 0)    # 可以注册多个分区，此分区从第一条消息开始接收
# consumer.seek(TopicPartition(topic=my_topic, partition=2), 32) # 没有注册的分区上的消息不会被消费

for msg in consumer: # 迭代器，等待下一条消息
    print msg # 打印消息

注：因指定了分区、偏移量，不会消费分区为2的信息；如果开启2个相同服务，会把同样的消息消费2次

手动提交

 enable_auto_commit = True

手动提交

消费了会自动提交offset, 如果想保证业务处理完再手动提交，需要设置 enable_auto_commit = False

from kafka import KafkaConsumer, OffsetAndMetadata
tp = TopicPartition(my_topic, 0)
consumer.commit(offsets={tp: (OffsetAndMetadata(msg.offset + 1, None))}

注意提交的偏移量是下次消费开始的位置。如果设置为当前offset，下次会重复消费

附

KafkaConsumer构造函数参数列表

*topics ，要订阅的主题
bootstrap_servers ：kafka节点或节点的列表，不一定需要罗列所有的kafka节点。格式为： ‘host[:port]’ 。默认值是：localhost:9092
client_id (str) : 客户端id，默认值: ‘kafka-python-{version}’
group_id (str or None)：分组id
key_deserializer (callable) ：key反序列化函数
value_deserializer (callable)：value反序列化函数
fetch_min_bytes：服务器应每次返回的最小数据量
fetch_max_wait_ms (int)： 服务器应每次返回的最大等待时间
fetch_max_bytes (int) ：服务器应每次返回的最大数据量
max_partition_fetch_bytes (int) ：
request_timeout_ms (int) retry_backoff_ms (int)
reconnect_backoff_ms (int)
reconnect_backoff_max_ms (int)
max_in_flight_requests_per_connection (int)
auto_offset_reset (str) enable_auto_commit (bool)
auto_commit_interval_ms (int)
default_offset_commit_callback (callable)
check_crcs (bool)
metadata_max_age_ms (int)
partition_assignment_strategy (list)
max_poll_records (int)
max_poll_interval_ms (int)
session_timeout_ms (int)
heartbeat_interval_ms (int)
receive_buffer_bytes (int)
send_buffer_bytes (int)
socket_options (list)
consumer_timeout_ms (int)
skip_double_compressed_messages (bool)
security_protocol (str)
ssl_context (ssl.SSLContext)
ssl_check_hostname (bool)
ssl_cafile (str) –
ssl_certfile (str)
ssl_keyfile (str)
ssl_password (str)
ssl_crlfile (str)
api_version (tuple)

KafkaConsumer 函数

assign(partitions)：手动为该消费者分配一个topic分区列表。
assignment()：获取当前分配给该消费者的topic分区。
beginning_offsets(partitions)：获取给定分区的第一个偏移量。
close(autocommit=True)：关闭消费者
commit(offsets=None)：提交偏移量，直到成功或错误为止。
commit_async(offsets=None, callback=None)：异步提交偏移量。
committed(partition)：获取给定分区的最后一个提交的偏移量。
end_offsets(partitions)：获取分区的最大偏移量
highwater(partition)：分区最大的偏移量
metrics(raw=False)：返回消费者性能指标
next（）：返回下一条数据
offsets_for_times(timestamps)：根据时间戳获取分区偏移量
partitions_for_topic(topic)：返回topic的partition列表，返回一个set集合
pause(*partitions)：停止获取数据paused()：返回停止获取的分区poll(timeout_ms=0, max_records=None)：获取数据
position(partition)：获取分区的偏移量
resume(*partitions)：恢复抓取指定的分区
seek(partition, offset)：seek偏移量
seek_to_beginning(*partitions)：搜索最旧的偏移量
seek_to_end(*partitions)：搜索最近可用的偏移量
subscribe(topics=(), pattern=None, listener=None)：订阅topics
subscription()：返回当前消费者消费的所有topic
topics()：返回当前消费者消费的所有topic，返回的是unicode
unsubscribe()：取消订阅所有的topic

查看全文

相关阅读:
2018/12/21 HDU-2077 汉诺塔IV（递归）
2018-12-08 acm日常 HDU
2018/12/12 acm日常第二周第六题
 git 添加远程分支,并可以code review.
zookeeper数据迁移方法
 gem install nokogiri -v '1.6.6.2' 出错
 gem install json -v '1.8.2' error
gem install bundle 安装失败
 全能型开源远程终端：MobaXterm
如何写好 Git Commit 信息

原文地址：https://www.cnblogs.com/kaituorensheng/p/12305774.html