目的:kafka消息通过flume上传到hdfs
每天上报的日志可能含之前的日志,但每天上报的日志是在一个一日目录下,ymd
1、s102 flume配置
kafka_hdfs.txt
a1.sources = r1 a1.channels = c1 a1.sinks = k1 a1.sources.r1.type = org.apache.flume.source.kafka.KafkaSource a1.sources.r1.batchSize = 5000 a1.sources.r1.batchDurationMillis = 2000 a1.sources.r1.kafka.bootstrap.servers = s102:9092 a1.sources.r1.kafka.topics = raw-logs a1.sources.r1.kafka.consumer.group.id = g10 a1.channels.c1.type=memory a1.sinks.k1.type = hdfs a1.sinks.k1.hdfs.path = /user/centos/umeng/raw-logs/%Y%m/%d/%H%M a1.sinks.k1.hdfs.filePrefix = events- a1.sinks.k1.hdfs.round = true a1.sinks.k1.hdfs.roundValue = 1 a1.sinks.k1.hdfs.roundUnit = minute a1.sinks.k1.hdfs.rollInterval = 30 a1.sinks.k1.hdfs.rollSize = 10240 a1.sinks.k1.hdfs.rollCount = 500 a1.sinks.k1.hdfs.useLocalTimeStamp = true a1.sinks.k1.hdfs.fileType = DataStream a1.sources.r1.channels=c1 a1.sinks.k1.channel=c1
2、准备hdfs目录
hdfs dfs -mkdir -p /user/centos/umeng/raw-logs
3、启动flume
4、查看hdfs