zoukankan      html  css  js  c++  java
  • KAFKA基础(十五):Flume对接Kafka

    1 简单实现

    1)配置flume

    # define
    a1.sources = r1
    a1.sinks = k1
    a1.channels = c1
    
    # source
    a1.sources.r1.type = exec
    a1.sources.r1.command = tail -F  /opt/module/data/flume.log
    
    # sink
    a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
    a1.sinks.k1.kafka.bootstrap.servers = hadoop102:9092,hadoop103:9092,hadoop104:9092
    a1.sinks.k1.kafka.topic = first
    a1.sinks.k1.kafka.flumeBatchSize = 20
    a1.sinks.k1.kafka.producer.acks = 1
    a1.sinks.k1.kafka.producer.linger.ms = 1
    
    # channel
    a1.channels.c1.type = memory
    a1.channels.c1.capacity = 1000
    a1.channels.c1.transactionCapacity = 100
    
    # bind
    a1.sources.r1.channels = c1
    a1.sinks.k1.channel = c1

    2 启动kafka消费者

    3 进入flume根目录下,启动flume

    $ bin/flume-ng agent -c conf/ -n a1 -f jobs/flume-kafka.conf

    4  /opt/module/data/flume.log里追加数据,查看kafka消费者消费情况

    $ echo hello >> /opt/module/data/flume.log

    2 数据分离

    0)需求:  flume采集的数据按照不同的类型输入到不同的topic

              将日志数据中带有atguigu的,输入到Kafkafirst主题中,

              将日志数据中带有shangguigu,输入到Kafkasecond主题中,

      其他的数据输入到Kafkathird主题中

    1) 编写FlumeInterceptor

    package com.atguigu.kafka.flumeInterceptor;
    
    import org.apache.flume.Context;
    import org.apache.flume.Event;
    import org.apache.flume.interceptor.Interceptor;
    
    import javax.swing.text.html.HTMLEditorKit;
    import java.util.List;
    import java.util.Map;
    
    public class FlumeKafkaInterceptor implements Interceptor {
        @Override
        public void initialize() {
    
        }
    
        /**
         * 如果包含"atguigu"的数据,发送到first主题
         * 如果包含"sgg"的数据,发送到second主题
         * 其他的数据发送到third主题
         * @param event
         * @return
         */
        @Override
        public Event intercept(Event event) {
            //1.获取event的header
            Map<String, String> headers = event.getHeaders();
            //2.获取event的body
            String body = new String(event.getBody());
            if(body.contains("atguigu")){
                headers.put("topic","first");
            }else if(body.contains("sgg")){
                headers.put("topic","second");
            }
            return event;
    
        }
    
        @Override
        public List<Event> intercept(List<Event> events) {
            for (Event event : events) {
              intercept(event);
            }
            return events;
        }
    
        @Override
        public void close() {
    
        }
    
        public static class MyBuilder implements  Builder{
    
            @Override
            public Interceptor build() {
                return  new FlumeKafkaInterceptor();
            }
    
            @Override
            public void configure(Context context) {
    
            }
        }
    }

    2)将写好的interceptor打包上传到Flume安装目录的lib目录下

    3)配置flume

    # Name the components on this agent
    a1.sources = r1
    a1.sinks = k1
    a1.channels = c1
    
    # Describe/configure the source
    a1.sources.r1.type = netcat
    a1.sources.r1.bind = 0.0.0.0
    a1.sources.r1.port = 6666
    
    
    # Describe the sink
    a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
    a1.sinks.k1.kafka.topic = third
    a1.sinks.k1.kafka.bootstrap.servers = hadoop102:9092,hadoop103:9092,hadoop104:9092
    a1.sinks.k1.kafka.flumeBatchSize = 20
    a1.sinks.k1.kafka.producer.acks = 1
    a1.sinks.k1.kafka.producer.linger.ms = 1
    
    #Interceptor
    a1.sources.r1.interceptors = i1
    a1.sources.r1.interceptors.i1.type = com.atguigu.kafka.flumeInterceptor.FlumeKafkaInterceptor$MyBuilder
    
    # # Use a channel which buffers events in memory
    a1.channels.c1.type = memory
    a1.channels.c1.capacity = 1000
    a1.channels.c1.transactionCapacity = 100
    
    # Bind the source and sink to the channel
    a1.sources.r1.channels = c1
    a1.sinks.k1.channel = c1

    4 启动kafka消费者

    5 进入flume根目录下,启动flume

    $ bin/flume-ng agent -c conf/ -n a1 -f jobs/flume-kafka.conf

    6 6666端口写数据,查看kafka消费者消费情况

    本文来自博客园,作者:秋华,转载请注明原文链接:https://www.cnblogs.com/qiu-hua/p/15224730.html

  • 相关阅读:
    NIO与普通IO文件读写性能对比
    JAVA学习.java.sql.date 与java.util.date以及gettime()方法的分析
    软件工程之软件设计
    ubuntu下管理android手机
    AFNetworking2.0 NSHipster翻译
    【Jsoup爬取网页内容】
    IOS 表视图UITableView 束NSBundle
    如何将位图格式图片文件(.bmp)生成geotiff格式图片?
    opencv3 使用glob遍历并修改文件名
    Ubuntu clion下载及激活
  • 原文地址:https://www.cnblogs.com/qiu-hua/p/15224730.html
Copyright © 2011-2022 走看看