zoukankan      html  css  js  c++  java
  • flume配置

    配置flume        (在hadoop集群里 哪里需要采集信息那里就配置Flume)
        vi flume-env.sh ( cp flume-env.sh.template flume-env.sh )
            将export  JAVA_OPTS 那行的注释去掉
            修改export JAVA_HOME=/root/app/jdk1.8.0_161
        ***将dir-hdfs.conf导入到Flume中的bin目录下边  
            需要注意:ag1.sinks.sink1.hdfs.path =hdfs://hdp-01:9000/access_log/%y-%m-%d/%H-%M
                中的hdp-01:9000 这个是NameNode的主机名
                ag1.sources.source1.spoolDir = /root/data/log
                从什么地方采集的信息
        创建 源目录log 给与777权限 并且在log目录下存放需要采集的数据
        执行命令
            ./flume-ng agent -c conf/ -f dir-hdfs.conf -n ag1 -Dflume.root.logger=INFO,console

    dir-hdfs.conf文件的内容:

    #定义三大组件的名称
    ag1.sources = source1
    ag1.sinks = sink1
    ag1.channels = channel1

    # 配置source组件
    ag1.sources.source1.type = spooldir
    ag1.sources.source1.spoolDir = /root/data/log
    ag1.sources.source1.fileSuffix=.FINISHED
    ag1.sources.source1.inputCharset=utf-8
    ag1.sources.source1.deserializer.maxLineLength=5120

    # 配置sink组件
    ag1.sinks.sink1.type = hdfs
    ag1.sinks.sink1.hdfs.path =hdfs://192.168.56.2/access_log/%y-%m-%d/%H-%M
    ag1.sinks.sink1.hdfs.filePrefix = app_log
    ag1.sinks.sink1.hdfs.fileSuffix = .log
    ag1.sinks.sink1.hdfs.batchSize= 100
    ag1.sinks.sink1.hdfs.fileType = DataStream
    ag1.sinks.sink1.hdfs.writeFormat =Text

    ## roll:滚动切换:控制写文件的切换规则
     ## 按文件体积(字节)来切   
    ag1.sinks.sink1.hdfs.rollSize = 512000   
     ## 按event条数切
    ag1.sinks.sink1.hdfs.rollCount = 1000000
     ## 按时间间隔切换文件
    ag1.sinks.sink1.hdfs.rollInterval = 60   

    ## 控制生成目录的规则(round回滚)
    ag1.sinks.sink1.hdfs.round = true
    ag1.sinks.sink1.hdfs.roundValue = 10
    ag1.sinks.sink1.hdfs.roundUnit = minute
    ag1.sinks.sink1.hdfs.useLocalTimeStamp = true

    # channel组件配置
    ag1.channels.channel1.type = memory
     ## event条数
    ag1.channels.channel1.capacity = 500000  
    ##flume事务控制所需要的缓存容量600条event
    ag1.channels.channel1.transactionCapacity = 600  

    # 绑定source、channel和sink之间的连接
    ag1.sources.source1.channels = channel1
    ag1.sinks.sink1.channel = channel1

  • 相关阅读:
    iaas,paas,saas理解
    July 06th. 2018, Week 27th. Friday
    July 05th. 2018, Week 27th. Thursday
    July 04th. 2018, Week 27th. Wednesday
    July 03rd. 2018, Week 27th. Tuesday
    July 02nd. 2018, Week 27th. Monday
    July 01st. 2018, Week 27th. Sunday
    June 30th. 2018, Week 26th. Saturday
    June 29th. 2018, Week 26th. Friday
    June 28th. 2018, Week 26th. Thursday
  • 原文地址:https://www.cnblogs.com/lihui123/p/13131768.html
Copyright © 2011-2022 走看看