zoukankan      html  css  js  c++  java
  • Flume practices and sqoop hive 2 oracle

    #receive the file

    flume-ng agent --conf conf --conf-file conf1.conf --name a1

    flume-ng agent --conf conf --conf-file conf2.conf --name hdfs-agent

    flume-ng agent --conf conf --conf-file conf3.conf --name file-agent

       

    Conf1.conf

    a1.sources = tail

    a1.channels = c1

    a1.sinks = avro-forward-sink

       

    a1.channels.c1.type = file

    #a1.channels.c1.capacity = 1000

    #a1.channels.c1.transactionCapacity = 100

       

    a1.sources.tail.type = spooldir

    a1.sources.tail.spoolDir = /path/to/folder/

       

    a1.sinks.avro-forward-sink.type = avro

    a1.sinks.avro-forward-sink.hostname =hostname/ip

    a1.sinks.avro-forward-sink.port = 12345

       

    # Bind the source and sink to the channel

    a1.sources.tail.channels = c1

    a1.sinks.avro-forward-sink.channel = c1

       

    Conf2.conf

    hdfs-agent.sources= avro-collect

    hdfs-agent.sinks = hdfs-write

    hdfs-agent.channels=ch1

    hdfs-agent.channels.ch1.type = file

    #hdfs-agent.channels.ch1.capacity = 1000

    #hdfs-agent.channels.ch1.transactionCapacity = 100

       

    hdfs-agent.sources.avro-collect.type = avro

    hdfs-agent.sources.avro-collect.bind = 10.59.123.69

    hdfs-agent.sources.avro-collect.port = 12345

       

    hdfs-agent.sinks.hdfs-write.type = hdfs

    hdfs-agent.sinks.hdfs-write.hdfs.path = hdfs://namenode/user/usera/test/

    hdfs-agent.sinks.hdfs-write.hdfs.writeFormat=Text

       

    # Bind the source and sink to the channel

    hdfs-agent.sources.avro-collect.channels = ch1

    hdfs-agent.sinks.hdfs-write.channel = ch1

       

    Start the conf2.conf first, then start conf1.conf agent.

    Because the avro source should start first then avro sink can connect to it.

    #when use memory change, issue is :

    org.apache.flume.ChannelException: Unable to put batch on required channel:

    org.apache.flume.channel.MemoryChannel{name: ch1}

    #change to filechannel

    ok...

       

    #batched change the filename, remove .completed

    for f in *;

    do

    mv $f ${f%.COMPLETED*};

    done;

       

    Sqoop load data from hive to oracle:

    sqoop export -D oraoop.disabled=true

    --connect "jdbc:oracle:thin:@(description=(address=(protocol=tcp)(host=hostname)(port=port))(connect_data=(service_name=sname)))"

    --username user_USER

    --password pwd

    --table EVAN_TEST

    --fields-terminated-by '01'

    -m 1

    --export-dir /path/to/folder/

       

    ####table name should in upper case. Or else, report exception not found columns information.

  • 相关阅读:
    【最大流之EdmondsKarp算法】【HDU1532】模板题
    【矩阵乘法经典应用】【ZOJ3497】【Mistwa】
    【矩阵专题】
    【斐波拉契+数论+同余】【ZOJ3707】Calculate Prime S
    对拍BAT
    【枚举+贪心】【ZOJ3715】【Kindergarten Electiond】
    计算(a/b)%c
    斐波拉契数列性质
    【类克鲁斯卡尔做法+枚举最小边】【HDU1598】【find the most comfortable road】
    【并查集+拓扑排序】【HDU1811】【Rank of Tetris】
  • 原文地址:https://www.cnblogs.com/huaxiaoyao/p/4550083.html
Copyright © 2011-2022 走看看