zoukankan      html  css  js  c++  java
  • Sqoop导入mysql数据到Hbase

    sqoop import --driver com.mysql.jdbc.Driver --connect "jdbc:mysql://11.143.18.29:3306/db_1"  --username root --password root --query "select rowkey,dtTime from tb1  where 1=1 and $CONDITIONS" --hbase-table tb1--column-family t --hbase-row-key id  -split-by dtTime -m 2

    导入大表时一定要注意查看数据是否分布均匀,否则很容易出现java heap space错误

    sqoop导入mysql数据到hbase时,如果带有--hbase-create-table选项出现如下错误,据说是兼容问题:

    16/01/24 13:46:33 INFO zookeeper.ClientCnxn: Session establishment complete on server slave1/192.168.2.80:2181, sessionid = 0x252720e0fac0008, negotiated timeout = 40000
    
    16/01/24 13:46:33 INFO mapreduce.HBaseImportJob: Creating missing HBase table exchangeshopflow
    
    Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.hbase.HTableDescriptor.addFamily(Lorg/apache/hadoop/hbase/HColumnDescriptor;)V
        at org.apache.sqoop.mapreduce.HBaseImportJob.jobSetup(HBaseImportJob.java:222)
        at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:264)
        at org.apache.sqoop.manager.SqlManager.importQuery(SqlManager.java:729)
        at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:499)
        at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:236)

    解决方法:
    手动建hbase表,然后使用sqoop导入

    包含自定义field:

    sqoop import --driver com.mysql.jdbc.Driver --connect "jdbc:mysql://192.168.2.80:3306/db_1"  --username root --password root --query "select concat(concat(md5(concat(PID,vid)),UNIX_TIMESTAMP(dtTime)), 100000+right(\`__#alibaba_rds_row_id#__\`,5)) as id,dttime,vid,pid,zoneid,type,count,reason,\`__#alibaba_rds_row_id#__\` from achi where  $CONDITIONS" --hbase-table achi --column-family a --hbase-row-key id  -split-by `__#alibaba_rds_row_id#__\`

    直接导入:

    sqoop import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://192.168.2.80:3306/db_1"  --username root --password root --table achi --hbase-table achi --column-family a --hbase-row-key dtTime  -split-by dtTime --hbase-create-table

    定时增量导入

    sqoop job --create t11 -- import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://192.168.2.80:3306/db_1"  --username root --password root --table t2 --hbase-table t007117 --column-family a --hbase-row-key id  -split-by id --hbase-create-table --incremental append --check-column id  --last-value 0;

    定时任务

    sqoop job --create t117 -- import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://192.168.2.80:3306/db_1"  --username root --password root --query "select concat(concat(md5(concat(PID,vid)),UNIX_TIMESTAMP(dtTime)), 100000+right(id,5)) as key_id,t.*  from t2 as t where $CONDITIONS" --hbase-table t007117 --column-family a --hbase-row-key id  -split-by id --hbase-create-table --incremental append --check-column id  --last-value 0;

    使用–query时,–incremental append –check-column id –last-value 0;不起作用

  • 相关阅读:
    局部变量、全局变量和修改全局变量
    python中函数的参数
    python之匿名函数和递归函数
    设计模式之职责链模式
    设计模式之代理模式
    设计模式之flyweight享元模式
    设计模式之外观模式
    设计模式之装饰模式
    组合模式更清晰的例子
    设计模式之组合模式
  • 原文地址:https://www.cnblogs.com/ggzone/p/10121172.html
Copyright © 2011-2022 走看看