zoukankan      html  css  js  c++  java
  • 【Sqoop1】Sqoop1实战之Sqoop Job

    前面两篇主要演示了Sqoop1进行数据的导入和导出,本篇主要演示如何使用Sqoop1对导入和导出任务创建Job。

    1 查看Sqoop Job语法帮助
    [hadoop@strong ~]$ sqoop help job
    18/06/27 16:55:24 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    usage: sqoop job [GENERIC-ARGS] [JOB-ARGS] [-- [<tool-name>] [TOOL-ARGS]]
    
    Job management arguments:
       --create <job-id>            Create a new saved job
       --delete <job-id>            Delete a saved job
       --exec <job-id>              Run a saved job
       --help                       Print usage instructions
       --list                       List saved jobs
       --meta-connect <jdbc-uri>    Specify JDBC connect string for the
                                    metastore
       --show <job-id>              Show the parameters for a saved job
       --verbose                    Print more information while working
    
    Generic Hadoop command-line arguments:
    (must preceed any tool-specific arguments)
    Generic options supported are
    -conf <configuration file>     specify an application configuration file
    -D <property=value>            use value for given property
    -fs <local|namenode:port>      specify a namenode
    -jt <local|resourcemanager:port>    specify a ResourceManager
    -files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
    -libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
    -archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.
    
    The general command line syntax is
    bin/hadoop command [genericOptions] [commandOptions]
    2 创建Sqoop Job(非增量方式)

    1)非增量导入命令
    [hadoop@strong ~]$ sqoop import --connect jdbc:mysql://strong.hadoop.com:3306/sakila --username root --password root --table lang --warehouse-dir /user/sqoop1 -m 1
    2)将上述命令改为以Job方式运行
    [hadoop@strong ~]$ sqoop job --create lang_job -- import --connect jdbc:mysql://strong.hadoop.com:3306/sakila --username root --password root --table lang --warehouse-dir /us
    er/sqoop1 -m 1
    3)列出创建的Job
    [hadoop@strong ~]$ sqoop job --list
    18/06/27 17:29:51 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    Available jobs:
      lang_job
    4)查看创建的Job
    [hadoop@strong ~]$ sqoop job --show lang_job
    18/06/27 17:30:14 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    Enter password: 
    Job: lang_job
    Tool: import
    Options:
    ----------------------------
    verbose = false
    hcatalog.drop.and.create.table = false
    db.connect.string = jdbc:mysql://strong.hadoop.com:3306/sakila
    codegen.output.delimiters.escape = 0
    codegen.output.delimiters.enclose.required = false
    codegen.input.delimiters.field = 0
    split.limit = null
    hbase.create.table = false
    mainframe.input.dataset.type = p
    db.require.password = true
    skip.dist.cache = false
    hdfs.append.dir = false
    db.table = lang
    codegen.input.delimiters.escape = 0
    accumulo.create.table = false
    import.fetch.size = null
    codegen.input.delimiters.enclose.required = false
    db.username = root
    reset.onemapper = false
    codegen.output.delimiters.record = 10
    import.max.inline.lob.size = 16777216
    sqoop.throwOnError = false
    hbase.bulk.load.enabled = false
    hcatalog.create.table = false
    db.clear.staging.table = false
    codegen.input.delimiters.record = 0
    hdfs.warehouse.dir = /user/sqoop1
    enable.compression = false
    hive.overwrite.table = false
    hive.import = false
    codegen.input.delimiters.enclose = 0
    accumulo.batch.size = 10240000
    hive.drop.delims = false
    customtool.options.jsonmap = {}
    codegen.output.delimiters.enclose = 0
    hdfs.delete-target.dir = false
    codegen.output.dir = .
    codegen.auto.compile.dir = true
    relaxed.isolation = false
    mapreduce.num.mappers = 1
    accumulo.max.latency = 5000
    import.direct.split.size = 0
    sqlconnection.metadata.transaction.isolation.level = 2
    codegen.output.delimiters.field = 44
    export.new.update = UpdateOnly
    incremental.mode = None
    hdfs.file.format = TextFile
    sqoop.oracle.escaping.disabled = true
    codegen.compile.dir = /tmp/sqoop-hadoop/compile/aeb58232c26c9414ca84080c52525881
    direct.import = false
    temporary.dirRoot = _sqoop
    hive.fail.table.exists = false
    db.batch = false
    [hadoop@strong ~]$ 
    5)执行Job
    [hadoop@strong ~]$ sqoop job --exec lang_job
    18/06/27 17:36:00 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    Enter password: 
    18/06/27 17:36:06 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    18/06/27 17:36:06 INFO tool.CodeGenTool: Beginning code generation
    Wed Jun 27 17:36:07 CST 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
    18/06/27 17:36:07 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `lang` AS t LIMIT 1
    18/06/27 17:36:07 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `lang` AS t LIMIT 1
    18/06/27 17:36:07 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop
    ---------剩余输出省略---------
    6) 查看Job执行后的数据
    [hadoop@strong ~]$ hdfs dfs -cat /user/sqoop1/lang/part-m-00000
    1,2006-02-15 05:02:19.0,English
    2,2006-02-15 05:02:19.0,Italian
    3,2006-02-15 05:02:19.0,Japanese
    4,2006-02-15 05:02:19.0,Mandarin
    5,2006-02-15 05:02:19.0,French
    6,2006-02-15 05:02:19.0,German
    7,2018-06-26 16:51:34.0,Chinese
    8,2018-06-26 17:54:54.0,GD
    3 创建Sqoop Job(增量方式)

    1)创建测试数据
    mysql> update lang set name='Guangdonghua' ,last_update=current_timestamp where language_id=8 ;
    Query OK, 1 row affected (0.70 sec)
    Rows matched: 1  Changed: 1  Warnings: 0
    
    mysql> insert into lang(language_id,name) values(9,'Português'),(10,'Russian');
    Query OK, 2 rows affected (0.26 sec)
    Records: 2  Duplicates: 0  Warnings: 0
    2)增量导入命令
    [hadoop@strong ~]$ sqoop import --connect jdbc:mysql://strong.hadoop.com:3306/sakila --username root --password root --table lang --warehouse-dir /user/sqoop1 --check-column 
    last_update --incremental lastmodify --last-value '2018-06-26 17:54:54.0' --merge-key language_id -m 1
    3)创建Job
    [hadoop@strong ~]$ sqoop job --create incr_job -- import --connect jdbc:mysql://strong.hadoop.com:3306/sakila --username root -P --table lang --warehouse-dir /user/sqoop1 --c
    heck-column last_update --incremental lastmodified --last-value '2018-06-26 17:54:54.0' --merge-key language_id -m 1
    4)查看Job
    [hadoop@strong ~]$ sqoop job --list
    18/06/27 17:56:25 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    Available jobs:
      incr_job
      lang_job
    5)运行Job
    [hadoop@strong ~]$ sqoop job --exec incr_job
    6)查看Job运行结果
    [hadoop@strong ~]$ hdfs dfs -cat /user/sqoop1/lang/part-r-00000
    1,2006-02-15 05:02:19.0,English
    10,2018-06-27 17:49:56.0,Russian
    2,2006-02-15 05:02:19.0,Italian
    3,2006-02-15 05:02:19.0,Japanese
    4,2006-02-15 05:02:19.0,Mandarin
    5,2006-02-15 05:02:19.0,French
    6,2006-02-15 05:02:19.0,German
    7,2018-06-26 16:51:34.0,Chinese
    8,2018-06-27 17:47:13.0,Guangdonghua
    9,2018-06-27 17:49:56.0,Português
    7)查看增量导入时的last_value值
    [hadoop@strong ~]$ sqoop job --show incr_job
    18/06/27 18:01:08 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    Enter password: 
    Job: incr_job
    Tool: import
    Options:
    ----------------------------
    verbose = false
    hcatalog.drop.and.create.table = false
    incremental.last.value = 2018-06-27 17:57:00.0
    db.connect.string = jdbc:mysql://strong.hadoop.com:3306/sakila
    codegen.output.delimiters.escape = 0
    codegen.output.delimiters.enclose.required = false
    codegen.input.delimiters.field = 0
    mainframe.input.dataset.type = p
    split.limit = null
    hbase.create.table = false
    db.require.password = true
    skip.dist.cache = false
    hdfs.append.dir = false
    db.table = lang
    codegen.input.delimiters.escape = 0
    accumulo.create.table = false
    import.fetch.size = null
    codegen.input.delimiters.enclose.required = false
    db.username = root
    reset.onemapper = false
    codegen.output.delimiters.record = 10
    import.max.inline.lob.size = 16777216
    sqoop.throwOnError = false
    hbase.bulk.load.enabled = false
    hcatalog.create.table = false
    db.clear.staging.table = false
    incremental.col = last_update
    codegen.input.delimiters.record = 0
    hdfs.warehouse.dir = /user/sqoop1
    enable.compression = false
    hive.overwrite.table = false
    hive.import = false
    codegen.input.delimiters.enclose = 0
    accumulo.batch.size = 10240000
    hive.drop.delims = false
    customtool.options.jsonmap = {}
    codegen.output.delimiters.enclose = 0
    hdfs.delete-target.dir = false
    codegen.output.dir = .
    codegen.auto.compile.dir = true
    relaxed.isolation = false
    mapreduce.num.mappers = 1
    accumulo.max.latency = 5000
    import.direct.split.size = 0
    sqlconnection.metadata.transaction.isolation.level = 2
    codegen.output.delimiters.field = 44
    export.new.update = UpdateOnly
    incremental.mode = DateLastModified
    hdfs.file.format = TextFile
    sqoop.oracle.escaping.disabled = true
    codegen.compile.dir = /tmp/sqoop-hadoop/compile/13396563126bce459d370e97847a3507
    direct.import = false
    temporary.dirRoot = _sqoop
    hive.fail.table.exists = false
    merge.key.col = language_id
    db.batch = false
    [hadoop@strong ~]$ 
    注:incremental.last.value = 2018-06-27 17:57:00.0已改为新值。


  • 相关阅读:
    P4556 [Vani有约会]雨天的尾巴(线段树合并)
    bzoj3590: [Snoi2013]Quare
    P3187 [HNOI2007]最小矩形覆盖
    对文件中的名字进行随机抽选(小脚本)
    用shell编写一个三角形图案
    HUE安装与使用
    史上最全CentOS6离线安装部署Cloudera Manager5.9.3
    ReLU 函数
    关于反向传播
    关于微分
  • 原文地址:https://www.cnblogs.com/alen-liu-sz/p/12975630.html
Copyright © 2011-2022 走看看