zoukankan      html  css  js  c++  java
  • sqoop1的安装以及数据导入导出测试

    下载

    wget http://mirror.bit.edu.cn/apache/sqoop/1.4.7/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz

    解压

    tar -zxf sqoop-1.99.7-bin-hadoop200.tar.gz

    将mysql-connector-java-5.1.39.jar拷贝到sqoop1的lib目录

    [root@host ~]# cp mysql-connector-java-5.1.39.jar /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/lib/

    配置环境变量,并使之生效

    export SQOOP_HOME=/root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0
    export PATH=$PATH:$SQOOP_HOME/bin

    测试

    [root@host bin]# sqoop help

    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/04 14:03:00 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    usage: sqoop COMMAND [ARGS]

    Available commands:
      codegen            Generate code to interact with database records
      create-hive-table  Import a table definition into Hive
      eval               Evaluate a SQL statement and display the results
      export             Export an HDFS directory to a database table
      help               List available commands
      import             Import a table from a database to HDFS
      import-all-tables  Import tables from a database to HDFS
      import-mainframe   Import datasets from a mainframe server to HDFS
      job                Work with saved jobs
      list-databases     List available databases on a server
      list-tables        List available tables in a database
      merge              Merge results of incremental imports
      metastore          Run a standalone Sqoop metastore
      version            Display version information

     执行SQL并返回结果

    [root@host ~]#  sqoop eval --connect jdbc:mysql://localhost:3306/test --username root --password root --query 'select roleid,level from role where sex=0'
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/04 17:30:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    18/06/04 17:30:54 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
    18/06/04 17:30:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    ----------------------------
    | roleid     | level       |
    ----------------------------
    | 200        | 8           |
    | 400        | 4           |
    | 600        | 91          |
    ----------------------------

    列出数据库的所有表

    [root@host bin]# sqoop list-tables --connect jdbc:mysql://localhost:3306/test --username root -P
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/04 14:18:26 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    Enter password:
    18/06/04 14:18:28 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    gamelog
    log1
    log2
    log_merge
    loginlog
    name_test
    name_test1
    role
    role1

    列出所有数据库

    [root@host bin]# sqoop list-databases --connect jdbc:mysql://localhost:3306/test --username root -P     
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/04 14:18:46 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    Enter password:
    18/06/04 14:18:48 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    information_schema
    gamelog
    metastore
    mysql
    test

    [root@host bin]# sqoop list-databases --connect jdbc:mysql://localhost:3306 --username root --password root
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/04 14:22:22 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    18/06/04 14:22:22 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
    18/06/04 14:22:22 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    information_schema
    gamelog
    metastore
    mysql
    test

    MySQL导入数据到Hdfs

    导入成功会自动创建文件夹20180604,如果文件夹存在则会报错

    将表test.role导入到hdfs://localhost:9000/sqoop/sqoop1/20180604

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --target-dir hdfs://localhost:9000/sqoop/sqoop1/20180604

    查看hdfs:

    [root@host ~]# hdfs dfs -ls -R /sqoop/sqoop1
    drwxr-xr-x   - root supergroup          0 2018-06-04 14:41 /sqoop/sqoop1/20180604
    -rw-r--r--   1 root supergroup          0 2018-06-04 14:41 /sqoop/sqoop1/20180604/_SUCCESS
    -rw-r--r--   1 root supergroup        152 2018-06-04 14:41 /sqoop/sqoop1/20180604/part-m-00000
    -rw-r--r--   1 root supergroup         52 2018-06-04 14:41 /sqoop/sqoop1/20180604/part-m-00001
    -rw-r--r--   1 root supergroup         47 2018-06-04 14:41 /sqoop/sqoop1/20180604/part-m-00002
    -rw-r--r--   1 root supergroup         92 2018-06-04 14:41 /sqoop/sqoop1/20180604/part-m-00003

    -m,--num-mappers启动N个map来并行导入数据,默认是4个,最好不要将数字设置为高于集群的节点数

    --target-dir <dir> 指定hdfs路径

    上述操作没有指-m因此默认是4个,

    我们指定-m 1尝试下:

     sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060401 -m 1

    查看hdfs

    [root@host ~]# hdfs dfs -ls -R /sqoop/sqoop1/2018060401
    -rw-r--r--   1 root supergroup          0 2018-06-04 14:49 /sqoop/sqoop1/2018060401/_SUCCESS
    -rw-r--r--   1 root supergroup        343 2018-06-04 14:49 /sqoop/sqoop1/2018060401/part-m-00000

    [root@host ~]# hdfs dfs -cat /sqoop/sqoop1/2018060401/p*
    1,null,2017-11-16 14:49:11.0,henan luohe linying,1,10
    40,null,2017-11-13 14:50:25.0,guangdong shenzhen,1,20
    110,null,2017-11-14 14:50:47.0,beijing,1,20
    200,null,2017-11-14 14:49:47.0,shandong qingdao,0,8
    400,null,2017-11-15 14:49:56.0,anhui hefei,0,4
    600,null,2017-11-15 14:50:05.0,hunan changsha,0,91
    650,null,2017-11-01 17:24:34.0,null,1,29
    [root@host ~]# hdfs dfs -cat /sqoop/sqoop1/20180604/p*
    1,null,2017-11-16 14:49:11.0,henan luohe linying,1,10
    40,null,2017-11-13 14:50:25.0,guangdong shenzhen,1,20
    110,null,2017-11-14 14:50:47.0,beijing,1,20
    200,null,2017-11-14 14:49:47.0,shandong qingdao,0,8
    400,null,2017-11-15 14:49:56.0,anhui hefei,0,4
    600,null,2017-11-15 14:50:05.0,hunan changsha,0,91
    650,null,2017-11-01 17:24:34.0,null,1,29

     HDFS导入到mysql

    导入前的mysql表:

    mysql> select * from role1;
    +--------+------+---------------------+---------------------+------+-------+
    | roleid | name | dateid              | addr                | sex  | level |
    +--------+------+---------------------+---------------------+------+-------+
    |    400 | NULL | 2017-11-15 14:49:56 | anhui hefei         |    0 |     4 |
    |      1 | NULL | 2017-11-16 14:49:11 | henan luohe linying |    1 |    10 |
    |     40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen  |    1 |    20 |
    |    110 | NULL | 2017-11-14 14:50:47 | beijing             |    1 |    20 |
    |    200 | NULL | 2017-11-14 14:49:47 | shandong qingdao    |    0 |     8 |
    |    600 | NULL | 2017-11-15 14:50:05 | hunan changsha      |    0 |    91 |
    |    650 | NULL | 2017-11-01 17:24:34 | NULL                |    1 |    29 |
    +--------+------+---------------------+---------------------+------+-------+
    7 rows in set (0.01 sec)

    导入:

    sqoop export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --export-dir hdfs://localhost:9000/sqoop/sqoop1/2018060401

    导入后的mysql表:

    mysql> select * from role1;
    +--------+------+---------------------+---------------------+------+-------+
    | roleid | name | dateid              | addr                | sex  | level |
    +--------+------+---------------------+---------------------+------+-------+
    |    400 | NULL | 2017-11-15 14:49:56 | anhui hefei         |    0 |     4 |
    |      1 | NULL | 2017-11-16 14:49:11 | henan luohe linying |    1 |    10 |
    |     40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen  |    1 |    20 |
    |    110 | NULL | 2017-11-14 14:50:47 | beijing             |    1 |    20 |
    |    200 | NULL | 2017-11-14 14:49:47 | shandong qingdao    |    0 |     8 |
    |    600 | NULL | 2017-11-15 14:50:05 | hunan changsha      |    0 |    91 |
    |    650 | NULL | 2017-11-01 17:24:34 | NULL                |    1 |    29 |
    |    650 | NULL | 2017-11-01 17:24:34 | NULL                |    1 |    29 |
    |      1 | NULL | 2017-11-16 14:49:11 | henan luohe linying |    1 |    10 |
    |     40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen  |    1 |    20 |
    |    110 | NULL | 2017-11-14 14:50:47 | beijing             |    1 |    20 |
    |    200 | NULL | 2017-11-14 14:49:47 | shandong qingdao    |    0 |     8 |
    |    400 | NULL | 2017-11-15 14:49:56 | anhui hefei         |    0 |     4 |
    |    600 | NULL | 2017-11-15 14:50:05 | hunan changsha      |    0 |    91 |
    +--------+------+---------------------+---------------------+------+-------+
    14 rows in set (0.00 sec)

    根据条件导出某些字段:

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --columns "roleid,level" --where "sex=0"  --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060403

     [root@host ~]# hdfs dfs -cat /sqoop/sqoop1/2018060403/p*
    200,8
    400,4
    600,91

    直接导入模式

    --direct 直接导入模式,使用的是关系数据库自带的导入导出工具。传说这样导入更快

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --delete-target-dir  --direct --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060405

    --delete-target-dir   删除目标目录

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --fields-terminated-by " " --lines-terminated-by " " --delete-target-dir  --direct --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060405

    --fields-terminated-by指定列分隔符

    --lines-terminated-by指定行分隔符

    mysql与hive互导

    将mysql表结构导入的hive

    sqoop create-hive-table --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --hive-table sqoopdb.role

    18/06/04 16:10:39 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
    18/06/04 16:10:39 ERROR tool.CreateHiveTableTool: Encountered IOException running create table job: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf

    修改配置文件/etc/profile,添加:

    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/*

    重启终端或者新开一个窗口执行:

    sqoop create-hive-table --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --hive-table sqoopdb.role

    或者

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role  --hive-import  --hive-database sqoopdb --hive-table role

    查看hive的sqoop库:

    hive> use sqoopdb;
    OK
    Time taken: 0.116 seconds
    hive> show tables;
    OK
    tab_name
    role
    Time taken: 0.251 seconds, Fetched: 1 row(s)
    hive> select * from role;
    OK
    role.roleid     role.name       role.dateid     role.addr       role.sex        role.level
    Time taken: 4.798 seconds

    将mysql数据导入到hive的表sqoop.role:

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role  --hive-import --hive-table sqoopdb.role

    查看hdfs:

    drwx-wx-wx   - root supergroup          0 2018-06-04 16:24 /user/hive/warehouse/sqoopdb.db
    drwx-wx-wx   - root supergroup          0 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role
    -rwx-wx-wx   1 root supergroup        152 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00000
    -rwx-wx-wx   1 root supergroup         52 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00001
    -rwx-wx-wx   1 root supergroup         47 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00002
    -rwx-wx-wx   1 root supergroup         92 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00003

    ...............................................

    drwxr-xr-x   - root supergroup          0 2018-06-04 16:36 /user/root
    drwxr-xr-x   - root supergroup          0 2018-06-04 16:36 /user/root/role
    -rw-r--r--   1 root supergroup          0 2018-06-04 16:36 /user/root/role/_SUCCESS

    查看hive:

    hive> select * from role;
    OK
    role.roleid     role.name       role.dateid     role.addr       role.sex        role.level
    1       null    2017-11-16 14:49:11.0   henan luohe linying     1       10
    40      null    2017-11-13 14:50:25.0   guangdong shenzhen      1       20
    110     null    2017-11-14 14:50:47.0   beijing 1       20
    200     null    2017-11-14 14:49:47.0   shandong qingdao        0       8
    400     null    2017-11-15 14:49:56.0   anhui hefei     0       4
    600     null    2017-11-15 14:50:05.0   hunan changsha  0       91
    650     null    2017-11-01 17:24:34.0   null    1       29
    Time taken: 0.674 seconds, Fetched: 7 row(s)

    再次执行

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role  --hive-import --hive-table sqoopdb.role

    hive> select * from role;
    OK
    role.roleid     role.name       role.dateid     role.addr       role.sex        role.level
    1       null    2017-11-16 14:49:11.0   henan luohe linying     1       10
    40      null    2017-11-13 14:50:25.0   guangdong shenzhen      1       20
    110     null    2017-11-14 14:50:47.0   beijing 1       20
    1       null    2017-11-16 14:49:11.0   henan luohe linying     1       10
    40      null    2017-11-13 14:50:25.0   guangdong shenzhen      1       20
    110     null    2017-11-14 14:50:47.0   beijing 1       20
    200     null    2017-11-14 14:49:47.0   shandong qingdao        0       8
    200     null    2017-11-14 14:49:47.0   shandong qingdao        0       8
    400     null    2017-11-15 14:49:56.0   anhui hefei     0       4
    400     null    2017-11-15 14:49:56.0   anhui hefei     0       4
    600     null    2017-11-15 14:50:05.0   hunan changsha  0       91
    650     null    2017-11-01 17:24:34.0   null    1       29
    600     null    2017-11-15 14:50:05.0   hunan changsha  0       91
    650     null    2017-11-01 17:24:34.0   null    1       29
    Time taken: 0.464 seconds, Fetched: 14 row(s)

    drwx-wx-wx   - root supergroup          0 2018-06-04 16:24 /user/hive/warehouse/sqoopdb.db
    drwx-wx-wx   - root supergroup          0 2018-06-04 16:40 /user/hive/warehouse/sqoopdb.db/role
    -rwx-wx-wx   1 root supergroup        152 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00000
    -rwx-wx-wx   1 root supergroup        152 2018-06-04 16:40 /user/hive/warehouse/sqoopdb.db/role/part-m-00000_copy_1
    -rwx-wx-wx   1 root supergroup         52 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00001
    -rwx-wx-wx   1 root supergroup         52 2018-06-04 16:40 /user/hive/warehouse/sqoopdb.db/role/part-m-00001_copy_1
    -rwx-wx-wx   1 root supergroup         47 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00002
    -rwx-wx-wx   1 root supergroup         47 2018-06-04 16:40 /user/hive/warehouse/sqoopdb.db/role/part-m-00002_copy_1
    -rwx-wx-wx   1 root supergroup         92 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00003
    -rwx-wx-wx   1 root supergroup         92 2018-06-04 16:40 /user/hive/warehouse/sqoopdb.db/role/part-m-00003_copy_1

    hive导入到mysql

           --hive-home <dir>  重写$HIVE_HOME
      --hive-import          插入数据到hive当中,使用hive的默认分隔符
      --hive-overwrite  重写插入
      --create-hive-table  建表,如果表已经存在,该操作会报错!
      --hive-table <table-name>  设置到hive当中的表名
      --hive-drop-import-delims  导入到hive时删除 , , and 1
      --hive-delims-replacement  导入到hive时用自定义的字符替换掉 , , and 1
      --hive-partition-key          hive分区的key
      --hive-partition-value <v>  hive分区的值
      --map-column-hive <map>          类型匹配,sql类型对应到hive类型

    sqoop export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --fields-terminated-by '01' --export-dir /user/hive/warehouse/sqoopdb.db/role

    sqoop export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --fields-terminated-by '01' --export-dir hdfs://localhost:9000/user/hive/warehouse/sqoopdb.db/role

    测试时未指定列分割符发生报错,指定后成功执行

    查看mysql表

    mysql> select * from role1;

    Empty set (0.00 sec)

    mysql> select * from role1;
    +--------+------+---------------------+---------------------+------+-------+
    | roleid | name | dateid              | addr                | sex  | level |
    +--------+------+---------------------+---------------------+------+-------+
    |      1 | NULL | 2017-11-16 14:49:11 | henan luohe linying |    1 |    10 |
    |     40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen  |    1 |    20 |
    |    110 | NULL | 2017-11-14 14:50:47 | beijing             |    1 |    20 |
    |      1 | NULL | 2017-11-16 14:49:11 | henan luohe linying |    1 |    10 |
    |     40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen  |    1 |    20 |
    |    110 | NULL | 2017-11-14 14:50:47 | beijing             |    1 |    20 |
    |    200 | NULL | 2017-11-14 14:49:47 | shandong qingdao    |    0 |     8 |
    |    200 | NULL | 2017-11-14 14:49:47 | shandong qingdao    |    0 |     8 |
    |    400 | NULL | 2017-11-15 14:49:56 | anhui hefei         |    0 |     4 |
    |    400 | NULL | 2017-11-15 14:49:56 | anhui hefei         |    0 |     4 |
    |    600 | NULL | 2017-11-15 14:50:05 | hunan changsha      |    0 |    91 |
    |    650 | NULL | 2017-11-01 17:24:34 | NULL                |    1 |    29 |
    |    600 | NULL | 2017-11-15 14:50:05 | hunan changsha      |    0 |    91 |
    |    650 | NULL | 2017-11-01 17:24:34 | NULL                |    1 |    29 |
    +--------+------+---------------------+---------------------+------+-------+
    14 rows in set (0.00 sec)

     --query,-e <sql> 从查询结果中导入数据,该参数使用时必须指定–target-dir–hive-table,在查询语句中一定要有where条件且在where条件中需要包含 $CONDITIONS

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --query "select * from role1 where $CONDITIONS" --fields-terminated-by " " --lines-terminated-by " " --delete-target-dir --split-by roleid  --direct --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060406

    测试发现--split-by也不可以少

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --query "select roleid,level from role1 where sex=0 and $CONDITIONS" --fields-terminated-by " " --lines-terminated-by " " --delete-target-dir --split-by roleid  --direct --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060406

    $CONDITIONS是linux的系统变量,查看执行过程,发现$CONDITIONS被替换成了1=0

    If you want to import the results of a query in parallel, then each map task will need to execute a copy of the query, with results partitioned by bounding conditions inferred by Sqoop. Your query must include the token $CONDITIONS which each Sqoop process will replace with a unique condition expression. You must also select a splitting column with --split-by.

    [root@host ~]# sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --query "select roleid,level from role1 where sex=0 and $CONDITIONS" --fields-terminated-by " " --lines-terminated-by " " --delete-target-dir --split-by roleid  --direct --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060406
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/05 10:24:26 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    18/06/05 10:24:26 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
    18/06/05 10:24:26 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    18/06/05 10:24:26 INFO tool.CodeGenTool: Beginning code generation
    18/06/05 10:24:27 INFO manager.SqlManager: Executing SQL statement: select roleid,level from role1 where sex=0 and  (1 = 0)
    18/06/05 10:24:27 INFO manager.SqlManager: Executing SQL statement: select roleid,level from role1 where sex=0 and  (1 = 0)
    18/06/05 10:24:27 INFO manager.SqlManager: Executing SQL statement: select roleid,level from role1 where sex=0 and  (1 = 0)

    ................

    export

    --update-key  表的主键

    --update-mode 两种模式allowinsert:存在更新,不存在插入,updateonly:只更新

    如果表不存在主键,指定了--update-key,再指定allowinsert,数据已经存在也会重复插入

     sqoop export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --fields-terminated-by '01' --export-dir hdfs://localhost:9000/user/hive/warehouse/sqoopdb.db/role --update-key roleid --update-mode allowinsert

    sqoop job

    创建一个Job

    [root@host ~]# sqoop job --create mysqltohive -- export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --fields-terminated-by '01' --export-dir hdfs://localhost:9000/user/hive/warehouse/sqoopdb.db/role
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/05 13:52:55 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    18/06/05 13:52:56 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

    查看可用的job:

    [root@host ~]# sqoop job --list
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/05 13:55:45 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    Available jobs:
      mysqltohive

    执行Job:

    执行前的mysql:

    mysql> select * from role1;
    +--------+------+---------------------+---------------------+------+-------+
    | roleid | name | dateid              | addr                | sex  | level |
    +--------+------+---------------------+---------------------+------+-------+
    |      1 | NULL | 2017-11-16 14:49:11 | henan luohe linying |    1 |    10 |
    |     40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen  |    1 |    20 |
    |    110 | NULL | 2017-11-14 14:50:47 | beijing             |    1 |    20 |
    |    200 | NULL | 2017-11-14 14:49:47 | shandong qingdao    |    0 |     8 |
    |    400 | NULL | 2017-11-15 14:49:56 | anhui hefei         |    0 |     4 |
    |    600 | NULL | 2017-11-15 14:50:05 | hunan changsha      |    0 |    91 |
    |    650 | NULL | 2017-11-01 17:24:34 | NULL                |    1 |    29 |
    +--------+------+---------------------+---------------------+------+-------+
    7 rows in set (0.00 sec)

    [root@host ~]# sqoop job --exec mysqltohive

    注:执行过程中需要输入密码

    查看执行后的mysql:

    mysql> select * from role1;
    +--------+------+---------------------+---------------------+------+-------+
    | roleid | name | dateid              | addr                | sex  | level |
    +--------+------+---------------------+---------------------+------+-------+
    |      1 | NULL | 2017-11-16 14:49:11 | henan luohe linying |    1 |    10 |
    |     40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen  |    1 |    20 |
    |    110 | NULL | 2017-11-14 14:50:47 | beijing             |    1 |    20 |
    |    200 | NULL | 2017-11-14 14:49:47 | shandong qingdao    |    0 |     8 |
    |    400 | NULL | 2017-11-15 14:49:56 | anhui hefei         |    0 |     4 |
    |    600 | NULL | 2017-11-15 14:50:05 | hunan changsha      |    0 |    91 |
    |    650 | NULL | 2017-11-01 17:24:34 | NULL                |    1 |    29 |
    |      1 | NULL | 2017-11-16 14:49:11 | henan luohe linying |    1 |    10 |
    |     40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen  |    1 |    20 |
    |    110 | NULL | 2017-11-14 14:50:47 | beijing             |    1 |    20 |
    |    200 | NULL | 2017-11-14 14:49:47 | shandong qingdao    |    0 |     8 |
    |    400 | NULL | 2017-11-15 14:49:56 | anhui hefei         |    0 |     4 |
    |    600 | NULL | 2017-11-15 14:50:05 | hunan changsha      |    0 |    91 |
    |    650 | NULL | 2017-11-01 17:24:34 | NULL                |    1 |    29 |
    +--------+------+---------------------+---------------------+------+-------+

    14 rows in set (0.00 sec)

    执行成功!

    删除job

    sqoop job --delete mysqltohive

    以上执行sqoop job执行过程中需要输入密码,免密码执行方式如下:

    配置sqoop_site.xml,启用:

      <property>
        <name>sqoop.metastore.client.record.password</name>
        <value>true</value>
        <description>If true, allow saved passwords in the metastore.
        </description>
      </property>

    创建job,最好新建窗口:

    [root@host conf]# sqoop job --create hivetomysql -- export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --fields-terminated-by '01' --export-dir hdfs://localhost:9000/user/hive/warehouse/sqoopdb.db/role
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/05 14:17:51 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    18/06/05 14:17:51 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

    执行job:

    [root@host ~]# sqoop job --exec hivetomysql

    果然不用输入秘密,bingo!!!

    查看mysql数据表,数据存在!测试成功!!!!

    同样sqoop也可以实现sqlserver与hdfs,hive,hbase的互导,需要下载相关驱动

    添加环境变量 MSSQL_CONNECTOR_HOME

    并且将驱动复制到sqoop的lib目录下面

  • 相关阅读:
    什么样的代码称得上是好代码?
    九年程序人生 总结分享
    Docker入门 第一课 --.Net Core 使用Docker全程记录
    阿里云 Windows Server 2012 r2 部署asp.net mvc网站 平坑之旅
    Visual studio 2015 Community 安装过程中遇到问题的终极解决
    Activiti6.0 spring5 工作流引擎 java SSM流程审批 项目框架
    java 进销存 库存管理 销售报表 商户管理 springmvc SSM crm 项目
    Leetcode名企之路
    24. 两两交换链表中的节点
    21. 合并两个有序链表
  • 原文地址:https://www.cnblogs.com/playforever/p/9103912.html
Copyright © 2011-2022 走看看