zoukankan      html  css  js  c++  java
  • Sqoop2入门之导入关系型数据库数据到HDFS上(sqoop2-1.99.4版本)

    sqoop2-1.99.4和sqoop2-1.99.3版本操作略有不同:新版本中使用link代替了老版本的connection,其他使用类似。

    sqoop2-1.99.4环境搭建参见:Sqoop2环境搭建

    sqoop2-1.99.3版本实现参见:Sqoop2入门之导入关系型数据库数据到HDFS上

    启动sqoop2-1.99.4版本客户端:

    $SQOOP2_HOME/bin/sqoop.sh client 
    set server --host hadoop000 --port 12000 --webapp sqoop

    查看所有connector:

    show connector --all
    2 connector(s) to show: 
            Connector with id 1:
                Name: hdfs-connector 
                Class: org.apache.sqoop.connector.hdfs.HdfsConnector
                Version: 1.99.4-cdh5.3.0
    
            Connector with id 2:
                Name: generic-jdbc-connector 
                Class: org.apache.sqoop.connector.jdbc.GenericJdbcConnector
                Version: 1.99.4-cdh5.3.0

    查询所有link: 

    show link

    删除指定link:

    delete link --lid x

    查询所有job:

    show job

    删除指定job:

    delete job --jid 1

      

    创建generic-jdbc-connector类型的connector

    create link --cid 2
        Name: First Link
        JDBC Driver Class: com.mysql.jdbc.Driver
        JDBC Connection String: jdbc:mysql://hadoop000:3306/hive
        Username: root
        Password: ****
        JDBC Connection Properties: 
        There are currently 0 values in the map:
        entry# protocol=tcp
        There are currently 1 values in the map:
        protocol = tcp
        entry# 
        New link was successfully created with validation status OK and persistent id 3
    show link
    +----+-------------+-----------+---------+
    | Id |    Name     | Connector | Enabled |
    +----+-------------+-----------+---------+
    | 3  | First Link  | 2         | true    |
    +----+-------------+-----------+---------+

    创建hdfs-connector类型的connector:

    create link -cid 1
        Name: Second Link
        HDFS URI: hdfs://hadoop000:8020
        New link was successfully created with validation status OK and persistent id 4
    show link
    +----+-------------+-----------+---------+
    | Id |    Name     | Connector | Enabled |
    +----+-------------+-----------+---------+
    | 3  | First Link  | 2         | true    |
    | 4  | Second Link | 1         | true    |
    +----+-------------+-----------+---------+
    show link -all
        2 link(s) to show: 
        link with id 3 and name First Link (Enabled: true, Created by null at 15-2-2 ??11:28, Updated by null at 15-2-2 ??11:28)
        Using Connector id 2
          Link configuration
            JDBC Driver Class: com.mysql.jdbc.Driver
            JDBC Connection String: jdbc:mysql://hadoop000:3306/hive
            Username: root
            Password: 
            JDBC Connection Properties: 
              protocol = tcp
        link with id 4 and name Second Link (Enabled: true, Created by null at 15-2-2 ??11:32, Updated by null at 15-2-2 ??11:32)
        Using Connector id 1
          Link configuration
            HDFS URI: hdfs://hadoop000:8020

    根据connector id创建job:

    create job -f 3 -t 4
        Creating job for links with from id 3 and to id 4
        Please fill following values to create new job object
        Name: Sqoopy
    
        From database configuration
    
        Schema name: hive
        Table name: TBLS
        Table SQL statement: 
        Table column names: 
        Partition column name: 
        Null value allowed for the partition column: 
        Boundary query: 
    
        ToJob configuration
    
        Output format: 
          0 : TEXT_FILE
          1 : SEQUENCE_FILE
        Choose: 0
        Compression format: 
          0 : NONE
          1 : DEFAULT
          2 : DEFLATE
          3 : GZIP
          4 : BZIP2
          5 : LZO
          6 : LZ4
          7 : SNAPPY
          8 : CUSTOM
        Choose: 0
        Custom compression format: 
        Output directory: hdfs://hadoop000:8020/sqoop2/tbls_import_demo_sqoop1.99.4
    
        Throttling resources
    
        Extractors: 
        Loaders: 
        New job was successfully created with validation status OK  and persistent id 2

    查询所有job: 

    show job
    +----+--------+----------------+--------------+---------+
    | Id |  Name  | From Connector | To Connector | Enabled |
    +----+--------+----------------+--------------+---------+
    | 2  | Sqoopy | 2              | 1            | true    |
    +----+--------+----------------+--------------+---------+

    启动指定的job:  该job执行完后查看HDFS上的文件(hdfs fs -ls hdfs://hadoop000:8020/sqoop2/tbls_import_demo_sqoop1.99.4/)

    start job --jid 2

    查看指定job的执行状态:

    status job --jid 2

    停止指定的job:

    stop job --jid 2

    在start job(如:start job --jid 2)时常见错误:

    Exception has occurred during processing command 
    Exception: org.apache.sqoop.common.SqoopException Message: CLIENT_0001:Server has returned exception

    在sqoop客户端设置查看job详情:

    set option --name verbose --value true
    show job --jid 2
  • 相关阅读:
    PyCharm 自定义文件和代码模板
    Django 1.10中文文档-第一个应用Part6-静态文件
    Django 1.10中文文档-第一个应用Part5-测试
    Python标准库笔记(4) — collections模块
    Python标准库笔记(3) — datetime模块
    Django 1.10中文文档-第一个应用Part4-表单和通用视图
    Python爬虫—破解JS加密的Cookie
    Python标准库笔记(2) — re模块
    算法"新"名词
    MLP神经网络实例--手写识别
  • 原文地址:https://www.cnblogs.com/luogankun/p/4267442.html
Copyright © 2011-2022 走看看