zoukankan      html  css  js  c++  java
  • how to use Sqoop to import/ export data

    Sqoop is a tool designed for efficiently transferring data between RDBMS and HDFS, we can import data from mysql, oracle, and other data bases into HDFS very easily; meanwhile we can dump data into data base from HDFS. For detailed documentation, please refer to sqoop documentation.

    Before using Sqoop, please follow steps to setup it correctly.

    Sqoop - Import

    the following command is used for import

    sqoop import (generic-args) (import-args)

    given a table named stock_info, and the schema is:

    Case 1: we can use below command to import stock_info data to hadoop hdfs file system:

    sqoop import --connect jdbc:mysql://host:port/dbname --username loginuser --password loginuser --table stock_info --m 1

    and the result looks like:

    we can verify result in hdfs by running command

    hadoop fs -cat /emp/part-m-*

    Case 2: sepcify the target directory in hdfs by running the following import command

    sqoop import --connect jdbc:mysql://host:port/dbname --username loginuser --password loginuser --table stock_info --m 1 --target-dir /temp

    then we can verify result by executing the same command as above

    Case 3: imcremental import by specifying --incremental, --check-column and --append arguments. Note we should change 'last_chg_date' when applying other tables.

    sqoop import --connect jdbc:mysql://host:port/dbname --username loginuser --password loginuser --table stock_info --m 1 --target-dir /temp --incremental lastmodified --check-column last_chg_date --append

    Case 4: specify target file format as parquet format by adding argument '--as-parquetfile'

    sqoop import --connect jdbc:mysql://host:port/dbname --username loginuser --password loginuser --table stock_info --m 1 --target-dir /temp --incremental lastmodified --check-column last_chg_date --append --as-parquetfile

    Case 5: import all tables

    sqoop import-all-tables --connect jdbc:mysql://host:port/dbname --username loginuser --password loginuser 

    Sqoop - Export

    export means to dump data from hdfs to mysql, oracle or other data bases, command syntax is like

    sqoop export (generic-args) (export-args)

    given there are many parquet files under stock_info folder which is imported by sqoop import command incrementally

    then we want to dump data back into mysql data base, using the following command

    sqoop export --connent jdbc:mysql://host:port/dbname --username loginuser --password loginuser --table stock_info --export-dir /user/hlli/stock_info

    finally verify data in mysql command line

    select * from stock_info;

    Incremental importing data

    by using linux timer 'crontab' to schedule a job to execute importing periodically.

    cd /var/spool/cron

    touch hlli (please change hlli to your user name here)

    vi hlli

    */5 * * * * /usr/lib/sqoop/bin/sqoop import --connect jdbc:mysql://host:port/dbname --username loginuser --password loginuser --table stock_info --m 1 --target-dir /temp --incremental lastmodified --check-column last_chg_date --append --as-parquetfile

    if it works, you will receive email in '/var/spool/mail/hlli'; meanwhile we can verify data by running command

    hadoop fs -ls /

    Commonly used Sqoop commands

    sqoop help import

    sqoop help export

    sqoop help job

    sqoop help codegen

    sqoop help eval

    sqoop help list-tables

    sqoop help list-databases

    sqoop help import-all-tables

    References:

    1. http://sqoop.apache.org/
    2. http://man.linuxde.net/crontab
  • 相关阅读:
    20080531 Windows 下安装 Bugzilla
    20080823 windows + apache + mod_python 的安装
    20080519 在 Windows Server 2003 下安装 SQL Server 2000 提示“无法验证产品密钥”
    20080508 Borland CodeGear 卖了
    20080520 Javascript 随机数产生办法
    20090613 批量操作 Windows Live Mail 邮件的办法
    20080726 Castle项目创始人加入微软
    20080511 php send_mail()
    20080618 ASP.NET Ajax clientside framework failed to load
    20081105 Microsoft Word 2007 中鼠标操作失效的解决办法
  • 原文地址:https://www.cnblogs.com/allanli/p/how_to_use_sqoop.html
Copyright © 2011-2022 走看看