1、下载sqoop
sqoop有两个方向的版本,sqoop 1版本和sqoop 2版本,这里使用sqoop 1。sqoop 1下载链接 http://www.apache.org/dyn/closer.lua/sqoop/1.4.7
2、解压
cd /data/
tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
3、修改配置文件,主要是配置hadoop目录和hive目录
cd conf
cp sqoop-env-template.sh sqoop-env.sh
配置hadoop、hive
vim sqoop-env.sh
export HADOOP_COMMON_HOME=hadoop安装目录
export HADOOP_MAPRED_HOME=hadoop安装目录
export HIVE_HOME=hive安装目录
4、下载MySQL数据库链接jar包到sqoop 的lib目录
wget https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.19/mysql-connector-java-8.0.19.jar
5、 连接MySQL
./sqoop import --connect jdbc:mysql://ip:3306/test?zeroDateTimeBehavior=CONVERT_TO_NULL --username '数据库账号' --password '数据库密码' --table 数据库表名 --fields-terminated-by ',' --target-dir '/data/hive/test'
6、查看HDFS上面的数据
hdfs dfs -ls /data/hive/test
报错:
ERROR tool.ImportTool: Import failed: java.io.IOException: Generating splits for a textual index column allowed only in case of "-Dorg.apache.sqoop.splitter.allow_text_splitter=true" property passed as a parameter at org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat.getSplits(DataDrivenDBInputFormat.java:204) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:310) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:327) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:200) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588) at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:200) at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:173) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:270) at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692) at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:127)
解决 办法,加上 -Dorg.apache.sqoop.splitter.allow_text_splitter=true 参数,允许主键是字符串。
即:
./sqoop import "-Dorg.apache.sqoop.splitter.allow_text_splitter=true" --connect jdbc:mysql://ip:3306/test?zeroDateTimeBehavior=CONVERT_TO_NULL --username '数据库账号' --password '数据库密码' --table 数据库表名 --fields-terminated-by ',' --target-dir '/data/hive/test'