环境:Hadoop 2.3.0
sqoop 1.4.5
1、下载并解压sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz (解压完,名字会很长,可以根据需要自己修改下目录名)
tar -zxvfsqoop-1.4.5.bin__hadoop-2.0.4-alpha.tar.gz
2、配置环境变量
export SQOOP_HOME=/home/grid2/sqoop-1.4.5
export PATH=$PATH:$SQOOP_HOME/bin
3、修改配置文件
cd /home/grid2/sqoop-1.4.5/conf
cp sqoop-env-template.sh sqoop-env.sh
vi sqoop-env.sh
exportHADOOP_COMMON_HOME=/home/grid2/hadoop-2.3.0
exportHIVE_HOME=/home/grid2/apache-hive-0.13.1-bin
cd /home/grid2/sqoop-1.4.5/bin
vi configure-sqoop
将HBASE,ZOOKEEPER,ACCUMULO部分全部注释掉
4、拷贝mysql连接器(注意:对于1.4.5版本的sqoop,要用5.1.31版本的mysql连接器)
cp mysql-connector-java-5.1.31-bin.jar./sqoop-1.4.5/lib/
5、检查sqoop是否配置成功
sqoop help
6、如果想要使用sqoop的job需要配置
<property>
<name>sqoop.metastore.client.autoconnect.url</name>
<value>jdbc:hsqldb:file:/usr/local/sqoop-1.4.6/metastore/meta.db;shutdown=true</value>
<description>The connect string to use when connecting to a
job-management metastore. If unspecified, uses ~/.sqoop/.
You can specify a different path here.
</description>
</property>
<property>
<name>sqoop.metastore.client.autoconnect.username</name>
<value>SA</value>
<description>The username to bind to the metastore.
</description>
</property>
<property>
<name>sqoop.metastore.client.autoconnect.password</name>
<value></value>
<description>The password to bind to the metastore.
</description>
</property>
<property>
<name>sqoop.metastore.client.record.password</name>
<value>true</value>
<description>If true, allow saved passwords in the metastore.
</description>
</property>
<property>
<name>sqoop.metastore.server.location</name>
<value>/usr/local/sqoop-1.4.6/metastore/shared.db</value>
<description>Path to the shared metastore database files.
If this is not set, it will be placed in ~/.sqoop/.
</description>
</property>
<property>
<name>sqoop.metastore.server.port</name>
<value>16000</value>
<description>Port that this metastore should listen on.
</description>
</property>
出错记录
问题1:
ERROR manager.SqlManager: Error reading from database:
java.sql.SQLException: Streaming result
setcom.mysql.jdbc.RowDataDynamic@6c4fc156 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
java.sql.SQLException: Streaming result set
com.mysql.jdbc.RowDataDynamic@6c4fc156 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
解决:
后经定位,发现是mysql-connect-java jar包的版本不对
改为mysql-connector-java-5.1.31 这个版本后就可以了
问题2:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:For direct MetaStore DB connections, we don't support retries at the client level.)
解决:
hive/lib换成mysql-connector-java-5.1.31-bin.jar
quit退出hive,然后重新进入hive
问题3:
[hadoop@Master bin]$ sqoop export --connect jdbc:mysql://localhost:3306/test --username dyh --password 000000 --table users --export-dir /user/hive/warehouse/users/part-m-00000 --input-fields-terminated-by ' 001'
Warning: /usr/lib/hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: $HADOOP_HOME is deprecated.
13/12/12 19:50:38 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
13/12/12 19:50:38 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
13/12/12 19:50:38 INFO tool.CodeGenTool: Beginning code generation
13/12/12 19:50:38 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `users` AS t LIMIT 1
13/12/12 19:50:38 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `users` AS t LIMIT 1
13/12/12 19:50:38 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hadoop
Note: /tmp/sqoop-hadoop/compile/9731783979d46a3414a9f86d700bec33/users.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
13/12/12 19:50:39 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/9731783979d46a3414a9f86d700bec33/users.jar
13/12/12 19:50:39 INFO mapreduce.ExportJobBase: Beginning export of users
13/12/12 19:50:41 INFO input.FileInputFormat: Total input paths to process : 1
13/12/12 19:50:41 INFO input.FileInputFormat: Total input paths to process : 1
13/12/12 19:50:41 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/12/12 19:50:41 WARN snappy.LoadSnappy: Snappy native library not loaded
13/12/12 19:50:42 INFO mapred.JobClient: Running job: job_201312051716_0034
13/12/12 19:50:43 INFO mapred.JobClient: map 0% reduce 0%
13/12/12 19:50:51 INFO mapred.JobClient: map 25% reduce 0%
13/12/12 19:50:53 INFO mapred.JobClient: map 50% reduce 0%
13/12/12 19:50:58 INFO mapred.JobClient: Task Id : attempt_201312051716_0034_m_000002_0, Status : FAILED
java.io.IOException: java.sql.SQLException: Access denied for user 'dyh'@'localhost' (using password: YES)
Caused by: java.sql.SQLException: Access denied for user 'dyh'@'localhost' (using password: YES)
13/12/12 19:50:58 INFO mapred.JobClient: Task Id : attempt_201312051716_0034_m_000003_0, Status : FAILED
java.io.IOException: java.sql.SQLException: Access denied for user 'dyh'@'localhost' (using password: YES)
解决办法:
将jdbc:mysql://localhost:3306/test 中的localhost 改为ip地址即可。