今天学习了以下几个方面的内容:
sqoop导入mysql数据到hdfs:
./sqoop import --connect jdbc:mysql://[IP]:3306/[数据库名] --username root --P --table [表名] -m 1 --target-dir [目标地址];
-m 1 表示把数据存在一个文件里,默认是放在四个文件里。
sqoop导入mysql数据到hive:
./sqoop import --hive-import --connect jdbc:mysql://[IP]:3306/[数据库名] --username root --P --table [表名] --hive-table [表名];
sqoop导出数据到mysql:
./sqoop export --connect jdbc:mysql://[IP]:3306/[数据库名] --username root --P --table myemp --export-dir=’******’;
HIVE连接JDBC:
1新建maven项目,
2导入相应的依赖包:
<dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>2.7.3</version> </dependency> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-jdbc</artifactId> <version>1.2.2</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> <version>2.7.3</version> </dependency> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-metastore</artifactId> <version>1.2.2</version> </dependency> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>1.2.2</version> </dependency>
3编写HiveJDBCUtils工具类;
public class HiveJDBCUtils { //加载驱动 private static String driver="org.apache.hive.jdbc.HiveDriver"; private static String url="jdbc:hive2://10.25.134.142:10000/default"; static { try { Class.forName(driver); } catch (ClassNotFoundException e) { e.printStackTrace(); } } //获取连接 public static Connection getConnection() throws SQLException { return DriverManager.getConnection(url,"root","144214"); } //关闭连接 public static void close(Connection connection, Statement statement) throws SQLException { if(connection!=null){ connection.close(); } if(statement!=null){ statement.close(); } } public static void close(Connection connection, Statement statement,ResultSet resultSet) throws SQLException { if(connection!=null){ connection.close(); } if(statement!=null){ statement.close(); } if(resultSet!=null){ resultSet.close(); } } }
4编写测试类;
@Test public void testConnection(){ try { Connection connection = HiveJDBCUtils.getConnection(); System.out.println(connection); } catch (SQLException e) { e.printStackTrace(); } }
注意:运行之前要先开启hive服务:
Linux上进入hive的bin目录,输入命令: hive --service hiveserver2
运行测试类,得到如下结果则连接成功。