1.ThiftServer介绍
-
正常的hive仅允许使用HiveQL执行查询、更新等操作,并且该方式比较笨拙单一。幸好Hive提供了轻客户端的实现,通过HiveServer或者HiveServer2,客户端可以在不启动CLI的情况下对Hive中的数据进行操作,两者都允许远程客户端使用多种编程语言如Java、Python向Hive提交请求,取回结果 使用jdbc协议连接hive的thriftserver服务器
-
可以实现远程访问
-
可以通过命令链接多个hive
2.ThiftServer启动
-
启动hive的thriftserver
#cd /soft/hive/bin/ #./hiveserver2 #默认启动非后台启动 需要开另外一个终端 #端口号为10000
-
使用beeline连接hiveserver2服务器,client端命令行程序
#beeline Beeline version 2.1.1 by Apache Hive #输入要链接的服务 beeline> !connect jdbc:hive2://localhost:10000 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/soft/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/soft/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Connecting to jdbc:hive2://localhost:10000 Enter username for jdbc:hive2://localhost:10000: Enter password for jdbc:hive2://localhost:10000: Connected to: Apache Hive (version 2.1.1) Driver: Hive JDBC (version 2.1.1) 17/07/13 10:31:00 [main]: WARN jdbc.HiveConnection: Request to set autoCommit to false; Hive does not support autoCommit=false. Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://localhost:10000>
3.beeline的基本用法
-
查看数据库
0: jdbc:hive2://localhost:10000> show databases; +----------------+--+ | database_name | +----------------+--+ | default | | liuyao | +----------------+--+ 2 rows selected (1.485 seconds)
-
查看表
0: jdbc:hive2://localhost:10000> use liuyao; No rows affected (0.123 seconds) 0: jdbc:hive2://localhost:10000> show tables; +-----------+--+ | tab_name | +-----------+--+ | test | +-----------+--+ 1 row selected (0.283 seconds) 或者使用 0: jdbc:hive2://localhost:10000> !tables +------------+--------------+-------------+-------------+----------+-----------+-------------+------------+----------------------------+-----------------+--+ | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | TABLE_TYPE | REMARKS | TYPE_CAT | TYPE_SCHEM | TYPE_NAME | SELF_REFERENCING_COL_NAME | REF_GENERATION | +------------+--------------+-------------+-------------+----------+-----------+-------------+------------+----------------------------+-----------------+--+ | | liuyao | test | TABLE | NULL | NULL | NULL | NULL | NULL | NULL | +------------+--------------+-------------+-------------+----------+-----------+-------------+------------+----------------------------+-----------------+--+ 0: jdbc:hive2://localhost:10000>
-
创建表
CREATE TABLE emp0 ( name string, arr ARRAY<string>, stru1 STRUCT<sex:string,age:int>, map1 MAP<string,int>, map2 MAP<string,ARRAY<string>> ) ;
-
查看表结构
0: jdbc:hive2://localhost:10000> desc emp0; +-----------+-----------------------------+----------+--+ | col_name | data_type | comment | +-----------+-----------------------------+----------+--+ | name | string | | | arr | array<string> | | | stru1 | struct<sex:string,age:int> | | | map1 | map<string,int> | | | map2 | map<string,array<string>> | | +-----------+-----------------------------+----------+--+ 5 rows selected (0.296 seconds)
-
删除表
0: jdbc:hive2://localhost:10000> use default; No rows affected (0.092 seconds) 0: jdbc:hive2://localhost:10000> drop table emp0; No rows affected (1.823 seconds)
4. 数据导入查询等演示
-
生成数据
#vim /root/hive.data 放入以下数据 Michael|Montreal,Toronto|Male,30|DB:80|Product:Developer^DLead Will|Montreal|Male,35|Perl:85|Product:Lead,Test:Lead Shelley|New York|Female,27|Python:80|Test:Lead,COE:Architect Lucy|Vancouver|Female,57|Sales:89,HR:94|Sales:Lead
-
创建表
CREATE TABLE emp1 ( name string, arr ARRAY<string>, stru1 STRUCT<sex:string,age:int>, map1 MAP<string,int>, map2 MAP<string,ARRAY<string>> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' //字段分隔符 COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' //map的kv之间分隔符 LINES TERMINATED BY ' '; //集合元素分隔符
-
导入数据
如果数据在本地用 0: jdbc:hive2://localhost:10000> load data local inpath '/root/hive.data' into table emp0; No rows affected (0.956 seconds) 0: jdbc:hive2://localhost:10000>
-
查询
0: jdbc:hive2://localhost:10000> select * from emp0; +------------+-------------------------+----------------------------+-----------------------+----------------------------------------+--+ | emp0.name | emp0.arr | emp0.stru1 | emp0.map1 | emp0.map2 | +------------+-------------------------+----------------------------+-----------------------+----------------------------------------+--+ | Michael | ["Montreal","Toronto"] | {"sex":"Male","age":30} | {"DB":80} | {"Product":["Developer^DLead"]} | | Will | ["Montreal"] | {"sex":"Male","age":35} | {"Perl":85} | {"Product":["Lead"],"Test":["Lead"]} | | Shelley | ["New York"] | {"sex":"Female","age":27} | {"Python":80} | {"Test":["Lead"],"COE":["Architect"]} | | Lucy | ["Vancouver"] | {"sex":"Female","age":57} | {"Sales":89,"HR":94} | {"Sales":["Lead"]} | | | NULL | NULL | NULL | NULL | +------------+-------------------------+----------------------------+-----------------------+----------------------------------------+--+ 5 rows selected (1.049 seconds) 0: jdbc:hive2://localhost:10000> select arr[0] from emp0; +------------+--+ | c0 | +------------+--+ | Montreal | | Montreal | | New York | | Vancouver | | NULL | +------------+--+ 5 rows selected (0.656 seconds) 0: jdbc:hive2://localhost:10000> select stru1 from emp0; +----------------------------+--+ | stru1 | +----------------------------+--+ | {"sex":"Male","age":30} | | {"sex":"Male","age":35} | | {"sex":"Female","age":27} | | {"sex":"Female","age":57} | | NULL | +----------------------------+--+ 5 rows selected (0.193 seconds) 0: jdbc:hive2://localhost:10000> select map1 from emp0; +-----------------------+--+ | map1 | +-----------------------+--+ | {"DB":80} | | {"Perl":85} | | {"Python":80} | | {"Sales":89,"HR":94} | | NULL | +-----------------------+--+ 5 rows selected (0.216 seconds) 0: jdbc:hive2://localhost:10000> select map1["DB"] from emp0; +-------+--+ | c0 | +-------+--+ | 80 | | NULL | | NULL | | NULL | | NULL | +-------+--+ 5 rows selected (0.249 seconds)
5.使用api编程方式连接到thriftserver服务器
public class TestCURD {
@Test
public void select() throws Exception{
String driverClass = "org.apache.hive.jdbc.HiveDriver";
String url = "jdbc:hive2://192.168.10.145:10000/liuyao";
Class.forName(driverClass);
Connection connection = DriverManager.getConnection(url);
System.out.println(connection);
Statement statement = connection.createStatement();
ResultSet rs = statement.executeQuery("SELECT * FROM emp0");
while (rs.next()){
int id = rs.getInt(1);
String name = rs.getString(2);
System.out.println(id + "," + name);
}
rs.close();
connection.close();
}
}