zoukankan      html  css  js  c++  java
  • Hive-1.2.1_02_简单操作与访问方式

    1. Hive默认显示当前使用库

     1 1、需要用时,即时配置,在cli执行属性设置,这种配置方式,当重新打开cli时,就会生效:
     2 hive> set hive.cli.print.current.db=true;
     3 
     4 2、一次配置,永久生效,在当前用户的HOME目录下,新建.hiverc文件,把属性设置命令放置到该文件中,每次打开cli时,都会先执行该文件。
     5 [yun@mini01 ~]$ pwd
     6 /app 
     7 [yun@mini01 ~]$ cat .hiverc 
     8 set hive.cli.print.current.db=true;
     9 
    10 3、在hive配置文件中添加配置【推荐】,上一篇文章hive配置中已经有了该配置项
    11   <!-- 显示当前使用的数据库 -->
    12   <property>
    13     <name>hive.cli.print.current.db</name>
    14     <value>true</value>
    15     <description>Whether to include the current database in the Hive prompt.</description>
    16   </property>

    2. 创建库

     1 # 没有显示当前使用库
     2 [yun@mini01 ~]$ hive
     3 
     4 Logging initialized using configuration in jar:file:/app/hive-1.2.1/lib/hive-common-1.2.1.jar!/hive-log4j.properties
     5 hive> show databases;  # 默认库为default
     6 OK
     7 default
     8 Time taken: 0.774 seconds, Fetched: 1 row(s)
     9 # 创建库
    10 hive> create database zhang;
    11 OK
    12 Time taken: 0.168 seconds
    13 hive> show databases;  
    14 OK
    15 default
    16 zhang
    17 Time taken: 0.02 seconds, Fetched: 2 row(s)

    浏览器访问

    3. 创建表

     1 # 默认显示当前使用库
     2 hive (default)> show databases;
     3 OK
     4 default
     5 zhang
     6 Time taken: 0.729 seconds, Fetched: 2 row(s)
     7 hive (default)> use zhang;
     8 OK
     9 Time taken: 0.036 seconds
    10 hive (zhang)> create table t_sz01(id int, name string)
    11             > row format delimited
    12             > fields terminated by ',';
    13 OK
    14 Time taken: 0.187 seconds
    15 hive (zhang)> show tables;
    16 OK
    17 t_sz01
    18 Time taken: 0.031 seconds, Fetched: 1 row(s)

    浏览器访问

    4. 创建数据并上传

     1 [yun@mini01 hive]$ cat sz.dat 
     2 1,zhangsan
     3 5,李四
     4 3,wangwu
     5 2,赵六
     6 4,zhouqi
     7 6,孙八
     8 [yun@mini01 hive]$ hadoop fs -put sz.dat /user/hive/warehouse/zhang.db/t_sz01   # 上传 
     9 [yun@mini01 hive]$ hadoop fs -ls /user/hive/warehouse/zhang.db/t_sz01/
    10 Found 1 items
    11 -rw-r--r--   2 yun supergroup         56 2018-07-03 21:27 /user/hive/warehouse/zhang.db/t_sz01/sz.dat
    12 [yun@mini01 hive]$ hadoop fs -cat /user/hive/warehouse/zhang.db/t_sz01/sz.dat
    13 1,zhangsan
    14 5,李四
    15 3,wangwu
    16 2,赵六
    17 4,zhouqi
    18 6,孙八

    5. Hive查询数据

     1 hive (zhang)> show tables;
     2 OK
     3 t_sz01
     4 Time taken: 0.028 seconds, Fetched: 1 row(s)
     5 hive (zhang)> select * from t_sz01;  # 全表查询 
     6 OK
     7 1    zhangsan
     8 5    李四
     9 3    wangwu
    10 2    赵六
    11 4    zhouqi
    12 6    孙八
    13 Time taken: 0.264 seconds, Fetched: 6 row(s)
    14 hive (zhang)> select count(1) from t_sz01;  # 表数据条数 
    15 Query ID = yun_20180703213443_ebca743c-2025-405a-8832-59359e1566c2
    16 Total jobs = 1
    17 Launching Job 1 out of 1
    18 Number of reduce tasks determined at compile time: 1
    19 In order to change the average load for a reducer (in bytes):
    20   set hive.exec.reducers.bytes.per.reducer=<number>
    21 In order to limit the maximum number of reducers:
    22   set hive.exec.reducers.max=<number>
    23 In order to set a constant number of reducers:
    24   set mapreduce.job.reduces=<number>
    25 Starting Job = job_1530619476293_0001, Tracking URL = http://mini02:8088/proxy/application_1530619476293_0001/
    26 Kill Command = /app/hadoop/bin/hadoop job  -kill job_1530619476293_0001
    27 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
    28 2018-07-03 21:34:53,810 Stage-1 map = 0%,  reduce = 0%
    29 2018-07-03 21:35:00,224 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.5 sec
    30 2018-07-03 21:35:07,882 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 6.37 sec
    31 MapReduce Total cumulative CPU time: 6 seconds 370 msec
    32 Ended Job = job_1530619476293_0001
    33 MapReduce Jobs Launched: 
    34 Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 6.37 sec   HDFS Read: 6513 HDFS Write: 2 SUCCESS
    35 Total MapReduce CPU Time Spent: 6 seconds 370 msec
    36 OK
    37 6
    38 Time taken: 25.312 seconds, Fetched: 1 row(s)
    39 hive (zhang)> select id,name from t_sz01 where id >3;  # 查询id>3 
    40 OK
    41 5    李四
    42 4    zhouqi
    43 6    孙八
    44 Time taken: 0.126 seconds, Fetched: 3 row(s)
    45 hive (zhang)> select id,name from t_sz01 where id >3 limit 2;  # 不能使用 limit m,n 
    46 OK
    47 5    李四
    48 4    zhouqi
    49 Time taken: 0.072 seconds, Fetched: 2 row(s)
    50 hive (zhang)> select id, name from t_sz01 order by name;  # 使用order by 排序
    51 Query ID = yun_20180703214314_db222afe-3287-4c8e-8077-73aa4fec62ef
    52 Total jobs = 1
    53 Launching Job 1 out of 1
    54 Number of reduce tasks determined at compile time: 1
    55 In order to change the average load for a reducer (in bytes):
    56   set hive.exec.reducers.bytes.per.reducer=<number>
    57 In order to limit the maximum number of reducers:
    58   set hive.exec.reducers.max=<number>
    59 In order to set a constant number of reducers:
    60   set mapreduce.job.reduces=<number>
    61 Starting Job = job_1530619476293_0002, Tracking URL = http://mini02:8088/proxy/application_1530619476293_0002/
    62 Kill Command = /app/hadoop/bin/hadoop job  -kill job_1530619476293_0002
    63 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
    64 2018-07-03 21:43:25,676 Stage-1 map = 0%,  reduce = 0%
    65 2018-07-03 21:43:34,166 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.64 sec
    66 2018-07-03 21:43:40,606 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 4.85 sec
    67 MapReduce Total cumulative CPU time: 4 seconds 850 msec
    68 Ended Job = job_1530619476293_0002
    69 MapReduce Jobs Launched: 
    70 Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 4.85 sec   HDFS Read: 5789 HDFS Write: 74 SUCCESS
    71 Total MapReduce CPU Time Spent: 4 seconds 850 msec
    72 OK
    73 3    wangwu
    74 1    zhangsan
    75 4    zhouqi
    76 6    孙八
    77 5    李四
    78 2    赵六
    79 Time taken: 26.768 seconds, Fetched: 6 row(s)

    MapReduce信息

    1 http://mini02:8088    

    6. Hive的访问方式

    6.1. Hive交互shell

    1 # 之间已经添加环境变量
    2 [yun@mini01 ~]$ hive

    6.2. Hive thrift服务

      启动方式,(例如是在mini01上):

    1 # 之间已经添加环境变量
    2 启动为前台: hiveserver2
    3 启动为后台: nohup hiveserver2 1>/app/hive/logs/hiveserver.log 2>/app/hive/logs/hiveserver.err &
    4 # 没有 /app/hive/logs 目录就创建

      启动成功后,可以在别的节点上用beeline去连接

    方式1

     1 # 由于没有在其他机器安装,所以还是在本机用beeline去连接
     2 [yun@mini01 bin]$ beeline 
     3 Beeline version 1.2.1 by Apache Hive
     4 beeline> !connect jdbc:hive2://mini01:10000   # jdbc连接 可以是mini01、127.0.0.0、10.0.0.11、172.16.1.11 
     5 Connecting to jdbc:hive2://mini01:10000
     6 Enter username for jdbc:hive2://mini01:10000: yun
     7 Enter password for jdbc:hive2://mini01:10000: 
     8 Connected to: Apache Hive (version 1.2.1)
     9 Driver: Hive JDBC (version 1.2.1)
    10 Transaction isolation: TRANSACTION_REPEATABLE_READ
    11 0: jdbc:hive2://mini01:10000>

    方式2

    1 # 或者启动就连接:
    2 [yun@mini01 ~]$ beeline -u jdbc:hive2://mini01:10000 -n yun 

    接下来就可以做正常sql查询了

      例如:

     1 0: jdbc:hive2://mini01:10000> show databases;
     2 +----------------+--+
     3 | database_name  |
     4 +----------------+--+
     5 | default        |
     6 | zhang          |
     7 +----------------+--+
     8 2 rows selected (0.437 seconds)
     9 0: jdbc:hive2://mini01:10000> use zhang;
    10 No rows affected (0.058 seconds)
    11 0: jdbc:hive2://mini01:10000> show tables;
    12 +-----------+--+
    13 | tab_name  |
    14 +-----------+--+
    15 | t_sz01    |
    16 +-----------+--+
    17 1 row selected (0.054 seconds)
    18 0: jdbc:hive2://mini01:10000> select * from t_sz01;
    19 +------------+--------------+--+
    20 | t_sz01.id  | t_sz01.name  |
    21 +------------+--------------+--+
    22 | 1          | zhangsan     |
    23 | 5          | 李四         |
    24 | 3          | wangwu       |
    25 | 2          | 赵六         |
    26 | 4          | zhouqi       |
    27 | 6          | 孙八         |
    28 +------------+--------------+--+
    29 6 rows selected (0.641 seconds)
    30 0: jdbc:hive2://10.0.0.11:10000> select count(1) from t_sz01;  # 条数查询 
    31 INFO  : Number of reduce tasks determined at compile time: 1
    32 INFO  : In order to change the average load for a reducer (in bytes):
    33 INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
    34 INFO  : In order to limit the maximum number of reducers:
    35 INFO  :   set hive.exec.reducers.max=<number>
    36 INFO  : In order to set a constant number of reducers:
    37 INFO  :   set mapreduce.job.reduces=<number>
    38 INFO  : number of splits:1
    39 INFO  : Submitting tokens for job: job_1530619476293_0003
    40 INFO  : The url to track the job: http://mini02:8088/proxy/application_1530619476293_0003/
    41 INFO  : Starting Job = job_1530619476293_0003, Tracking URL = http://mini02:8088/proxy/application_1530619476293_0003/
    42 INFO  : Kill Command = /app/hadoop/bin/hadoop job  -kill job_1530619476293_0003
    43 INFO  : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
    44 INFO  : 2018-07-03 22:58:43,405 Stage-1 map = 0%,  reduce = 0%
    45 INFO  : 2018-07-03 22:58:49,882 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.56 sec
    46 INFO  : 2018-07-03 22:58:57,815 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 5.28 sec
    47 INFO  : MapReduce Total cumulative CPU time: 5 seconds 280 msec
    48 INFO  : Ended Job = job_1530619476293_0003
    49 +------+--+
    50 | _c0  |
    51 +------+--+
    52 | 6    |
    53 +------+--+
    54 1 row selected (25.433 seconds)

    6.3. hive -e "HiveQL"

          适用于写脚本 

     1 [yun@mini01 ~]$ hive -e "use exercise; select * from student;"
     2 
     3 Logging initialized using configuration in jar:file:/app/hive-1.2.1/lib/hive-common-1.2.1.jar!/hive-log4j.properties
     4 OK
     5 Time taken: 1.109 seconds
     6 OK
     7 95001    李勇    男    20    CS
     8 95002    刘晨    女    19    IS
     9 95003    王敏    女    22    MA
    10 95004    张立    男    19    IS
    11 95005    刘刚    男    18    MA
    12 95006    孙庆    男    23    CS
    13 95007    易思玲    女    19    MA
    14 95008    李娜    女    18    CS
    15 95009    梦圆圆    女    18    MA
    16 95010    孔小涛    男    19    CS
    17 Time taken: 0.786 seconds, Fetched: 10 row(s)

    6.4. hive  -f  'test.sql'

      适用于hive直接调用一个脚本,该脚本中全是hive的类SQL语句。  

    1 # aztest.sql  脚本名称
    2 CREATE DATABASE IF NOT EXISTS azkaban;
    3 use azkaban;
    4 DROP TABLE IF EXISTS aztest;
    5 DROP TABLE IF EXISTS azres;
    6 create table aztest(id int,name string) row format delimited fields terminated by ',';
    7 load data inpath '/aztest/hiveinput/azkb.dat' into table aztest;
    8 create table azres row format delimited fields terminated by '#' as select * from aztest;
    9 insert overwrite directory '/aztest/hiveoutput' select count(1) from aztest; 

    7. 文章参考

    1、Hive创建表格报【Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.

    2、实战 - Hive cli命令行工具默认显示当前数据库

  • 相关阅读:
    javascript与CSS复习(《精通javascript》)
    javascript改进表单
    javascript与CSS复习(三)
    javascript dom代码应用:简单的相册
    重新发明轮子之Draggable Elements
    《精通javascript》5,6章复习(三)
    我的JAVA之旅(五)继承
    全面学习DBMS包之UTL_FILE
    想法
    生活在上海
  • 原文地址:https://www.cnblogs.com/zhanglianghhh/p/9348958.html
Copyright © 2011-2022 走看看