zoukankan html css js c++ java

hive

1 解压 apache-hive-1.2.1-bin.tar.gz 到/opt/module/目录下面

tar -zxvf apache-hive-1.2.1-bin.tar.gz -C /opt/module/

2.修改 apache-hive-1.2.1-bin.tar.gz 的名称为 hive

[root@hadoop01 module]# mv apache-hive-1.2.1-bin/ hive

3. 修改/opt/module/hive/conf 目录下的 hive-env.sh.template 名称为 hive-env.sh

mv hive-env.sh.template hive-env.sh

4.配置 hive-env.sh 文件

export HADOOP_HOME=/software/hadoop-2.7.1



export HIVE_CONF_DIR=/opt/module/hive/conf

5 .必须启动 hdfs 和 yarn

[atguigu@hadoop102 hadoop-2.7.2]$ sbin/start-dfs.sh
[atguigu@hadoop103 hadoop-2.7.2]$ sbin/start-yarn.sh

6. 在 HDFS 上创建/tmp 和/user/hive/warehouse 两个目录并修改他们的同组权限可写

bin/hadoop fs -mkdir /tmp

bin/hadoop fs -mkdir -p /user/hive/warehouse

7.更改目录权限

bin/hadoop fs -chmod g+w /tmp

bin/hadoop fs -chmod g+w /user/hive/warehouse

8. Hive 基本操作

启动 hive
[atguigu@hadoop102 hive]$ bin/hive

查看数据库
hive> show databases;

打开默认数据库
hive> use default;

显示 default 数据库中的表
hive> show tables;

创建一张表
hive> create table student(id int, name string);

显示数据库中有几张表
hive> show tables;

查看表的结构
hive> desc student;

向表中插入数据
hive> insert into student values(1000,"ss");

hive> insert into student values(100,"ss");
Query ID = root_20200426160909_831fc215-2543-45f7-93a2-4867f3320f56
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1587886069426_0001, Tracking URL = http://hadoop02:8088/proxy/application_1587886069426_0001/
Kill Command = /software/hadoop-2.7.1/bin/hadoop job  -kill job_1587886069426_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2020-04-26 16:09:16,861 Stage-1 map = 0%,  reduce = 0%
2020-04-26 16:09:23,129 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.43 sec
MapReduce Total cumulative CPU time: 1 seconds 430 msec
Ended Job = job_1587886069426_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://hadoop01:9000/user/hive/warehouse/student/.hive-staging_hive_2020-04-26_16-09-09_979_3268714695468457370-1/-ext-10000
Loading data to table default.student
Table default.student stats: [numFiles=1, numRows=1, totalSize=7, rawDataSize=6]
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   Cumulative CPU: 1.43 sec   HDFS Read: 3639 HDFS Write: 78 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 430 msec
OK
Time taken: 14.413 seconds

查询表中数据
hive> select * from student;

退出 hive
hive> quit;

将本地文件导入 Hive 案例

将本地/opt/module/data/student.txt 这个目录下的数据导入到 hive 的 student(id int, name string)表中。

1．数据准备

在/opt/module/data 这个目录下准备数据

在/opt/module/目录下创建 data
[atguigu@hadoop102 module]$ mkdir data

在/opt/module/datas/目录下创建 student.txt 文件并添加数据

[atguigu@hadoop102 datas]$ touch student.txt
[atguigu@hadoop102 datas]$ vi student.txt
1001 zhangshan
1002 lishi
1003 zhaoliu

注意以 tab 键间隔。

2．Hive 实际操作
（1）启动 hive
[atguigu@hadoop102 hive]$ bin/hive
（2）显示数据库
hive> show databases;
（3）使用 default 数据库
hive> use default;
（4）显示 default 数据库中的表
hive> show tables;
（5）删除已创建的 student 表
hive> drop table student;
（6）创建 student 表, 并声明文件分隔符’ ’
hive> create table student(id int, name string) ROW FORMAT
DELIMITED FIELDS TERMINATED
BY ' ';
（7）加载/opt/module/data/student.txt 文件到 student 数据库表中。
hive> load data local inpath '/opt/module/data/student.txt' into
table student;
（8）Hive 查询结果
hive> select * from student;
OK
1001 zhangshan
1002 lishi
1003 zhaoliu
Time taken: 0.266 seconds, Fetched: 3 row(s)

在建立一个txt文件

[root@hadoop01 data]# cp student.txt student1.txt
[root@hadoop01 data]# vim student1.txt

[root@hadoop01 data]# cp student.txt student1.txt 
[root@hadoop01 data]# vim student1.txt

通过hadoop -fs命令添加

[root@hadoop01 data]# hadoop fs -put student1.txt /user/hive/warehouse/stu

查看select

hive> select * from stu;
OK
1001    zhangshan
1002    lishi
1003    zhaoliu
1004    zhangshan33
1005    lishi4444
1006    zhaoliu5555
Time taken: 0.043 seconds, Fetched: 6 row(s)

也就是说 load data local inpath '/opt/module/data/student.txt' into table student; 和

hadoop fs -put student1.txt /user/hive/warehouse/stu
  
这两个命令一样，都是添加数据往表里面. 用insert命令太少了，不建议使用.


在建立一个文件 上传到根目录

1077  cp student1.txt student2.txt
 
 1079  hadoop fs -put student2.txt /

然后在导入 ,这个命令相当于移动了 ，根目录下面的这个student2.txt 移动了.

hive>  load data inpath '/student2.txt' into table stu;
Loading data to table default.stu
Table default.stu stats: [numFiles=3, totalSize=137]

查看

hive> select * from stu;
OK
1001    zhangshan
1002    lishi
1003    zhaoliu
1004    zhangshan33
1005    lishi4444
1006    zhaoliu5555
1004    zhangshan33
1005    lishi4444
1006    zhaoliu5555
Time taken: 0.31 seconds, Fetched: 9 row(s)

原因是，Metastore 默认存储在自带的 derby 数据库中，推荐使用 MySQL 存储 Metastore

只能开一个hive 连接 ，因为deby的原因

2.4 MySql 安装

[root@hadoop01 data]# rpm -qa |grep mysql

卸载

root@hadoop102 桌 面 ]# rpm -e --nodeps  mysql-libs-5.1.73-7.el6.x86_64

清除依赖卸载

安装服务端

root@hadoop01 mysql]# ll
总用量 198612
-rw-r--r--. 1 root root  25381952 4月  26 17:20 mysql-community-client-5.7.26-1.el7.x86_64.rpm
-rw-r--r--. 1 root root 173541272 4月  26 17:20 mysql-community-server-5.7.26-1.el7.x86_64.rpm
-rw-r--r--. 1 root root   4452049 4月  24 15:46 mysql-connector-java-5.1.47.tar.gz

安装

  983  rpm -ivh mysql-community-server-5.7.26-1.el7.x86_64.rpm  --nodeps --force

查看临时密码

  984  cat /root/.mysql_secret

或者更改

vim /etc/my.cnf

在my.ini文件末尾加上“skip-grant-tables”（取消权限设置）保存文件

不需要密码登录

mysql -uroot -p

设置密码重点

必须选择 use mysql 数据库否则会报错。

如下：

MySQL [(none)]> update user set authentication_string=passworD("root") where user='root';
ERROR 1046 (3D000):

选择use mysql后即可

MySQL [mysql]> update user set password=passworD("root") where user='root';
Query OK, 4 rows affected, 1 warning (0.01 sec)

flush privileges;

然后把 vim /etc/my.cnf

把skip-grant-tables去掉.

安装客户端;

[root@hadoop01 mysql]# rpm -ivh mysql-community-client-5.7.26-1.el7.x86_64.rpm  --nodeps --force

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| hue                |
| metastore          |
| mysql              |
| nav_as             |
| nav_ms             |
| oozie              |
| performance_schema |
| sentry             |
+--------------------+
9 rows in set (0.00 sec)

mysql> select user,host,password from user;
+------+-----------+-------------------------------------------+
| user | host      | password                                  |
+------+-----------+-------------------------------------------+
| root | localhost | *81F5E21E35407D884A6CD4A731AEBFB6AF209E1B |
| root | hadoop01  | *81F5E21E35407D884A6CD4A731AEBFB6AF209E1B |
| root | 127.0.0.1 | *81F5E21E35407D884A6CD4A731AEBFB6AF209E1B |
| root | ::1       | *81F5E21E35407D884A6CD4A731AEBFB6AF209E1B |
| hive | %         | *2470C0C06DEE42FD1618BB99005ADCA2EC9D1E19 |
+------+-----------+-------------------------------------------+

 修改 user 表，把 Host 表内容修改为%
mysql>update user set host='%' where host='localhost';

删除 root 用户的其他 host
delete from user where Host='hadoop102';
delete from user where Host='127.0.0.1';
delete from user where Host='::1';

mysql> flush privileges

安装驱动MySQL-connector

 tar -zxvf mysql-connector-java-5.1.47.tar.gz

．拷贝 mysql-connector-java-5.1.27-bin.jar 到/opt/module/hive/lib/
[root@hadoop102 mysql-connector-java-5.1.27]# cp /opt/software/mysql-libs/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /opt/module/hive/lib

配置 Metastore 到到 MySql

1. 在/opt/module/hive/conf 目录下创建一个 hive-site.xml


[atguigu@hadoop102 conf]$ touch hive-site.xml
[atguigu@hadoop102 conf]$ vi hive-site.xml



2．根据官方文档配置参数，拷贝数据到 hive-site.xml 文件中

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop22:3306/metastore?createDatabaseI
fNotExist=true</value>
<description>JDBC connect string for a JDBC
metastore</description>
</property>
<property>

<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC
metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore
database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
<description>password to use against metastore
database</description>
</property>
</configuration>

3．配置完毕后，如果启动 hive 异常，可以重新启动虚拟机。（重启后，别忘了启
动 hadoop 集群）

2.5.3 多窗口启动 Hive 测试


1．先启动 MySQL
[atguigu@hadoop102 mysql-libs]$ mysql -uroot -proot
查看有几个数据库
mysql> show databases;

+--------------------+
| Database           |
+--------------------+
| information_schema |
| hue                |
| metastore          |
| mysql              |
| nav_as             |
| nav_ms             |
| oozie              |
| performance_schema |
| sentry             |
+--------------------+
9 rows in set (0.00 sec)


2．再次打开多个窗口，分别启动 hive
[atguigu@hadoop102 hive]$ bin/hive
3．启动 hive 后，回到 MySQL 窗口查看数据库，显示增加了 metastore 数据库

2.6 HiveJDBC 访问
2.6.1 启动 hiveserver2 服务
[atguigu@hadoop102 hive]$ bin/hiveserver2

启动 beeline

[root@hadoop01 hive]# cd bin/
[root@hadoop01 bin]# ls
beeline  derby.log  ext  hive  hive-config.sh  hiveserver2  metastore_db  metatool  schematool
[root@hadoop01 bin]# beeline
Beeline version 1.2.1 by Apache Hive
beeline>

2.9.1 Hive 数据仓库位置配置
1）Default 数据仓库的最原始位置是在 hdfs 上的：/user/hive/warehouse 路径下

2）在仓库目录下，没有对默认的数据库 default 创建文件夹。如果某张表属于 default
数据库，直接在数据仓库目录下创建一个文件夹。
3）修改 default 数据仓库原始位置（将 hive-default.xml.template 如下配置信息拷贝到
hive-site.xml 文件中）。

<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the
warehouse</description>
</property>


2.9.3 Hive 运行日志信息配置
1．Hive 的 log 默认存放在/tmp/atguigu/hive.log 目录下（当前用户名下）
2．修改 hive 的 log 存放日志到/opt/module/hive/logs
（1）修改/opt/module/hive/conf/hive-log4j.properties.template 文件名称为
hive-log4j.properties
[atguigu@hadoop102 conf]$ pwd
/opt/module/hive/conf
[atguigu@hadoop102 conf]$ mv hive-log4j.properties.template
hive-log4j.properties
（2）在 hive-log4j.properties 文件中修改 log 存放位置
hive.log.dir=/opt/module/hive/logs

查看目录与表

hive> dfs -ls /user/hive/warehouse/

查看全文

相关阅读:
一个传奇世界玩家对盛大传奇世界的看法
 详解Silverlight 2中的独立存储
 U盘几种工作模式的区别(ZIP,FDD,HDD)
P2P之UDP穿透NAT的原理与实现(转)
ASP.Net2.0 GridView 多列排序，显示排序图标，分页(转)
NHibernate代码解析 SqlCommand SqlString 参数名后绑定
 Silverlight技术调查(转)
对美国转嫁次贷危机的思考
 SQLite.Interop.DLL与System.Data.SQLite.dll比较
 JavaScript中“单实例模式（单值模型）”的实现

原文地址：https://www.cnblogs.com/mengbin0546/p/12712930.html