环境:
hadoop 3.1.1
hive 3.1.0
mysql 8.0.11
安装前准备:
准备好mysql-connector-java-8.0.12.jar驱动包
上传hive的tar包并解压
第一步:
进入hive/conf,拷贝hive-env.sh.template 为hive-env.sh,修改部分为第48,51,54行
1 # Licensed to the Apache Software Foundation (ASF) under one
2 # or more contributor license agreements. See the NOTICE file
3 # distributed with this work for additional information
4 # regarding copyright ownership. The ASF licenses this file
5 # to you under the Apache License, Version 2.0 (the
6 # "License"); you may not use this file except in compliance
7 # with the License. You may obtain a copy of the License at
8 #
9 # http://www.apache.org/licenses/LICENSE-2.0
10 #
11 # Unless required by applicable law or agreed to in writing, software
12 # distributed under the License is distributed on an "AS IS" BASIS,
13 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14 # See the License for the specific language governing permissions and
15 # limitations under the License.
16
17 # Set Hive and Hadoop environment variables here. These variables can be used
18 # to control the execution of Hive. It should be used by admins to configure
19 # the Hive installation (so that users do not have to set environment variables
20 # or set command line parameters to get correct behavior).
21 #
22 # The hive service being invoked (CLI etc.) is available via the environment
23 # variable SERVICE
24
25
26 # Hive Client memory usage can be an issue if a large number of clients
27 # are running at the same time. The flags below have been useful in
28 # reducing memory usage:
29 #
30 # if [ "$SERVICE" = "cli" ]; then
31 # if [ -z "$DEBUG" ]; then
32 # export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParNewGC -XX:-UseGCOverheadLimit"
33 # else
34 # export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit"
35 # fi
36 # fi
37
38 # The heap size of the jvm stared by hive shell script can be controlled via:
39 #
40 # export HADOOP_HEAPSIZE=1024
41 #
42 # Larger heap size may be required when running queries over large number of files or partitions.
43 # By default hive shell scripts use a heap size of 256 (MB). Larger heap size would also be
44 # appropriate for hive server.
45
46
47 # Set HADOOP_HOME to point to a specific hadoop install directory
48 HADOOP_HOME=/opt/module/hadoop-3.1.1
49
50 # Hive Configuration Directory can be controlled by:
51 export HIVE_CONF_DIR=/opt/module/hive/conf
52
53 # Folder containing extra libraries required for hive compilation/execution can be controlled by:
54 export HIVE_AUX_JARS_PATH=/opt/module/hive/lib
第二步,拷贝hive-default.xml.template为hive-site.xml,主要是一些连接数据库的信息,包括用户名,密码,注意把hive.metastore.schema.verification设置为false
1 <?xml version="1.0" encoding="UTF-8" standalone="no"?>
2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--
3 Licensed to the Apache Software Foundation (ASF) under one or more
4 contributor license agreements. See the NOTICE file distributed with
5 this work for additional information regarding copyright ownership.
6 The ASF licenses this file to You under the Apache License, Version 2.0
7 (the "License"); you may not use this file except in compliance with
8 the License. You may obtain a copy of the License at
9
10 http://www.apache.org/licenses/LICENSE-2.0
11
12 Unless required by applicable law or agreed to in writing, software
13 distributed under the License is distributed on an "AS IS" BASIS,
14 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15 See the License for the specific language governing permissions and
16 limitations under the License.
17 -->
18 <configuration>
19 <property>
20 <name>javax.jdo.option.ConnectionURL</name>
21 <value>jdbc:mysql://127.0.0.1:3306/hive?useUnicode=true&characterEncoding=utf8&useSSL=false&serverTimezone=GMT</value>
22 <description>JDBC connect string for a JDBC metastore</description>
23 </property>
24
25 <property>
26 <name>hive.metastore.schema.verification</name>
27 <value>false</value>
28 </property>
29 <property>
30 <name>javax.jdo.option.ConnectionDriverName</name>
31 <value>com.mysql.cj.jdbc.Driver</value>
32 <description>Driver class name for a JDBC metastore</description>
33 </property>
34
35 <property>
36 <name>javax.jdo.option.ConnectionUserName</name>
37 <value>root</value>
38 <description>username to use against metastore database</description>
39 </property>
40
41 <property>
42 <name>javax.jdo.option.ConnectionPassword</name>
43 <value>123</value>
44 <description>password to use against metastore database</description>
45 </property>
46 <property>
47 <name>datanucleus.schema.autoCreateAll</name>
48 <value>true</value>
49 </property>
50 </configuration>
第三步:上传mysql驱动包到hive/lib目录下
第四步:在mysql中创建hive数据库
create database hive;
第五步:进入bin目录执行(指定元数据库并进行初始化)
./schematool -dbType mysql -initSchema
第六步:启动hive(要先启动hadoop)
./hive
启动完成后show databases;
注意事项:
1.hive-site.xml中 mysql的驱动名称为com.mysql.cj.jdbc.Driver
2.xml文档中javax.jdo.option.ConnectionURL中&要用&替代,一定要指定字符集,时区
3.我已经提前对数据库的root用户进行授权
4.如果测试hive插入数据,要在hdfs上创建/user/hive/warehouse路径
常见问题:
1../schematool -dbType mysql -initSchema提示server code 255之类的是连接数据库的字符集没指定,failed的话删除hive数据库,重新创建,再次执行此命令即可
2.The server time zone value 'PDT' is unrecognized or represents more than one
时区问题.写成jdbc:mysql://127.0.0.1:3306/hive?useUnicode=true&characterEncoding=utf8&useSSL=false&serverTimezone=GMT即可
参考https://blog.csdn.net/m0_37520980/article/details/80364884