zoukankan      html  css  js  c++  java
  • hive环境搭建

    注: 本次搭建是基于已经搭建好的hadoop3集群搭建的,故对于hadoop3搭建不做介绍,且本次搭建是以本地模式为例

    特别关注:由于hadoop3xy不兼容hive2xy,如使用hive安装会出现各种问题,故使用hive3作为本次环境搭建

     1.安装mysql

    1.1安装mysql数据库

    yum install -y mysql-server

    1.2对字符集进行设置: 进入/etc/my.cnf文件中,加入default-character-set=utf8,代码如下:

     1.3启动mysql服务,代码如下:

    service mysqld start          #启动mysql服务
    service mysqld status        #查看mysql是否启动
    systemctl enable mysqld        #设置myql开机启动1
    systemctl daemon-reload        #设置myql开机启动2

    1.4设置myql的root密码

    mysql -uroot -p   ,第一次进入时,默认密码为空,输入密码时直接回车可直接进入
    set password for 'root'@'localhost' = password('123456');   设置密码为123456

    1.5新建root1用户,并且赋予两个用户远程登陆权限;

    create user 'root1'@'%' identified by '123456';   #如果使用root作为连接用户此步可以省略,本次安装使用root用户作为连接用户
    grant all on *.* to'root1'@'%';    #如果使用root作为连接用户此步可以省略,本次安装使用root用户作为连接用户

    grant all on *.* to'root'@'%';

    2.配置hive 环境 (/home/hive/conf):(本次环境搭建使用hive3.1.2版本,下载地址: http://archive.apache.org/dist/

    2.1配置hive-env.sh  ,进入conf目录,cp hive-env.sh.template hive-env.sh ,打开 hive-env.sh文件:

    export HADOOP_HOME=/app/hadoop-3.2.1
    export HIVE_CONF_DIR=/app/hive-3.1.2/conf

    2.2配置hive-site.xml,进入conf目录,cp hive-default.xml.template hive-site.xml,打开hive-env.sh文件,对于如下内容有则修改,无则新增:

    <property>
            <name>system:java.io.tmpdir</name>
            <value>/user/hive/warehouse</value>
          </property>
          <property>
            <name>system:user.name</name>
            <value>${user.name}</value>
          </property>
         <property>
            <name>hive.metastore.db.type</name>
            <value>mysql</value>
         </property>
         <property>
            <name>javax.jdo.option.ConnectionURL</name>
            <value>jdbc:mysql://localhost:3306/metastore?createDatabaseIfNotExist=true</value>
         </property>
         <property>
            <name>javax.jdo.option.ConnectionDriverName</name>
            <value>com.mysql.jdbc.Driver</value>
         </property>
        <property>
          <name>javax.jdo.option.ConnectionUserName</name>
          <value>root</value>
          <description>user name for connecting to mysql server</description>
        </property>
        <property>
          <name>javax.jdo.option.ConnectionPassword</name>
          <value>root</value>
          <description>password for connecting to mysql server</description>
        </property>

    2.3创建目录:

    hadoop fs -mkdir -p /tmp
    hadoop fs -mkdir -p /user/hive/warehouse
    hadoop fs -chmod g+w   /tmp
    hadoop fs -chmod g+w   /user/hive/warehouse

    2.4替换低版本的guava.jar文件,否则初始化时会报错:

    错误一:Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
    原因:hadoop和hive的两个guava.jar版本不一致 两个位置分别位于下面两个目录: - /usr/local/hive/lib/ - /usr/local/hadoop/share/hadoop/common/lib/   
    解决办法: 除低版本的那个,将高版本的复制到低版本目录下

    2.5删除hive-site.xml中的&#8特殊字符,否则初始化时会报如下错误:

    错误二:Exception in thread "main" java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
    at [row,col,system-id]: [3224,96,"file:/app/hive-3.1.2/conf/hive-site.xml"]
    原因: hive-site.xml配置文件中,3224行有特殊字符
    解决办法: 进入hive-site.xml文件,跳转到对应行,删除里面的&#8特殊字符即可

    2.6上次jdbc驱动至hive/lib目录下,否则会报如下错误:

    错误三:
    org.apache.hadoop.hive.metastore.HiveMetaException: Failed to load driver
    Underlying cause: java.lang.ClassNotFoundException : com.mysql.jdbc.Driver
    原因:缺少jdbc驱动
    解决办法:上传jdbc( mysql-connector-java-5.1.36-bin.jar )驱动到 hive的lib下    

    2.7初始化hive

    schematool -dbType mysql -initSchema

    2.8启动 metastore服务 (不启用会报:HiveException java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient。)

    ./hive --service metastore &

    2.9进入hive

    hive

    注意如果insert时卡住不能向下执行,查看日志(hive的日志文件在:/tmp/{user.name}下)一直在连接8032端口,则是MR没有启动

    在命令行输入:hadoop classpath

     将内容,添加至yarn-site.xml中

            <property>
                    <name>yarn.application.classpath</name>
                    <value>/app/hadoop-3.2.1/etc/hadoop:/app/hadoop-3.2.1/share/hadoop/common/lib/*:/app/hadoop-3.2.1/share/hadoop/common/*:/app/hadoop-3.2.1/share/hadoop/hdfs:/app/hadoop-3.2.1/share/hadoop/hdfs/lib/*:/app/hadoop-3.2.1/share/hadoop/hdfs/*:/app/hadoop-3.2.1/share/hadoop/mapreduce/lib/*:/app/hadoop-3.2.1/share/hadoop/mapreduce/*:/app/hadoop-3.2.1/share/hadoop/yarn:/app/hadoop-3.2.1/share/hadoop/yarn/lib/*:/app/hadoop-3.2.1/share/hadoop/yarn/*</value>
            </property>

    启动 yarn,start-yarn.sh。

    3.hive创建表:

    #创建表

    hive> CREATE TABLE IF NOT EXISTS test_table
    > (col1 int COMMENT 'Integer Column',
    > col2 string COMMENT 'String Column'
    > )
    > COMMENT 'This is test table'
    > ROW FORMAT DELIMITED
    > FIELDS TERMINATED BY ','
    > STORED AS TEXTFILE;
    OK
    Time taken: 0.042 seconds

    hive> show tables;
    OK
    dual
    tb_person
    test_table
    Time taken: 0.046 seconds, Fetched: 3 row(s)

      #写入

    hive> insert into test_table values(2,'bbbb');
    Query ID = root_20210103210843_c6f2a369-9d1e-4e1f-b933-ab7acb5d533d
    Total jobs = 3
    Launching Job 1 out of 3
    Number of reduce tasks determined at compile time: 1
    In order to change the average load for a reducer (in bytes):
    set hive.exec.reducers.bytes.per.reducer=<number>
    In order to limit the maximum number of reducers:
    set hive.exec.reducers.max=<number>
    In order to set a constant number of reducers:
    set mapreduce.job.reduces=<number>
    Starting Job = job_1609672242377_0003, Tracking URL = http://node1:8088/proxy/application_1609672242377_0003/
    Kill Command = /app/hadoop-3.2.1/bin/mapred job -kill job_1609672242377_0003
    Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
    2021-01-03 21:08:56,536 Stage-1 map = 0%, reduce = 0%
    2021-01-03 21:09:10,923 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 3.04 sec
    2021-01-03 21:09:18,243 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.64 sec
    MapReduce Total cumulative CPU time: 4 seconds 640 msec
    Ended Job = job_1609672242377_0003
    Stage-4 is selected by condition resolver.
    Stage-3 is filtered out by condition resolver.
    Stage-5 is filtered out by condition resolver.
    Moving data to directory hdfs://mycluster/user/hive/warehouse/test_table/.hive-staging_hive_2021-01-03_21-08-43_766_6132862660378482907-1/-ext-10000
    Loading data to table default.test_table
    MapReduce Jobs Launched:
    Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 4.64 sec HDFS Read: 15409 HDFS Write: 244 SUCCESS
    Total MapReduce CPU Time Spent: 4 seconds 640 msec
    OK
    Time taken: 37.116 seconds

     #查询

    hive> select * from test_table;
    OK
    1 aaa
    2 bbbb
    Time taken: 0.195 seconds, Fetched: 2 row(s)

    author@nohert
  • 相关阅读:
    CSS 之 div中文字超出时自动换行
    架构设计分享之权限系统(看图说话)
    perl杂项
    nginx比较apache
    Apache与Nginx的优缺点比较
    DVB系统中PCR的生成和PCR校正
    相关软件测试工具
    我遇到的有趣面试题:破解程序
    OpenStack 部署运维实战
    一些大公司的开源项目及代码托管平台
  • 原文地址:https://www.cnblogs.com/gzgBlog/p/14224720.html
Copyright © 2011-2022 走看看