zoukankan      html  css  js  c++  java
  • hive + hadoop 环境搭建

    机器规划:

    主机 ip 进程
    hadoop1 10.183.225.158 hive server
    hadoop2 10.183.225.166 hive client

    前置条建:

    kerberos部署:http://www.cnblogs.com/kisf/p/7473193.html

    Hadoop  HA + kerberos部署:http://www.cnblogs.com/kisf/p/7477440.html

    mysql安装:略

    添加hive用户名,及数据库。mysql -uhive -h10.112.28.179 -phive123456

    hive使用2.3.0版本:

    wget http://mirror.bit.edu.cn/apache/hive/hive-2.3.0/apache-hive-2.3.0-bin.tar.gz

    添加环境变量:

    export HIVE_HOME=/letv/soft/apache-hive-2.3.0-bin
    export HIVE_CONF_DIR=$HIVE_HOME/conf
    export PATH=$PATH:$HIVE_HOME/bin
    

    同步至master2,并 source /etc/profile

    解压:  

    tar zxvf apache-hive-2.3.0-bin.tar.gz

      

    kerberos生成keytab:

    addprinc -randkey hive/hadoop1@JENKIN.COM
    addprinc -randkey hive/hadoop2@JENKIN.COM
    
    xst -k /var/kerberos/krb5kdc/keytab/hive.keytab hive/hadoop1@JENKIN.COM
    xst -k /var/kerberos/krb5kdc/keytab/hive.keytab hive/hadoop2@JENKIN.COM
    

      

    拷贝至hadoop2

    scp /var/kerberos/krb5kdc/keytab/hive.keytab hadoop1:/var/kerberos/krb5kdc/keytab/
    scp /var/kerberos/krb5kdc/keytab/hive.keytab hadoop2:/var/kerberos/krb5kdc/keytab/
    

    (使用需要kinit)  

    hive server 配置:

    hive server hive-env.sh增加:  

    HADOOP_HOME=/xxx/soft/hadoop-2.7.3
    export HIVE_CONF_DIR=/xxx/soft/apache-hive-2.3.0-bin/conf
    export HIVE_AUX_JARS_PATH=/xxx/soft/apache-hive-2.3.0-bin/lib
    

      

    hive server上增加hive-site.xml:

    <configuration>
        <property>
               <name>hive.metastore.schema.verification</name>
               <value>false</value>
               <description>
                  Enforce metastore schema version consistency.
                      True: Verify that version information stored in metastore matches with one from Hive jars.  Also disable automatic
                            schema migration attempt. Users are required to manully migrate schema after Hive upgrade which ensures
                            proper metastore schema migration. (Default)
                      False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
                </description>
        </property>
        <property>
                <name>hive.metastore.warehouse.dir</name>
                <value>/user/hive/warehouse</value>
                <description>location of default database for the warehouse</description>
        </property>
        <property>
                <name>hive.querylog.location</name>
                <value>/xxx/soft/apache-hive-2.3.0-bin/log</value>
                <description>Location of Hive run time structured log file</description>
        </property>
        <property>
                <name>hive.downloaded.resources.dir</name>
                <value>/xxx/soft/apache-hive-2.3.0-bin/tmp</value>
                <description>Temporary local directory for added resources in the remote file system.</description>
        </property>
        <property>
                <name>javax.jdo.option.ConnectionURL</name>
                <value>jdbc:mysql://10.112.28.179:3306/hive?createDatabaseIfNotExist=true&iuseUnicode=true&characterEncoding=utf-8&useSSL=false</value<configuration>
                <description>JDBC connect string for a JDBC metastore</description>
        </property>
        <property>
                <name>javax.jdo.option.ConnectionDriverName</name>
                <value>com.mysql.jdbc.Driver</value>
                <description>Driver class name for a JDBC metastore</description>
        </property>
    
        <property>
                <name>javax.jdo.option.ConnectionUserName</name>
                <value>hive</value>
                <description>username to use against metastore database</description>
        </property>
        <property>
                <name>javax.jdo.option.ConnectionPassword</name>
                <value>hive123456</value>
                <description>password to use against metastore database</description>
        </property>
    <!-- kerberos config -->
        <property>
            <name>hive.server2.authentication</name>
            <value>KERBEROS</value>
        </property>
        <property>
            <name>hive.server2.authentication.kerberos.principal</name>
            <value>hive/_HOST@JENKIN.COM</value>
        </property>
        <property>
            <name>hive.server2.authentication.kerberos.keytab</name>
            <value>/var/kerberos/krb5kdc/keytab/hive.keytab</value>
            <!-- value>/xxx/soft/apache-hive-2.3.0-bin/conf/keytab/hive.keytab</value -->
        </property>
    
        <property>
            <name>hive.metastore.sasl.enabled</name>
            <value>true</value>
        </property>
        <property>
            <name>hive.metastore.kerberos.keytab.file</name>
            <value>/var/kerberos/krb5kdc/keytab/hive.keytab</value>
        </property>
        <property>
            <name>hive.metastore.kerberos.principal</name>
            <value>hive/_HOST@JENKIN.COM</value>
        </property>
    

      

    hadoop namenode core-site.xml增加配置:

    <!-- hive congfig  -->
            <property>
                    <name>hadoop.proxyuser.hive.hosts</name>
                    <value>*</value>
            </property>
            <property>
                    <name>hadoop.proxyuser.hive.groups</name>
                    <value>*</value>
            </property>
            <property>
                    <name>hadoop.proxyuser.hdfs.hosts</name>
                    <value>*</value>
            </property>
            <property>
                    <name>hadoop.proxyuser.hdfs.groups</name>
                    <value>*</value>
            </property>
            <property>
                    <name>hadoop.proxyuser.HTTP.hosts</name>
                    <value>*</value>
            </property>
            <property>
                    <name>hadoop.proxyuser.HTTP.groups</name>
                    <value>*</value>
             </property>
    

      同步是其他机器。

    scp etc/hadoop/core-site.xml master2:/xxx/soft/hadoop-2.7.3/etc/hadoop/
    scp etc/hadoop/core-site.xml slave2:/xxx/soft/hadoop-2.7.3/etc/hadoop/
    

      

    JDBC下载:

    wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.44.tar.gz
    tar zxvf mysql-connector-java-5.1.44.tar.gz 
    

    复制到hive lib目录:

    cp mysql-connector-java-5.1.44/mysql-connector-java-5.1.44-bin.jar apache-hive-2.3.0-bin/lib/
    

    客户端配置:

    将hive拷贝至hadoop2

    scp -r apache-hive-2.3.0-bin/ hadoop2:/xxx/soft/
    

      

    在hadoop2上(client):

    hive-site.xml

    <configuration>
        <property>
            <name>hive.metastore.uris</name>
            <value>thrift://hadoop1:9083</value>
        </property>
        <property>
             <name>hive.metastore.local</name>
             <value>false</value>
        </property>
        <!-- kerberos config -->
        <property>
            <name>hive.server2.authentication</name>
            <value>KERBEROS</value>
        </property>
        <property>
            <name>hive.server2.authentication.kerberos.principal</name>
            <value>hive/_HOST@JENKIN.COM</value>
        </property>
        <property>
            <name>hive.server2.authentication.kerberos.keytab</name>
            <value>/var/kerberos/krb5kdc/keytab/hive.keytab</value>
            <!-- value>/xxx/soft/apache-hive-2.3.0-bin/conf/keytab/hive.keytab</value -->
        </property>
    
        <property>
            <name>hive.metastore.sasl.enabled</name>
            <value>true</value>
        </property>
        <property>
            <name>hive.metastore.kerberos.keytab.file</name>
            <value>/var/kerberos/krb5kdc/keytab/hive.keytab</value>
        </property>
        <property>
            <name>hive.metastore.kerberos.principal</name>
            <value>hive/_HOST@JENKIN.COM</value>
        </property>
    
    </configuration>
    

      

    启动hive:

    初始化数据:

    ./bin/schematool -dbType mysql -initSchema
    

    获取票据:

    kinit -k -t /var/kerberos/krb5kdc/keytab/hive.keytab hive/hadoop1@JENKIN.COM

    启动server:

    hive --service metastore &  

     验证:

    [root@hadoop1 conf]# netstat -nl | grep 9083
    tcp        0      0 0.0.0.0:9083                0.0.0.0:*                   LISTEN  
    

      

    ps -ef | grep metastore
    
    hive
    
    hive>
    

    启动thrift (hive server)

    hive --service hiveserver2 &
    

     

    验证thrift(hive server是否启动) 

    [root@hadoop1 conf]# netstat -nl | grep 10000
    tcp        0      0 0.0.0.0:10000               0.0.0.0:*                   LISTEN  
    

      

    hive客户端hql操作:

    DDL参考:https://cwiki.apache.org//confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Create/Drop/Alter/UseDatabase

    DML参考:https://cwiki.apache.org//confluence/display/Hive/LanguageManual+DML

    通过hive建的database,tables, 在hdfs 上都能看到。参考hive-site.xml location配置。

    hadoop fs -ls /usr/hive/warehouse
    

      

     beeline客户端连接hive:

    beeline -u "jdbc:hive2://hadoop1:10000/;principal=hive/_HOST@JENKIN.COM"
    

    执行sql:

    0: jdbc:hive2://hadoop1:10000/> show databases;
    +----------------+
    | database_name  |
    +----------------+
    | default        |
    | hivetest       |
    +----------------+
    2 rows selected (0.318 seconds)
    

      

    hive> create database jenkintest;
    OK
    Time taken: 0.968 seconds
    hive> show databases;
    OK
    default
    hivetest
    jenkintest
    Time taken: 0.033 seconds, Fetched: 3 row(s)
    hive> use jenkintest
        > ;
    OK
    Time taken: 0.108 seconds
    hive> create table test1(columna int, columnb string);
    OK
    Time taken: 0.646 seconds
    hive> show tables;
    OK
    test1
    Time taken: 0.084 seconds, Fetched: 1 row(s)
    

      

     hive数据导入:(通过文件导入,在本地建立文件,列按“table”键分开) 

    [root@hadoop2 ~]# vim jenkindb.txt
    1       jenkin
    2       jenkin.k
    3       anne
    
    
    [root@hadoop2 ~]#hive
    
    hive> create table jenkintb (id int, name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '	' STORED AS TEXTFILE;
    
    hive> load data local inpath 'jenkindb.txt' into table jenkintb;
    
    hive> select * from jenkintb;
    OK
    1       jenkin
    2       jenkin.k
    3       anne
    

      

    show create table jenkintb;

      

      

  • 相关阅读:
    angular.element函数
    mknod创建命名管道(I/O缓存)
    谈谈sipXecs及其它【转】
    Linux下判断cpu物理个数、几核
    shell 中判断文件/文件夹是否存在
    一个人可以用Open IMS Core做什么呢
    linux 下更改磁盘名
    PPTP 服务器配置
    IP多媒体子系统[转维基百科]
    针对用编译的方式安装时的卸载
  • 原文地址:https://www.cnblogs.com/kisf/p/7497261.html
Copyright © 2011-2022 走看看