zoukankan      html  css  js  c++  java
  • hive 安装

    1、下载地址:http://mirrors.hust.edu.cn/apache/

    环境变量:
    vi /etc/profile
    追加
    
    #set hive environment
    export HIVE_HOME=/opt/apache-hive-3.1.1
    export PATH=$PATH:$HIVE_HOME/bin

    2、conf配置

    解压后进入hive的conf目录,新建配置文件,设定hive元数据存储的数据库等

    vi hive-site.xml 
    
    <configuration>
            <property>
                    <name>javax.jdo.option.ConnectionURL</name>
                    <value>jdbc:mysql://192.168.1.101:3306/hivedb?createDatabaseIfNotExist=true&amp;useSSL=false</value> #用&amp;转义等于&
                    <description>JDBC connect string for a JDBC metastore</description>
            </property>
            <property>
                    <name>javax.jdo.option.ConnectionDriverName</name>
                    <value>com.mysql.jdbc.Driver</value>
                    <description>Driver class name for a JDBC metastore</description>
            </property>
            <property>
                    <name>javax.jdo.option.ConnectionUserName</name>
                    <value>hive</value>
                    <description>username to use against metastore database</description>
            </property>
            <property>
                    <name>javax.jdo.option.ConnectionPassword</name>
                    <value>123@abc.com</value>
            <description>password to use against metastore database</description>
            </property>
            <property>
                    <name>hive.metastore.warehouse.dir</name>
                    <value>/hive/warehouse</value>
                    <description>hive default warehouse, if nessecory, change it</description>
            </property>
    </configuration>

    因为用的是mysql作为元数据存储,将mysql连接jar包放入hive的lib目录下:mysql-connector-java-5.1.47.jar

    3、初始化元数据

    schematool -dbType mysql -initSchema
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/opt/apache-hive-3.1.1/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/opt/hadoop/hadoop-3.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
    Metastore connection URL:     jdbc:mysql://192.168.1.101:3306/hivedb?createDatabaseIfNotExist=true
    Metastore Connection Driver :     com.mysql.jdbc.Driver
    Metastore connection User:     hive
    
    Starting metastore schema initialization to 3.1.0
    Initialization script hive-schema-3.1.0.mysql.sql
    Initialization script completed
    
    schemaTool completed

     4、hive基本使用

    启动:hive --service cli或者hive

    4.1、创建数据库

    hive> create database busdata;
    OK
    Time taken: 0.735 seconds
    hive> show databases;
    OK
    busdata
    default
    Time taken: 0.161 seconds, Fetched: 2 row(s)
    hive> select current_database();
    OK
    default
    Time taken: 2.362 seconds, Fetched: 1 row(s)
    hive> use busdata;
    OK

    4.2、建表

    hive> create table userinfo(userid int,createTime timestamp,email string,expiredate date,name string,password string,salt string,state tinyint,tel string,username string) row format delimited fields terminated by ",";
    OK
    Time taken: 1.109 seconds
    hive> show tables;
    OK
    userinfo
    Time taken: 0.04 seconds, Fetched: 1 row(s)
    在Hive中提供了字段类型TIMESTAMP,但其对应的时间格式为yyyy-MM-dd HH:MM:SS,即通过LOAD命令将文本加载到已创建的表中时,只有满足上述格式的内容才可以被正确解析,否则只能识别为NULL。

    4.3、加载本地数据,文件是cp

    hive> load data local inpath "/home/hadoop/userinfo.txt" into table userinfo;
    Loading data to table busdata.userinfo
    OK
    Time taken: 0.562 seconds

    加载hdfs上的数据,hive直接把源文件mv走了

    hive> load data inpath "/mysql/syslog2/part-m-00000" into table syslog;
    Loading data to table busdata.syslog
    OK
    Time taken: 0.431 seconds

     加载批量数据

    hive> load data local inpath "/home/hadoop/sampler/*.csv" into table weather_data;
    Loading data to table busdata.weather_data
    OK
    Time taken: 0.523 seconds

    4.4、查看数据

    hive> select userid,username,name,createtime from userinfo;
    OK
    1    admin    管理员    2018-10-04 22:22:41
    2    xu.dm    管理员    2018-10-04 22:22:00
    3    myuser    java编程    2018-10-05 17:16:19
    4    user123    abc测    2018-10-12 18:22:42
    5    user1    mytest    2018-10-18 00:36:39
    6    user2    abc测    2018-10-18 22:37:24
    ... ...
    Time taken: 0.337 seconds, Fetched: 22 row(s)

    4.5、查看表结构

    hive> desc userinfo;
    OK
    userid                  int                                         
    createtime              timestamp                                   
    email                   string                                      
    expiredate              date                                        
    name                    string                                      
    password                string                                      
    salt                    string                                      
    state                   tinyint                                     
    tel                     string                                      
    username                string                                      
    Time taken: 0.112 seconds, Fetched: 10 row(s)
    hive> desc extended userinfo;
    OK
    userid                  int                                         
    createtime              timestamp                                   
    email                   string                                      
    expiredate              date                                        
    name                    string                                      
    password                string                                      
    salt                    string                                      
    state                   tinyint                                     
    tel                     string                                      
    username                string                                      
              
    Detailed Table Information    Table(tableName:userinfo, dbName:busdata, owner:hadoop, createTime:1551432622, lastAccessTime:0, retention:0, 
    sd:StorageDescriptor(cols:[FieldSchema(name:userid, type:int, comment:null),
    FieldSchema(name:createtime, type:timestamp, comment:null), FieldSchema(name:email, type:string, comment:null), FieldSchema(name:expiredate, type:date, comment:null),
    FieldSchema(name:name, type:string, comment:null), FieldSchema(name:password, type:string, comment:null), FieldSchema(name:salt, type:string, comment:null),
    FieldSchema(name:state, type:tinyint, comment:null), FieldSchema(name:tel, type:string, comment:null), FieldSchema(name:username, type:string, comment:null)],
    location:hdfs://bigdata-senior01.home.com:9000/hive/warehouse/busdata.db/userinfo,
    inputFormat:org.apache.hadoop.mapred.TextInputFormat,
    outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1,
    serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
    parameters:{serialization.format=,, field.delim=,}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[],
    skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{totalSize=2642, numRows=0, rawDataSize=0, numFiles=1,
    transient_lastDdlTime=1551432786, bucketing_version=2}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, rewriteEnabled:false, catName:hive, ownerType:USER)
    Time taken: 0.138 seconds, Fetched: 12 row(s)
    hive> desc formatted userinfo;
    OK
    # col_name                data_type               comment             
    userid                  int                                         
    createtime              timestamp                                   
    email                   string                                      
    expiredate              date                                        
    name                    string                                      
    password                string                                      
    salt                    string                                      
    state                   tinyint                                     
    tel                     string                                      
    username                string                                      
              
    # Detailed Table Information          
    Database:               busdata                  
    OwnerType:              USER                     
    Owner:                  hadoop                   
    CreateTime:             Fri Mar 01 17:30:22 CST 2019     
    LastAccessTime:         UNKNOWN                  
    Retention:              0                        
    Location:               hdfs://bigdata-senior01.home.com:9000/hive/warehouse/busdata.db/userinfo     
    Table Type:             MANAGED_TABLE            
    Table Parameters:          
        bucketing_version       2                   
        numFiles                1                   
        numRows                 0                   
        rawDataSize             0                   
        totalSize               2642                
        transient_lastDdlTime    1551432786          
              
    # Storage Information          
    SerDe Library:          org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe     
    InputFormat:            org.apache.hadoop.mapred.TextInputFormat     
    OutputFormat:           org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat     
    Compressed:             No                       
    Num Buckets:            -1                       
    Bucket Columns:         []                       
    Sort Columns:           []                       
    Storage Desc Params:          
        field.delim             ,                   
        serialization.format    ,                   
    Time taken: 0.125 seconds, Fetched: 40 row(s)
  • 相关阅读:
    文件操作实例加强
    文件操作的一般基础操作
    列表与if语句的结合
    难题记录
    字典,集合,列表混合使用需注意:
    列表的一些难度操作
    字符串知识巩固
    and与or的用法
    AngularJS中的过滤器
    NodeJS中的静态资源管理服务
  • 原文地址:https://www.cnblogs.com/asker009/p/10457360.html
Copyright © 2011-2022 走看看