zoukankan      html  css  js  c++  java
  • 将HBase中的表加载到hive中

    两种方式加载hbase中的表到hive中,一是hive创建外部表关联hbase表数据,二是hive创建普通表将hbase的数据加载到本地

    1. 创建外部表

    hbase中已经有了一个test表,内容如下

    hbase(main):012:0* scan 'test'
    ROW                                 COLUMN+CELL                                                                                         
     001                                column=info:age, timestamp=1526563694645, value=18                                                  
     001                                column=info:name, timestamp=1526563629119, value=tom                                                
     002                                column=info:age, timestamp=1526563723288, value=19                                                  
     002                                column=info:name, timestamp=1526563706773, value=jerry                                              
    2 row(s) in 0.4320 seconds
    

    接下来我们创建一个hive的外部表

    CREATE EXTERNAL TABLE test_external (key int, name string,age int)
    STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
    WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:name,info:age") 
    TBLPROPERTIES ("hbase.table.name" = "test");
    

    注:EXTERNAL 关键字指定创建一个外部表

    进入hive shell,执行上面的建表语句

    hive> CREATE EXTERNAL TABLE test_external (key int, name string,age int)
        > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
        > WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:name,info:age") 
        > TBLPROPERTIES ("hbase.table.name" = "test");
    OK
    Time taken: 0.331 seconds
    

    hive可以使用类sql语句执行查询命令

    hive> select * from test_external;
    OK
    1       tom     18
    2       jerry   19
    Time taken: 0.313 seconds, Fetched: 2 row(s)
    

    2. 创建普通表

    hive外部表,仅记录数据所在的路径, 不对数据的位置做任何改变。在删除表的时候,内部表的元数据和数据会被一起删除, 而外部表只删除元数据,不删除数据。接下来我们创建普通表(内部表),并将hbase表数据加载到内部表中。

    hive> create table test (key string, name string,age string) 
        > ROW FORMAT DELIMITED FIELDS TERMINATED BY '	' 
        > STORED AS TEXTFILE;
    OK
    Time taken: 0.154 seconds
    

    执行sql命令INSERT OVERWRITE TABLE test SELECT * FROM test_external,通过外部表将hbase中的表数据加载到hive本地表

    hive> INSERT OVERWRITE TABLE test SELECT * FROM test_external;
    Query ID = hadoop_20180519102929_b52c1820-7d93-4baf-a410-8604bfd0e43b
    Total jobs = 3
    Launching Job 1 out of 3
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_1525962191655_0030, Tracking URL = http://SHQZ-PS-IOT-TEST-WEB01:8088/proxy/application_1525962191655_0030/
    Kill Command = /opt/cloudera/parcels/CDH-5.8.4-1.cdh5.8.4.p0.5/lib/hadoop/bin/hadoop job  -kill job_1525962191655_0030
    Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
    2018-05-19 10:30:18,571 Stage-1 map = 0%,  reduce = 0%
    2018-05-19 10:30:31,505 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 4.68 sec
    MapReduce Total cumulative CPU time: 4 seconds 680 msec
    Ended Job = job_1525962191655_0030
    Stage-4 is selected by condition resolver.
    Stage-3 is filtered out by condition resolver.
    Stage-5 is filtered out by condition resolver.
    Moving data to: hdfs://nameservice1/user/hive/warehouse/test/.hive-staging_hive_2018-05-19_10-29-59_285_1972684120193290878-1/-ext-10000
    Loading data to table default.test
    Table default.test stats: [numFiles=1, numRows=2, totalSize=20, rawDataSize=18]
    MapReduce Jobs Launched: 
    Stage-Stage-1: Map: 1   Cumulative CPU: 4.68 sec   HDFS Read: 3811 HDFS Write: 88 SUCCESS
    Total MapReduce CPU Time Spent: 4 seconds 680 msec
    OK
    Time taken: 34.13 seconds
    

    加载hbase表数据到hive内部表中,是通过启动一个mapreduce任务来进行的;另外hive的sql基本上都是转换成mapreduce任务来执行。

    hive> select * from test;
    OK
    1       tom     18
    2       jerry   19
    Time taken: 0.194 seconds, Fetched: 2 row(s)
    
  • 相关阅读:
    理解TCP/IP,SOCKET,HTTP,FTP,RMI,RPC,webservic
    flyway 管理数据库版本
    【代码工具】Lombok来优雅的编码
    windows常用
    Elasticsearch-基础介绍及索引原理分析
    Zipkin和微服务链路跟踪
    【安装】Mac rabbitMQ
    数论四大定理的证明与部分应用(含算术基本定理)
    洛谷 P1306 斐波那契公约数 题解
    「SWTR-04」Sweet Round 04 游记
  • 原文地址:https://www.cnblogs.com/leekeggs/p/9347365.html
Copyright © 2011-2022 走看看