zoukankan      html  css  js  c++  java
  • HBase集成hive

    、为了创建一个新的由Hive管理的HBase表,请使用CREATE TABLE

    CREATE TABLE hbase_table_1(key int, value string) 
    STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
    WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
    TBLPROPERTIES ("hbase.table.name" = "xyz", "hbase.mapred.output.outputtable" = "xyz");
    • The hbase.columns.mapping property is required and will be explained in the next section.
    • The hbase.table.name property is optional;
      • it controls the name of the table as known by HBase, and allows the Hive table to have a different name.
      • In this example, the table is known as hbase_table_1 within Hive, and as xyz within HBase.
      • If not specified, then the Hive and HBase table names will be identical.
    • The hbase.mapred.output.outputtable property is optional;
      • it's needed if you plan to insert data to the table (the property is used by hbase.mapreduce.TableOutputFormat)

    2、列的映射

    There are two SERDEPROPERTIES that control the mapping of HBase columns to Hive:

    • hbase.columns.mapping
    • hbase.table.default.storage.type: Can have a value of either string (the default) or binary, this option is only available as of Hive 0.9 and the string behavior is the only one available in earlier versions

    列和多列族

    • hive中创建表
    CREATE TABLE hbase_table_1(key int, value1 string, value2 int, value3 int) 
    STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
    WITH SERDEPROPERTIES (
    "hbase.columns.mapping" = ":key,a:b,a:c,d:e"
    );
    • 插入数据
    hive> insert into table hbase_table_1 values(100,val_100,101,102);
    hive> insert into table hbase_table_1 values(100,val_100,101,102);
    • hbase查看表结构

    • HBASE查看数据
    hbase(main):004:0> scan "hbase_table_1"
    ROW                       COLUMN+CELL                                                            
     100                      column=a:b, timestamp=1595817016732, value=val_100                     
     100                      column=a:c, timestamp=1595817016732, value=101                         
     100                      column=d:e, timestamp=1595817016732, value=102                         
     98                       column=a:b, timestamp=1595817050488, value=val_98                      
     98                       column=a:c, timestamp=1595817050488, value=99                          
     98                       column=d:e, timestamp=1595817050488, value=100                         
    2 row(s) in 0.0410 seconds
    • 总结

    (1)hive的key即为hbase的rowkey

    (2)"hbase.columns.mapping" = ":key,a:b,a:c,d:e"中,:key 即为rowkey

    3、列的映射

    • hbase中插入数据
    hbase(main):006:0> put "hbase_table_1",102,"a:b","val_102"
    hbase(main):008:0> put "hbase_table_1",102,"a:c","101"
    hbase(main):009:0> put "hbase_table_1",102,"d:e","102"
    • scan数据

    • hive查看数据

  • 相关阅读:
    《算法竞赛入门经典》《算法竞赛入门经典——训练之南》:勘误、讨论及代码
    codeforces 340B Maximal Area Quadrilateral(叉积)
    codeforces 340C Tourist Problem(简单数学题)
    codeforces 340A The Wall(简单数学题)
    UVALive 4043 Ants(二分图完美匹配)
    UVA 11865 Stream My Contest(最小树形图)
    UVA 11354 Bond(最小瓶颈路+倍增)
    UVALive 5713 Qin Shi Huang's National Road System(次小生成树)
    UVALive 3661 Animal Run(最短路解最小割)
    卡尔曼滤波器
  • 原文地址:https://www.cnblogs.com/hyunbar/p/13384490.html
Copyright © 2011-2022 走看看