zoukankan      html  css  js  c++  java
  • hive创建桶表

    • 创建桶表
    hive> create table house_1(id int,name string,age int) clustered by (id) into 3 buckets row format delimited fields terminated by ',';
    • 查看表详细信息
    hive> desc formatted house_1;
    OK
    # col_name              data_type               comment
    
    id                      int
    name                    string
    age                     int
    
    # Detailed Table Information
    Database:               cr
    Owner:                  xiaoqiu
    CreateTime:             Mon Aug 13 10:59:08 EDT 2018
    LastAccessTime:         UNKNOWN
    Retention:              0
    Location:               hdfs://mycluster/user/hive/warehouse/cr.db/house_1
    Table Type:             MANAGED_TABLE
    Table Parameters:
            COLUMN_STATS_ACCURATE   {"BASIC_STATS":"true"}
            numFiles                0
            numRows                 0
            rawDataSize             0
            totalSize               0
            transient_lastDdlTime   1534172348
    
    # Storage Information
    SerDe Library:          org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    InputFormat:            org.apache.hadoop.mapred.TextInputFormat
    OutputFormat:           org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    Compressed:             No
    Num Buckets:            3  #桶的个数
    Bucket Columns:         [id]  #根据id划分桶
    Sort Columns:           []
    Storage Desc Params:
            field.delim             ,
            serialization.format    ,
    Time taken: 2.636 seconds, Fetched: 33 row(s)
    • 加载数据到桶表,采用insert into ....select ...
    hive> load data local inpath '/home/xiaoqiu/teacher.txt' into table house_1;
    FAILED: SemanticException Please load into an intermediate table and use 'insert... select' to allow Hive to enforce bucketing. Load into bucketed tables are disabled for safety reasons. If you know what you are doing, please sethive.strict.checks.bucketing to false and that hive.mapred.mode is not set to 'strict' to proceed. Note that if you may get errors or incorrect results if you make a mistake while using some of the unsafe features.
    hive> insert into house_1 select * from house;
    hive> select * from house;
    OK
    2       'bom1'  24
    3       'rom1'  25
    4       'som1'  26
    5       'tom10' 23
    6       'tom9'  20
    7       'tom8'  11
    8       'tom7'  12
    9       'tom6'  13
    10      'tom5'  33
    12      'tom4'  45
    22      'tom3'  20
    32      'tom2'  23
    NULL    NULL    NULL
    hive> select * from house_1; #根据id进行分桶
    OK
    3       'rom1'  25
    12      'tom4'  45
    9       'tom6'  13
    NULL    NULL    NULL
    6       'tom9'  20
    22      'tom3'  20
    4       'som1'  26
    10      'tom5'  33
    7       'tom8'  11
    8       'tom7'  12
    32      'tom2'  23
    5       'tom10' 23
    2       'bom1'  24
    
    

    查看HDFS目录 

    [xiaoqiu@s150 /home/xiaoqiu]$ hadoop fs -lsr /
    drwxr-xr-x   - xiaoqiu   supergroup          0 2018-08-13 11:45 /user/hive/warehouse/cr.db              /house_1
    -rwxr-xr-x   3 xiaoqiu   supergroup         58 2018-08-13 11:43 /user/hive/warehouse/cr.db              /house_1/000000_0 #桶表1
    -rwxr-xr-x   3 xiaoqiu   supergroup         50 2018-08-13 11:44 /user/hive/warehouse/cr.db              /house_1/000001_0#桶表2
    -rwxr-xr-x   3 xiaoqiu   supergroup         50 2018-08-13 11:43 /user/hive/warehouse/cr.db              /house_1/000002_0#桶表3
    [xiaoqiu@s150 /home/xiaoqiu]$ hadoop fs -cat /user/hive/warehouse/cr.db/house_1/000000_0#桶表1
    3,'rom1',25
    12,'tom4',45
    9,'tom6',13
    N,N,N
    6,'tom9',20
    [xiaoqiu@s150 /home/xiaoqiu]$ hadoop fs -cat /user/hive/warehouse/cr.db/house_1/000001_0#桶表2
    22,'tom3',20
    4,'som1',26
    10,'tom5',33
    7,'tom8',11
    [xiaoqiu@s150 /home/xiaoqiu]$ hadoop fs -cat /user/hive/warehouse/cr.db/house_1/000002_0#桶表3
    8,'tom7',12
    32,'tom2',23
    5,'tom10',23
    2,'bom1',24
    [xiaoqiu@s150 /home/xiaoqiu]$
    
    
    欢迎关注我的公众号:小秋的博客 CSDN博客:https://blog.csdn.net/xiaoqiu_cr github:https://github.com/crr121 联系邮箱:rongchen633@gmail.com 有什么问题可以给我留言噢~
  • 相关阅读:
    redis连接客户端
    map中使用的细节问题(不同key循坏中被后面的值替换)
    使用异步开启新的线程Spring异步方法注解@Async
    npm init 出现一堆提问(npm init -y)
    小程序的时间日期选择器
    小程序--分类项目跳转页面(同样也适用轮播图跳转)
    小程序样式不管用,解决方法button:not([size='mini']) { width: 184px; margin-left: auto; margin-right: auto; }
    vue-elementui的时间日期选择器( value-format="yyyy-MM-dd HH:mm:ss"),以及时间解析{y}-{m}-{d} {h}:{i}:{s}
    vue.config.js配置详细说明(逐条解释)
    element在el-table-column中如何使用过滤器
  • 原文地址:https://www.cnblogs.com/flyingcr/p/10326863.html
Copyright © 2011-2022 走看看