zoukankan      html  css  js  c++  java
  • hive行转列和列转行

    1.hive行转列

    good01表内容
    hive> select * from good01;
    OK
    1    2,3,4,5,6,7,8,9,10,11,12,13,14,15,16
    2    11,12,13,14,15,16
    Time taken: 0.163 seconds, Fetched: 2 row(s)
    hive> 
    
    
    good01 列结构
    hive> desc good01;
    OK
    id_one                  string                                      
    new_id                  string                                      
    Time taken: 0.046 seconds, Fetched: 2 row(s)
    hive> 
    
    
    行转列语句
    select id_one,result from good01 lateral view explode(split(new_id,',')) new_id as result;
    执行 结果为
    OK
    1    2
    1    3
    1    4
    1    5
    1    6
    1    7
    1    8
    1    9
    1    10
    1    11
    1    12
    1    13
    1    14
    1    15
    1    16
    2    11
    2    12
    2    13
    2    14
    2    15
    2    16
    View Code

    2.hive列转行

    第一种,k  -   单 v形式 

    表内容
    hive> select * from good;
    OK
    1    2
    1    3
    1    4
    1    5
    1    6
    1    7
    1    8
    1    9
    1    10
    1    11
    1    12
    1    13
    1    14
    1    15
    1    16
    2    11
    2    12
    2    13
    2    14
    2    15
    2    16
    
    表结构
    ive> desc good;
    OK
    id_one                  string                                      
    id_two                  string   
    
    列转行语句
    hive> select id_one,concat_ws(',',collect_list(id_two)) as value01 
        > from good group by id_one;



    MapReduce Total cumulative CPU time: 4 seconds 420 msec
    Ended Job = job_1572198119580_0017
    MapReduce Jobs Launched:
    Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 4.42 sec HDFS Read: 9610 HDFS Write: 170 SUCCESS
    Total MapReduce CPU Time Spent: 4 seconds 420 msec
    OK
    1 2,3,4,5,6,7,8,9,10,11,12,13,14,15,16
    2 11,12,13,14,15,16
    Time taken: 23.857 seconds, Fetched: 2 row(s)
    hive>

    第二种,列转行,k-n v

    列转行
    输出格式  k- nv
    
    表内容:
    hive> select * from tmp_dianpingt;
    OK
    344920    one    10
    344920    two    30
    344920    three    30
    344920    four    30
    Time taken: 0.135 seconds, Fetched: 4 row(s)
    
    表结构形式:
    ive> desc tmp_dianpingt;
    OK
    productid               string                                      
    tagtype                 string                                      
    highlight               string 
    
    分隔符是   ,
    建表语句是
    hive> show create table tmp_dianpingt;
    OK
    CREATE TABLE `tmp_dianpingt`(
      `productid` string, 
      `tagtype` string, 
      `highlight` string)
    ROW FORMAT SERDE 
      'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
    WITH SERDEPROPERTIES ( 
      'field.delim'=',', 
      'serialization.format'=',') 
    STORED AS INPUTFORMAT 
      'org.apache.hadoop.mapred.TextInputFormat' 
    OUTPUTFORMAT 
      'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
    LOCATION
      'hdfs://min01:9000/user/hive/warehouse/dw_htl.db/tmp_dianpingt'
    TBLPROPERTIES (
      'bucketing_version'='2', 
      'transient_lastDdlTime'='1572200512')
    Time taken: 0.058 seconds, Fetched: 18 row(s)
    
    
    
    
    执行 列转行  k  -nv 语句为
    select productid
          ,concat_ws('',collect_set(fuwu))     `one`
          ,concat_ws('',collect_set(weizhi))   `two`
          ,concat_ws('',collect_set(sheshi))   `three`
          ,concat_ws('',collect_set(weisheng)) `four`
    from
      (  select productid
               ,case when tagtype='one' then highlight else '' end as  weizhi
               ,case when tagtype='two' then highlight else '' end as  fuwu 
               ,case when tagtype='three' then highlight else '' end as  weisheng
               ,case when tagtype='four' then highlight else '' end as  sheshi
          from dw_htl.tmp_dianpingt
       ) a 
     group by productid





    结果为:

    apReduce Total cumulative CPU time: 4 seconds 270 msec
    Ended Job = job_1572198119580_0016
    MapReduce Jobs Launched:
    Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 4.27 sec HDFS Read: 12689 HDFS Write: 118 SUCCESS
    Total MapReduce CPU Time Spent: 4 seconds 270 msec
    OK
    344920 30 10 30 30

    RUSH B
  • 相关阅读:
    平均值的最值化
    GCJ 2008 R3 C 二分图匹配
    BZOJ 1013 高斯消元
    NOIP2015 d2T3 二分+树上前缀和
    9.7集训 总结
    BZOJ 1026
    BZOJ 3223
    NOIP前夕:codevs,关路灯
    NOIP前夕:codevs,解药还是毒药
    NOIP前夕:codevs,修剪花卉
  • 原文地址:https://www.cnblogs.com/tangsonghuai/p/11748549.html
Copyright © 2011-2022 走看看