1.hive行转列
good01表内容 hive> select * from good01; OK 1 2,3,4,5,6,7,8,9,10,11,12,13,14,15,16 2 11,12,13,14,15,16 Time taken: 0.163 seconds, Fetched: 2 row(s) hive> good01 列结构 hive> desc good01; OK id_one string new_id string Time taken: 0.046 seconds, Fetched: 2 row(s) hive> 行转列语句 select id_one,result from good01 lateral view explode(split(new_id,',')) new_id as result; 执行 结果为 OK 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 1 10 1 11 1 12 1 13 1 14 1 15 1 16 2 11 2 12 2 13 2 14 2 15 2 16
2.hive列转行
第一种,k - 单 v形式
表内容 hive> select * from good; OK 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 1 10 1 11 1 12 1 13 1 14 1 15 1 16 2 11 2 12 2 13 2 14 2 15 2 16 表结构 ive> desc good; OK id_one string id_two string 列转行语句 hive> select id_one,concat_ws(',',collect_list(id_two)) as value01 > from good group by id_one;
MapReduce Total cumulative CPU time: 4 seconds 420 msec
Ended Job = job_1572198119580_0017
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 4.42 sec HDFS Read: 9610 HDFS Write: 170 SUCCESS
Total MapReduce CPU Time Spent: 4 seconds 420 msec
OK
1 2,3,4,5,6,7,8,9,10,11,12,13,14,15,16
2 11,12,13,14,15,16
Time taken: 23.857 seconds, Fetched: 2 row(s)
hive>
第二种,列转行,k-n v
列转行 输出格式 k- nv 表内容: hive> select * from tmp_dianpingt; OK 344920 one 10 344920 two 30 344920 three 30 344920 four 30 Time taken: 0.135 seconds, Fetched: 4 row(s) 表结构形式: ive> desc tmp_dianpingt; OK productid string tagtype string highlight string 分隔符是 , 建表语句是 hive> show create table tmp_dianpingt; OK CREATE TABLE `tmp_dianpingt`( `productid` string, `tagtype` string, `highlight` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES ( 'field.delim'=',', 'serialization.format'=',') STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'hdfs://min01:9000/user/hive/warehouse/dw_htl.db/tmp_dianpingt' TBLPROPERTIES ( 'bucketing_version'='2', 'transient_lastDdlTime'='1572200512') Time taken: 0.058 seconds, Fetched: 18 row(s) 执行 列转行 k -nv 语句为 select productid ,concat_ws('',collect_set(fuwu)) `one` ,concat_ws('',collect_set(weizhi)) `two` ,concat_ws('',collect_set(sheshi)) `three` ,concat_ws('',collect_set(weisheng)) `four` from ( select productid ,case when tagtype='one' then highlight else '' end as weizhi ,case when tagtype='two' then highlight else '' end as fuwu ,case when tagtype='three' then highlight else '' end as weisheng ,case when tagtype='four' then highlight else '' end as sheshi from dw_htl.tmp_dianpingt ) a group by productid
结果为:
apReduce Total cumulative CPU time: 4 seconds 270 msec
Ended Job = job_1572198119580_0016
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 4.27 sec HDFS Read: 12689 HDFS Write: 118 SUCCESS
Total MapReduce CPU Time Spent: 4 seconds 270 msec
OK
344920 30 10 30 30