1 数据来源
本次实战的数据来自于"YouTube视频统计与社交网络"的数据集,是西蒙弗雷泽大学计算机学院在2008年所爬取的数据
数据集地址
1. 1 Youtube视频表格式如下:
列名 | 注释 |
---|---|
视频ID | 一个11位字符串,是唯一的 |
上传 | 一个字符串的视频上传者的用户名 |
年龄 | 视频上传日期和2007年2月15日之间的整数天(YouTube的设立) |
类别 | 由上传者选择的视频类别的字符串 |
长度 | 视频长度的整数v |
观看数 | 一整数的视图 |
率 | 一个浮点数的视频速率 |
评分 | 整数的评分 |
评论数 | 一整数的评论 |
相关视频ID | 最多20个字符串的相关视频ID |
数据之间采用" "作为分隔符
具体数据如下:
video ID | uploader | age | category | length | views | rate | ratings | comments | related IDs |
---|---|---|---|---|---|---|---|---|---|
ifnlnji-Y4s | Hooran | 1162 | Travel & Events | 239 | 189 | 4.8 | 10 | 3 | tpAL3I0urI4 ... ifnlnji-Y4s |
数据量大小为1G,条数为500万+
1.2 用户表
列名 | uploader | videos | friends |
---|---|---|---|
类型 | string | int | int |
解释 | 上传者 | 上传视频数 | 朋友数 |
2 实战演练准备
2.1 环境搭建
使用环境为
hive-1.1.0-cdh5.4.5
hadoop-2.6.0-cdh5.4.5
演示形式为使用hive shell
2.2 数据清洗
我们一起来看看数据
video ID | uploader | age | category | length | views | rate | ratings | comments | related IDs |
---|---|---|---|---|---|---|---|---|---|
ifnlnji-Y4s | Hooran | 1162 | Travel & Events | 239 | 189 | 4.8 | 10 | 3 | tpAL3I0urI4 ... ifnlnji-Y4s |
主要的问题在于category和relatedIDs处理,由于Hive是支持array格式的,所以我们想到的是使用array来存储category和relatedIDs,但是我们发现category的分割符是"&"而realatedIDs的分隔符是" ",我们在创建表格的时候能够指定array的分隔符,但是只能指定一个,所以再将数据导入到Hive表格之前我们需要对数据进行一定转换和清洗
并且数据中肯定会存在一些不完整数据和一些奇怪的格式,所以数据的清洗是必要的,我在这里所使用的数据清洗方式是使用Spark进行清洗,也可以使用自定义UDF函数来进行清洗
数据清洗注意点
1)我们可以看到每行数据以" "作为分隔符,每行有十列数据,最后一列关联ID可以为空,那么我们对数据进行split之后数组的大小要大于8
2)数据中存在"uNiKXDA8eyQ KRQE 1035 News & Politics 107"
这样格式的数据,所以在处理category时需要注意 News & Politics中间的&
处理后的数据如下:
video ID | uploader | age | category | length | views | rate | ratings | comments | related IDs |
---|---|---|---|---|---|---|---|---|---|
PkGUU_ggO3k | theresident | 704 | Entertainment | 262 | 11235 | 3.85 | 247 | 280 | PkGUU_ggO3k&EYC5bWF0ss8&...shU2hfHKmU0&p0lq5-8IDqY |
RX24KLBhwMI | lemonette | 697 | People&Blogs | 512 | 24149 | 4.22 | 315 | 474 | t60tW0WevkE&WZgoejVDZlo&...s8xf4QX1UvA&2cKd9ERh5-8 |
下面的实战都是基于数据清洗后的数据进行的
2.3 创建表格和数据导入
2.3.1 youtube表格的创建和导入
1)youtube1的创建,文件格式为textfile
create table youtube1(videoId string, uploader string, age int, category array
row format delimited
fields terminated by " "
collection items terminated by "&"
stored as textfile;
2)youtube2的创建,文件格式为orc
create table youtube2(videoId string, uploader string, age int, category array
row format delimited
fields terminated by " "
collection items terminated by "&"
stored as orc;
3)youtube3的创建,文件格式为orc,进行桶分区
create table youtube3(videoId string, uploader string, age int, category array
clustered by (uploader) into 8 buckets
row format delimited
fields terminated by " "
collection items terminated by "&"
stored as orc;
数据导入:
1)load data inpath "path" into table youtube1;
2)由于无法将textfile格式的数据导入到orc格式的表格,所以数据需要从youtube1导入到youtube2和youtube3:
insert into table youtube2 select * from youtube1;
insert into table youtube3 select * from youtube1;
2.3.2 user表格的创建和导入
1)user_tmp的创建,文件格式textfile,24buckets
create table user_tmp(uploader string,videos int,friends int)
clustered by (uploader) into 24 buckets
row format delimited
fields terminated by " "
stored as textfile;
2)user的创建,文件格式orc,24buckets
create table user(uploader string,videos int,friends int)
clustered by (uploader) into 24 buckets
row format delimited
fields terminated by " "
stored as orc;
user表的数据导入也是同理
数据导入:
1)load data inpath "path" into table user_tmp;
2)由于无法将textfile格式的数据导入到orc格式的表格,所以数据需要从user_tmp导入到user:
insert into table user select * from user_tmp;
3 实战需求
1)统计出观看数最多的10个视频
2)统计出视频类别热度的前10个类型
3)统计出视频观看数最高的50个视频的所属类别
4)统计出观看数最多的前N个视频所关联的视频的所属类别排行
5)筛选出每个类别中热度最高的前10个视频
6)筛选出每个类别中评分最高的前10个视频
7)找出用户中上传视频最多的10个用户的所有视频
8)筛选出每个类别中观看数Top10
4 实战演练
4.1 统计出观看数最多的10个视频
select * from youtube3 order by views desc limit 10;
结果如下:
hive> select * from youtube3 order by views desc limit 10;
Query ID = hadoop_20170710155353_4bc057f0-bbd5-4bfe-a66c-ec0e17cb3ca9
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
Starting Job = job_1499153664137_0101, Tracking URL = http://master:8088/proxy/application_1499153664137_0101/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0101
Hadoop job information for Stage-1: number of mappers: 4; number of reducers: 1
2017-07-10 15:53:55,297 Stage-1 map = 0%, reduce = 0%
...
2017-07-10 15:56:06,210 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 106.93 sec
MapReduce Total cumulative CPU time: 1 minutes 46 seconds 930 msec
Ended Job = job_1499153664137_0101
MapReduce Jobs Launched:
Stage-Stage-1: Map: 4 Reduce: 1 Cumulative CPU: 106.93 sec HDFS Read: 574632526 HDFS Write: 2916 SUCCESS
Total MapReduce CPU Time Spent: 1 minutes 46 seconds 930 msec
OK
dMH0bHeiRNg judsonlaipply 415 ["Comedy"] 360 79897120 4.65 287260 131356 ["HSoVKUVOnfQ","pv5zWaTEVkI","v1U8f7TmYkA","ZCT3MIUTc3o","mDiBOF8XI44","P9LmHXXWiJs","fo_QVq2lGMs","EtGQgSY9Nn4","5P6UU6m3cqk","61heClTFc5w","h3gdSHGcUU4","OPmYbP0F4Zw","gsOaQGF7kiQ","H2gw9VE16mo","rGkIUlYEQT8","innfyQZHPpo","cu8tUy14zQo","cQ25-glGRzI","WqDbn2iLwwY","jvz0bvYmnto"]
cQ25-glGRzI RCARecords 742 ["Music"] 227 77674728 4.45 188154 186858 ["otMB3WVQNVg","CAoo71VEYhs","6gqSk1UBFyo","pULa7F1gWZM","C2eUCfREAbw","neZw2CH1h9s","2fjW33W7424","aVnl9QCfNyE","DKhnmUdmz74","tPW70sgrHiY","W4kR8OQCrlQ","-WtcXpnMIT8","YhRZx2QIJKg","5iaYFN5eERA","KChJyERIclc","xsRWpK4pf90","lOq-Q60zWQs","ywNZPURFJNU","dMH0bHeiRNg","tvkh29RKFRY"]
12Z3J1uzd0Q kaejane 404 ["Film","Animation"] 615 65341925 3.03 9189 5508 ["innfyQZHPpo","-_CSo1gOd48","1Al4crLW9EM","nhSZs-aAZbo","Af9aPlG2WFw","5P6UU6m3cqk","uBHsOTYO6AI","lW19DnWz6vg","vr3x_RRJdd4","5GE82tqcYYQ","D2kJZOfq7zk","lsO6D1rwrKc","dMH0bHeiRNg","kNLXjXxj3J8","vQbYvjmfbr4","G-HonZGBWus","MuOvqeABHvQ","-2caf6KlSyw","cQ25-glGRzI","pv5zWaTEVkI"]
LpAI8TzQDes IMVUinc 848 ["Entertainment"] 68 65078772 1.38 5678 2499 ["c2eP-UADAi8","5P6UU6m3cqk","cQ25-glGRzI","FYbqcgd97NQ","12Z3J1uzd0Q","VS4wFOWFUt8","RB-wUgnyGv0","sBQLq2VmZcA","w2xUzv6iZWo","oOF3T9Zvizc","244qR7SvvX0","pv5zWaTEVkI","dMH0bHeiRNg","b3u65f4CRLk"]
7AVHXe-ol-s internmarket 603 ["Music"] 264 60349673 3.16 1033 594 ["trrRo3kGRv0","CCsGkN1PEec","l5mQKJDO-nY","DP_4rQcU0G8","7xz5aOvP2YA","kCjKIus3iBA","nD0MILEXZP0","IQcyLMa716k","d05KUL8aW4E","szeeHnu2DM8","7b_wObF6vgs","UazqVaOg9uc","LRBD9l3Nv6Y","108YigkYFgM","_qCau44yfX0","NkkGj4_1m9A","kuyRSrklGU8","uLr7IJYyNnM","IEuyk_277xY","XmAaYo-CQNw"]
244qR7SvvX0 donotasyoudo 960 ["Entertainment"] 6 57790943 1.45 66412 14913 ["v3ARyAb_1Bs","nhSZs-aAZbo","2pNTrYd-4FQ","dMH0bHeiRNg","kHmvkRoEowc","yh0xYO3dYx8","5GE82tqcYYQ","ktUSIJEiOug","6PrDw6T3-rM","-xEzGIuY7kw","innfyQZHPpo","pv5zWaTEVkI","E7QOUJfRs3Y","9oRQJK61MWs"]
ePyRrb2-fzs 1988basti 826 ["Music"] 204 45984219 4.87 51039 27065 ["fm0T7_SGee4","h8oBykb_Pqs","iWg3IMN_rhU","MdOAr_4FJvc","xNp7OxgFJJM","nzaw_Gk9BAU","u9rl4apDFfY","fMzMm4sJYGo","b5eEa5a2Rio","aNR0tW_afdk","Kj_MceyjS6g","GojTUmjxVHU","CGPUuPHdHQg","SIM4DCn7AlE","ktUSIJEiOug","Dn6kGzRSjhU","gxxU68z8quM","HRSrFm_8YK8","2__Qdd11rfA","XewBty95JVg"]
xsRWpK4pf90 universalmusicgroup 902 ["Music"] 240 44614530 4.73 99373 53441 ["a4X7eFbP3u4","4WAapKx2TvM","PgZgubuT2DM","FIUv3dOBbCk","Dk8NQB_T1ws","7NAbTHGIPP8","mekQxrnhQ3I","OouU4PFUw3Y","rwgXvL9o0QQ","4eeyhtlJp5A","ktUSIJEiOug","UZStqFeYk3Y","sF84pIhP5UM","FKXm3Qg7sBo","a1ti0KccvOg","ERV-wh4VwZI","iWg3IMN_rhU","i1PKmEaUqes","gsAN_Zfc1nc","CmBCLWkrysY"]
ktUSIJEiOug aliciakeys 953 ["Music"] 251 43583367 4.85 103074 59672 ["_VD-PDzmW4M","vmjl5ARtQWM","X_SmJpAmirw","glBHQSasVMQ","IGj3T7C-RdE","6U_4ANczMPY","oh3Pkw6cokk","j97WOHviXbE","FqsGSvrcfDE","2PWfB4lurT4","Yci8zVNMEdg","sF84pIhP5UM","dDfEjBSWj54","Zdm7FJktDOM","a4X7eFbP3u4","jhPAK8HjcPI","ePyRrb2-fzs","8_8KJfSNgC4","kN9vm95SocU","4DC4Rb9quKk"]
b3u65f4CRLk srcrecords 729 ["Music"] 256 43511791 4.76 55148 27818 ["4fZF9UsS8LY","5f7kfhHHH_A","H1vaszd6NnA","6Di-dAIR9RA","HiBcQeIax4g","JVciSFc5Wrw","Md6rURKhZmA","IG5ReXP0SSg","D9g2szHsoz0","EflOarvoeVs","7PxBGHjABnU","Gjq1g3j7WFc","mDvCcrU-Ob8","V9YE8dHMxvw","iWg3IMN_rhU","Un2dbprtZFE","WpgMuHwAdC4","RtQ5JENdx5I","ZEYgAgKVuO4","D_2jb8D8AOI"]
Time taken: 172.566 seconds, Fetched: 10 row(s)
4.2 统计出视频类别热度的前10个类型
select tagId, count(a.videoid) as sum from (select videoid,tagId from youtube3 lateral view explode(category) catetory as tagId) a group by a.tagId order by sum desc limit 10;
结果:
hive> select tagId, count(a.videoid) as sum from (select videoid,tagId from youtube3 lateral view explode(category) catetory as tagId) a group by a.tagId order by sum desc limit 10;
Query ID = hadoop_20170710155757_614ec9dd-ffa0-465d-8d62-c47b5ad585f0
Total jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Defaulting to jobconf value of: 2
Starting Job = job_1499153664137_0102, Tracking URL = http://master:8088/proxy/application_1499153664137_0102/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0102
Hadoop job information for Stage-1: number of mappers: 4; number of reducers: 2
2017-07-10 15:58:23,797 Stage-1 map = 0%, reduce = 0%
...
2017-07-10 15:59:16,341 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 19.51 sec
MapReduce Total cumulative CPU time: 19 seconds 510 msec
Ended Job = job_1499153664137_0102
Launching Job 2 out of 2
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1499153664137_0103, Tracking URL = http://master:8088/proxy/application_1499153664137_0103/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0103
Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1
2017-07-10 15:59:51,645 Stage-2 map = 0%, reduce = 0%
2017-07-10 16:00:18,373 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.97 sec
2017-07-10 16:00:48,410 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 2.57 sec
MapReduce Total cumulative CPU time: 2 seconds 570 msec
Ended Job = job_1499153664137_0103
MapReduce Jobs Launched:
Stage-Stage-1: Map: 4 Reduce: 2 Cumulative CPU: 19.51 sec HDFS Read: 47327614 HDFS Write: 900 SUCCESS
Stage-Stage-2: Map: 1 Reduce: 1 Cumulative CPU: 2.57 sec HDFS Read: 5596 HDFS Write: 148 SUCCESS
Total MapReduce CPU Time Spent: 22 seconds 80 msec
OK
Entertainment 1304724
Music 1274825
Comedy 449652
Blogs 447581
People 447581
Film 442109
Animation 442109
Sports 390619
Politics 186753
News 186753
Time taken: 185.399 seconds, Fetched: 10 row(s)
4.3 统计出视频观看数最高的50个视频的所属类别
select tagId, count(a.videoid) as sum from (select videoid,tagId from (select * from youtube3 order by views desc limit 20) e lateral view explode(category) catetory as tagId) a group by a.tagId order by sum desc;
结果:
hive> select tagId, count(a.videoid) as sum from (select videoid,tagId from (select * from youtube3 order by views desc limit 20) e lateral view explode(category) catetory as tagId) a group by a.tagId order by sum desc;
Query ID = hadoop_20170710160909_c6cbbe29-4df3-4c0b-ad70-bd34857acc80
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks determined at compile time: 1
Starting Job = job_1499153664137_0104, Tracking URL = http://master:8088/proxy/application_1499153664137_0104/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0104
Hadoop job information for Stage-1: number of mappers: 4; number of reducers: 1
2017-07-10 16:09:48,197 Stage-1 map = 0%, reduce = 0%
2017-07-10 16:10:17,734 Stage-1 map = 25%, reduce = 0%, Cumulative CPU 3.09 sec
...
2017-07-10 16:10:59,048 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 47.74 sec
MapReduce Total cumulative CPU time: 47 seconds 740 msec
Ended Job = job_1499153664137_0104
Launching Job 2 out of 3
Number of reduce tasks not specified. Defaulting to jobconf value of: 2
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1499153664137_0105, Tracking URL = http://master:8088/proxy/application_1499153664137_0105/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0105
Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 2
2017-07-10 16:11:35,860 Stage-2 map = 0%, reduce = 0%
2017-07-10 16:12:04,633 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 1.07 sec
2017-07-10 16:12:23,187 Stage-2 map = 100%, reduce = 50%, Cumulative CPU 2.61 sec
2017-07-10 16:12:32,480 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 3.93 sec
MapReduce Total cumulative CPU time: 3 seconds 930 msec
Ended Job = job_1499153664137_0105
Launching Job 3 out of 3
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1499153664137_0106, Tracking URL = http://master:8088/proxy/application_1499153664137_0106/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0106
Hadoop job information for Stage-3: number of mappers: 2; number of reducers: 1
2017-07-10 16:13:09,244 Stage-3 map = 0%, reduce = 0%
2017-07-10 16:13:37,263 Stage-3 map = 100%, reduce = 0%, Cumulative CPU 2.18 sec
2017-07-10 16:14:05,048 Stage-3 map = 100%, reduce = 100%, Cumulative CPU 3.5 sec
MapReduce Total cumulative CPU time: 3 seconds 500 msec
Ended Job = job_1499153664137_0106
MapReduce Jobs Launched:
Stage-Stage-1: Map: 4 Reduce: 1 Cumulative CPU: 47.74 sec HDFS Read: 57839263 HDFS Write: 228 SUCCESS
Stage-Stage-2: Map: 1 Reduce: 2 Cumulative CPU: 3.93 sec HDFS Read: 6065 HDFS Write: 324 SUCCESS
Stage-Stage-3: Map: 2 Reduce: 1 Cumulative CPU: 3.5 sec HDFS Read: 6600 HDFS Write: 53 SUCCESS
Total MapReduce CPU Time Spent: 55 seconds 170 msec
OK
Music 12
Comedy 3
Entertainment 3
Film 2
Animation 2
Time taken: 297.466 seconds, Fetched: 5 row(s)
4.4 统计出观看数最多的前50个视频所关联的视频的所属类别排行
思路:
- 首先筛选出前50个视频所关联的视频,
select * from youtube3 order by views desc limit 50 - 再将结果和youtube3进行join
select explode(relatedId) as videoId from (select * from youtube3 order by views desc limit 50) a) b join youtube3 c on b.videoId = c.videoId - 然后去重
select distinct(b.videoId),c.category from (select explode(relatedId) as videoId from (select * from youtube3 order by views desc limit 50) a) b join youtube3 c on b.videoId = c.videoId - 最后得出所有视频的类别排行
select tagId, count(e.videoid) as sum from (select videoid,tagId from (select distinct(b.videoId),c.category from (select explode(relatedId) as videoId from (select * from youtube3 order by views desc limit 50) a) b join youtube3 c on b.videoId = c.videoId) d lateral view explode(category) catetory as tagId) e group by tagId order by sum desc;
结果:
hive> select tagId, count(e.videoid) as sum from (select videoid,tagId from (select distinct(b.videoId),c.category from (select explode(relatedId) as videoId from (select * from youtube3 order by views desc limit 50) a) b join youtube3 c on b.videoId = c.videoId) d lateral view explode(category) catetory as tagId) e group by tagId order by sum desc;
Query ID = hadoop_20170710170808_cfbfde68-c016-4c5d-860b-fe840a2d50cb
Total jobs = 7
Launching Job 1 out of 7
Number of reduce tasks determined at compile time: 1
Starting Job = job_1499153664137_0111, Tracking URL = http://master:8088/proxy/application_1499153664137_0111/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0111
Hadoop job information for Stage-1: number of mappers: 4; number of reducers: 1
2017-07-10 17:08:48,662 Stage-1 map = 0%, reduce = 0%
2017-07-10 17:09:25,175 Stage-1 map = 17%, reduce = 0%, Cumulative CPU 20.83 sec
...
2017-07-10 17:10:15,276 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 81.01 sec
MapReduce Total cumulative CPU time: 1 minutes 21 seconds 10 msec
Ended Job = job_1499153664137_0111
Stage-10 is filtered out by condition resolver.
Stage-11 is selected by condition resolver.
Stage-2 is filtered out by condition resolver.
17/07/10 17:10:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Execution log at: /tmp/hadoop/hadoop_20170710170808_cfbfde68-c016-4c5d-860b-fe840a2d50cb.log
2017-07-10 05:10:23 Starting to launch local task to process map join; maximum memory = 518979584
2017-07-10 05:10:34 Dump the side-table for tag: 0 with group count: 743 into file: file:/tmp/hadoop/146c18a2-9a94-440d-9302-72e50ede944e/hive_2017-07-10_17-08-08_628_1950388697756865753-1/-local-10009/HashTable-Stage-8/MapJoin-mapfile30--.hashtable
2017-07-10 05:10:34 Uploaded 1 File to: file:/tmp/hadoop/146c18a2-9a94-440d-9302-72e50ede944e/hive_2017-07-10_17-08-08_628_1950388697756865753-1/-local-10009/HashTable-Stage-8/MapJoin-mapfile30--.hashtable (22611 bytes)
2017-07-10 05:10:34 End of local task; Time Taken: 11.392 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 3 out of 7
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1499153664137_0112, Tracking URL = http://master:8088/proxy/application_1499153664137_0112/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0112
Hadoop job information for Stage-8: number of mappers: 4; number of reducers: 0
2017-07-10 17:11:13,493 Stage-8 map = 0%, reduce = 0%
2017-07-10 17:11:57,229 Stage-8 map = 100%, reduce = 0%, Cumulative CPU 15.79 sec
MapReduce Total cumulative CPU time: 15 seconds 790 msec
Ended Job = job_1499153664137_0112
Launching Job 4 out of 7
Number of reduce tasks not specified. Defaulting to jobconf value of: 2
Starting Job = job_1499153664137_0113, Tracking URL = http://master:8088/proxy/application_1499153664137_0113/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0113
Hadoop job information for Stage-3: number of mappers: 3; number of reducers: 2
2017-07-10 17:12:31,982 Stage-3 map = 0%, reduce = 0%
2017-07-10 17:13:17,467 Stage-3 map = 100%, reduce = 100%, Cumulative CPU 6.53 sec
MapReduce Total cumulative CPU time: 6 seconds 530 msec
Ended Job = job_1499153664137_0113
Launching Job 5 out of 7
Number of reduce tasks not specified. Defaulting to jobconf value of: 2
Starting Job = job_1499153664137_0114, Tracking URL = http://master:8088/proxy/application_1499153664137_0114/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0114
Hadoop job information for Stage-4: number of mappers: 2; number of reducers: 2
2017-07-10 17:13:55,123 Stage-4 map = 0%, reduce = 0%
2017-07-10 17:14:22,876 Stage-4 map = 50%, reduce = 0%, Cumulative CPU 0.91 sec
2017-07-10 17:14:23,905 Stage-4 map = 100%, reduce = 0%, Cumulative CPU 1.99 sec
2017-07-10 17:14:40,601 Stage-4 map = 100%, reduce = 50%, Cumulative CPU 3.35 sec
2017-07-10 17:14:52,033 Stage-4 map = 100%, reduce = 100%, Cumulative CPU 4.96 sec
MapReduce Total cumulative CPU time: 4 seconds 960 msec
Ended Job = job_1499153664137_0114
Launching Job 6 out of 7
Number of reduce tasks determined at compile time: 1
Starting Job = job_1499153664137_0115, Tracking URL = http://master:8088/proxy/application_1499153664137_0115/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0115
Hadoop job information for Stage-5: number of mappers: 2; number of reducers: 1
2017-07-10 17:15:27,647 Stage-5 map = 0%, reduce = 0%
2017-07-10 17:15:54,560 Stage-5 map = 50%, reduce = 0%, Cumulative CPU 0.91 sec
2017-07-10 17:15:55,587 Stage-5 map = 100%, reduce = 0%, Cumulative CPU 2.03 sec
2017-07-10 17:16:23,357 Stage-5 map = 100%, reduce = 100%, Cumulative CPU 3.59 sec
MapReduce Total cumulative CPU time: 3 seconds 590 msec
Ended Job = job_1499153664137_0115
MapReduce Jobs Launched:
Stage-Stage-1: Map: 4 Reduce: 1 Cumulative CPU: 81.01 sec HDFS Read: 463074822 HDFS Write: 26949 SUCCESS
Stage-Stage-8: Map: 4 Cumulative CPU: 15.79 sec HDFS Read: 47322670 HDFS Write: 30829 SUCCESS
Stage-Stage-3: Map: 3 Reduce: 2 Cumulative CPU: 6.53 sec HDFS Read: 43694 HDFS Write: 1126 SUCCESS
Stage-Stage-4: Map: 2 Reduce: 2 Cumulative CPU: 4.96 sec HDFS Read: 8888 HDFS Write: 725 SUCCESS
Stage-Stage-5: Map: 2 Reduce: 1 Cumulative CPU: 3.59 sec HDFS Read: 7001 HDFS Write: 207 SUCCESS
Total MapReduce CPU Time Spent: 1 minutes 51 seconds 880 msec
OK
Music 399
Entertainment 103
Comedy 94
People 36
Blogs 36
Film 19
Animation 19
Pets 12
Animals 12
UNA 11
Style 6
Politics 6
News 6
Howto 6
Sports 4
Vehicles 3
Autos 3
Travel 2
Events 2
Technology 1
Science 1
Time taken: 496.822 seconds, Fetched: 21 row(s)
4.5 筛选出某个类别(如music)中热度最高的前10个视频
思路:
- 创建一个表格,存储每个视频对应一个标签的信息
create table youtube_category(videoId string, uploader string, age int, categoryId string, length int, views int, rate float, ratings int, comments int,relatedId array)
row format delimited
fields terminated by " "
collection items terminated by "&"
stored as orc; - 将转换后的数据进行插入
insert into table youtube_category select videoid,uploader,age,categoryId,length,views,rate,ratings,comments,relatedId from youtube3 lateral view explode(category) catetory as categoryId; - 根据观看数和类别进行查询
select * from youtube_category where categoryId="Music" order by views desc limit 10;
结果如下:
hive> create table youtube_category(videoId string, uploader string, age int, categoryId string, length int, views int, rate float, ratings int, comments int,relatedId array<string>)
> row format delimited
> fields terminated by " "
> collection items terminated by "&"
> stored as orc;
OK
Time taken: 0.256 seconds
hive> insert into youtube_category select videoid,uploader,age,categoryId,length,views,rate,ratings,comments,relatedId from youtube3 lateral view explode(category) catetory as categoryId;
Query ID = hadoop_20170711091010_7c76d7bd-5af0-43f9-812f-e30c612ee60b
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1499153664137_0116, Tracking URL = http://master:8088/proxy/application_1499153664137_0116/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0116
Hadoop job information for Stage-1: number of mappers: 4; number of reducers: 0
2017-07-11 09:11:18,741 Stage-1 map = 0%, reduce = 0%
...
2017-07-11 09:19:05,239 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 402.71 sec
MapReduce Total cumulative CPU time: 6 minutes 42 seconds 710 msec
Ended Job = job_1499153664137_0116
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://bydcluster1/user/hive/warehouse/youtube_category/.hive-staging_hive_2017-07-11_09-10-38_970_8191962088714912782-1/-ext-10000
Loading data to table default.youtube_category
Table default.youtube_category stats: [numFiles=4, numRows=6742209, totalSize=608748677, rawDataSize=10187373874]
MapReduce Jobs Launched:
Stage-Stage-1: Map: 4 Cumulative CPU: 404.25 sec HDFS Read: 574634998 HDFS Write: 608749047 SUCCESS
Total MapReduce CPU Time Spent: 6 minutes 44 seconds 250 msec
OK
Time taken: 511.914 seconds
hive> select * from youtube_category where categoryId="music" order by views desc limit 10;
Query ID = hadoop_20170711091919_a48aa98b-9cfa-4f58-a106-9da85484f0dd
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1499153664137_0117, Tracking URL = http://master:8088/proxy/application_1499153664137_0117/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0117
Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 1
2017-07-11 09:20:37,607 Stage-1 map = 0%, reduce = 0%
...
2017-07-11 09:21:32,382 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 36.15 sec
MapReduce Total cumulative CPU time: 36 seconds 150 msec
Ended Job = job_1499153664137_0117
MapReduce Jobs Launched:
Stage-Stage-1: Map: 3 Reduce: 1 Cumulative CPU: 36.15 sec HDFS Read: 607706126 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 36 seconds 150 msec
OK
Time taken: 96.836 seconds
hive> select * from youtube_category where catagoryId="Music" order by views desc limit 10;
FAILED: SemanticException [Error 10004]: Line 1:37 Invalid table alias or column reference 'catagoryId': (possible column names are: videoid, uploader, age, categoryid, length, views, rate, ratings, comments, relatedid)
hive> select * from youtube_category where catagoryid="Music" order by views desc limit 10;
FAILED: SemanticException [Error 10004]: Line 1:37 Invalid table alias or column reference 'catagoryid': (possible column names are: videoid, uploader, age, categoryid, length, views, rate, ratings, comments, relatedid)
hive> select * from youtube_category where categoryid="Music" order by views desc limit 10;
Query ID = hadoop_20170711092525_53f32ccf-7d04-4615-b446-9009cf16dc7f
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1499153664137_0118, Tracking URL = http://master:8088/proxy/application_1499153664137_0118/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0118
Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 1
2017-07-11 09:26:08,727 Stage-1 map = 0%, reduce = 0%
...
2017-07-11 09:27:17,297 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 57.89 sec
MapReduce Total cumulative CPU time: 57 seconds 890 msec
Ended Job = job_1499153664137_0118
MapReduce Jobs Launched:
Stage-Stage-1: Map: 3 Reduce: 1 Cumulative CPU: 58.14 sec HDFS Read: 607706243 HDFS Write: 3043 SUCCESS
Total MapReduce CPU Time Spent: 58 seconds 140 msec
OK
cQ25-glGRzI RCARecords 742 Music 227 77674728 4.45 188154 186858 ["otMB3WVQNVg","CAoo71VEYhs","6gqSk1UBFyo","pULa7F1gWZM","C2eUCfREAbw","neZw2CH1h9s","2fjW33W7424","aVnl9QCfNyE","DKhnmUdmz74","tPW70sgrHiY","W4kR8OQCrlQ","-WtcXpnMIT8","YhRZx2QIJKg","5iaYFN5eERA","KChJyERIclc","xsRWpK4pf90","lOq-Q60zWQs","ywNZPURFJNU","dMH0bHeiRNg","tvkh29RKFRY"]
7AVHXe-ol-s internmarket 603 Music 264 60349673 3.16 1033 594 ["trrRo3kGRv0","CCsGkN1PEec","l5mQKJDO-nY","DP_4rQcU0G8","7xz5aOvP2YA","kCjKIus3iBA","nD0MILEXZP0","IQcyLMa716k","d05KUL8aW4E","szeeHnu2DM8","7b_wObF6vgs","UazqVaOg9uc","LRBD9l3Nv6Y","108YigkYFgM","_qCau44yfX0","NkkGj4_1m9A","kuyRSrklGU8","uLr7IJYyNnM","IEuyk_277xY","XmAaYo-CQNw"]
ePyRrb2-fzs 1988basti 826 Music 204 45984219 4.87 51039 27065 ["fm0T7_SGee4","h8oBykb_Pqs","iWg3IMN_rhU","MdOAr_4FJvc","xNp7OxgFJJM","nzaw_Gk9BAU","u9rl4apDFfY","fMzMm4sJYGo","b5eEa5a2Rio","aNR0tW_afdk","Kj_MceyjS6g","GojTUmjxVHU","CGPUuPHdHQg","SIM4DCn7AlE","ktUSIJEiOug","Dn6kGzRSjhU","gxxU68z8quM","HRSrFm_8YK8","2__Qdd11rfA","XewBty95JVg"]
xsRWpK4pf90 universalmusicgroup 902 Music 240 44614530 4.73 99373 53441 ["a4X7eFbP3u4","4WAapKx2TvM","PgZgubuT2DM","FIUv3dOBbCk","Dk8NQB_T1ws","7NAbTHGIPP8","mekQxrnhQ3I","OouU4PFUw3Y","rwgXvL9o0QQ","4eeyhtlJp5A","ktUSIJEiOug","UZStqFeYk3Y","sF84pIhP5UM","FKXm3Qg7sBo","a1ti0KccvOg","ERV-wh4VwZI","iWg3IMN_rhU","i1PKmEaUqes","gsAN_Zfc1nc","CmBCLWkrysY"]
ktUSIJEiOug aliciakeys 953 Music 251 43583367 4.85 103074 59672 ["_VD-PDzmW4M","vmjl5ARtQWM","X_SmJpAmirw","glBHQSasVMQ","IGj3T7C-RdE","6U_4ANczMPY","oh3Pkw6cokk","j97WOHviXbE","FqsGSvrcfDE","2PWfB4lurT4","Yci8zVNMEdg","sF84pIhP5UM","dDfEjBSWj54","Zdm7FJktDOM","a4X7eFbP3u4","jhPAK8HjcPI","ePyRrb2-fzs","8_8KJfSNgC4","kN9vm95SocU","4DC4Rb9quKk"]
b3u65f4CRLk srcrecords 729 Music 256 43511791 4.76 55148 27818 ["4fZF9UsS8LY","5f7kfhHHH_A","H1vaszd6NnA","6Di-dAIR9RA","HiBcQeIax4g","JVciSFc5Wrw","Md6rURKhZmA","IG5ReXP0SSg","D9g2szHsoz0","EflOarvoeVs","7PxBGHjABnU","Gjq1g3j7WFc","mDvCcrU-Ob8","V9YE8dHMxvw","iWg3IMN_rhU","Un2dbprtZFE","WpgMuHwAdC4","RtQ5JENdx5I","ZEYgAgKVuO4","D_2jb8D8AOI"]
iWg3IMN_rhU TimbalandMusic 853 Music 216 43323757 4.8 58761 36142 ["SIM4DCn7AlE","GojTUmjxVHU","ePyRrb2-fzs","2__Qdd11rfA","P_TKpULdDvo","Dn6kGzRSjhU","1iLDIj0pDHk","xsRWpK4pf90","XewBty95JVg","y4jiyjDQGLY","h8oBykb_Pqs","xNp7OxgFJJM","b3u65f4CRLk","S-783rzHIS4","cZd1Js0QaOI","kZGEgVxyPHU","43o0vwAmFM8","BMMUWvavORI","v_-1peCW6Ok","9gkjyM7ZOUc"]
innfyQZHPpo chai0322 468 Music 210 41564032 2.61 13564 6126 ["nhSZs-aAZbo","hh0nVc0NYTE","12Z3J1uzd0Q","V1s9queYhF8","1Al4crLW9EM","61e1h4vALS0","8h98jb9Lk74","Z-HjmP7BCVc","o_pIQIV_NuU","82BDV1gYjdM","Af9aPlG2WFw","cC27PTFP4V8","-_CSo1gOd48","00c08ijIlHo","cIKxlWuTviE","dMH0bHeiRNg","iWg3IMN_rhU","v3ARyAb_1Bs","5P6UU6m3cqk","QjA5faZF1A8"]
Lt6o8NlrbHg seankingston 867 Music 257 41171303 4.74 101352 67579 ["qwflMOAaOf0","aVB5ViG_0yE","_QmajsQI9SM","Skz6h2gb-t8","t2dkCGDgVzk","HJgsbsMSe5w","BXhN2DbI0hc","sFJQAfW_jgw","RZJ32D-lrTE","U1bf7XLGGRE","ktUSIJEiOug","TWEez0U-9ag","BhkIjh-nuRo","a4X7eFbP3u4","b3u65f4CRLk","sm2fTDpuyyM","9pzro0T8rgY","TBzfqR4mrgE","c2cyose42jU","iWg3IMN_rhU"]
QjA5faZF1A8 guitar90 308 Music 320 40294882 4.83 329108 169563 ["r2BOApUvFpw","ATub40Npxik","m7Jh1BV1EOc","owAj5LiXG5w","Ddn4MGaS3N4","GxplDa3M5Io","by8oyJztzwo","aZpD0btOZx8","pZ9jrBg4Lwc","heISA256CRo","6wpPk8qk3uQ","i4BYMvVvMg0","2xjJXT0C0X4","p-VR-cXghko","-9ao_vOsZkg","wfqu4YEufYc","WetVXbYRfWk","JdxkVQy7QLM","fdjamy75C4A","dVUgd8ot6BE"]
Time taken: 111.631 seconds, Fetched: 10 row(s)
4.6 筛选出每个类别中评分最高的前10个视频
select * from youtube_category where categoryId="Music" order by ratings desc limit 10;
结果如下:
hive> select * from youtube_category where categoryId="Music" order by ratings desc limit 10;
Query ID = hadoop_20170711093838_6ab5573a-964f-486e-b0dc-77e5853fcfb5
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1499153664137_0119, Tracking URL = http://master:8088/proxy/application_1499153664137_0119/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0119
Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 1
2017-07-11 09:39:27,903 Stage-1 map = 0%, reduce = 0%
...
2017-07-11 09:40:36,298 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 57.41 sec
MapReduce Total cumulative CPU time: 57 seconds 410 msec
Ended Job = job_1499153664137_0119
MapReduce Jobs Launched:
Stage-Stage-1: Map: 3 Reduce: 1 Cumulative CPU: 57.41 sec HDFS Read: 607706243 HDFS Write: 2957 SUCCESS
Total MapReduce CPU Time Spent: 57 seconds 410 msec
OK
QjA5faZF1A8 guitar90 308 Music 320 40294882 4.83 329108 169563 ["r2BOApUvFpw","ATub40Npxik","m7Jh1BV1EOc","owAj5LiXG5w","Ddn4MGaS3N4","GxplDa3M5Io","by8oyJztzwo","aZpD0btOZx8","pZ9jrBg4Lwc","heISA256CRo","6wpPk8qk3uQ","i4BYMvVvMg0","2xjJXT0C0X4","p-VR-cXghko","-9ao_vOsZkg","wfqu4YEufYc","WetVXbYRfWk","JdxkVQy7QLM","fdjamy75C4A","dVUgd8ot6BE"]
cQ25-glGRzI RCARecords 742 Music 227 77674728 4.45 188154 186858 ["otMB3WVQNVg","CAoo71VEYhs","6gqSk1UBFyo","pULa7F1gWZM","C2eUCfREAbw","neZw2CH1h9s","2fjW33W7424","aVnl9QCfNyE","DKhnmUdmz74","tPW70sgrHiY","W4kR8OQCrlQ","-WtcXpnMIT8","YhRZx2QIJKg","5iaYFN5eERA","KChJyERIclc","xsRWpK4pf90","lOq-Q60zWQs","ywNZPURFJNU","dMH0bHeiRNg","tvkh29RKFRY"]
pv5zWaTEVkI OkGo 531 Music 184 32022043 4.83 121516 39974 ["dMH0bHeiRNg","x49WZRyXGe0","RbdbVhBGETQ","yRmqZRPgK1w","DjCL0_0Il7w","gq7r3F1SoX0","k66epna2Sss","1LNbzqoOPu4","vr3x_RRJdd4","ERV-wh4VwZI","xsRWpK4pf90","iAQZ_uui1SY","5P6UU6m3cqk","bav63MWNUKg"]
ktUSIJEiOug aliciakeys 953 Music 251 43583367 4.85 103074 59672 ["_VD-PDzmW4M","vmjl5ARtQWM","X_SmJpAmirw","glBHQSasVMQ","IGj3T7C-RdE","6U_4ANczMPY","oh3Pkw6cokk","j97WOHviXbE","FqsGSvrcfDE","2PWfB4lurT4","Yci8zVNMEdg","sF84pIhP5UM","dDfEjBSWj54","Zdm7FJktDOM","a4X7eFbP3u4","jhPAK8HjcPI","ePyRrb2-fzs","8_8KJfSNgC4","kN9vm95SocU","4DC4Rb9quKk"]
Lt6o8NlrbHg seankingston 867 Music 257 41171303 4.74 101352 67579 ["qwflMOAaOf0","aVB5ViG_0yE","_QmajsQI9SM","Skz6h2gb-t8","t2dkCGDgVzk","HJgsbsMSe5w","BXhN2DbI0hc","sFJQAfW_jgw","RZJ32D-lrTE","U1bf7XLGGRE","ktUSIJEiOug","TWEez0U-9ag","BhkIjh-nuRo","a4X7eFbP3u4","b3u65f4CRLk","sm2fTDpuyyM","9pzro0T8rgY","TBzfqR4mrgE","c2cyose42jU","iWg3IMN_rhU"]
xsRWpK4pf90 universalmusicgroup 902 Music 240 44614530 4.73 99373 53441 ["a4X7eFbP3u4","4WAapKx2TvM","PgZgubuT2DM","FIUv3dOBbCk","Dk8NQB_T1ws","7NAbTHGIPP8","mekQxrnhQ3I","OouU4PFUw3Y","rwgXvL9o0QQ","4eeyhtlJp5A","ktUSIJEiOug","UZStqFeYk3Y","sF84pIhP5UM","FKXm3Qg7sBo","a1ti0KccvOg","ERV-wh4VwZI","iWg3IMN_rhU","i1PKmEaUqes","gsAN_Zfc1nc","CmBCLWkrysY"]
K2cYWfq--Nw FrEckleStudios 841 Music 224 17005186 4.82 90991 50788 ["SyIC3Munnyw","cZd1Js0QaOI","lLYD_-A_X5E","alqM0IYeH54","wQVEPFzkhaM","oGECJP3phyY","nPLOiBM8hLk","VpBDqtUEWcM","MJPdVVOmbz4","bPZJYQXQsm8","nPBmXEO3yUU","3jzSh_MLNcY","_EXeBLvmllg","Cva_sGN_0VA","Sr2JneittqQ","SYpYmkcadRA","71eBjNbdXDg","DgBgnoEY4iM","bl6RJyZdBSU","FAK_jtOf70g"]
-xEzGIuY7kw alyankovic 580 Music 171 24095019 4.84 90091 45517 ["HYokLWfqbaU","N26KWq7MmSc","JCAt9WcCFbM","Rt1_6uz_sVU","Nh9mVsBKwYs","p9Zt8mn14hY","8n7ncJEFuSw","FT060JGp9sQ","E6Zc9NyYH-k","ixyTNd-Ln38","zIllRdSzSug","GsfVw9xxoNY","XkDeJgGrtdU","fqz1ojIQTBk","5GE82tqcYYQ","ODdGhOOUOpI","v3ARyAb_1Bs","XbVtbc_XzrI","U1ULxKM75rY","Jw00EUh0GT4"]
EwTZ2xpQwpA TayZonday 796 Music 292 16841569 4.23 83514 129200 ["2x2W12A8Qow","P6dUCOS1bM0","NattlyH0IeM","nTQOpibv_OA","9mSKBgvHdoE","hjD6iigdB-g","caIBKOztlAo","xUz2YMmiq0k","aWY3eYOX3U0","mD5_GUovjiM","1oFS-q8BIps","deXAEN70CDY","0pElTyjfxe0","eyDuGwlrFRs","m6SjPfc_xNA","qYGvGWY1FDs","N0amCfgnwY8","JPu4uErBFks","pgSA-ErKd8c","ZZgGGlOGyUg"]
xWHf_vYZzQ8 universalmusicgroup 771 Music 262 25607299 4.79 83419 70529 ["jvz0bvYmnto","zElEs8yw7fw","9FcBnaLjxY4","95wgKdSJGDo","ueOgZPBXY4A","k3O5uy-MBBk","O3sGTaQ9s9c","aT434G38OBg","4PdDPrwIwhI","Y-VifE8EK8w","9wpxno6qUd0","B17SYROT3GI","wJtUuxmm-B0","LBDnkJ5h1ho","CfzpBjKpSPE","QF2AmC2xyXM","eoa6Gx4HxTc","mvPvcV44rCc"]
Time taken: 109.053 seconds, Fetched: 10 row(s)
4.7 找出用户中上传视频最多的10个用户的所有视频
思路:
- 筛选出用户表中上传视频数最多的前十位
select * from user order by videos desc limit 10; - 将筛选结果跟youtube3进行join得出结果
select b.*,a.videos,a.friends from (select * from user order by videos desc limit 10) a join youtube3 b on a.uploader = b.uploader order by views desc limit 20;
结果如下:
hive> select b.*,a.videos,a.friends from (select * from user order by videos desc limit 10) a join youtube3 b on a.uploader = b.uploader order by views desc limit 20;
Query ID = hadoop_20170711101616_d9ebb69e-7fbe-4f09-ad55-54f4dfd11ebe
Total jobs = 5
Launching Job 1 out of 5
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1499153664137_0125, Tracking URL = http://master:8088/proxy/application_1499153664137_0125/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0125
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2017-07-11 10:17:30,594 Stage-1 map = 0%, reduce = 0%
2017-07-11 10:18:03,439 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 6.03 sec
2017-07-11 10:18:25,025 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 10.28 sec
MapReduce Total cumulative CPU time: 10 seconds 280 msec
Ended Job = job_1499153664137_0125
Stage-8 is filtered out by condition resolver.
Stage-9 is selected by condition resolver.
Stage-2 is filtered out by condition resolver.
17/07/11 10:18:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Execution log at: /tmp/hadoop/hadoop_20170711101616_d9ebb69e-7fbe-4f09-ad55-54f4dfd11ebe.log
2017-07-11 10:18:33 Starting to launch local task to process map join; maximum memory = 518979584
2017-07-11 10:18:44 Dump the side-table for tag: 0 with group count: 10 into file: file:/tmp/hadoop/146c18a2-9a94-440d-9302-72e50ede944e/hive_2017-07-11_10-16-50_653_2968482866464413222-1/-local-10007/HashTable-Stage-6/MapJoin-mapfile90--.hashtable
2017-07-11 10:18:44 Uploaded 1 File to: file:/tmp/hadoop/146c18a2-9a94-440d-9302-72e50ede944e/hive_2017-07-11_10-16-50_653_2968482866464413222-1/-local-10007/HashTable-Stage-6/MapJoin-mapfile90--.hashtable (611 bytes)
2017-07-11 10:18:44 End of local task; Time Taken: 11.244 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 3 out of 5
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1499153664137_0126, Tracking URL = http://master:8088/proxy/application_1499153664137_0126/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0126
Hadoop job information for Stage-6: number of mappers: 4; number of reducers: 0
2017-07-11 10:19:25,285 Stage-6 map = 0%, reduce = 0%
...
2017-07-11 10:20:26,162 Stage-6 map = 100%, reduce = 0%, Cumulative CPU 44.09 sec
MapReduce Total cumulative CPU time: 44 seconds 90 msec
Ended Job = job_1499153664137_0126
Launching Job 4 out of 5
Number of reduce tasks determined at compile time: 1
Starting Job = job_1499153664137_0127, Tracking URL = http://master:8088/proxy/application_1499153664137_0127/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0127
Hadoop job information for Stage-3: number of mappers: 2; number of reducers: 1
2017-07-11 10:21:01,746 Stage-3 map = 0%, reduce = 0%
2017-07-11 10:21:30,462 Stage-3 map = 100%, reduce = 0%, Cumulative CPU 4.27 sec
2017-07-11 10:21:48,968 Stage-3 map = 100%, reduce = 100%, Cumulative CPU 6.36 sec
MapReduce Total cumulative CPU time: 6 seconds 360 msec
Ended Job = job_1499153664137_0127
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 10.28 sec HDFS Read: 9796823 HDFS Write: 434 SUCCESS
Stage-Stage-6: Map: 4 Cumulative CPU: 44.09 sec HDFS Read: 574641054 HDFS Write: 2133846 SUCCESS
Stage-Stage-3: Map: 2 Reduce: 1 Cumulative CPU: 6.36 sec HDFS Read: 2144607 HDFS Write: 6335 SUCCESS
Total MapReduce CPU Time Spent: 1 minutes 0 seconds 730 msec
OK
lUOe76YPY7M expertvillage 801 ["Howto","Style"] 119 1014983 3.36 596 689 ["5cEsa-rswrg","HOoQPbcF8Rk","ht3DMDx1spo","VFnf17cp3Eg","IP35rm_31RQ","hXuiuwZPX0M","s8pH5jgSuP8","KIEkwm_6DD4","ovFSahIPaVQ","rQtwcSDh3Hk","COeT3WR7SFc","ZNLBKRpWt7w","qncPBZ9DRRk","tOBqg3yDlyE","YgfWUXsbr7w","qBB3AsgjN1M","XX8ouWx6xm8","BtF9dYRuiGU","NRanI5qR81I","evfl0NGLjVE"] 86228 5659
2Aj2wx9PYxI expertvillage 815 ["Howto","Style"] 170 918553 3.82 1596 1533 ["hb5QaCfm7bg","YVDPSZe21KI","C1elpMjZ4wE","8gD1RG-tnNk","d-IDrRSVkCQ","IAOr6SYGAyw","3mbx03mP5eg","o-pN8qAiZhQ","K_qw03-3gFg","JgsO3A_vbtk","ffKr98Ium-M","WKYcBKa8_7Q","dgeX8CZmqvo","jgUCs72RDHs","PaNhROW-wEo","WhRCVm-1r2k","5I5O8P-r5Rk","reJSIZ3ugsE","-51iHvqP0Rs","9wItsn3r_kc"] 86228 5659
VBGHer3yFyc libertaddigitaltv 998 ["News","Politics"] 29 828616 4.68 1337 10763 ["HL9p8I3lOFA","Fb3HZ7K1gTk","AARX6XOA5h0","t3DPDKbRxio","VAmz8MNZdlk","WrC6_uBe4eA","d9X8DYOA5DE","pZuZv8UK7YA","NwswzwoA4pA","g1J2gvjp4fs","3SMhrQZdABc","h6M42Il4kN8","izM_JwEkWPs","_FM_IoPIFq4","uMZCCPtl_Do","5cZFinVFubQ","om0iHwO4d6o","lUZxlXkbaxM","iSJ3qKpC-3o","tRdq9_Si0Ws"] 6874 1
07jnqD8wvyE expertvillage 776 ["Howto","Style"] 217 574365 4.55 1073 1526 ["3wNpOp50uGI","-taU9d26wT4","_kly-fVUi1I","kZtRmnly_sU","LKe50AJvhz8","0GAUnuuBkW4","eM6Bpz6-dio","i9xf62PKC5M","mwFfk7igYC0","ilLTA4p4MMw","qyoLuTjguJA","aC-KOYQsIvU","0V6LWQZjYRk","_uZkvzYEXp0","auQbi_fkdGE","qgiUSEpg8Xc","4I79yc3uKfE","7Lz-XGjynN8","taHOwsdQXWI","2Aj2wx9PYxI"] 86228 5659
NDwZcLGekE8 expertvillage 801 ["Howto","Style"] 94 491582 4.35 553 588 ["CEmYX76UJy0","0sF-PoXR9ZU","qDfoNYpozxk","T4P3yp6mFfY","rzcj_Wq6BIE","A2LTVhqHAdo","4gyH0mJPqY8","m3PsQCMz3sY","wckYj_PAhgM","YkvwdM9Xstc","Ph2UAjGfeB4","vT_uXPfSAVE","89mAouUu8YM","0t5pPZ4AXX0","sLNFN75CYbw","QQGHXEPfMcQ","p6gcz4hkOmw","qk6R-X-YmUw","NR_9VlHFOu8","0gA_3BAxtVM"] 86228 5659
CEmYX76UJy0 expertvillage 801 ["Howto","Style"] 90 422284 4.48 448 459 ["NDwZcLGekE8","jRAKgd50Jh0","0sF-PoXR9ZU","IjjJWXgqWL4","m3PsQCMz3sY","8Di58l1dfPA","DpEVqy954OY","dmYMeImNy1s","qk6R-X-YmUw","wckYj_PAhgM","qZSD3JLPURk","YkvwdM9Xstc","fdSFh_z0HJY","n5LDyWyHJTo","Dmm5O1DPQYc","A2LTVhqHAdo","I2mM0r76b7o","iE16Z9Gh3TQ","rzcj_Wq6BIE","8szPfWiK-ag"] 86228 5659
E6VWi7IaroA expertvillage 946 ["Howto","Style"] 228 338889 4.46 300 251 ["fLy3M2jfe4w","cC1FBmbYgMk","xEJhInSxsQg","RxMLoqzKZ8s","Z9a--4RQ9O8","7neOXKLrgzk","JS-lS9ngMtY","aKIFfgPLY8I","EsqPNhtkWIo","NJUPNtZN2rc","vV7pajY0zOk","GTtZapaRDbg","-0UVDD6ZwYE","B9jNaUP59Vo","2QDGPe50E4Y","lJDZO_pX4aw","wuD2CemlvP0","ZTx2ZFI1spg","CzZghYdupiQ","5ueEKSJ8J-U"] 86228 5659
0sF-PoXR9ZU expertvillage 801 ["Sports"] 77 326365 4.21 512 663 ["24FJ58Q22D4","m3PsQCMz3sY","qk6R-X-YmUw","CEmYX76UJy0","NDwZcLGekE8","QiIunH47qew","I2mM0r76b7o","c_qSPdLPReM","5eYBaQ5Cg-c","fdSFh_z0HJY","8szPfWiK-ag","Dmm5O1DPQYc","A2LTVhqHAdo","n-Cvzk84X8o","T4P3yp6mFfY","1vo3CCBIcaY","fnGFR1AFCJc","n5LDyWyHJTo","YkvwdM9Xstc","emNfLmLTQb0"] 86228 5659
ohdiRHLSw5Y expertvillage 447 ["Howto","Style"] 78 321521 3.79 139 102 ["12_nJamoyTk","dXLhjYgMZ68","ukzFwRoFj4k","sOUgrJo2kIg","HhO39nCDfMg","3iVP0tzwhVc","MFNA0PqLynY","ZlcMzLhiBjg","7F8ajh_DDYs","6vaPIz6S6sk","-L_uhGXOtF0","-libzR5AV58","CSdSH7XKTtQ","KZX5jXPAWIk","Vkj5fbrQpNs","YCgnFIk5Acg","hjPDmVf6KJw","BZnhMl85dq4","iuqVpMdb1NM","GO2_3q6euug"] 86228 5659
0sF-PoXR9ZU expertvillage 801 ["Howto","Style"] 77 294737 4.21 438 599 ["24FJ58Q22D4","m3PsQCMz3sY","CEmYX76UJy0","NDwZcLGekE8","A2LTVhqHAdo","qk6R-X-YmUw","5eYBaQ5Cg-c","fdSFh_z0HJY","jGdXcitOUzY","n-Cvzk84X8o","wckYj_PAhgM","I2mM0r76b7o","QMZaxtjhZ5k","Dmm5O1DPQYc","vT_uXPfSAVE","8szPfWiK-ag","QiIunH47qew","YkvwdM9Xstc","DpEVqy954OY","c_qSPdLPReM"] 86228 5659
3Xc-kxSznZ0 expertvillage 430 ["Howto","Style"] 107 292329 3.33 565 964 ["9YAKfkVvvEc","wEkcPjBaHjs","zMl3ixv1kHw","Q4ZUPEgbzu0","-jIEGZwLPvo","YxdxGLBCCRA","Jp0LaJ_ftT0","PgCRWvwdFHM","yZL2MOFk6I0","qDJk3-ofbk0","e4nHAAuDiPE","cuRk9bTbU6A","z53cPMciVio","MREw0dIWaQ0","237AQAvXtw8","a3dnOupmt7Y","3-va5g_QVss","mFdaiY8YhEY","D6li4tYmKf0","bEVHBHKqBfQ"] 86228 5659
VxYkdycN3WE expertvillage 801 ["Howto","Style"] 77 257702 4.19 153 137 ["ZpCE7mnYJK0","NYDdCGtbr8E","904S9W9AZg8","Rl5KEODnCfI","7-i2gl9288I","TC1XlhRwhsU","rD_4nCKf5Ns","pFKZZ120fyU","oN4Esih54V8","N1Ov6cCUXkc","rDM97S4jPiw","IK2uLNND3i0","OZXbM6kRuuA","5pvyXkFTa18","bExUJAnCLZw","VoREkZvkyI8","BWhfpzAItx4","EEybT3IeyzA","QteH0KOJx0g","dbXHvd_SGkk"] 86228 5659
-libzR5AV58 expertvillage 447 ["Howto","Style"] 391 249682 3.34 73 62 ["-L_uhGXOtF0","4LvTYBbgMjE","-wO07skEauo","CSdSH7XKTtQ","kzQDoXKDgTM","e1kzLj-FZ6k","Guhgb3pWRAQ","KIbyySDSGEc","DGUBzs-GNxE","KZX5jXPAWIk","ohdiRHLSw5Y","6vaPIz6S6sk","5MrAuq_F2dU","gOv-yPqZ3LE","BZnhMl85dq4","oCDdDCqygOg","12_nJamoyTk","7F8ajh_DDYs","HhO39nCDfMg","X3-6IT-7S80"] 86228 5659
HL9p8I3lOFA libertaddigitaltv 998 ["News","Politics"] 17 235313 4.61 114 581 ["VBGHer3yFyc","Fb3HZ7K1gTk","Z3Pe8ff37gk","t3DPDKbRxio","pZuZv8UK7YA","_FM_IoPIFq4","1P_EQjd7lq4","yEzNx2g7ors","_neKPvLdG8Y","7i_WDprxACU","zCM0nsym0cE","om0iHwO4d6o","h6M42Il4kN8","Bjk1McjrWtg","VAmz8MNZdlk","-t2hwljZmA8","gH6PHFrA6g8","ysAwzxqbWD8","U5lX_EIFOOA","4zu_X_3qHLA"] 6874 1
oE5ZhBHy6Rs expertvillage 776 ["Howto","Style"] 143 234551 4.64 358 280 ["rCWRUtqJgzI","476vNb6thyM","eg3eo8x_9rw","grK43Poye1U","HGncDLMNPRI","iylaKSDfLSc","JFqiDcvRW2Y","XrbXGRLIDTc","NmxoR9lsu-c","JW8FJefw82k","n0Mi550FDng","QXsW44Wh6tY","6oJDYT5MlEA","IJy05ssELhg","TvT-M7tTQHE","GYNVEWCO_X0","5pJ_o19GVRA","q_Nn_CnvD9s","uqpQNkFxVAU","KKP8vWeEc30"] 86228 5659
wOnP_oAXUMA expertvillage 429 ["Howto","Style"] 120 228842 3.56 236 601 ["BlDWdfTAx8o","WF1kRLJ_4B8","6bu9csQC45c","xTAYAl4g7HE","871VTEF7qIY","UheCchftswc","TKg_cdwq9l4","ADQ4Bjq42wE","cSJCDcAKShA","SYklbxHP2tI","fktuPPNkQpQ","L6267-JZu3E","_IwJ3Aj2XrQ","hf-4ppNmH-U","aO8evzfTR8E","E-kNUEv0YgA","Hsr1-xcXFL8","VCiLZMjo_PQ","rMEgKgZOJi8","YeYU_OE5oIU"] 86228 5659
MVPCc4eODys expertvillage 448 ["Howto","Style"] 69 225879 4.29 171 315 ["iPPgmvcsJNw","xc8sysT31mQ","svt_fnKE_80","kBjUDCyDCuI","ysa50-plo48","yutDlQL9Ct0","gCra4qOrjFw","CN1QJDGinwE","dFJjaj7pXsA","LvJzFdcYSag","4I79yc3uKfE","AcryZ1o4RQE","WyyB0M6wAV8","A0sOVKbYR20","8XCyd0XEejc","5BmVKKpu_CU","8_QNzZ9WPRo","pNiX_l-HEGM","CQJSZs-euZU","ZwHHYzK8UCg"] 86228 5659
7neOXKLrgzk expertvillage 946 ["Howto","Style"] 111 215943 4.07 215 314 ["xEJhInSxsQg","PSR5kinMX3Y","E6VWi7IaroA","5ovGW0CzQp4","DeMcNhz0Vz8","CzZghYdupiQ","72EYsQEw35k","tScm-eZInBE","Hz4lnt7d88s","dgLUbXSqwSs","G3P4VFrB_zM","pU0O2-o67C0","pDGP8Q3Zzb8","hQhdK8l6CuY","VK9VflDsunI","LWgzG8I9bWY","csL7WNTWK9U","LE93Q5k5-fI","Uirspi_t3xM","wGLNJK4zcdE"] 86228 5659
AaO1aLwk53Y expertvillage 443 ["Howto","Style"] 156 204450 4.14 210 290 ["Q8YYZGvKM-8","Bn59zha-uAQ","BjnkKuI-YAY","aQeXHXkL6ow","Qtp2ibwd2Ms","nstbrkjk3OM","Odj4ATUPh9w","v2mm2-9JQ-I","328YVJVtMKU","B-jzkyKxDLo","yB-yGNxNEZg","b0msCCnzmNA","sfh5AJhRto8","SEILiSkGzrY","jm4uxgVDU6o","svOBjuvvy-k","y_1nj8fBQOE","zb8ArkvxOxo","VgWayeJdV0w","Yor1dHH6orE"] 86228 5659
m3PsQCMz3sY expertvillage 801 ["Howto","Style"] 99 203912 4.38 177 207 ["CEmYX76UJy0","0sF-PoXR9ZU","NDwZcLGekE8","qk6R-X-YmUw","DpEVqy954OY","Ph2UAjGfeB4","fdSFh_z0HJY","YkvwdM9Xstc","8szPfWiK-ag","Dmm5O1DPQYc","T4P3yp6mFfY","I2mM0r76b7o","7GtVwKZBf9U","mugebyUqK6o","qDfoNYpozxk","XA7dRqkp8YA","a28QcMXjHyI","n5LDyWyHJTo","ovoNU_CbmfA","Q2j6MlACIoc"] 86228 5659
Time taken: 300.414 seconds, Fetched: 20 row(s)
4.8 筛选出每个类别中观看数Top10
** select a.* from (select videoId,categoryId,views,row_number() over(partition by categoryId order by views desc) rank from youtube_category) a where rank<=10;**
hive> select a.* from (select videoId,categoryId,views,row_number() over(partition by categoryId order by views desc) rank from youtube_category) a where rank<=10;
Query ID = hadoop_20170713170101_76ffdd80-e72c-4c7f-b9ad-9b8c3f7e17ab
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 3
Starting Job = job_1499153664137_0143, Tracking URL = http://master:8088/proxy/application_1499153664137_0143/
Kill Command = /home/hadoop/app/hadoop/bin/hadoop job -kill job_1499153664137_0143
Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 3
2017-07-13 17:01:52,690 Stage-1 map = 0%, reduce = 0%
2017-07-13 17:02:24,374 Stage-1 map = 44%, reduce = 0%, Cumulative CPU 10.49 sec
...
2017-07-13 17:03:08,803 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 89.07 sec
MapReduce Total cumulative CPU time: 1 minutes 29 seconds 70 msec
Ended Job = job_1499153664137_0143
MapReduce Jobs Launched:
Stage-Stage-1: Map: 3 Reduce: 3 Cumulative CPU: 89.07 sec HDFS Read: 73070026 HDFS Write: 7487 SUCCESS
Total MapReduce CPU Time Spent: 1 minutes 29 seconds 70 msec
OK
qruSOZq-wJg Activism 2673823 1
tFxk7glmMbo Activism 1063928 2
dYN278GB2Kg Activism 831836 3
n-Akqik3hME Activism 778987 4
LtqRAYWjz2Q Activism 768422 5
2v-whrgAbf4 Activism 742172 6
ja4d02s0WKQ Activism 733014 7
IpAOYskH1s8 Activism 714580 8
ieuVTOJyLBs Activism 640994 9
6CXBgbP_ZKs Activism 562163 10
12Z3J1uzd0Q Animation 65341925 1
bFytHZHFXhA Animation 38937813 2
sdUUx5FdySs Animation 16151661 3
vQbYvjmfbr4 Animation 12114231 4
rNKefRRvV7g Animation 11884311 5
6HIavxnUHls Animation 11735507 6
gm5DHKI8o5o Animation 9381674 7
316BF17k5d8 Animation 9372064 8
ZuQMn6Z0k_w Animation 9195588 9
6B26asyGKDo Animation 8915176 10
dMH0bHeiRNg Comedy 79897120 1
5P6UU6m3cqk Comedy 42525795 2
Tx1XIm6q4r4 Comedy 38378910 3
Q5im0Ssyyus Comedy 21170471 4
k66epna2Sss Comedy 17944367 5
AYxu_MQSTTY Comedy 15665340 6
pYak2F1hUYA Comedy 15634357 7
_OBlgSz8sSM Comedy 14567743 8
wCF3ywukQYA Comedy 14227802 9
sHzdsFiBbFc Comedy 13606292 10
Mz9BlgiV45I Education 2807232 1
VnGb6UQvwJs Education 2441432 2
j6lADdhmAec Education 2306689 3
pYfTQpsErsM Education 2066429 4
7GNE6AhL0qI Education 2037257 5
NiGZf15W9xc Education 1867126 6
-FeAK-q5Cok Education 1741504 7
N5P6o2YaqVI Education 1656512 8
Dl-agXA63nw Education 1529566 9
JX3VmDgiFnY Education 1373008 10
cQ25-glGRzI Music 77674728 1
7AVHXe-ol-s Music 60349673 2
ePyRrb2-fzs Music 45984219 3
xsRWpK4pf90 Music 44614530 4
ktUSIJEiOug Music 43583367 5
b3u65f4CRLk Music 43511791 6
iWg3IMN_rhU Music 43323757 7
innfyQZHPpo Music 41564032 8
Lt6o8NlrbHg Music 41171303 9
QjA5faZF1A8 Music 40294882 10
qruSOZq-wJg Nonprofits 2673823 1
tFxk7glmMbo Nonprofits 1063928 2
dYN278GB2Kg Nonprofits 831836 3
n-Akqik3hME Nonprofits 778987 4
LtqRAYWjz2Q Nonprofits 768422 5
2v-whrgAbf4 Nonprofits 742172 6
ja4d02s0WKQ Nonprofits 733014 7
IpAOYskH1s8 Nonprofits 714580 8
ieuVTOJyLBs Nonprofits 640994 9
6CXBgbP_ZKs Nonprofits 562163 10
SkELRp4wKPs Politics 12730681 1
AFFQrUyi8-s Politics 11953598 2
YgW7or1TuFk Politics 8516433 3
JgiGrXpOhYg Politics 8211043 4
up5jmbSjWkw Politics 7869023 5
hr23tpWX8lM Politics 7209885 6
Kje7NUNebL8 Politics 6707868 7
I4u3449L5VI Politics 6675798 8
jjXyqcx-mYY Politics 6462051 9
a9Vde3FHMmc Politics 6028642 10
8h98jb9Lk74 UNA 33880568 1
AR5yq0aMxI8 UNA 18409445 2
DH7FVB15EPU UNA 15487752 3
Qvfx6UiAAG0 UNA 13977592 4
n-FdoZdfpmE UNA 12241555 5
MddPeH1DAvY UNA 12211192 6
sI8Sus_KRpY UNA 10460047 7
14oqm9ywLIw UNA 10344366 8
LytRWuhn4BM UNA 9899270 9
iSByZARPJSU UNA 9316897 10
Oro28yg7W74 Gaming 727015 1
0to1HDxtkM4 Gaming 529346 2
M-w-gLVfi30 Gaming 396471 3
i6aH2F1WjsY Gaming 382350 4
EgbUSsblCSQ Gaming 266718 5
7SMnWr9MqwQ Gaming 210147 6
NQMBIRipp5A Gaming 207783 7
vi1lVqJSbsM Gaming 186709 8
aId2hK4pI2M Gaming 186054 9
tWPWcyLdrko Gaming 181979 10
sLGLum5SyKQ Howto 31121122 1
eMhGpzyFdhE Howto 16320501 2
KPOOWvP_dd8 Howto 11131527 3
91wuBqlny50 Howto 8579613 4
mr5ghuaTK14 Howto 8361812 5
GfPJeDssBOM Howto 6101232 6
6gmP4nk0EOE Howto 4970382 7
mM-30cmM33s Howto 4936417 8
STQ3nhXuuEM Howto 4655542 9
XZGgeGHU1Bs Howto 4424698 10
LU8DDYz68kM Pets 27721690 1
epUk3T2Kfno Pets 10352882 2
z3U0udLH974 Pets 9461084 3
kkT7A3jegBc Pets 9269896 4
TZ860P4iTaM Pets 9009434 5
7tRWRSfcDuQ Pets 8538635 6
Qit3ALTelOo Pets 7939352 7
Zi9GOvR3Ynw Pets 7351184 8
Kxa0mnDj0bs Pets 7289545 9
PadauuWF94w Pets 6271287 10
W1czBcnX1Ww Science 3234852 1
D99NHb6B03s Science 3176792 2
tk_F2Y-F2kE Science 3121903 3
nhyH7lQ6D2k Science 2879861 4
JCbKv9yiLiQ Science 2672391 5
U5vs2ly_grk Science 2611389 6
M0ODskdEPnQ Science 2555284 7
p4ebtj1jR7c Science 2536109 8
8wTlureUMP8 Science 2477804 9
NZNTgglPbUA Science 2230729 10
vt4X7zFfv4k Sports 12598542 1
P-bWsOK-h98 Sports 12101588 2
OS5tQvQOB-Y Sports 9047732 3
P9LmHXXWiJs Sports 7415813 4
NIKdK-T-jZM Sports 7175403 5
euMu1SKi-ak Sports 7040023 6
zKQgTiqhPbw Sports 7017699 7
yeXoxNP8_xY Sports 6422039 8
q8t7iSGAKik Sports 6312667 9
COcczatkNP4 Sports 6258611 10
sLGLum5SyKQ Style 31121122 1
eMhGpzyFdhE Style 16320501 2
KPOOWvP_dd8 Style 11131527 3
91wuBqlny50 Style 8579613 4
mr5ghuaTK14 Style 8361812 5
GfPJeDssBOM Style 6101232 6
6gmP4nk0EOE Style 4970382 7
mM-30cmM33s Style 4936417 8
STQ3nhXuuEM Style 4655542 9
XZGgeGHU1Bs Style 4424698 10
ZeBd_F2Bz5Y Vehicles 8623041 1
ju6t-yyoU8s Vehicles 7325889 2
uUurALr_Ckk Vehicles 7258142 3
WShY1ObPvhQ Vehicles 7047156 4
q3idQKi5EqM Vehicles 6470816 5
9JWywFpZkg4 Vehicles 6465830 6
tth9krDtxII Vehicles 6394563 7
npTRXr4Sgxg Vehicles 5450317 8
aCamHfJwSGU Vehicles 5354163 9
S-ppGiwc0wQ Vehicles 5039415 10
LU8DDYz68kM Animals 27721690 1
epUk3T2Kfno Animals 10352882 2
z3U0udLH974 Animals 9461084 3
kkT7A3jegBc Animals 9269896 4
TZ860P4iTaM Animals 9009434 5
7tRWRSfcDuQ Animals 8538635 6
Qit3ALTelOo Animals 7939352 7
Zi9GOvR3Ynw Animals 7351184 8
Kxa0mnDj0bs Animals 7289545 9
PadauuWF94w Animals 6271287 10
ZeBd_F2Bz5Y Autos 8623041 1
ju6t-yyoU8s Autos 7325889 2
uUurALr_Ckk Autos 7258142 3
WShY1ObPvhQ Autos 7047156 4
q3idQKi5EqM Autos 6470816 5
9JWywFpZkg4 Autos 6465830 6
tth9krDtxII Autos 6394563 7
npTRXr4Sgxg Autos 5450317 8
aCamHfJwSGU Autos 5354163 9
S-ppGiwc0wQ Autos 5039415 10
v3ARyAb_1Bs Blogs 31812447 1
5GE82tqcYYQ Blogs 30209692 2
ervaMPt4Ha0 Blogs 23859297 3
uWow42TCwzg Blogs 22389389 4
-_CSo1gOd48 Blogs 21176701 5
D2kJZOfq7zk Blogs 20458159 6
nhSZs-aAZbo Blogs 20423158 7
GuMMfgWhm3g Blogs 18634621 8
4jbkRGPxvaM Blogs 15980887 9
hMnk7lh9M3o Blogs 13553069 10
LpAI8TzQDes Entertainment 65078772 1
244qR7SvvX0 Entertainment 57790943 2
1uwOL4rB-go Entertainment 39883413 3
w2xUzv6iZWo Entertainment 25222946 4
vr3x_RRJdd4 Entertainment 25093671 5
lj3iNxZ8Dww Entertainment 23737579 6
RB-wUgnyGv0 Entertainment 23067889 7
lsO6D1rwrKc Entertainment 21758300 8
7iYWxfNSjYk Entertainment 19029677 9
5pGJCkCDK5A Entertainment 18858695 10
p0aQvKDA1K0 Events 12239023 1
bNF_P281Uu4 Events 9125026 2
AlPqL7IUT6M Events 6441558 3
J833f9fqWBA Events 4776832 4
xIvIWJbzimo Events 4284770 5
QuTj9a04o-s Events 4053317 6
eejQPUyeNiY Events 3573812 7
3QL97xldoXc Events 3010296 8
r43yCiKlbCo Events 2806995 9
z42fchrzhHY Events 2565257 10
12Z3J1uzd0Q Film 65341925 1
bFytHZHFXhA Film 38937813 2
sdUUx5FdySs Film 16151661 3
vQbYvjmfbr4 Film 12114231 4
rNKefRRvV7g Film 11884311 5
6HIavxnUHls Film 11735507 6
gm5DHKI8o5o Film 9381674 7
316BF17k5d8 Film 9372064 8
ZuQMn6Z0k_w Film 9195588 9
6B26asyGKDo Film 8915176 10
SkELRp4wKPs News 12730681 1
AFFQrUyi8-s News 11953598 2
YgW7or1TuFk News 8516433 3
JgiGrXpOhYg News 8211043 4
up5jmbSjWkw News 7869023 5
hr23tpWX8lM News 7209885 6
Kje7NUNebL8 News 6707868 7
I4u3449L5VI News 6675798 8
jjXyqcx-mYY News 6462051 9
a9Vde3FHMmc News 6028642 10
v3ARyAb_1Bs People 31812447 1
5GE82tqcYYQ People 30209692 2
ervaMPt4Ha0 People 23859297 3
uWow42TCwzg People 22389389 4
-_CSo1gOd48 People 21176701 5
D2kJZOfq7zk People 20458159 6
nhSZs-aAZbo People 20423158 7
GuMMfgWhm3g People 18634621 8
4jbkRGPxvaM People 15980887 9
hMnk7lh9M3o People 13553069 10
W1czBcnX1Ww Technology 3234852 1
D99NHb6B03s Technology 3176792 2
tk_F2Y-F2kE Technology 3121903 3
nhyH7lQ6D2k Technology 2879861 4
JCbKv9yiLiQ Technology 2672391 5
U5vs2ly_grk Technology 2611389 6
M0ODskdEPnQ Technology 2555284 7
p4ebtj1jR7c Technology 2536109 8
8wTlureUMP8 Technology 2477804 9
NZNTgglPbUA Technology 2230729 10
p0aQvKDA1K0 Travel 12239023 1
bNF_P281Uu4 Travel 9125026 2
AlPqL7IUT6M Travel 6441558 3
J833f9fqWBA Travel 4776832 4
xIvIWJbzimo Travel 4284770 5
QuTj9a04o-s Travel 4053317 6
eejQPUyeNiY Travel 3573812 7
3QL97xldoXc Travel 3010296 8
r43yCiKlbCo Travel 2806995 9
z42fchrzhHY Travel 2565257 10
Time taken: 121.381 seconds, Fetched: 250 row(s)
5 总结
上面的8个例子向大家展示Hive中简单和稍微复杂的操作,有的启动一个Job就可以完成,有的则需要启动多个Job才能完成,我们也可以看到,启动的Job越多运行时间就会越长,但是实际工作中的操作只会远比我们所演示要更加复杂,越是复杂的操作就更加需要去优化,来达到减少运行时间的目的,所以下一篇我们来看看Hive的优化实践