hive之创建桶表

zoukankan html css js c++ java

hive之创建桶表

1.创建桶表,用id进行分桶，分3个桶，行结束符用","
$hive>create table t6(id int,name string,age int) clustered by (id) into 3 buckets row format delimited fields terminated by ','
2.加载数据到桶表，按照桶id进行hash存储到不同的文件中。
$hive>load data local inpath '/home/centos/customers.txt' into table t6;//加载数据不会进行分桶操作
3：向桶表中插入数据。
$hive>insert into t6 select id,name,age from tx;//该过程是个mr的过程，查询tx表数据插入到t6表的过程
4:分区表是在目录的层面进行过滤，桶表是在存储文件的情况下进行过滤，桶表在进行hash之后，划分在同一个桶不同的段里面。

5.hive中的内连接查询
先加载数据到两个表中去：load data local inpath '/home/centos/customers.txt' into table customers;
$hive>select a.,b. from customers a,orders b where a.id=b.cid;
6.hive中的左外连接查询：$hive>select a.,b. from customers a left outer join orders b on a.id = b.cid ;
右外连接：$hive>select a.,b. from customers a right outer join orders b on a.id =b.cid;

7.explode,炸裂，表生成函数，这个函数是将字符串按照空格来进行切割，切成多条记录，就是切割成一个一个的单词
使用hive实现单词统计：
//第一步：建表
$hive>create table doc1(line string) row format delimited fields terminated by ' ';
//第二步，加载文档进来
load data local inpath '/home/centos/customers.txt' into customers ;

查看全文

相关阅读:
CSS里面position：relative与position：absolute 区别
 JSP页面规格化
 iframe自适应高度
 getElementByTagName的使用
 div左右居中
 @注解与普通web.xml的关系
 显示几秒内跳转
 js如何获取某id的子标签
 common upload乱码
 预览上传图片

原文地址：https://www.cnblogs.com/stone-learning/p/9279244.html