Hive中如何快速的复制一张分区表（包括数据）

zoukankan html css js c++ java

Hive中如何快速的复制一张分区表（包括数据）
转自：http://lxw1234.com/archives/2015/09/484.htm

关键字：Hive 复制表

Hive中有时候会遇到复制表的需求，复制表指的是复制表结构和数据。

如果是针对非分区表，那很简单，可以使用CREATE TABLE new_table AS SELECT * FROM old_table;

那么如果是分区表呢？

首先想到的办法可能是：

先创建一张和old_table结构相同的new_table，包括分区；可以使用CREATE TABLE new_table LIKE old_table;

接下来使用动态分区，把old_table的数据INSERT到new_table中。

这个方法当然可以，但可能不是最快的。

其实可以这样做：

1. CREATE TABLE new_table LIKE old_table;

2. 使用hadoop fs -cp 命令，把old_table对应的HDFS目录的文件夹全部拷贝到new_table对应的HDFS目录下；

3. 使用MSCK REPAIR TABLE new_table;修复新表的分区元数据；

看例子：

有一张分区表t1,只有两个分区，每个分区中都有一条数据，如下：
1. hive> show partitions t1;
2. OK
3. pt=2015-09-11
4. pt=2015-09-12
5. Time taken: 0.11 seconds, Fetched: 2 row(s)
6. hive> desc t1;
7. OK
8. id string
9. pt string
11. # Partition Information
12. # col_name data_type comment
14. pt string
15. Time taken: 0.123 seconds, Fetched: 7 row(s)
16. hive> select * from t1;
17. OK
18. X 2015-09-11
19. Y 2015-09-12
20. Time taken: 0.095 seconds, Fetched: 2 row(s)
21. hive>
创建一张相同表结构的新表t2;
1. hive> create table t2 like t1;
2. OK
3. Time taken: 0.162 seconds
4. hive> desc t2;
5. OK
6. id string
7. pt string
9. # Partition Information
10. # col_name data_type comment
12. pt string
13. Time taken: 0.139 seconds, Fetched: 7 row(s)
14. hive> show partitions t2;
15. OK
16. Time taken: 0.082 seconds
使用hadoop fs -cp命令把t1对应HDFS目录的所有文件夹复制到t2对应的HDFS目录下：
1. [liuxiaowen@dev ~]$ hadoop fs -cp /hivedata/warehouse/liuxiaowen.db/t1/* /hivedata/warehouse/liuxiaowen.db/t2/
2. [liuxiaowen@dev ~]$ hadoop fs -ls /hivedata/warehouse/liuxiaowen.db/t2/
3. Found 2 items
4. drwxr-xr-x - liuxiaowen liuxiaowen 0 2015-09-11 17:17 /hivedata/warehouse/liuxiaowen.db/t2/pt=2015-09-11
5. drwxr-xr-x - liuxiaowen liuxiaowen 0 2015-09-11 17:17 /hivedata/warehouse/liuxiaowen.db/t2/pt=2015-09-12
在Hive用使用MSCK REPAIR TABLE t2;修复新表t2的分区元数据；
1. hive> show partitions t2;
2. OK
3. Time taken: 0.082 seconds
4. hive> MSCK REPAIR TABLE t2;
5. OK
6. Partitions not in metastore: t2:pt=2015-09-11 t2:pt=2015-09-12
7. Repair: Added partition to metastore t2:pt=2015-09-11
8. Repair: Added partition to metastore t2:pt=2015-09-12
9. Time taken: 0.249 seconds, Fetched: 3 row(s)
10. hive> show partitions t2;
11. OK
12. pt=2015-09-11
13. pt=2015-09-12
14. Time taken: 0.068 seconds, Fetched: 2 row(s)
15. hive> select * from t2;
16. OK
17. X 2015-09-11
18. Y 2015-09-12
19. Time taken: 0.123 seconds, Fetched: 2 row(s)
20. hive>
OK，新表t2已经复制好了，它和t1有着相同的表结构，分区结构，分区以及数据。
查看全文

相关阅读:
Codeforces A. Bear and Big Brother
codeforces A. In Search of an Easy Problem
c#判断两个对象和对象中的属性是否相同（以及记录对象中的哪些字段，和详细的改变情况）
生成随机字符串
 SQL语句计算距离今天生日还差几天
 sqlServer 获取最新的一条数据
 c#所有部门及其下所部门生成树形图（递归算法获取或键值对方式获取）
根据中文名，自动生成首字母的拼音码或拼音码（两种方法）
char/varchar/nvarchar的区别
 c#中ofType的用法

原文地址：https://www.cnblogs.com/cxzdy/p/4934849.html