Hive官方文档:Home-UserDocumentation
Hive DDL官方文档:LanguageManual DDL
参考文章:Hive 用户指南
注意:各个语句的版本时间,有的是在 hive-1.2.1 之后才有的,这些语句我们在hive-1.2.1中是不能使用的。
注意:本文写的都是常用的知识点,更多细节请参考官网。
常用命令
1 select current_database(); # 当前使用哪个库 2 show databases; # 显示所有库名 3 show tables; # 显示当前库有哪些表 4 describe extended talbe_name; # 表的扩展信息 和 describe formatted talbe_name; 类似 5 describe formatted talbe_name; # 推荐使用★★★ 如:describe formatted t_sz01; # 描述表信息,也可知道是 MANAGED_TABLE 还是 EXTERNAL_TABLE 6 describe database test_db; # 查看数据库信息 7 show partitions t_sz03_part; # 查看分区表的分区信息
说明
1 本文章使用Hive操作时,可能是 HIVE的命令行,也可能是beeline 2 所以看到2种风格的命令行,请不要惊讶。
1. DDL- Database操作
1.1. Create Database
1 CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name 2 [COMMENT database_comment] 3 [LOCATION hdfs_path] 4 [WITH DBPROPERTIES (property_name=property_value, ...)];
创建库
1 # 创建库 2 create database if not exists test_db 3 comment 'my frist db'; 4 5 0: jdbc:hive2://mini01:10000> describe database test_db; # 建库信息 6 +----------+--------------+----------------------------------------------------+-------------+-------------+-------------+--+ 7 | db_name | comment | location | owner_name | owner_type | parameters | 8 +----------+--------------+----------------------------------------------------+-------------+-------------+-------------+--+ 9 | test_db | my frist db | hdfs://mini01:9000/user/hive/warehouse/test_db.db | yun | USER | | 10 +----------+--------------+----------------------------------------------------+-------------+-------------+-------------+--+
1.2. Drop Database
1 DROP (DATABASE|SCHEMA) [IF EXISTS] database_name [RESTRICT|CASCADE]; 2 # 默认为 RESTRICT,如果库下有表就不能删除库;如果使用 CASCADE 那么即使库下有表,也会删除库下的表和库。
1.3. Use Database
1 USE database_name; # 使用哪个库 2 USE DEFAULT; # 使用默认库
例如:
1 hive (default)> show databases; # 查询有哪些库 2 OK 3 default 4 test001 5 test_db 6 zhang 7 Time taken: 0.016 seconds, Fetched: 4 row(s) 8 hive (default)> use test_db; # 使用哪个库 9 OK 10 Time taken: 0.027 seconds 11 hive (test_db)> select current_database(); # 当前使用的是哪个库 12 OK 13 test_db 14 Time taken: 1.232 seconds, Fetched: 1 row(s)
2. DDL-Table
2.1. Create Table
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name [(col_name data_type [COMMENT col_comment], ... [constraint_specification])] [COMMENT table_comment] [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)] # PARTITIONED 分割 [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS] # CLUSTERED 分群 BUCKETS 桶 [SKEWED BY (col_name, col_name, ...) ON ((col_value, col_value, ...), (col_value, col_value, ...), ...) [STORED AS DIRECTORIES] [ [ROW FORMAT row_format] [STORED AS file_format] | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] ] [LOCATION hdfs_path] [TBLPROPERTIES (property_name=property_value, ...)] # TBLPROPERTIES 表属性 [AS select_statement]; -- (Note: Available in Hive 0.5.0 and later; not supported for external tables) CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name LIKE existing_table_or_view_name [LOCATION hdfs_path]; data_type : primitive_type | array_type | map_type | struct_type | union_type -- (Note: Available in Hive 0.7.0 and later) primitive_type : TINYINT | SMALLINT | INT | BIGINT | BOOLEAN | FLOAT | DOUBLE | DOUBLE PRECISION -- (Note: Available in Hive 2.2.0 and later) | STRING | BINARY -- (Note: Available in Hive 0.8.0 and later) | TIMESTAMP -- (Note: Available in Hive 0.8.0 and later) | DECIMAL -- (Note: Available in Hive 0.11.0 and later) | DECIMAL(precision, scale) -- (Note: Available in Hive 0.13.0 and later) | DATE -- (Note: Available in Hive 0.12.0 and later) | VARCHAR -- (Note: Available in Hive 0.12.0 and later) | CHAR -- (Note: Available in Hive 0.13.0 and later) array_type : ARRAY < data_type > map_type : MAP < primitive_type, data_type > struct_type : STRUCT < col_name : data_type [COMMENT col_comment], ...> union_type : UNIONTYPE < data_type, data_type, ... > -- (Note: Available in Hive 0.7.0 and later) row_format : DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY char] [MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char] [NULL DEFINED AS char] -- (Note: Available in Hive 0.13 and later) | SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, property_name=property_value, ...)] file_format: : SEQUENCEFILE | TEXTFILE -- (Default, depending on hive.default.fileformat configuration) | RCFILE -- (Note: Available in Hive 0.6.0 and later) | ORC -- (Note: Available in Hive 0.11.0 and later) | PARQUET -- (Note: Available in Hive 0.13.0 and later) | AVRO -- (Note: Available in Hive 0.14.0 and later) | JSONFILE -- (Note: Available in Hive 4.0.0 and later) | INPUTFORMAT input_format_classname OUTPUTFORMAT output_format_classname constraint_specification: : [, PRIMARY KEY (col_name, ...) DISABLE NOVALIDATE ] [, CONSTRAINT constraint_name FOREIGN KEY (col_name, ...) REFERENCES table_name(col_name, ...) DISABLE NOVALIDATE
注意:使用CREATE TABLE建表时,如果该表或视图已存在,那么会报错。可以使用IF NOT EXISTS 跳过该错误。
- 表名和列名大小写不明感,但是SerDe(Serializer/Deserializer的简写。hive使用Serde进行行对象的序列与反序列化)和property (属性)名大小写敏感。
- 表和列注释是字符串文本(需要单引号)。
- 一个表的创建没有使用EXTERNAL子句,那么被称为managed table(托管表);因为Hive管理它的数据。查看一个表是托管的还是外部的,可以通过 DESCRIBE EXTENDED 或者 table_name 查看tableType获知。【或者通过describe formatted table_name; 查看】 【参见示例1】
- TBLPROPERTIES 子句允许你使用自己的元数据key/value【键/值对】去标记表定义。一些预定义的表属性也存在,比如last_modified_user和last_modified_time,它们是由Hive自动添加和管理的。【参见示例1】
1 # 示例1 2 0: jdbc:hive2://mini01:10000> create table t_sz05 (id int, name string) tblproperties ('key001'='value1', 'key200'='value200'); # 创建表 3 No rows affected (0.224 seconds) 4 0: jdbc:hive2://mini01:10000> describe formatted t_sz05; # 查看表信息 5 +-------------------------------+-------------------------------------------------------------+-----------------------+--+ 6 | col_name | data_type | comment | 7 +-------------------------------+-------------------------------------------------------------+-----------------------+--+ 8 | # col_name | data_type | comment | 9 | | NULL | NULL | 10 | id | int | | 11 | name | string | | 12 | | NULL | NULL | 13 | # Detailed Table Information | NULL | NULL | 14 | Database: | zhang | NULL | 15 | Owner: | yun | NULL | 16 | CreateTime: | Sat Jul 07 20:13:53 CST 2018 | NULL | 17 | LastAccessTime: | UNKNOWN | NULL | 18 | Protect Mode: | None | NULL | 19 | Retention: | 0 | NULL | 20 | Location: | hdfs://mini01:9000/user/hive/warehouse/zhang.db/t_sz05 | NULL | 21 | Table Type: | MANAGED_TABLE 【# 如果是外部表,则为EXTERNAL_TABLE】 | NULL | 22 | Table Parameters: | NULL | NULL | 23 | | key001 【# 自定义】 | value1 | 24 | | key200 【# 自定义】 | value200 | 25 | | transient_lastDdlTime | 1530965633 | 26 | | NULL | NULL | 27 | # Storage Information | NULL | NULL | 28 | SerDe Library: | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL | 29 | InputFormat: | org.apache.hadoop.mapred.TextInputFormat | NULL | 30 | OutputFormat: | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL | 31 | Compressed: | No | NULL | 32 | Num Buckets: | -1 | NULL | 33 | Bucket Columns: | [] | NULL | 34 | Sort Columns: | [] | NULL | 35 | Storage Desc Params: | NULL | NULL | 36 | | serialization.format | 1 | 37 +-------------------------------+-------------------------------------------------------------+-----------------------+--+ 38 29 rows selected (0.153 seconds)
2.1.1. Managed Table
建表
1 # 可以通过 describe formatted table_name; # 查看表信息 2 hive (test_db)> create table t_sz01 (id int, name string comment 'person name') 3 comment 'a table of name' 4 row format delimited fields terminated by ','; 5 OK 6 Time taken: 0.311 seconds 7 hive (test_db)> show tables; 8 OK 9 t_sz01 10 Time taken: 0.031 seconds, Fetched: 1 row(s)
通过 desc formatted t_sz01; 中的信息可知: Location: 为 hdfs://mini01:9000/user/hive/warehouse/test_db.db/t_sz01
添加表数据
1 [yun@mini01 hive]$ pwd 2 /app/software/hive 3 [yun@mini01 hive]$ cat t_sz01.dat 4 1,zhnagsan 5 3,李四 6 5,wangwu 7 7,赵六 8 2,sunqi 9 4,周八 10 6,kkkkk 11 8,zzzzz 12 [yun@mini01 hive]$ cp -a t_sz01.dat t_sz01.dat2 # 复制一份 13 [yun@mini01 hive]$ hadoop fs -put t_sz01.dat /user/hive/warehouse/test_db.db/t_sz01 # 数据上传 14 [yun@mini01 hive]$ hadoop fs -put t_sz01.dat2 /user/hive/warehouse/test_db.db/t_sz01 # 数据上传 15 [yun@mini01 hive]$ hadoop fs -ls /user/hive/warehouse/test_db.db/t_sz01 # 表下可以有多个数据文件 16 Found 2 items 17 -rw-r--r-- 2 yun supergroup 71 2018-07-12 21:58 /user/hive/warehouse/test_db.db/t_sz01/t_sz01.dat 18 -rw-r--r-- 2 yun supergroup 71 2018-07-12 22:30 /user/hive/warehouse/test_db.db/t_sz01/t_sz01.dat2
通过hive查看数据
1 0: jdbc:hive2://mini01:10000> select * from t_sz01; 2 +------------+--------------+--+ 3 | t_sz01.id | t_sz01.name | 4 +------------+--------------+--+ 5 | 1 | zhnagsan | 6 | 3 | 李四 | 7 | 5 | wangwu | 8 | 7 | 赵六 | 9 | 2 | sunqi | 10 | 4 | 周八 | 11 | 6 | kkkkk | 12 | 8 | zzzzz | 13 | 1 | zhnagsan | 14 | 3 | 李四 | 15 | 5 | wangwu | 16 | 7 | 赵六 | 17 | 2 | sunqi | 18 | 4 | 周八 | 19 | 6 | kkkkk | 20 | 8 | zzzzz | 21 +------------+--------------+--+ 22 16 rows selected (0.159 seconds)
2.1.2. External Tables
外表就是自己提供一个LOCATION,而不使用默认的表位置。并且删除该表时,表中的数据是不会删除的。
建表
1 # 其中location也可省略 hdfs://mini01:9000 改为 /user02/hive/database/ext_table 2 # 其中如果location目录不存在,那么hive会创建 3 hive (test_db)> create external table t_sz02_ext (id int, name string) 4 comment 'a ext table' 5 row format delimited fields terminated by ' ' 6 location 'hdfs://mini01:9000/user02/hive/database/ext_table'; 7 OK 8 Time taken: 0.065 seconds 9 hive (test_db)> show tables; 10 OK 11 t_sz01 12 t_sz02_ext 13 Time taken: 0.03 seconds, Fetched: 2 row(s) 14 0: jdbc:hive2://mini01:10000> select * from t_sz02_ext; # 无数据 15 +----------------+------------------+--+ 16 | t_sz02_ext.id | t_sz02_ext.name | 17 +----------------+------------------+--+ 18 +----------------+------------------+--+ 19 No rows selected (0.094 seconds) 20 21 # 通过desc formatted t_sz02_ext; 可能查询表的Location,得到下面的信息 22 # hdfs://mini01:9000/user02/hive/database/ext_table
添加表数据
1 [yun@mini01 hive]$ pwd 2 /app/software/hive 3 [yun@mini01 hive]$ ll 4 total 12 5 -rw-rw-r-- 1 yun yun 56 Jul 3 21:26 sz.dat 6 -rw-rw-r-- 1 yun yun 71 Jul 12 21:53 t_sz01.dat 7 -rw-rw-r-- 1 yun yun 79 Jul 12 22:15 t_sz02_ext.dat 8 [yun@mini01 hive]$ cat t_sz02_ext.dat # 最后有一行空行 9 1 刘晨 10 2 王敏 11 3 张立 12 4 刘刚 13 5 孙庆 14 6 易思玲 15 7 李娜 16 8 梦圆圆 17 18 [yun@mini01 hive]$ hadoop fs -put t_sz02_ext.dat /user02/hive/database/ext_table # 上传数据 19 [yun@mini01 hive]$ hadoop fs -ls /user02/hive/database/ext_table 20 Found 1 items 21 -rw-r--r-- 2 yun supergroup 79 2018-07-12 22:16 /user02/hive/database/ext_table/t_sz02_ext.dat
通过hive查看数据
1 0: jdbc:hive2://mini01:10000> select * from t_sz02_ext; 2 +----------------+------------------+--+ 3 | t_sz02_ext.id | t_sz02_ext.name | 4 +----------------+------------------+--+ 5 | 1 | 刘晨 | 6 | 2 | 王敏 | 7 | 3 | 张立 | 8 | 4 | 刘刚 | 9 | 5 | 孙庆 | 10 | 6 | 易思玲 | 11 | 7 | 李娜 | 12 | 8 | 梦圆圆 | 13 | NULL | NULL | # 原因是数据中,最后一行为空行 14 +----------------+------------------+--+ 15 9 rows selected (0.14 seconds)
2.1.3. Partitioned Tables
分区表(Partitioned tables)可以使用 PARTITIONED BY 子句。一个表可以有一个或多个分区列,并且为每个分区列中的不同值组合创建一个单独的数据目录。
当创建一个分区表,你得到这样的错误:“FAILED: Error in semantic analysis: Column repeated in partitioning columns,【失败:语义分析错误:分区列重复列】”。意味着你试图在表的本身数据中包含分区列。你可能确实定义了列。但是,您创建的分区可以生成一个可以查询的伪列,因此必须将表的列重命名为其他(那样用户不会查询)。
例如,假设原始未分区表有三列:id、date和name。
1 id int, 2 date date, 3 name varchar
现在要按日期分区。你的Hive定义可以使用 "dtDontQuery"作为列名,以便 "date" 可以被用作分区(和查询)。
1 create table table_name ( 2 id int, 3 dtDontQuery string, 4 name string 5 ) 6 partitioned by (date string)
现在你的用户仍然查询"where date = '...'",但是第二列dtDontQuery将保存原始值。
建表
1 # 不能使用date作为表的字段【列】,因为date是关键字 2 hive (test_db)> create table t_sz03_part (id int, name string) 3 comment 'This is a partitioned table' 4 partitioned by (dt string, country string) 5 row format delimited fields terminated by ',';
添加数据
1 [yun@mini01 hive]$ pwd 2 /app/software/hive 3 [yun@mini01 hive]$ cat t_sz03_20180711.dat1 4 1,张三_20180711 5 2,lisi_20180711 6 3,Wangwu_20180711 7 [yun@mini01 hive]$ cat t_sz03_20180711.dat2 8 11,Tom_20180711 9 12,Dvid_20180711 10 13,cherry_20180711 11 [yun@mini01 hive]$ cat t_sz03_20180712.dat1 12 1,张三_20180712 13 2,lisi_20180712 14 3,Wangwu_20180712 15 [yun@mini01 hive]$ cat t_sz03_20180712.dat2 16 11,Tom_20180712 17 12,Dvid_20180712 18 13,cherry_20180712 19 #### 在Hive中导入数据 20 hive (test_db)> load data local inpath '/app/software/hive/t_sz03_20180711.dat1' into table t_sz03_part partition (dt='20180711', country='CN'); 21 Loading data to table test_db.t_sz03_part partition (dt=20180711, country=CN) 22 Partition test_db.t_sz03_part{dt=20180711, country=CN} stats: [numFiles=1, numRows=0, totalSize=52, rawDataSize=0] 23 OK 24 Time taken: 0.406 seconds 25 hive (test_db)> load data local inpath '/app/software/hive/t_sz03_20180711.dat2' into table t_sz03_part partition (dt='20180711', country='US'); 26 Loading data to table test_db.t_sz03_part partition (dt=20180711, country=US) 27 Partition test_db.t_sz03_part{dt=20180711, country=US} stats: [numFiles=1, numRows=0, totalSize=52, rawDataSize=0] 28 OK 29 Time taken: 0.453 seconds 30 hive (test_db)> load data local inpath '/app/software/hive/t_sz03_20180712.dat1' into table t_sz03_part partition (dt='20180712', country='CN'); 31 Loading data to table test_db.t_sz03_part partition (dt=20180712, country=CN) 32 Partition test_db.t_sz03_part{dt=20180712, country=CN} stats: [numFiles=1, numRows=0, totalSize=52, rawDataSize=0] 33 OK 34 Time taken: 0.381 seconds 35 hive (test_db)> load data local inpath '/app/software/hive/t_sz03_20180712.dat2' into table t_sz03_part partition (dt='20180712', country='US'); 36 Loading data to table test_db.t_sz03_part partition (dt=20180712, country=US) 37 Partition test_db.t_sz03_part{dt=20180712, country=US} stats: [numFiles=1, numRows=0, totalSize=52, rawDataSize=0] 38 OK 39 Time taken: 0.506 seconds
浏览器访问
通过hive查看数据
1 0: jdbc:hive2://mini01:10000> select * from t_sz03_part; 2 +-----------------+-------------------+-----------------+----------------------+--+ 3 | t_sz03_part.id | t_sz03_part.name | t_sz03_part.dt | t_sz03_part.country | 4 +-----------------+-------------------+-----------------+----------------------+--+ 5 | 1 | 张三_20180711 | 20180711 | CN | 6 | 2 | lisi_20180711 | 20180711 | CN | 7 | 3 | Wangwu_20180711 | 20180711 | CN | 8 | 11 | Tom_20180711 | 20180711 | US | 9 | 12 | Dvid_20180711 | 20180711 | US | 10 | 13 | cherry_20180711 | 20180711 | US | 11 | 1 | 张三_20180712 | 20180712 | CN | 12 | 2 | lisi_20180712 | 20180712 | CN | 13 | 3 | Wangwu_20180712 | 20180712 | CN | 14 | 11 | Tom_20180712 | 20180712 | US | 15 | 12 | Dvid_20180712 | 20180712 | US | 16 | 13 | cherry_20180712 | 20180712 | US | 17 +-----------------+-------------------+-----------------+----------------------+--+ 18 12 rows selected (0.191 seconds) 19 0: jdbc:hive2://mini01:10000> show partitions t_sz03_part; # 查看分区表的分区信息 20 +-------------------------+--+ 21 | partition | 22 +-------------------------+--+ 23 | dt=20180711/country=CN | 24 | dt=20180711/country=US | 25 | dt=20180712/country=CN | 26 | dt=20180712/country=US | 27 +-------------------------+--+ 28 4 rows selected (0.164 seconds)
2.1.4. Create Table As Select (CTAS)
表还可以由一个create-table-as-select (CTAS)语句中的查询结果创建和填充。CTAS创建的表是原子的,这意味着在填充所有查询结果之前,其他用户看不到该表。因此,其他用户要么看到表的完整结果,要么根本看不到表。
CTAS限制:
目标表不能是分区表
目标表不能是外部表
目标表不能是分桶表
这里的目标表指的是要创建的表。
1 # 示例: 2 CREATE TABLE new_key_value_store 3 ROW FORMAT SERDE "org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe" 4 STORED AS RCFile 5 AS 6 SELECT (key % 1024) new_key, concat(key, value) key_value_pair 7 FROM key_value_store 8 SORT BY new_key, key_value_pair;
实例
1 hive (test_db)> create table t_sz02_ext_new 2 row format delimited fields terminated by '#' 3 AS 4 SELECT id , concat(name, '_', id) name2 5 FROM t_sz02_ext 6 SORT BY name2; 7 8 0: jdbc:hive2://mini01:10000> show tables; 9 +-----------------+--+ 10 | tab_name | 11 +-----------------+--+ 12 | t_sz01 | 13 | t_sz02_ext | 14 | t_sz02_ext_new | 15 | t_sz03_part | 16 | t_sz100_ext | 17 | t_sz101_ext | 18 +-----------------+--+ 19 6 rows selected (0.069 seconds) 20 0: jdbc:hive2://mini01:10000> select * from t_sz02_ext_new; 21 +--------------------+-----------------------+--+ 22 | t_sz02_ext_new.id | t_sz02_ext_new.name2 | 23 +--------------------+-----------------------+--+ 24 | NULL | NULL | 25 | 4 | 刘刚_4 | 26 | 1 | 刘晨_1 | 27 | 5 | 孙庆_5 | 28 | 3 | 张立_3 | 29 | 6 | 易思玲_6 | 30 | 7 | 李娜_7 | 31 | 8 | 梦圆圆_8 | 32 | 2 | 王敏_2 | 33 +--------------------+-----------------------+--+ 34 9 rows selected (0.094 seconds) 35 36 # 其中路径location可以通过desc formatted t_sz02_ext_new; 获取 37 hive (test_db)> dfs -ls /user/hive/warehouse/test_db.db/t_sz02_ext_new/; 38 Found 1 items 39 -rwxr-xr-x 2 yun supergroup 100 2018-07-12 23:50 /user/hive/warehouse/test_db.db/t_sz02_ext_new/000000_0 40 hive (test_db)> dfs -cat /user/hive/warehouse/test_db.db/t_sz02_ext_new/000000_0; 41 N#N 42 4#刘刚_4 43 1#刘晨_1 44 5#孙庆_5 45 3#张立_3 46 6#易思玲_6 47 7#李娜_7 48 8#梦圆圆_8 49 2#王敏_2
hdfs查看
1 # 其中路径location可以通过desc formatted t_sz02_ext_new; 获取 2 [yun@mini01 hive]$ hadoop fs -ls /user/hive/warehouse/test_db.db/t_sz02_ext_new; 3 Found 1 items 4 -rwxr-xr-x 2 yun supergroup 100 2018-07-12 23:44 /user/hive/warehouse/test_db.db/t_sz02_ext_new/000000_0 5 [yun@mini01 hive]$ hadoop fs -cat /user/hive/warehouse/test_db.db/t_sz02_ext_new/000000_0 6 N#N 7 4#刘刚_4 8 1#刘晨_1 9 5#孙庆_5 10 3#张立_3 11 6#易思玲_6 12 7#李娜_7 13 8#梦圆圆_8 14 2#王敏_2
2.1.5. Create Table Like
拷贝一个已存在的表结构作为一个新表
1 CREATE TABLE empty_key_value_store 2 LIKE key_value_store [TBLPROPERTIES (property_name=property_value, ...)];
实例
1 hive (test_db)> create table t_sz03_part_new 2 like t_sz03_part tblproperties ('proper1'='value1', 'proper2'='value2'); 3 OK 4 Time taken: 0.083 seconds 5 0: jdbc:hive2://mini01:10000> select * from t_sz03_part_new; # 只复制表结构,不复制内容 6 +---------------------+-----------------------+---------------------+--------------------------+--+ 7 | t_sz03_part_new.id | t_sz03_part_new.name | t_sz03_part_new.dt | t_sz03_part_new.country | 8 +---------------------+-----------------------+---------------------+--------------------------+--+ 9 +---------------------+-----------------------+---------------------+--------------------------+--+ 10 No rows selected (0.087 seconds) 11 hive (test_db)> create table t_sz04_like 12 like t_sz02_ext tblproperties ('proper1'='value1', 'proper2'='value2'); 13 No rows affected (0.153 seconds) 14 15 0: jdbc:hive2://mini01:10000> select * from t_sz04_like; 16 +-----------------+-------------------+--+ 17 | t_sz04_like.id | t_sz04_like.name | 18 +-----------------+-------------------+--+ 19 +-----------------+-------------------+--+
通过desc formatted tab_name; 可以看到,t_sz03_part 和 t_sz03_part_new 的表结构是一样的。 只是tblproperties 不一样而已。
如果like对象即使是一个外部表,那么生成的表也是MANAGED_TABLE,。
2.1.6. Bucketed Sorted Tables
1 CREATE TABLE page_view(viewTime INT, userid BIGINT, 2 page_url STRING, referrer_url STRING, 3 ip STRING COMMENT 'IP Address of the User') 4 COMMENT 'This is the page view table' 5 PARTITIONED BY(dt STRING, country STRING) 6 CLUSTERED BY(userid) SORTED BY(viewTime) INTO 32 BUCKETS 7 ROW FORMAT DELIMITED 8 FIELDS TERMINATED BY '