HiveQL(HiveSQL)跟普通SQL最大区别一直使用PIG，而今也需要兼顾HIVE

zoukankan html css js c++ java

HiveQL(HiveSQL)跟普通SQL最大区别一直使用PIG，而今也需要兼顾HIVE

HiveQL(Hive SQL)跟普通SQL最大区别

　　一直使用PIG，而今也需要兼顾HIVE。网上搜了点资料，感觉挺有用，这里翻译过来。翻译估计不太准确，待自己熟悉HIVE后再慢慢总结。

　　* No true date/time data types, no interval types, and many missing UDFs for manipulating dates (e.g. ADD_MONTH)

　　* Strict type matching without support for automatic coercion or typed literals (e.g. CASE <bigint expr> WHEN 1 THEN ... END)

　　* All queries must reference a table (no 'dual' or table-less queries)

　　* No session-scoped temp tables

　　* No 'IN' predicate

　　* No 'FIND' string search function for producing the offset to a match

　　* No find/replace string functions for plain strings (i.e. not regex)

　　* XPATH UDFs cannot return a string representing an entire subtree in the DOM, which prevents composition.

　　* Few mechanisms for collapsing arrays to scalar types (e.g. 'join' complement of string 'split'; aggregations other than 'size' for numeric arrays; etc.)

　　粗略的翻译：

　　1.HiveQL没有真正的日期/时间类型,自增类型,以及操作日期和时间的一些函数如(ADD_MONTH)

　　2.HiveQL有着非常严格的类型匹配,不支持类型自动转换(如不支持: CASE big_int_number WHEN 1 THEN ... END),我的理解是big int类型不可以自动帮你转换为int

　　3.HiveQL只能对表进行查询，普通的SQL可以对结果集查询,如一般的嵌套查询)

　　4.HiveQL没有临时表的概念

　　5.HiveQL没有IN操作

　　6.HiveQL对于字符串没有FIND和REPLACE函数

　　7.HiveQL中的XPATH UDF不能够返回一个代表子DOM树的字符串实体,为了阻止composition.

　　8.Few mechanisms for collapsing arrays to scalar types (e.g. 'join' complement of string 'split'; aggregations other than 'size' for numeric arrays; etc.)

　　===========================================================================================================================================================

　　1.No windowing functions. IE, SUM(sales) OVER (PARTITION BY date). Its difficult to do a lot things common to warehousing, like a running sum, without having to write custom mappers/reducers or a UDF.

　　2.No regular UNION, INTERSECT, or MINUS operators.

　　3.Null values are treated differently than empty string, and are exported differently. IE, empty strings are exported as ' ' and nulls are exported as nulls. I know this isn't unique to Hive but still annoying when exporting data from Hive into another system.

　　4.No hierarchical/self referencing querying. I know most distributed computing solutions can't do this, but it can be very handy.

　　5.No Update or Delete statements.

　　6.Haven't been able to find any kind of cost-based explain plans. Running explain plans generally just shows the path of accessing data. Useful to some degree but it would be great if it was more advanced in that it could help the user understand which steps are causing the biggest slowdowns.

　　=======================================================================================================================================================================

　　1. For row format delimiter for line termination, it only supports ' '.

　　2. Hive does not support the ability to run a query that select from tables in more than one database.

　　3. Hive does not support sub-queries such as those connected by IN/EXISTS in the WHERE clause.

　　4. Hive does not support the truncation of data from a table.

　　===========================================================================================================================================================

查看全文

相关阅读:
在一个字符串中找到第一个只出现一次的字符
 声明数组变量/// 计算所有元素的总和/打印所有元素总和/输出/foreach循环/数组作为函数的参数/调用printArray方法打印
 intellij idea 如何更改编辑器文本字体和大小
 称砝码算法//输入与算法分开
 invalid types 'int[int]' for array subscript// EOF 输入多组数据//如何键盘输入EOF
scanf和gets的差别
 输入3行字符串/定义flag/while/字符串后要加空格符
 ‘'的单引号/输入字符串/输出单个字符
 窗口迅速关闭的解决办法/scanf/if/for/break
【笔记】【VSCode】Windows下VSCode编译调试c/c++

原文地址：https://www.cnblogs.com/catWang/p/4367347.html