zoukankan      html  css  js  c++  java
  • [Hive-Tutorial] What Is Hive

    What Is Hive

    Hive is a data warehousing infrastructure based on Hadoop. Hadoop provides massive scale out and fault tolerance capabilities for data storage and processing (using the map-reduce programming paradigm) on commodity hardware.

    Hive is designed to enable easy data summarization, ad-hoc querying and analysis of large volumes of data. It provides a simple query language called Hive QL, which is based on SQL and which enables users familiar with SQL to do ad-hoc querying, summarization and data analysis easily. At the same time, Hive QL also allows traditional map/reduce programmers to be able to plug in their custom mappers and reducers to do more sophisticated analysis that may not be supported by the built-in capabilities of the language.

    What Hive Is NOT

    Hadoop is a batch processing system and Hadoop jobs tend to have high latency and incur substantial overheads in job submission and scheduling. As a result - latency for Hive queries is generally very high (minutes) even when data sets involved are very small (say a few hundred megabytes). As a result it cannot be compared with systems such as Oracle where analyses are conducted on a significantly smaller amount of data but the analyses proceed much more iteratively with the response times between iterations being less than a few minutes. Hive aims to provide acceptable (but not optimal) latency for interactive data browsing, queries over small data sets or test queries.

    Hive is not designed for online transaction processing and does not offer real-time queries and row level updates. It is best used for batch jobs over large sets of immutable data (like web logs). 不支持在线事务处理,不支持实时查询,不支持行数据更新。最好的用途是:批量处理大数据集的不可改变的数据,比如网络日志。

    In the following sections we provide a tutorial on the capabilities of the system. We start by describing the concepts of data types, tables and partitions (which are very similar to what you would find in a traditional relational DBMS) and then illustrate the capabilities of the QL language with the help of some examples.

     

    谨言慎行,专注思考 , 工作与生活同乐
  • 相关阅读:
    Python-05 基础语法-函数
    使用单个命令安装 WSL 现在可在 Windows 10 版本 2004 及更高版本中使用
    java.sql public interface ResultSet
    Selecting Contents for Uber JAR
    【初次使用h0遇到的一些问题】
    关于Swagger-UI下的渗透实战
    CTF—MISC—USB键盘流量分析
    k8s之路-Rancher
    单元测试
    flutter开发中设置模拟器一直悬浮在ide上方
  • 原文地址:https://www.cnblogs.com/tmeily/p/4240779.html
Copyright © 2011-2022 走看看