zoukankan      html  css  js  c++  java
  • hadoop what is difference between Pig and Hive? Stack Overflow

    hadoop - what is difference between Pig and Hive? - Stack Overflow

    Apache Pig and Hive are two projects that layer on top of Hadoop, and provide a higher-level language for using Hadoop's MapReduce library. Apache Pig provides a scripting language for describing operations like reading, filtering, transforming, joining, and writing data -- exactly the operations that MapReduce was originally designed for. Rather than expressing these operations in thousands of lines of Java code that uses MapReduce directly, Pig lets users express them in a language not unlike a bash or perl script. Pig is excellent for prototyping and rapidly developing MapReduce-based jobs, as opposed to coding MapReduce jobs in Java itself.

    If Pig is "scripting for Hadoop", then Hive is "SQL queries for Hadoop". Apache Hive offers an even more specific and higher-level language, for querying data by running Hadoop jobs, rather than directly scripting step-by-step the operation of several MapReduce jobs on Hadoop. The language is, by design, extremely SQL-like. Hive is still intended as a tool for long-running batch-oriented queries over massive data; it's not "real-time" in any sense. Hive is an excellent tool for analysts and business development types who are accustomed to SQL-like queries and Business Intelligence systems; it will let them easily leverage your shiny new Hadoop cluster to perform ad-hoc queries or generate report data across data stored in storage systems mentioned above.

  • 相关阅读:
    [树形DP]Luogu P1131 [ZJOI2007]时态同步
    [状压DP]JZOJ 1303 骑士
    [DFS]JZOJ 1301 treecut
    [最小费用最大流]JZOJ 4802 探险计划
    [KMP][倍增求LCA]JZOJ 4669 弄提纲
    [DP]JZOJ 1758 过河
    列表生成式和生成器表达式
    协程函数
    生成器
    迭代器
  • 原文地址:https://www.cnblogs.com/lexus/p/2594445.html
Copyright © 2011-2022 走看看