zoukankan      html  css  js  c++  java
  • hive之external table创建

    External Tables

         However, managed tables are less convinent for sharing with other tools. For example, suppose we have data that is created and used primarily by Pig or other tools, but we want to run some quries against it, but not give Hive ownership of the data. So we can define an external table that points to that data, but doesn't take ownership of it.

        Suppose we are analyzing data from the stock markets. Periodically, we ingest the data for NASDAQ and the NYSE from a source like Infochimps(http://infochimps.com/datasets).

        Now the following table declaration creates an external table that can read all the data files for this comma-delimited data in /data/stocks:

    hive(Economy)> create external table if not exists stocks(

                          > exchange string, symbol string, ymd string, price_open float, price_high float, price_low float,

                          > price_close float, volume int, price_adj_close float)

                          > row format delimited fields terminated by ','

                          > location '/data/stocks';

         Because it's external, Hive doesn't assume it owns the data. Therefore, dropping the table doesn't delete the data, although the metadata for the table will be deleted.(sometimes permit denied);

        In addtion, you can judge the table type between managed and external table using the output of 'hive>describe extended tablename'.

        As for managed tables, you can also copy the schema(but the data) of an existign table:

        hive> create external table if not  exists Economy1

              > like Economy

              > location '/data/stocks/path';

        What's more, if you omit the 'external' keyword and the original table is external, the new table will also be external; if you omit 'external' and the original table is managed, the new table will also be managed. However, if you include the external keyword and the original table is managed, the new table will be external.

  • 相关阅读:
    博客园
    未释放的已删除文件
    ssh连接缓慢
    剑指 Offer 38. 字符串的排列
    剑指 Offer 37. 序列化二叉树
    剑指 Offer 50. 第一个只出现一次的字符
    剑指 Offer 36. 二叉搜索树与双向链表
    剑指 Offer 35. 复杂链表的复制
    剑指 Offer 34. 二叉树中和为某一值的路径
    剑指 Offer 33. 二叉搜索树的后序遍历序列
  • 原文地址:https://www.cnblogs.com/likai198981/p/2982427.html
Copyright © 2011-2022 走看看