zoukankan      html  css  js  c++  java
  • Hive的表操作

    表的修改

      1.alter table 修改表的元数据:

        a.修改表名称

          hive>alter table student rename to student1;

        b.修改表分区

          hive>alter table hive.logs add partition (dt='2018-9-1',country='beijng'); 

          hive>alter table hive.logs drop partition (dt='2018-9-1',country='beijng'); 

        c.添加列

          hive>alter table hive.logs add columns(id int,name string);

        d.删除和替换列

          hive>alter table hive.logs replace columns (id int,name string);

    向管理表装载数据

      1.装载数据

        hive>load data [local] inpath '/data/hive' [overwrite] into table tbname;

        说明:含local关键字:加载本地目录数据到warehouse下;如果不含local关键字,加载数据目录默认为hdfs路径。

      2.通过查询语句向表中插入数据:在原始数据的基础上,进行业务分析后生成的表数据

        hive>insert overwrite table new_table select [column字段] from 原始数据表 where 条件;

          或
          hive> from 原始数据表 别名 insert overwrite table new_table select [column字段] where 条件;

      3.创建表时,也可以执行插入数据

        hive>create table new_table as select id,name,age from student where age>23;

      4.导出数据 

        hive>insert overwrite directory '/data/stocks/2009-6-23' select * from stocks where ymd ='2009=6-23';

    查询

      1.对array查询 

        hive>select name,subordinates[1] from hive.employees where name='John Doe';

      2.对map查询

        hive>select name,deductions['State Taxes'] from hive.employees where name = 'John Doe';

      3.对struct查询

        hive>select name,address.state from hive.employees where name ='John Doe';

      4.支持运算符

      5.内置函数,比如sum(),count(),avg()等。

      6.limit

        hive>select upper(name),salary,deductions['Federal Taxes'],round(salary*(1-deductions['Federal Taxes'])) from hive.employees limit 6;

      7.列别名

        hive>select upper(name),salary,deductions['Federal Taxes'] as Taxes,round(salary*(1-deductions['Federal Taxes'])) as sal from hive.employees limit 6;

      8.嵌套

         hive>from (select upper(name) as name,salary,deductions['Federal Taxes'] as Taxes,round(salary*(1-deductions['Federal Taxes'])) as sal from hive.employees) e select e.name,e.Taxes,e.sal where e.sal >70000;

    本地执行条件 

      1.简单查询:select * from table;

      2.将分区列作为查询条件,但查询列应为“*”;

      3.调优方案:建议将hive.exec.mode.local.auto=true;并添加至{/home/hyxy/.hiverc}中;

    group by

      hive>select year(ymd),avg(price_close) from stocks where exchange1='NASDAQ' and symbol = 'AAPL' group by year(ymd);

      说明:分组属性列在mysql中一定要出现在select子句之后;但在hive中不需要。  

  • 相关阅读:
    Chapter 1 Secondary Sorting:Introduction
    Apache Spark : RDD
    Apache Spark : Introduction
    Python 变量和数据类型
    CMP3$
    idea实现eclipse工作空间的思路
    Oracle 数据字典查询
    sql左连接、右连接、内连接、不写连接符号
    ADB
    Serverless
  • 原文地址:https://www.cnblogs.com/lyr999736/p/9473935.html
Copyright © 2011-2022 走看看