zoukankan      html  css  js  c++  java
  • Hive 5、Hive 的数据类型 和 DDL Data Definition Language)

    官方帮助文档:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL

    -- 扩展数据类型
    data_type
      : primitive_type
      | array_type
      | map_type
      | struct_type
      | union_type  -- (Note: Available in Hive 0.7.0 and later)
    
    array_type : ARRAY < data_type >
    map_type : MAP < primitive_type, data_type >
    struct_type : STRUCT < col_name : data_type [COMMENT col_comment], ...>
    union_type : UNIONTYPE < data_type, data_type, ... >  -- (Note: Available in Hive 0.7.0 and later)
    -- 基本数据类型 
    primitive_type
      : TINYINT
      | SMALLINT
      | INT
      | BIGINT
      | BOOLEAN
      | FLOAT
      | DOUBLE
      | STRING
      | BINARY      -- (Note: Available in Hive 0.8.0 and later)
      | TIMESTAMP   -- (Note: Available in Hive 0.8.0 and later)
      | DECIMAL     -- (Note: Available in Hive 0.11.0 and later)
      | DECIMAL(precision, scale)  -- (Note: Available in Hive 0.13.0 and later)
      | DATE        -- (Note: Available in Hive 0.12.0 and later)
      | VARCHAR     -- (Note: Available in Hive 0.12.0 and later)
      | CHAR        -- (Note: Available in Hive 0.13.0 and later)
    

      

    Hive  DDL

      Hive完整的DDL

      Hive DDL的语方法为类SQL语法,所以标准的SQL语法大多数在Hive中都可用;

    CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name    -- (Note: TEMPORARY available in Hive 0.14.0 and later)
      [(col_name data_type [COMMENT col_comment], ...)]
      [COMMENT table_comment]
      [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
      [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]
      [SKEWED BY (col_name, col_name, ...)                  -- (Note: Available in Hive 0.10.0 and later)]
         ON ((col_value, col_value, ...), (col_value, col_value, ...), ...)
         [STORED AS DIRECTORIES]
      [
       [ROW FORMAT row_format] 
       [STORED AS file_format]
         | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]  -- (Note: Available in Hive 0.6.0 and later)
      ]
      [LOCATION hdfs_path]
      [TBLPROPERTIES (property_name=property_value, ...)]   -- (Note: Available in Hive 0.6.0 and later)
      [AS select_statement];   -- (Note: Available in Hive 0.5.0 and later; not supported for external tables)
    

      Hive建表

    -- Hive建表 语法
    CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name
    LIKE existing_table_or_view_name
    [LOCATION hdfs_path];
    
    create table person( 
    id int, 
    name string, 
    age int, 
    likes array<string>, 
    address map<string,string> 
    ) 
    row format delimited  
    -- 指定导入数据的列与列之间的分隔符
    fields terminated by ','  
    -- 指定Array类型的分隔符
    collection ITEMS TERMINATED BY  '-' 
    -- 指定map类型的分隔符
    map keys terminated by ':'  
    -- 指定行与行之间的分隔符
    lines terminated by '
    ';
    

      

      Hive导入数据

    # 三条数据,列与列之间用,号隔开;array之间用-号隔开;map之间用:号隔开;行与行用换行符隔开
    1,tom,28,game-music-book,stu:henan-home:henan-work:beijing
    2,jack,21,money-meinv,stu:wuhan-home:wuhan
    3,lusi,18,shopping-music,stu:shanghai-home:beijing
    

      导入数据:

    hive> load data local inpath '/opt/data.txt' overwrite into table person;

      数据查询 

    -- 查询所有
    select * from person;
    
    -- 还可以这样查
     select * from person where name='tom';
    
    -- 或者这样
    select * from person where likes[1]='music';
    
    -- 还有这样
    select * from person where address['stu']='shanghai'; 
    
    -- 还有这样
    select avg(age) from person;
    
    -- ... 等标准的SQL语法大多都可以在Hive中使用包括一些函数,因为Hive是类SQL的;
    

      但在Hive中不推荐进行这些操作:Insert、Update、Delete等操作,因为Hive的特性是对数据仓库的数据进行提取,针对的数据是批量的,不适合行级的运算;

     清空表

    -- 使truncate清空表
    TRUNCATE TABLE person;
    -- 通过覆盖的方式清空表
    insert overwrite table person select * from person where 1=2;
    

      

    drop table person;
    

      

  • 相关阅读:
    常用排序算法--合并排序和快速排序
    常用排序算法--冒泡排序及改进和插入排序时间复杂度分析
    常用数据结构图--拓扑排序
    常用数据结构栈的应用—-表达式求值
    Session原理,生命周期
    jsp内置对象out 和response.getwriter().write()的区别
    div中的div在父容器中水平居中或者垂直居中
    <!DOCTYPE html> 到底是什么意思?
    设置了环境变量,为什么执行javac报找不到javac: 找不到文件
    自动抽取邮件内容
  • 原文地址:https://www.cnblogs.com/tesla-turing/p/11509261.html
Copyright © 2011-2022 走看看