zoukankan      html  css  js  c++  java
  • 2.3 Hive的数据类型讲解及实际项目中如何使用python脚本对数据进行ETL

    一、hive Data Types

    https://cwiki. apache. org/confluence/display/HiveLanguageManual+Types

    Numeric Types
            · TINYINT(1-byte signed integer, from-128 to 127)
            · SMALLINT(2-byte signed integer, from-32,768 to 32,767)
            · INT(4-byte signed integer, from-2,147,483,648 to 2,147,483,647)
            · BIGINT(8-byte signed integer, from-9,223,372,036,854,775,808 to9
            · FLOAT(4-byte single precision floating point number)
            · DOUBLE(8-byte double precision floating point number)
            · DECIMAL
                    · Introduced in Hive 0.11.0 with a precision of 38 digits
                    · Hive 0.13.0 introduced user definable precision and scale
    
    
    Date/Time Types
            · TIMESTAMP(Note: Only available starting with Hive 0.8.0)
            · DATE(Note: Only available starting with Hive 0.12.0)
    
    
    String Types
           · STRING
        · VARCHAR(Note: Only available starting with Hive 0.12.0)
        · CHAR(Note: Only available starting with Hive 0.13.0)
    
    
    Misc Types
        · BOOLEAN
        · BINARY(Note: Only available starting with Hive 0.8.0)
    
    
    
    Complex Types
        · arrays: ARRAY<data_type>(Note: negative values and non-constant expressions are allowed as of Hive 0.14.)
        · maps: MAP<primitivetype, data_type>(Note: negative values and non-constant expressions are allowed as of Hive 0.14.)
        · structs: STRUCT<col_name: datatype [ COMENT col_comment],..>
        · union: UNIONTYPE<datatype, data_type,..>(Note: Only available starting with Hive 0.7.0.)


    二、Primitive Types

    ·Types are associated with the columns in the tables.The following Primitive types are
    supported:
    
    ·Integers
        ·TINYINT-1 byte integer
        ·SMALLINT-2 byte integer
        ·INT-4 byte integer
        ·BIGINT-8 byte integer
    
    
    ·Boolean type
        ·BOOLEAN-TRUE/FALSE
    
    
    ·Floating point numbers
        ·FLOAT-single precision
        ·DOUBLE-Double precision
    
    
    ·String type
        ·STRING-sequence of characters in a specified character set
    
    
    https://cwiki.apache.org/confluence/display/Hive/Tutorial


    三、python脚本对数据进行ETL流程

    1)table, load           E

    2)select, python     T

    3)sub table             L

  • 相关阅读:
    Java map双括号初始化方式的问题
    Koa 中间件的执行
    JavaScript 实现页面中录音功能
    Koa 中实现 chunked 数据传输
    WebAssembly 上手
    TypeScript `infer` 关键字
    Vim 插件的安装
    MySQL EXPLAIN 语句
    面向切面编程(AOP)
    CSS 类名的问题
  • 原文地址:https://www.cnblogs.com/weiyiming007/p/10750144.html
Copyright © 2011-2022 走看看