zoukankan      html  css  js  c++  java
  • 2.3 Hive的数据类型讲解及实际项目中如何使用python脚本对数据进行ETL

    一、hive Data Types

    https://cwiki. apache. org/confluence/display/HiveLanguageManual+Types

    Numeric Types
            · TINYINT(1-byte signed integer, from-128 to 127)
            · SMALLINT(2-byte signed integer, from-32,768 to 32,767)
            · INT(4-byte signed integer, from-2,147,483,648 to 2,147,483,647)
            · BIGINT(8-byte signed integer, from-9,223,372,036,854,775,808 to9
            · FLOAT(4-byte single precision floating point number)
            · DOUBLE(8-byte double precision floating point number)
            · DECIMAL
                    · Introduced in Hive 0.11.0 with a precision of 38 digits
                    · Hive 0.13.0 introduced user definable precision and scale
    
    
    Date/Time Types
            · TIMESTAMP(Note: Only available starting with Hive 0.8.0)
            · DATE(Note: Only available starting with Hive 0.12.0)
    
    
    String Types
           · STRING
        · VARCHAR(Note: Only available starting with Hive 0.12.0)
        · CHAR(Note: Only available starting with Hive 0.13.0)
    
    
    Misc Types
        · BOOLEAN
        · BINARY(Note: Only available starting with Hive 0.8.0)
    
    
    
    Complex Types
        · arrays: ARRAY<data_type>(Note: negative values and non-constant expressions are allowed as of Hive 0.14.)
        · maps: MAP<primitivetype, data_type>(Note: negative values and non-constant expressions are allowed as of Hive 0.14.)
        · structs: STRUCT<col_name: datatype [ COMENT col_comment],..>
        · union: UNIONTYPE<datatype, data_type,..>(Note: Only available starting with Hive 0.7.0.)


    二、Primitive Types

    ·Types are associated with the columns in the tables.The following Primitive types are
    supported:
    
    ·Integers
        ·TINYINT-1 byte integer
        ·SMALLINT-2 byte integer
        ·INT-4 byte integer
        ·BIGINT-8 byte integer
    
    
    ·Boolean type
        ·BOOLEAN-TRUE/FALSE
    
    
    ·Floating point numbers
        ·FLOAT-single precision
        ·DOUBLE-Double precision
    
    
    ·String type
        ·STRING-sequence of characters in a specified character set
    
    
    https://cwiki.apache.org/confluence/display/Hive/Tutorial


    三、python脚本对数据进行ETL流程

    1)table, load           E

    2)select, python     T

    3)sub table             L

  • 相关阅读:
    第三十三天 客户机和tcp多个客户端通信
    第三十二天黏包问题及解决方法:
    第三十一天 udp通信和黏包
    第三十天网路的基础
    第二十九天日志和config模块:
    Linux系统开机显示BusyBox v1.22.1 built-in shell(ash) 解决方法
    MTK迁移Oracle单库
    Ubuntu14.04安装mysql
    Ubuntu14.04下tomcat的安装
    Thinking in Java Chapter4 Exercise10 吸血鬼数字
  • 原文地址:https://www.cnblogs.com/weiyiming007/p/10750144.html
Copyright © 2011-2022 走看看