zoukankan      html  css  js  c++  java
  • Hive(七):HQL DML

    目录:

    • Loading files into tables
    • Inserting data into Hive Tables from queries
    • Writing data into the filesystem from queries
    • Inserting values into tables from SQL
    • Delete
    • 应用Demo

     Loading files into tables:


    • 语法:LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]  
    • 示例代码如下:
      #创建表
      CREATE TABLE web_log(viewTime INT, userid BIGINT, url STRING, referrer STRING, ip STRING) 
      ROW FORMAT DELIMITED FIELDS TERMINATED BY '	';
      
      #导入文件数据
      LOAD DATA LOCAL INPATH '/usr/zhu/table.txt'  OVERWRITE INTO TABLE web_log;
      View Code
    • OVERWRITE:
    1. 目标表(或者分区)中的内容(如果有)会被删除,然后再将 filepath 指向的文件/目录中的内容添加到表/分区中
    2. 如果目标表(分区)已经有一个文件,并且文件名和 filepath 中的文件名冲突,那么现有的文件会被新文件所替代

    Inserting data into Hive Tables from queries:


    • 语法:INSERT OVERWRITE TABLE tablename1 [PARTITION] select_statement1 FROM from_statement;
               INSERT INTO TABLE tablename1 [PARTITION] select_statement1 FROM from_statement;
    • 示例代码:
      #创建结构相同的表
      create table empDemo as employee;
      
      #插入数据
      insert into table empDemo select * from employee;
      
      
      #覆盖插入数据
      insert overwrite table empDemo select * from employee;
      View Code

    Writing data into the filesystem from queries:


    • 语法:INSERT OVERWRITE [LOCAL] DIRECTORY directory1 SELECT ... FROM ...
    • 示例代码:
      INSERT OVERWRITE LOCAL DIRECTORY './tmp/zhu'  SELECT * FROM employee;
      View Code

    Inserting values into tables from SQL:


    • 语法:INSERT INTO TABLE tablename VALUES values_row [, values_row ...]

    示例代码:

    #单行插入
    insert into table employee values('001','001','tgzhu');
    
    #多行插入
    insert into table employee values('004','004','WangWu'),('005','005','ZhaoZhao');
    View Code

    Delete:


    • 语法:DELETE FROM tablename [WHERE expression]

    应用Demo:


    • 以一个实际的应用Demo对Hive 的 DDL、DML 进行说明,过程如下
    • 创建与HBase关联的外部表,HQL如下:
      CREATE EXTERNAL TABLE if not exists Hive_CM_EvcRealTimeData(
               Rowkey  string,
               RealTimeData_CarNo  string, 
               RealTimeData_Time  string,
           RealTimeData_Speed decimal(20,8),
           RealTimeData_Mileage decimal(20,8),
           RealTimeData_HighestVoltageBatteryOrd int,
           RealTimeData_Latitude decimal(20,8),
           RealTimeData_Longitude decimal(20,8)
        )
       STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
       WITH SERDEPROPERTIES('hbase.columns.mapping' = ':key,d:RealTimeData_CarNo, ata_Time,d:RealTimeData_Speed,d:RealTimeData_Mileage,d:RealTimeData_HighestVoltageBatteryOrd,d:RealTimeData_Latitude,d:RealTimeData_Longitude')
       TBLPROPERTIES('hbase.table.name' = 'CM_EvcRealTimeData')
      View Code
    • 创建一个用来保存计算结果的hive实表,如下:
    CREATE TABLE if not exists Hive_CM_CarDailyRpt(
             CarNo        string,
             DTime        string,
         OnLineCount  int, 
             RunCount     int,  
         Mileage      decimal(20,8), 
             MaxSpeed     decimal(20,8),  
         totalPower   decimal(20,8), 
         AverageSpeed decimal(20,8),
         CDI_BatteryFlag string,
         CDI_CoordinatorFlag string   
      )
      STORED AS TEXTFILE
    View Code
    •  计算并将结果插入实表,如下:
    set hive.execution.engine = tez;
        
    Insert overwrite table Hive_CM_CarDailyRpt 
    select
          CarNo,DTime,
          CONVERT(int,SUM(CT)) as OnLineCount ,
          CONVERT(int,SUM(CTSPEED)) as RunCount,
          CONVERT(decimal(18,2),MAX(MILE)-MIN(MILE)) as Mileage ,
          CONVERT(decimal(18,2),MAX(SPEED)) as MaxSpeed,
          ((MAX(MILE)-MIN(MILE))*0.2) as totalPower,
          case when SUM(CTSPEED)>0 then CONVERT(decimal(18,2),((MAX(MILE)-MIN(MILE))/SUM(CTSPEED)))
          else 0 end as AverageSpeed,      
          case when SUM(RealTimeData_HighestVoltageBatteryOrd)>0 then '0' else '1' end as BatteryFlag,
          case when (SUM(RealTimeData_Latitude) + SUM(RealTimeData_Longitude)) >0 then '0' else '1' end as LatitudeFlag,
     from 
        (
         SELECT REALTIMEDATA_CARNO AS CARNO,
              substring(RealTimeData_Time,1,8) as DTime,
              1 AS CT,
              CASE WHEN REALTIMEDATA_SPEED>0 THEN 1 ELSE 0 END AS CTSPEED,
              CASE WHEN REALTIMEDATA_MILEAGE=0 THEN NULL ELSE REALTIMEDATA_MILEAGE END AS MILE,
              CASE WHEN REALTIMEDATA_SPEED>200 then 0 else REALTIMEDATA_SPEED end AS SPEED,
              RealTimeData_HighestVoltageBatteryOrd,
              RealTimeData_Latitude,RealTimeData_Longitude
              FROM CM_EvcRealTimeData
          ) t
     group by CarNo,DTime
    View Code
    •  再将计算结果转存关系型数据库或HBase
  • 相关阅读:
    JDBC_批处理Batch_插入2万条数据的测试
    JDBC_ResultSet结果集用法_游标原理_关闭连接问题
    JDBC_PreparedStatement用法_占位符_参数处理
    python_字符串_常用处理
    R-biomaRt使用-代码备份
    R-描述性统计
    django开发傻瓜教程-3-celery异步处理
    Head First Java-图形化界面
    javascript隐藏和显示元素以及清空textarea
    Entrez Direct
  • 原文地址:https://www.cnblogs.com/tgzhu/p/5773433.html
Copyright © 2011-2022 走看看