zoukankan      html  css  js  c++  java
  • Hive使用,UDF的创建

    create database testdb;


    自定义UDF
    1.把程序打包放到目标机器上:hiveserver(hive使用端)
    2.进入hive客户端,添加jar包
    hive>add jar /usr/jar/hivetest.jar
    3.创建临时函数
    hive>create temporary function add_example as 'hive.udf.Add';
    4.查询HQL语句:
    select add_example(8,9) from score;
    5.销毁临时函数
    hive>drop temporary function add_example;


    create table psn1(
    id int,
    name string,
    likes ARRAY <string>,
    address MAP <string,string>
    )
    ROW FORMAT DELIMITED
    FIELDS TERMINATED by ','
    COLLECTION ITEMS TERMINATED by '-'
    MAP KEYS TERMINATED by ':';

    vi /var/tmp/data/psn1
    1,xiaoming,book-tv-code,beijing:chaoyang-shanghai:pudong
    2,lilei,book-code,huoxing:xxx-shanghai:pudong
    3,hanmeimie,book-lilei,beijing:xxx

    hive>load data local inpath '/var/tmp/data/psn1' into table psn1;


    创建外部表
    create external table psn2(
    id int,
    name string,
    likes ARRAY <string>,
    address MAP <string,string>
    )
    ROW FORMAT DELIMITED
    FIELDS TERMINATED by ','
    COLLECTION ITEMS TERMINATED by '-'
    MAP KEYS TERMINATED by ':'
    LOCATION '/user/data';

    外部表删除后,在数据库里没有表的数据,但在目录中该文件仍存在


    CREATE TABLE IF NOT EXISTS employee ( eid int, name String,
    salary String, destination String)
    COMMENT 'Employee details'
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY ' '
    LINES TERMINATED BY ' '
    STORED AS TEXTFILE;

    sample.txt
    1201 Gopal 45000 Technical manager
    1202 Manisha 45000 Proof reader
    1203 Masthanvali 40000 Technical writer
    1204 Kiran 40000 Hr Admin
    1205 Kranthi 30000 Op Admin
    hive>load data local inpath '/user/sample.txt' into table employee;

    创建分区
    create table psn3(
    id int,
    name string,
    likes ARRAY <string>,
    address MAP <string,string>
    )
    PARTITIONED BY(year int)
    ROW FORMAT DELIMITED
    FIELDS TERMINATED by ','
    COLLECTION ITEMS TERMINATED by '-'
    MAP KEYS TERMINATED by ':'
    LOCATION '/user/psn3';


    alter table psn3 add partiton (year int);
    alter table psn3 drop partition (year int);

    load data local inpath '/user/sample.txt' into table employee partition (dept='IT');

    load data [local] 'filepath' into table tb_name;
    如果filepath是hdfs的目录,相当于将hdfs的文件移动到hive表目录
    如果是本地目录,是将文件上传到hdfs,再将这个hdfs文件拷贝到对应表文件目录下

    create table result_table (cnt int);

    常用语保存运算或统计结果
    from psn1
    insert into table result_table
    select count(*);

  • 相关阅读:
    Tarjan-割点&桥&双连通
    树状数组
    loli的搜索测试-5
    思维题
    生成树
    贪心
    loli的搜索测试-4
    树链剖分
    基于AHK的上课自动签到
    用列表实现栈
  • 原文地址:https://www.cnblogs.com/pashanhu/p/10950262.html
Copyright © 2011-2022 走看看