zoukankan      html  css  js  c++  java
  • Hive 11、Hive嵌入Python

    Hive嵌入Python

    Python的输入输出都是 为分隔符,否则会出错,python脚本输入print出规定格式的数据

    用法为先add file使用语法为TRANSFORM (name, items)   USING 'python test.py'  AS (name string, item1 string,item2 string,item3 string),这里后面几个字段对应python的类型

     下面是一个将一列转成多列表小案例:

    create table test (name string,items string) 
    
    ROW FORMAT DELIMITED 
    
    FIELDS TERMINATED BY '	';
    LOAD DATA local INPATH '/opt/data/tt.txt' OVERWRITE INTO TABLE test ;

    tt.txt的内容:

    tom	shu fa,wei qi,chang ge
    jack	game,kan shu,shang wang
    lusi	lv you,guang jie,gou wu

    表2:

    create table test2 (name string,item1 string,item2 string,item3 string) 
    
    ROW FORMAT DELIMITED 
    
    FIELDS TERMINATED BY '	';
    -- 将python脚本上传到Hive
    Hive> add file /root/test.py
    -- 将结果放到test2中
    INSERT OVERWRITE TABLE test2  
    
    SELECT  TRANSFORM (name, items)  
    USING 'python test.py'  
    AS (name string, item1 string,item2 string,item3 string)  
    FROM test;
    #!/usr/bin/python  
    
    import sys  
    for line in sys.stdin:  
         line = line.strip()    
         name,it = line.split('	')  
         count = it.count(',')+1
         for i in range(0,3-count):
              it = it+',NULL'
         result = it.split(',')[0:3]
         print '%s	%s'%(name,'	'.join(result))
    结果:
    --
    表1 hive> select * from test; OK tom shu fa,wei qi,chang ge jack game,kan shu,shang wang lusi lv you,guang jie,gou wu Time taken: 0.07 seconds, Fetched: 3 row(s)

     hive> desc test2;
     OK
     name string
     item1 string
     item2 string
     item3 string
     Time taken: 0.141 seconds, Fetched: 4 row(s)

    -- 表2
    hive> select * from test2;
    OK
    tom    shu fa    wei qi    chang ge
    jack    game    kan shu    shang wang
    lusi    lv you    guang jie    gou wu
    Time taken: 1.368 seconds, Fetched: 3 row(s)
  • 相关阅读:
    NodeJS NPM 镜像使用方法
    用for; while...do; do...while; 写出九九乘法表
    create-react-app创建的项目中registerServiceWorker.js文件的作用
    前端应该从哪些方面优化网站?
    JS基础整理面试题
    netcore实践:跨平台动态加载native组件
    iOS开发--Swift RAC响应式编程初探
    算法导论学习笔记 (页码:9 ~ 16)
    iOS开发-- 通过runtime kvc 移除导航栏下方的阴影效果线条
    iOS开发--面试
  • 原文地址:https://www.cnblogs.com/raphael5200/p/5221927.html
Copyright © 2011-2022 走看看