zoukankan      html  css  js  c++  java
  • Hive 11、Hive嵌入Python

    Hive嵌入Python

    Python的输入输出都是 为分隔符,否则会出错,python脚本输入print出规定格式的数据

    用法为先add file,使用语法为TRANSFORM (name, items)   USING 'python test.py'  AS (name string, item1 string,item2 string,item3 string),这里后面几个字段对应python的类型

     下面是一个将一列转成多列表小案例:

    create table test (name string,items string) 
    
    ROW FORMAT DELIMITED 
    
    FIELDS TERMINATED BY '	';
    

      

    LOAD DATA local INPATH '/opt/data/tt.txt' OVERWRITE INTO TABLE test ;

    tt.txt的内容:

    tom	shu fa,wei qi,chang ge
    jack	game,kan shu,shang wang
    lusi	lv you,guang jie,gou wu
    

      表2:

    create table test2 (name string,item1 string,item2 string,item3 string) 
    
    ROW FORMAT DELIMITED 
    
    FIELDS TERMINATED BY '	';
    

      

    -- 将python脚本上传到Hive
    Hive> add file /root/test.py
    

      

    -- 将结果放到test2中
    INSERT OVERWRITE TABLE test2  
    
    SELECT  TRANSFORM (name, items)  
    USING 'python test.py'  
    AS (name string, item1 string,item2 string,item3 string)  
    FROM test;
    

      

    #!/usr/bin/python  
    
    import sys  
    for line in sys.stdin:  
         line = line.strip()    
         name,it = line.split('	')  
         count = it.count(',')+1
         for i in range(0,3-count):
              it = it+',NULL'
         result = it.split(',')[0:3]
         print '%s	%s'%(name,'	'.join(result))
    

      

    结果:
    -- 表1
    hive> select * from test;
    OK
    tom    shu fa,wei qi,chang ge
    jack    game,kan shu,shang wang
    lusi    lv you,guang jie,gou wu
    Time taken: 0.07 seconds, Fetched: 3 row(s)
    
    
     hive> desc test2;
     OK
     name                	string              	                    
     item1               	string              	                    
     item2               	string              	                    
     item3               	string              	                    
     Time taken: 0.141 seconds, Fetched: 4 row(s)
    -- 表2
    hive> select * from test2;
    OK
    tom    shu fa    wei qi    chang ge
    jack    game    kan shu    shang wang
    lusi    lv you    guang jie    gou wu
    Time taken: 1.368 seconds, Fetched: 3 row(s)
    

      

  • 相关阅读:
    将文件夹压缩为jar包——JAVA小工具
    android json解析及简单例子(转载)
    Eclipse RCP中获取Plugin/Bundle中文件资源的绝对路径(转载)
    右键菜单的过滤和启动(转载)
    eclipse rcp应用程序重启
    使用PrefUtil设置全局配置
    模拟器屏幕大小
    Android实现下载图片并保存到SD卡中
    PhoneGap与Jquery Mobile组合开发android应用的配置
    android WebView结合jQuery mobile之基础:整合篇
  • 原文地址:https://www.cnblogs.com/tesla-turing/p/11509344.html
Copyright © 2011-2022 走看看