zoukankan      html  css  js  c++  java
  • pig相关

    1. 重命名pig job name:

    在Pig脚本中的一开始处,写上这一句:

    set job.name 'This is my job';

    2. 设置pig参数:

    允许pig时,输入如下:

    pig -p JOBNAME="MyJob" test.pig
    ************test.pig**********
    set job.name '$JOBNAME';
    ......

    3. pig分隔符定义:

    pig默认分隔符是/t,可以通过如下命令 using PigStorage(',')自定义分隔符:

    prices = load 'NYSE_daily' using PigStorage(',') as (exchange, symbol, date, open,high, low, close, volume, adj_close);

    4. pig定义reduce个数:

    Parallel

    设置pig的reduce进程个数

    --parallel.pig
    daily   = load 'NYSE_daily' as (exchange, symbol, date, open, high, low, close,
                volume, adj_close);
    bysymbl = group daily by symbol parallel 10;

    parallel只针对一条语句,如果希望脚本中的所有语句都有10个reduce进程,可以使用 set default_parallel 10命令

    --defaultparallel.pig
    set default_parallel 10;
    daily   = load 'NYSE_daily' as (exchange, symbol, date, open, high, low, close,
                volume, adj_close);
    bysymbl = group daily by symbol;
    average = foreach bysymbl generate group, AVG(daily.close) as avg;
    sorted  = order average by avg desc;

    其他可以参考:

    http://www.cnblogs.com/siwei1988/archive/2012/08/06/2624912.html

  • 相关阅读:
    File类
    Java运算符
    JAVA语法
    数据库-子查询
    爬取笔趣阁_完本书籍
    爬取动物图片源码
    爬取电影天堂上最新电影的下载链接的源码
    pyinstaller的安装、使用、出错解决办法
    Emmet插件使用方法总结
    Markdown基本语法
  • 原文地址:https://www.cnblogs.com/dorothychai/p/4606406.html
Copyright © 2011-2022 走看看