A = load '/user/cloudera/lab/mydata' using PigStorage() as (a,b,c);
如果写成 A=load 就会出现 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Encountered " "A=load "" at line 1, column 1.
(1,2,3)
(4,2,1)
(8,3,4)
(4,3,3)
(7,2,5)
(8,4,3)
B = group A by a;
(1,{(1,2,3)})
(4,{(4,3,3),(4,2,1)})
(7,{(7,2,5)})
(8,{(8,4,3),(8,3,4)})
C = foreach B { D = distinct A.b; generate flatten(group), COUNT(D); };
如果写成把上一行分开写,如下,会有 Unexpected character '. 后来发现是group 前面那个括号写成中文的(了
C = foreach B {
>> D = distinct A.b;
>> generate flatten(group), COUNT(D);
>> };
(1,1)
(4,2)
(7,1)
(8,2)