What is the difference between GROUP and COGROUP in PIG?

17,485

Yes group is supposed to work like that !

According to the documentation ( http://pig.apache.org/docs/r0.12.0/basic.html#group ) :

Note: The GROUP and COGROUP operators are identical. Both operators work with one or more relations. For readability GROUP is used in statements involving one relation and COGROUP is used in statements involving two or more relations. You can COGROUP up to but no more than 127 relations at a time.

So it is just for readability, no differences between the two.

Share:
17,485
proutray
Author by

proutray

Updated on June 25, 2022

Comments

  • proutray
    proutray almost 2 years

    I understood Group didn't work with multiple tuples and hence we had COGROUP in PIG. However, while checking today the GROUP command works for me. I am using PIG-0.12.0. My commands and outputs are as follows.

    grunt> grpvar = GROUP C by $2, B by $2;
    grunt> cogrpvar = COGROUP C by $2, B by $2;
    grunt> describe grpvar;
    
    grpvar: {group: chararray,C: {(pid: int,pname: chararray,drug: chararray,gender: chararray,tot_amt: int)},B: {(pid: int,pname: chararray,drug: chararray,gender: chararray,tot_amt: int)}}
    
    grunt> describe cogrpvar;
    
    cogrpvar: {group: chararray,C: {(pid: int,pname: chararray,drug: chararray,gender: chararray,tot_amt: int)},B: {(pid: int,pname: chararray,drug: chararray,gender: chararray,tot_amt: int)}}
    

    Is GROUP expected to work like this? What is the difference between GROUP and COGROUP them?