How to convert fields to rows in Pig?

只愿长相守 提交于 2019-12-04 10:45:10

I think alexeipab's answer is the right direction. Here is a simple example:

> A = load 'input.txt';
> dump A
(0,1,2,3,4,5,6,7,8,9)
> B = foreach A generate FLATTEN(TOBAG(*));
> dump B
(0)
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
mfunaro

I ran into a very similar issues using Pig. What I ended up doing was writing a UDF, that would simply iterate through the tuple. For each of the fields in the tuple it would create a new tuple with the field value and add it to a databag. Here is a sample...

public DataBag exec(Tuple tuple) throws IOException {
    DataBag db = BagFactory.getInstance().newDefaultBag();
    for(int i = 0; i < tuple.size(); ++i){
        DefaultTuple dt = new DefaultTuple();
        dt.append(tuple.get(i));
        db.add(dt);
    }
    return db;
}

Obviously that does not include any error checking as it is a sample but it will help you get an idea of how to do this.

In your script you could 'FLATTEN' the results and put the single values back into individual tuples if need be.

alexeipab

It looks like you want to pivot the row. There are a couple of solutions see Pivot table with Apache Pig or Splitting a tuple into multiple tuples in Pig

Use DataFu UDF TransposeTupleToBag (http://datafu.incubator.apache.org/docs/datafu/1.1.0/datafu/pig/util/TransposeTupleToBag.html) to get a bag which contains fields from tuple transposed. Flatten the bag to get rows with (key:chararray, value:chararray) tuple. Select 'value' part from the flatten output.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!