Hadoop reduce side join using Datajoin

后端 未结 2 719
伪装坚强ぢ
伪装坚强ぢ 2020-12-21 06:21

I am using the folllowing code to do the reduce side join

/*
 * HadoopMapper.java
 *
 * Created on Apr 8, 2012, 5:39:51 PM
 */


import java.io.DataInput;
i         


        
相关标签:
2条回答
  • 2020-12-21 06:32

    Since your code is only working with Text, chaining the default constructor in TaggedWritable should do:

    public TaggedWritable() {
        this(new Text(""));
    }
    
    0 讨论(0)
  • 2020-12-21 06:36

    You need a default constructor for TaggedWritable (Hadoop uses reflection to create this object, and requires a default constructor (no args).

    You also have a problem in that your readFields method, you call data.readFields(in) on the writable interface - but has no knowledge of the actual runtime class of data.

    I suggest you either write out the data class name before outputting the data object itself, or look into the GenericWritable class (you'll need to extend it to define the set of allowable writable classes that can be used).

    So you could amend as follows:

    public static class TaggedWritable extends TaggedMapOutput {
        private Writable data;
    
        public TaggedWritable() {
            this.tag = new Text();
        }
    
        public TaggedWritable(Writable data) {
            this.tag = new Text("");
            this.data = data;
        }
    
        public Writable getData() {
            return data;
        }
    
        public void setData(Writable data) {
            this.data = data;
        }
    
        public void write(DataOutput out) throws IOException {
            this.tag.write(out);
            out.writeUTF(this.data.getClass().getName());
            this.data.write(out);
        }
    
        public void readFields(DataInput in) throws IOException {
            this.tag.readFields(in);
            String dataClz = in.readUTF();
            if (this.data == null
                    || !this.data.getClass().getName().equals(dataClz)) {
                this.data = (Writable) ReflectionUtils.newInstance(
                        Class.forName(dataClz), null);
            }
            this.data.readFields(in);
        }
    }
    
    0 讨论(0)
提交回复
热议问题