Advantages of using NullWritable in Hadoop

前端 未结 3 1597
时光说笑
时光说笑 2021-01-30 11:18

What are the advantages of using NullWritable for null keys/values over using null texts (i.e. new Text(null)). I see the fol

3条回答
  •  悲&欢浪女
    2021-01-30 11:59

    The key/value types must be given at runtime, so anything writing or reading NullWritables will know ahead of time that it will be dealing with that type; there is no marker or anything in the file. And technically the NullWritables are "read", it's just that "reading" a NullWritable is actually a no-op. You can see for yourself that there's nothing at all written or read:

    NullWritable nw = NullWritable.get();
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    nw.write(new DataOutputStream(out));
    System.out.println(Arrays.toString(out.toByteArray())); // prints "[]"
    
    ByteArrayInputStream in = new ByteArrayInputStream(new byte[0]);
    nw.readFields(new DataInputStream(in)); // works just fine
    

    And as for your question about new Text(null), again, you can try it out:

    Text text = new Text((String)null);
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    text.write(new DataOutputStream(out)); // throws NullPointerException
    System.out.println(Arrays.toString(out.toByteArray()));
    

    Text will not work at all with a null String.

提交回复
热议问题