What is the reason for having Writable wrapper classes in Hadoop MapReduce for Java types?

后端未结

关注

 1  765

挽巷 2021-01-15 12:57

It seems to me that a org.apache.hadoop.io.serializer.Serialization could be written to serialize the java types directly in the same format that the wrapper cl

1条回答

有刺的猬 (楼主)

2021-01-15 13:18
There is nothing stopping you changing the serialization to use a different mechanism such as java Serializable interface or something like thrift, protocol buffers etc.

In fact, Hadoop comes with an (experimental) Serialization implementation for Java Serializable objects - just configure the serialization factory to use it. The default serialization mechanism is WritableSerialization, but this can be changed by setting the following configuration property:
```
io.serializations=org.apache.hadoop.io.serializer.JavaSerialization
```
Bear in mind however that anything that expects a Writable (Input/Output formats, partitioners, comparators) etc will need to be replaced by versions that can be passed a Serializable instance rather than a Writable instance.

Some more links for the curious reader:
- http://www.tom-e-white.com/2008/07/rpc-and-serialization-with-hadoop.html
- What are the connections and differences between Hadoop Writable and java.io.serialization? - Which seems to be a similar question to what you're asking, and Tariq has a good link to a thread in which Doug Cutting explains the rationale behind using Writables over Serializables
0 讨论(0)
发布评论:

提交评论
- 加载中...