What is the reason for having Writable wrapper classes in Hadoop MapReduce for Java types?

后端 未结 1 761
挽巷
挽巷 2021-01-15 12:57

It seems to me that a org.apache.hadoop.io.serializer.Serialization could be written to serialize the java types directly in the same format that the wrapper cl

1条回答
  •  有刺的猬
    2021-01-15 13:18

    There is nothing stopping you changing the serialization to use a different mechanism such as java Serializable interface or something like thrift, protocol buffers etc.

    In fact, Hadoop comes with an (experimental) Serialization implementation for Java Serializable objects - just configure the serialization factory to use it. The default serialization mechanism is WritableSerialization, but this can be changed by setting the following configuration property:

    io.serializations=org.apache.hadoop.io.serializer.JavaSerialization
    

    Bear in mind however that anything that expects a Writable (Input/Output formats, partitioners, comparators) etc will need to be replaced by versions that can be passed a Serializable instance rather than a Writable instance.

    Some more links for the curious reader:

    • http://www.tom-e-white.com/2008/07/rpc-and-serialization-with-hadoop.html
    • What are the connections and differences between Hadoop Writable and java.io.serialization? - Which seems to be a similar question to what you're asking, and Tariq has a good link to a thread in which Doug Cutting explains the rationale behind using Writables over Serializables

    0 讨论(0)
提交回复
热议问题