Why does DataOutputStream.writeUTF() add additional 2 bytes at the beginning?

后端 未结 2 427
说谎
说谎 2020-12-05 07:52

When I was trying to parse xml using sax over sockets I came across a strange occurence. Upon analysing I noticed that DataOutputStream adds 2 bytes in front of my data.

相关标签:
2条回答
  • 2020-12-05 08:35

    The output of DataOutputStream.writeUTF() is a custom format, intended to be read by DataInputStream.readUTF().

    The javadocs of the writeUTF method you are calling say:

    Writes a string to the underlying output stream using modified UTF-8 encoding in a machine-independent manner.

    First, two bytes are written to the output stream as if by the writeShort method giving the number of bytes to follow. This value is the number of bytes actually written out, not the length of the string. Following the length, each character of the string is output, in sequence, using the modified UTF-8 encoding for the character. If no exception is thrown, the counter written is incremented by the total number of bytes written to the output stream. This will be at least two plus the length of str, and at most two plus thrice the length of str.

    0 讨论(0)
  • 2020-12-05 08:41

    Always use the same type of stream when reading and writing data. If you are feeding the stream directly into a sax parser, then you should not use a DataOutputStream.

    Just use

    BufferedOutputStream bos = new BufferedOutputStream(socket.getOutputStream());
    bos.write(os.getBytes("UTF-8"));
    
    0 讨论(0)
提交回复
热议问题