Java: detect control characters which are not correct for JSON

前端 未结 4 737
花落未央
花落未央 2021-02-07 11:11

I am reinventing the wheel and creating my own JSON parse methods in Java.

I am going by the (very nice!) documentation on json.org. The only part I am unsure about is w

4条回答
  •  野性不改
    2021-02-07 11:45

    I know the question has been asked a couple of years ago, but I am replying anyway, because the accepted answer is not correct.

    Character.isISOControl(int codePoint) 
    

    does the following check:

    (codePoint >= 0x00 && codePoint <= 0x1F) || (codePoint >= 0x7F && codePoint <= 0x9F);
    

    The JSON specification defines at https://tools.ietf.org/html/rfc7159:

    1. Strings

      The representation of strings is similar to conventions used in the C family of programming languages. A string begins and ends with quotation marks. All Unicode characters may be placed within the quotation marks, except for the characters that must be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F).

    Character.isISOControl(int codePoint) 
    

    will flag all characters that need to be escaped (U+0000-U+001F), though it will also flag characters that do not need to be escaped (U+007F-U+009F). It is not required to escape the characters (U+007F-U+009F).

提交回复
热议问题