问题
I am upgrading an application from .NET Core 2.2 to .NET Core 3.0, and the new System.Text.Json
serializer is not behaving the same as Newtonsoft did in 2.2. On characters like a non-breaking-space (\u00A0) or emoji characters, Newtonsoft (and even Utf8Json) serialize them as their actual characters, not the Unicode code.
I've created a simple .NET Fiddle to show this.
var input = new Foo { Bar = "\u00A0 Test !@#$%^&*() 💯\uD83D\uDCAF 你好" };
var newtonsoft = Newtonsoft.Json.JsonConvert.SerializeObject(input);
var system = System.Text.Json.JsonSerializer.Serialize(input, new System.Text.Json.JsonSerializerOptions
{
Encoder = System.Text.Encodings.Web.JavaScriptEncoder.UnsafeRelaxedJsonEscaping,
});
var utf8Json = Utf8Json.JsonSerializer.ToJsonString(input);
Console.WriteLine($"Original: {input.Bar} - {input.Bar.Contains('\u00A0')}"); // Original
Console.WriteLine($"Newtonsoft: {newtonsoft} - {newtonsoft.Contains('\u00A0')}"); // Works
Console.WriteLine($"System.Text.Json: {system} - {system.Contains('\u00A0')}"); // Does not work
Console.WriteLine($"Utf8Json: {utf8Json} - {utf8Json.Contains('\u00A0')}"); // Works
https://dotnetfiddle.net/erCaZl
Is there an Encoder or a JsonSerializerOptions
property to serialize like Newtonsoft did?
回答1:
This is by-design. Our goal is to ship secure defaults, which is why we escape anything that we don't know for a fact is safe. For practical reasons, we can't detect all safe characters because that would mean us shipping large tables and perform potentially non-trivial lookups.
If you really insist, you can extend the JavaScriptEncoder
class and choose the encoded characters yourself. I would advise against this because if you're not careful people can sneak in payloads that might change the semantics of the JSON.
来源:https://stackoverflow.com/questions/58738258/issues-with-system-text-json-serializing-unicode-characters-like-emojis