问题
I have a file containing some properties which value of some of them contains escape characters, for example some Urls and Regex patterns.
When reading the content and converting back to the json, with or without unescaping, the content is not correct. If I convert back to json with unescaping, some regular expression break, if I convert with unescaping, urls and some regular expressions will break.
How can I solve the problem?
Minimal Complete Verifiable Example
Here are some simple code blocks to allow you simply reproduce the problem:
Content
$fileContent =
@"
{
"something": "http://domain/?x=1&y=2",
"pattern": "^(?!(\\`|\\~|\\!|\\@|\\#|\\$|\\||\\\\|\\'|\\\")).*"
}
"@
With Unescape
If I read the content and then convert the content back to json using following command:
$fileContent | ConvertFrom-Json | ConvertTo-Json | %{[regex]::Unescape($_)}
The output (which is wrong) would be:
{
"something": "http://domain/?x=1&y=2",
"pattern": "^(?!(\|\~|\!|\@|\#|\$|\||\\|\'|\")).*"
}
Without Unescape
If I read the content and then convert the content back to json using following command:
$fileContent | ConvertFrom-Json | ConvertTo-Json
The output (which is wrong) would be:
{
"something": "http://domain/?x=1\u0026y=2",
"pattern": "^(?!(\\|\\~|\\!|\\@|\\#|\\$|\\||\\\\|\\\u0027|\\\")).*"
}
Expected Result
The expected result should be same as the input file content.
回答1:
I decided to not use Unscape, instead replace the unicode \uxxxx
characters with their string values and now it works properly:
$fileContent =
@"
{
"something": "http://domain/?x=1&y=2",
"pattern": "^(?!(\\`|\\~|\\!|\\@|\\#|\\$|\\||\\\\|\\'|\\\")).*"
}
"@
$fileContent | ConvertFrom-Json | ConvertTo-Json | %{
[Regex]::Replace($_,
"\\u(?<Value>[a-zA-Z0-9]{4})", {
param($m) ([char]([int]::Parse($m.Groups['Value'].Value,
[System.Globalization.NumberStyles]::HexNumber))).ToString() } )}
Which generates the expected output:
{
"something": "http://domain/?x=1&y=\\2",
"pattern": "^(?!(\\|\\~|\\!|\\@|\\#|\\$|\\||\\\\|\\'|\\\")).*"
}
回答2:
If you don't want to rely on Regex (from @Reza Aghaei's answer), you could import the Newtonsoft JSON library. The benefit is the default StringEscapeHandling property which escapes control characters only. Another benefit is avoiding the potentially dangerous string replacements you would be doing with Regex.
This StringEscapeHandling
is also the default handling of PowerShell Core (version 6 and up) because they started to use Newtonsoft internally since then. So another alternative would be to use ConvertFrom-Json and ConvertTo-Json from PowerShell Core.
Your code would look something like this if you import the Newtonsoft JSON library:
[Reflection.Assembly]::LoadFile("Newtonsoft.Json.dll")
$json = Get-Content -Raw -Path file.json -Encoding UTF8 # read file
$unescaped = [Newtonsoft.Json.Linq.JObject]::Parse($json) # similar to ConvertFrom-Json
$escapedElementValue = [Newtonsoft.Json.JsonConvert]::ToString($unescaped.apiName.Value) # similar to ConvertTo-Json
$escapedCompleteJson = [Newtonsoft.Json.JsonConvert]::SerializeObject($unescaped) # similar to ConvertTo-Json
Write-Output "Variable passed = $escapedElementValue"
Write-Output "Same JSON as Input = $escapedCompleteJson"
来源:https://stackoverflow.com/questions/47779157/convertto-json-and-convertfrom-jason-with-special-characters