Java or Scala. How to convert characters like \x22 into String

纵然是瞬间 提交于 2019-12-24 00:35:27

问题


I have a string that looks like this:

{\x22documentReferer\x22:\x22http:\x5C/\x5C/pikabu.ru\x5C/freshitems.php\x22}

How could I convert this into a readable JSON?

I've found different slow solutions like here with regEx

Have already tried:

URL.decode
StringEscapeUtils
JSON.parse // from different libraries 

For example python has simple solution like decode from 'string_escape'

Linked possible duplicate applies to Python, and my question is about Java or Scala

Working but also very slow solution I'm using now is from here:

 def unescape(oldstr: String): String = {
val newstr = new StringBuilder(oldstr.length)
var saw_backslash = false
var i = 0
while (i < oldstr.length) {
  {
    val cp = oldstr.codePointAt(i)
    if (!saw_backslash) {
      if (cp == '\\') saw_backslash = true
      else newstr.append(cp.toChar)
    } else {
      if (cp == '\\') {
        saw_backslash = false
        newstr.append('\\')
        newstr.append('\\')
      } else {
        if (cp == 'x') {
          if (i + 2 > oldstr.length) die("string too short for \\x escape")
          i += 1
          var value = 0
          try
            value = Integer.parseInt(oldstr.substring(i, i + 2), 16)
          catch {
            case nfe: NumberFormatException =>
              die("invalid hex value for \\x escape")
          }
          newstr.append(value.toChar)
          i += 1
        }
        else {
          newstr.append('\\')
          newstr.append(cp.toChar)
        }
        saw_backslash = false
      }
    }
  }
  i += 1
}
    if (saw_backslash) newstr.append('\\')
    newstr.toString
  }

private def die(msg: String) {
  throw new IllegalArgumentException(msg)
}

回答1:


\x is used to escape ASCII characters in Python and other languages. In Scala and Java, you can use \u to escape Unicode characters. Since ASCII is a subset of Unicode (as explained here), we can use the unescapeJava method (in StringEscapeUtils) along with some simple replacement to add the \u escape character together with 2 leading zeros:

import org.apache.commons.lang3.StringEscapeUtils
StringEscapeUtils.unescapeJava(x.replaceAll("""\\x""", """\\u00"""))

You can also use regex to find the escape sequences and replace them with the appropriate ASCII character:

val pattern = """\\x([0-9A-F]{2})""".r

pattern.replaceAllIn(x, m => m.group(1) match {
  case "5C" => """\\""" //special case for backslash
  case hex => Integer.parseInt(hex, 16).toChar.toString
})

This appears to be faster and does not require an external library, although it is still may be slow for your needs. It probably also does not cover some edge cases, but might cover simple needs.

I am definitely not an expert on this so there might be a better way to handle this.



来源:https://stackoverflow.com/questions/47032049/java-or-scala-how-to-convert-characters-like-x22-into-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!