Evaluate escaped string

前端 未结 5 662
-上瘾入骨i
-上瘾入骨i 2021-01-13 06:50

I have some strings in a file that are already escaped. So the content of the file looks like this:

Hello\\nWorld. This is\\tGreat.

When I

相关标签:
5条回答
  • 2021-01-13 07:08

    Like you I was unable to find a decent solution to this problem. While you can certainly use String.Replace, the performance and speed of this solution is terrible. Furthermore, it's hard to support octal and Unicode escape sequences via this method. A much better alternative is to use a simple RegEx parser. Here's a method that will properly un-escape any string given. It supports standard escape sequences, octal escape sequences, and unicode escape sequences.

    string UnEscape(string s) {
        StringBuilder sb = new StringBuilder();
        Regex r = new Regex("\\\\[abfnrtv?\"'\\\\]|\\\\[0-3]?[0-7]{1,2}|\\\\u[0-9a-fA-F]{4}|.");
        MatchCollection mc = r.Matches(s, 0);
    
        foreach (Match m in mc) {
            if (m.Length == 1) {
                sb.Append(m.Value);
            } else {
                if (m.Value[1] >= '0' && m.Value[1] <= '7') {
                    int i = 0;
    
                    for (int j = 1; j < m.Length; j++) {
                        i *= 8;
                        i += m.Value[j] - '0';
                    }
    
                    sb.Append((char)i);
                } else if (m.Value[1] == 'u') {
                    int i = 0;
    
                    for (int j = 2; j < m.Length; j++) {
                        i *= 16;
    
                        if (m.Value[j] >= '0' && m.Value[j] <= '9') {
                            i += m.Value[j] - '0';
                        } else if (m.Value[j] >= 'A' && m.Value[j] <= 'F') {
                            i += m.Value[j] - 'A' + 10;
                        } else if (m.Value[j] >= 'a' && m.Value[j] <= 'f') {
                            i += m.Value[j] - 'a' + 10;
                        }
                    }
    
                    sb.Append((char)i);
                } else {
                    switch (m.Value[1]) {
                        case 'a':
                            sb.Append('\a');
                            break;
                        case 'b':
                            sb.Append('\b');
                            break;
                        case 'f':
                            sb.Append('\f');
                            break;
                        case 'n':
                            sb.Append('\n');
                            break;
                        case 'r':
                            sb.Append('\r');
                            break;
                        case 't':
                            sb.Append('\t');
                            break;
                        case 'v':
                            sb.Append('\v');
                            break;
                        default:
                            sb.Append(m.Value[1]);
                            break;
                    }
                }
            }
        }
    
        return sb.ToString();
    }
    
    0 讨论(0)
  • 2021-01-13 07:12

    You can try using System.Text.RegularExpressions.Regex.Unescape.

    There's also an entry on the MSDN forums.

    See also How can I Unescape and Reescape strings in .net? .

    0 讨论(0)
  • 2021-01-13 07:12

    you could do something like:

    string str = str.Replace(@"\n","\n");
    

    update:

    Obviously this is a workaround as the scenario is "un natural" by itself. The Regex.Unescape solution is unapplicable here as it is intended to use for unescaping regex control characters, and not new lines etc.

    In order to support other relevant characters one can write a replacing function like this one:

    public string ReEscapeControlCharacters(string str) {
       return str.Replace(@"\n","\n").Replace(@"\r","\r").Replace(@"\t","\t");
    }
    
    0 讨论(0)
  • 2021-01-13 07:19

    Try this:

    String replaced = startstring.Replace(System.Environment.NewLine, desirevalue);
    

    This have to be valid only for "\n".

    0 讨论(0)
  • 2021-01-13 07:26

    based on @deAtog 's code, i made some minor additions

    • support \U00000000 format chars
    • simplify the hex conversions somewhat

      string UnEscape(string s)
      {
          StringBuilder sb = new StringBuilder();
          Regex r = new Regex("\\\\[abfnrtv?\"'\\\\]|\\\\[0-3]?[0-7]{1,2}|\\\\u[0-9a-fA-F]{4}|\\\\U[0-9a-fA-F]{8}|.");
          MatchCollection mc = r.Matches(s, 0);
      
          foreach (Match m in mc)
          {
              if (m.Length == 1)
              {
                  sb.Append(m.Value);
              }
              else
              {
                  if (m.Value[1] >= '0' && m.Value[1] <= '7')
                  {
                      int i = Convert.ToInt32(m.Value.Substring(1), 8);
                      sb.Append((char)i);
                  }
                  else if (m.Value[1] == 'u')
                  {
                      int i = Convert.ToInt32(m.Value.Substring(2), 16);
                      sb.Append((char)i);
                  }
                  else if (m.Value[1] == 'U')
                  {
                      int i = Convert.ToInt32(m.Value.Substring(2), 16);
                      sb.Append(char.ConvertFromUtf32(i));
                  }
                  else
                  {
                      switch (m.Value[1])
                      {
                          case 'a':
                              sb.Append('\a');
                              break;
                          case 'b':
                              sb.Append('\b');
                              break;
                          case 'f':
                              sb.Append('\f');
                              break;
                          case 'n':
                              sb.Append('\n');
                              break;
                          case 'r':
                              sb.Append('\r');
                              break;
                          case 't':
                              sb.Append('\t');
                              break;
                          case 'v':
                              sb.Append('\v');
                              break;
                          default:
                              sb.Append(m.Value[1]);
                              break;
                      }
                  }
              }
          }
      
          return sb.ToString();
      }
      
    0 讨论(0)
提交回复
热议问题