Can I convert a C# string value to an escaped string literal

后端 未结 16 842
时光取名叫无心
时光取名叫无心 2020-11-22 07:51

In C#, can I convert a string value to a string literal, the way I would see it in code? I would like to replace tabs, newlines, etc. with their escape sequences.

If

相关标签:
16条回答
  • 2020-11-22 08:05

    Hallgrim's answer was excellent. Here's a small tweak in case you need to parse out additional whitespace characters and linebreaks with a c# regular expression. I needed this in the case of a serialized Json value for insertion into google sheets and ran into trouble as the code was inserting tabs, +, spaces, etc.

      provider.GenerateCodeFromExpression(new CodePrimitiveExpression(input), writer, null);
      var literal = writer.ToString();
      var r2 = new Regex(@"\"" \+.\n[\s]+\""", RegexOptions.ECMAScript);
      literal = r2.Replace(literal, "");
      return literal;
    
    0 讨论(0)
  • 2020-11-22 08:05

    I submit my own implementation, which handles null values and should be more performant on account of using array lookup tables, manual hex conversion, and avoiding switch statements.

    using System;
    using System.Text;
    using System.Linq;
    
    public static class StringLiteralEncoding {
      private static readonly char[] HEX_DIGIT_LOWER = "0123456789abcdef".ToCharArray();
      private static readonly char[] LITERALENCODE_ESCAPE_CHARS;
    
      static StringLiteralEncoding() {
        // Per http://msdn.microsoft.com/en-us/library/h21280bw.aspx
        var escapes = new string[] { "\aa", "\bb", "\ff", "\nn", "\rr", "\tt", "\vv", "\"\"", "\\\\", "??", "\00" };
        LITERALENCODE_ESCAPE_CHARS = new char[escapes.Max(e => e[0]) + 1];
        foreach(var escape in escapes)
          LITERALENCODE_ESCAPE_CHARS[escape[0]] = escape[1];
      }
    
      /// <summary>
      /// Convert the string to the equivalent C# string literal, enclosing the string in double quotes and inserting
      /// escape sequences as necessary.
      /// </summary>
      /// <param name="s">The string to be converted to a C# string literal.</param>
      /// <returns><paramref name="s"/> represented as a C# string literal.</returns>
      public static string Encode(string s) {
        if(null == s) return "null";
    
        var sb = new StringBuilder(s.Length + 2).Append('"');
        for(var rp = 0; rp < s.Length; rp++) {
          var c = s[rp];
          if(c < LITERALENCODE_ESCAPE_CHARS.Length && '\0' != LITERALENCODE_ESCAPE_CHARS[c])
            sb.Append('\\').Append(LITERALENCODE_ESCAPE_CHARS[c]);
          else if('~' >= c && c >= ' ')
            sb.Append(c);
          else
            sb.Append(@"\x")
              .Append(HEX_DIGIT_LOWER[c >> 12 & 0x0F])
              .Append(HEX_DIGIT_LOWER[c >>  8 & 0x0F])
              .Append(HEX_DIGIT_LOWER[c >>  4 & 0x0F])
              .Append(HEX_DIGIT_LOWER[c       & 0x0F]);
        }
    
        return sb.Append('"').ToString();
      }
    }
    
    0 讨论(0)
  • 2020-11-22 08:06

    Here is a little improvement for Smilediver's answer, it will not escape all no-ASCII chars but only these are really needed.

    using System;
    using System.Globalization;
    using System.Text;
    
    public static class CodeHelper
    {
        public static string ToLiteral(this string input)
        {
            var literal = new StringBuilder(input.Length + 2);
            literal.Append("\"");
            foreach (var c in input)
            {
                switch (c)
                {
                    case '\'': literal.Append(@"\'"); break;
                    case '\"': literal.Append("\\\""); break;
                    case '\\': literal.Append(@"\\"); break;
                    case '\0': literal.Append(@"\0"); break;
                    case '\a': literal.Append(@"\a"); break;
                    case '\b': literal.Append(@"\b"); break;
                    case '\f': literal.Append(@"\f"); break;
                    case '\n': literal.Append(@"\n"); break;
                    case '\r': literal.Append(@"\r"); break;
                    case '\t': literal.Append(@"\t"); break;
                    case '\v': literal.Append(@"\v"); break;
                    default:
                        if (Char.GetUnicodeCategory(c) != UnicodeCategory.Control)
                        {
                            literal.Append(c);
                        }
                        else
                        {
                            literal.Append(@"\u");
                            literal.Append(((ushort)c).ToString("x4"));
                        }
                        break;
                }
            }
            literal.Append("\"");
            return literal.ToString();
        }
    }
    
    0 讨论(0)
  • 2020-11-22 08:07

    What about Regex.Escape(String) ?

    Regex.Escape escapes a minimal set of characters (\, *, +, ?, |, {, [, (,), ^, $,., #, and white space) by replacing them with their escape codes.

    0 讨论(0)
  • 2020-11-22 08:09

    I found this:

    private static string ToLiteral(string input)
    {
        using (var writer = new StringWriter())
        {
            using (var provider = CodeDomProvider.CreateProvider("CSharp"))
            {
                provider.GenerateCodeFromExpression(new CodePrimitiveExpression(input), writer, null);
                return writer.ToString();
            }
        }
    }
    

    This code:

    var input = "\tHello\r\n\tWorld!";
    Console.WriteLine(input);
    Console.WriteLine(ToLiteral(input));
    

    Produces:

        Hello
        World!
    "\tHello\r\n\tWorld!"
    
    0 讨论(0)
  • 2020-11-22 08:13

    My attempt at adding ToVerbatim to Hallgrim's accepted answer above:

    private static string ToLiteral(string input)
    {
        using (var writer = new StringWriter())
        {
            using (var provider = CodeDomProvider.CreateProvider("CSharp"))
            {
                provider.GenerateCodeFromExpression(new CodePrimitiveExpression(input), writer, new CodeGeneratorOptions { IndentString = "\t" });
                var literal = writer.ToString();
                literal = literal.Replace(string.Format("\" +{0}\t\"", Environment.NewLine), "");           
                return literal;
            }
        }
    }
    
    private static string ToVerbatim( string input )
    {
        string literal = ToLiteral( input );
        string verbatim = "@" + literal.Replace( @"\r\n", Environment.NewLine );
        return verbatim;
    }
    
    0 讨论(0)
提交回复
热议问题