Is the following behaviour some feature or a bug in C# .NET?
Test application:
using System;
using System.Linq;
namespace ConsoleApplication1
{
After much experimentation this worked for me. I'm trying to create a command to send to the Windows command line. A folder name comes after the -graphical
option in the command, and since it may have spaces in it, it has to be wrapped in double quotes. When I used back slashes to create the quotes they came out as literals in the command. So this. . . .
string q = @"" + (char) 34;
string strCmdText = string.Format(@"/C cleartool update -graphical {1}{0}{1}", this.txtViewFolder.Text, q);
System.Diagnostics.Process.Start("CMD.exe", strCmdText);
q
is a string holding just a double quote character. It's preceded with @
to make it a verbatim string literal.
The command template is also a verbatim string literal, and the string.Format method is used to compile everything into strCmdText
.
I came across this same issue the other day and had a tough time getting through it. In my googling, I came across this article regarding VB.NET (the language of my application) that solved the problem without having to change any of my other code based on the arguments.
In that article, he refers to the original article which was written for C#. Here's the actual code, you pass it Environment.CommandLine()
:
C#
class CommandLineTools
{
/// <summary>
/// C-like argument parser
/// </summary>
/// <param name="commandLine">Command line string with arguments. Use Environment.CommandLine</param>
/// <returns>The args[] array (argv)</returns>
public static string[] CreateArgs(string commandLine)
{
StringBuilder argsBuilder = new StringBuilder(commandLine);
bool inQuote = false;
// Convert the spaces to a newline sign so we can split at newline later on
// Only convert spaces which are outside the boundries of quoted text
for (int i = 0; i < argsBuilder.Length; i++)
{
if (argsBuilder[i].Equals('"'))
{
inQuote = !inQuote;
}
if (argsBuilder[i].Equals(' ') && !inQuote)
{
argsBuilder[i] = '\n';
}
}
// Split to args array
string[] args = argsBuilder.ToString().Split(new char[] { '\n' }, StringSplitOptions.RemoveEmptyEntries);
// Clean the '"' signs from the args as needed.
for (int i = 0; i < args.Length; i++)
{
args[i] = ClearQuotes(args[i]);
}
return args;
}
/// <summary>
/// Cleans quotes from the arguments.<br/>
/// All signle quotes (") will be removed.<br/>
/// Every pair of quotes ("") will transform to a single quote.<br/>
/// </summary>
/// <param name="stringWithQuotes">A string with quotes.</param>
/// <returns>The same string if its without quotes, or a clean string if its with quotes.</returns>
private static string ClearQuotes(string stringWithQuotes)
{
int quoteIndex;
if ((quoteIndex = stringWithQuotes.IndexOf('"')) == -1)
{
// String is without quotes..
return stringWithQuotes;
}
// Linear sb scan is faster than string assignemnt if quote count is 2 or more (=always)
StringBuilder sb = new StringBuilder(stringWithQuotes);
for (int i = quoteIndex; i < sb.Length; i++)
{
if (sb[i].Equals('"'))
{
// If we are not at the last index and the next one is '"', we need to jump one to preserve one
if (i != sb.Length - 1 && sb[i + 1].Equals('"'))
{
i++;
}
// We remove and then set index one backwards.
// This is because the remove itself is going to shift everything left by 1.
sb.Remove(i--, 1);
}
}
return sb.ToString();
}
}
VB.NET:
Imports System.Text
' Original version by Jonathan Levison (C#)'
' http://sleepingbits.com/2010/01/command-line-arguments-with-double-quotes-in-net/
' converted using http://www.developerfusion.com/tools/convert/csharp-to-vb/
' and then some manual effort to fix language discrepancies
Friend Class CommandLineHelper
''' <summary>
''' C-like argument parser
''' </summary>
''' <param name="commandLine">Command line string with arguments. Use Environment.CommandLine</param>
''' <returns>The args[] array (argv)</returns>
Public Shared Function CreateArgs(commandLine As String) As String()
Dim argsBuilder As New StringBuilder(commandLine)
Dim inQuote As Boolean = False
' Convert the spaces to a newline sign so we can split at newline later on
' Only convert spaces which are outside the boundries of quoted text
For i As Integer = 0 To argsBuilder.Length - 1
If argsBuilder(i).Equals(""""c) Then
inQuote = Not inQuote
End If
If argsBuilder(i).Equals(" "c) AndAlso Not inQuote Then
argsBuilder(i) = ControlChars.Lf
End If
Next
' Split to args array
Dim args As String() = argsBuilder.ToString().Split(New Char() {ControlChars.Lf}, StringSplitOptions.RemoveEmptyEntries)
' Clean the '"' signs from the args as needed.
For i As Integer = 0 To args.Length - 1
args(i) = ClearQuotes(args(i))
Next
Return args
End Function
''' <summary>
''' Cleans quotes from the arguments.<br/>
''' All signle quotes (") will be removed.<br/>
''' Every pair of quotes ("") will transform to a single quote.<br/>
''' </summary>
''' <param name="stringWithQuotes">A string with quotes.</param>
''' <returns>The same string if its without quotes, or a clean string if its with quotes.</returns>
Private Shared Function ClearQuotes(stringWithQuotes As String) As String
Dim quoteIndex As Integer = stringWithQuotes.IndexOf(""""c)
If quoteIndex = -1 Then Return stringWithQuotes
' Linear sb scan is faster than string assignemnt if quote count is 2 or more (=always)
Dim sb As New StringBuilder(stringWithQuotes)
Dim i As Integer = quoteIndex
Do While i < sb.Length
If sb(i).Equals(""""c) Then
' If we are not at the last index and the next one is '"', we need to jump one to preserve one
If i <> sb.Length - 1 AndAlso sb(i + 1).Equals(""""c) Then
i += 1
End If
' We remove and then set index one backwards.
' This is because the remove itself is going to shift everything left by 1.
sb.Remove(System.Math.Max(System.Threading.Interlocked.Decrement(i), i + 1), 1)
End If
i += 1
Loop
Return sb.ToString()
End Function
End Class
This works for me, and it works correctly with the example in the question.
/// <summary>
/// https://www.pinvoke.net/default.aspx/shell32/CommandLineToArgvW.html
/// </summary>
/// <param name="unsplitArgumentLine"></param>
/// <returns></returns>
static string[] SplitArgs(string unsplitArgumentLine)
{
int numberOfArgs;
IntPtr ptrToSplitArgs;
string[] splitArgs;
ptrToSplitArgs = CommandLineToArgvW(unsplitArgumentLine, out numberOfArgs);
// CommandLineToArgvW returns NULL upon failure.
if (ptrToSplitArgs == IntPtr.Zero)
throw new ArgumentException("Unable to split argument.", new Win32Exception());
// Make sure the memory ptrToSplitArgs to is freed, even upon failure.
try
{
splitArgs = new string[numberOfArgs];
// ptrToSplitArgs is an array of pointers to null terminated Unicode strings.
// Copy each of these strings into our split argument array.
for (int i = 0; i < numberOfArgs; i++)
splitArgs[i] = Marshal.PtrToStringUni(
Marshal.ReadIntPtr(ptrToSplitArgs, i * IntPtr.Size));
return splitArgs;
}
finally
{
// Free memory obtained by CommandLineToArgW.
LocalFree(ptrToSplitArgs);
}
}
[DllImport("shell32.dll", SetLastError = true)]
static extern IntPtr CommandLineToArgvW(
[MarshalAs(UnmanagedType.LPWStr)] string lpCmdLine,
out int pNumArgs);
[DllImport("kernel32.dll")]
static extern IntPtr LocalFree(IntPtr hMem);
static string Reverse(string s)
{
char[] charArray = s.ToCharArray();
Array.Reverse(charArray);
return new string(charArray);
}
static string GetEscapedCommandLine()
{
StringBuilder sb = new StringBuilder();
bool gotQuote = false;
foreach (var c in Environment.CommandLine.Reverse())
{
if (c == '"')
gotQuote = true;
else if (gotQuote && c == '\\')
{
// double it
sb.Append('\\');
}
else
gotQuote = false;
sb.Append(c);
}
return Reverse(sb.ToString());
}
static void Main(string[] args)
{
// Crazy hack
args = SplitArgs(GetEscapedCommandLine()).Skip(1).ToArray();
}
According to this article by Jon Galloway, there can be weird behaviour experienced when using backslashes in command line arguments.
Most notably it mentions that "Most applications (including .NET applications) use CommandLineToArgvW to decode their command lines. It uses crazy escaping rules which explain the behaviour you're seeing."
It explains that the first set of backslashes do not require escaping, but backslashes coming after alpha (maybe numeric too?) characters require escaping and that quotes always need to be escaped.
Based off of these rules, I believe to get the arguments you want you would have to pass them as:
a "b" "\\x\\\\" "\x\\"
"Whacky" indeed.
The full story of the crazy escaping rules was told in 2011 by an MS blog entry: "Everyone quotes command line arguments the wrong way"
Raymond also had something to say on the matter (already back in 2010): "What's up with the strange treatment of quotation marks and backslashes by CommandLineToArgvW"
The situation persists into 2020 and the escaping rules described in Everyone quotes command line arguments the wrong way are still correct as of 2020 and Windows 10.
I have escaped the problem the other way...
Instead of getting arguments already parsed I am getting the arguments string as it is and then I am using my own parser:
static void Main(string[] args)
{
var param = ParseString(Environment.CommandLine);
...
}
// The following template implements the following notation:
// -key1 = some value -key2 = "some value even with '-' character " ...
private const string ParameterQuery = "\\-(?<key>\\w+)\\s*=\\s*(\"(?<value>[^\"]*)\"|(?<value>[^\\-]*))\\s*";
private static Dictionary<string, string> ParseString(string value)
{
var regex = new Regex(ParameterQuery);
return regex.Matches(value).Cast<Match>().ToDictionary(m => m.Groups["key"].Value, m => m.Groups["value"].Value);
}
This concept lets you type quotes without the escape prefix.