In a .NET Regex pattern, what special characters need to be escaped in order to be used literally?
I think you can get the list of chars as
List<char> chars = Enumerable.Range(0,65535)
.Where(i=>((char)i).ToString()!=Regex.Escape(((char)i).ToString()))
.Select(i=>(char)i)
.ToList();
--
\t\n\f\r#$()*+.?[\^{|
Here is the list of characters that need to be escaped to use them as normal literals:
[
\
^
$
.
|
?
*
+
(
and the closing round bracket )
{
#
These special characters are often called "metacharacters".
But, I agree with Jon to use Regex.Escape
instead of hardcoding these character in code.
See the MSDN documentation here: http://msdn.microsoft.com/en-us/library/az24scfc.aspx#character_escapes
The problem with a complete list is that it depends on context. For example .
must be escaped, unless it is enclosed in brackets, as in [.]
. ]
technically does not need to be escaped, unless it is preceded by [
. -
has no special meaning, unless it's inside of brackets, as in [A-Z]
. =
has no special meaning unless it is preceded by ?
as in (?=)
.
I don't know the complete set of characters - but I wouldn't rely on the knowledge anyway, and I wouldn't put it into code. Instead, I would use Regex.Escape whenever I wanted some literal text that I wasn't sure about:
// Don't actually do this to check containment... it's just a little example.
public bool RegexContains(string haystack, string needle)
{
Regex regex = new Regex("^.*" + Regex.Escape(needle) + ".*$");
return regex.IsMatch(haystack);
}