i need to parse some data and i want to convert
AutomaticTrackingSystem
to
Automatic Tracking System
esse
Try this:
using System;
using System.Linq;
using System.Text.RegularExpressions;
class MainClass
{
public static void Main (string[] args)
{
var rx = new Regex
(@"([a-z]+[A-Z]|[A-Z][A-Z]+|[A-Z]|[^A-Za-z][^A-Za-z]+)");
string[] tests = {
"AutomaticTrackingSystem",
"XMLEditor",
"AnXMLAndXSLT2.0Tool",
"NumberOfABCDThings",
"AGoodMan",
"CodeOfAGoodMan"
};
foreach(string t in tests)
{
string y = Reverse(t);
string x = Reverse( rx.Replace(y, @" $1") );
Console.WriteLine("\n\n{0} -- {1}",y,x);
}
}
static string Reverse(string s)
{
var ca = s.ToCharArray();
Array.Reverse(ca);
string t = new string(ca);
return t;
}
}
Output:
metsySgnikcarTcitamotuA -- Automatic Tracking System
rotidELMX -- XML Editor
looT0.2TLSXdnALMXnA -- An XML And XSLT 2.0 Tool
sgnihTDCBAfOrebmuN -- Number Of ABCD Things
naMdooGA -- A Good Man
naMdooGAfOedoC -- Code Of A Good Man
It works by scanning the string backward, and making the capital letter the terminator. Wishing there's a parameter for RegEx for scanning the string backwards, so the above separate string reversal won't be needed anymore :-)
I've just written a function to do exactly this. :)
Replace ([a-z])([A-Z])
with $1 $2
(or \1 \2
in other languages).
I've also got a replace for ([A-Z]+)([A-Z][a-z])
too - this converts things like "NumberOfABCDThings" into "Number Of ABCD Things"
So in C# this would look something like:
Regex r1 = new Regex(@"([a-z])([A-Z])");
Regex r2 = new Regex(@"([A-Z]+)([A-Z][a-z])");
NewString = r1.Replace( InputString , "$1 $2");
NewString = r2.Replace( NewString , "$1 $2");
(although possibly there's a more consice way of writing that)
If you might have punctuation or numbers, I guess you could try ([^A-Z])([A-Z])
for the first match.
Hmmm, another way of writing those regexes, using lookbehind and lookahead, is to just match the position and insert a space - i.e. (?<=[a-z])(?=[A-Z])
and (?<=[A-Z]+)(?=[A-Z][a-z])
and in both cases replace with just " " - not sure whether there may be advantages to that method, but it's an interesting way. :)
If you seek to keep acronyms intact, replace "([^A-Z])([A-Z])" with "\1 \2", else replace "(.)([A-Z])" with "\1 \2".
You can use lookarounds, e.g:
string[] tests = {
"AutomaticTrackingSystem",
"XMLEditor",
};
Regex r = new Regex(@"(?!^)(?=[A-Z])");
foreach (string test in tests) {
Console.WriteLine(r.Replace(test, " "));
}
This prints (as seen on ideone.com):
Automatic Tracking System
X M L Editor
The regex (?!^)(?=[A-Z])
consists of two assertions:
(?!^)
- i.e. we're not at the beginning of the string(?=[A-Z])
- i.e. we're just before an uppercase letterHere's where using assertions really make a difference, when you have several different rules, and/or you want to Split
instead of Replace
. This example combines both:
string[] tests = {
"AutomaticTrackingSystem",
"XMLEditor",
"AnXMLAndXSLT2.0Tool",
};
Regex r = new Regex(
@" (?<=[A-Z])(?=[A-Z][a-z]) # UC before me, UC lc after me
| (?<=[^A-Z])(?=[A-Z]) # Not UC before me, UC after me
| (?<=[A-Za-z])(?=[^A-Za-z]) # Letter before me, non letter after me
",
RegexOptions.IgnorePatternWhitespace
);
foreach (string test in tests) {
foreach (string part in r.Split(test)) {
Console.Write("[" + part + "]");
}
Console.WriteLine();
}
This prints (as seen on ideone.com):
[Automatic][Tracking][System]
[XML][Editor]
[An][XML][And][XSLT][2.0][Tool]
Just use this linq one-liner: (perfectly works for me)
public static string SpaceCamelCase(string input)
{
return input.Aggregate(string.Empty, (old, x) => $"{old}{(char.IsUpper(x) ? " " : "")}{x}").TrimStart(' ');
}
Apparently, there's an option for reverse regex :-) We can now eliminate string reversal, here's another way to do it:
using System;
using System.Linq;
using System.Text.RegularExpressions;
class MainClass
{
public static void Main (string[] args)
{
Regex ry = new Regex
(@"([A-Z][a-z]+|[A-Z]+[A-Z]|[A-Z]|[^A-Za-z]+[^A-Za-z])",
RegexOptions.RightToLeft);
string[] tests = {
"AutomaticTrackingSystem",
"XMLEditor",
"AnXMLAndXSLT2.0Tool",
"NumberOfABCDThings",
"AGoodMan",
"CodeOfAGoodMan"
};
foreach(string t in tests)
{
Console.WriteLine("\n\n{0} -- {1}", t, ry.Replace(t, " $1"));
}
}
}
Output:
AutomaticTrackingSystem -- Automatic Tracking System
XMLEditor -- XML Editor
AnXMLAndXSLT2.0Tool -- An XML And XSLT 2.0 Tool
NumberOfABCDThings -- Number Of ABCD Things
AGoodMan -- A Good Man
CodeOfAGoodMan -- Code Of A Good Man