I need for text like \"joe ($3,004.50)\" to be filtered down to 3004.50 but am terrible at regex and can\'t find a suitable solution. So only numbers and periods should stay -
This should do it:
string s = "joe ($3,004.50)";
s = Regex.Replace(s, "[^0-9.]", "");
For the accepted answer, MatthewGunn raises a valid point in that all digits, commas, and periods in the entire string will be condensed together. This will avoid that:
string s = "joe.smith ($3,004.50)";
Regex r = new Regex(@"(?:^|[^w.,])(\d[\d,.]+)(?=\W|$)/)");
Match m = r.match(s);
string v = null;
if (m.Success) {
v = m.Groups[1].Value;
v = Regex.Replace(v, ",", "");
}
The regex is:
[^0-9.]
You can cache the regex:
Regex not_num_period = new Regex("[^0-9.]")
then use:
string result = not_num_period.Replace("joe ($3,004.50)", "");
However, you should keep in mind that some cultures have different conventions for writing monetary amounts, such as: 3.004,50.
The approach of removing offending characters is potentially problematic. What if there's another .
in the string somewhere? It won't be removed, though it should!
Removing non-digits or periods, the string joe.smith ($3,004.50)
would transform into the unparseable .3004.50
.
Imho, it is better to match a specific pattern, and extract it using a group. Something simple would be to find all contiguous commas, digits, and periods with regexp:
[\d,\.]+
Sample test run:
Pattern understood as:
[\d,\.]+
Enter string to check if matches pattern
> a2.3 fjdfadfj34 34j3424 2,300 adsfa
Group 0 match: "2.3"
Group 0 match: "34"
Group 0 match: "34"
Group 0 match: "3424"
Group 0 match: "2,300"
Then for each match, remove all commas and send that to the parser. To handle case of something like 12.323.344
, you could do another check to see that a matching substring has at most one .
.
You are dealing with a string - string is an IEumerable<char>
, so you can use LINQ:
var input = "joe ($3,004.50)";
var result = String.Join("", input.Where(c => Char.IsDigit(c) || c == '.'));
Console.WriteLine(result); // 3004.50