I need a regular expression that I can use in VBScript and .NET that will return only the numbers that are found in a string.
For Example any of the following \"str
As an alternative to the main .Net
solution, adapted from a similar question's answer:
string justNumbers = string.Concat(text.Where(char.IsDigit));
Have you gone through the phone nr category on regexlib. Seems like quite a few do what you need.
Note: you've only solved half the problem here.
For US phone numbers entered "in the wild", you may have:
You'll need to add some smarts to your code to conform the resulting list of digits to a single standard that you actually search against in your database.
Some simple things you could do to fix this:
Before the RegEx removal of non-digits, see if there's an "x" in the string. If there is, chop everything off after it (will handle most versions of writing an extension number).
For any number with 10+ digits beginning with a "1", chop off the 1. It's not part of the area code, US area codes start in the 2xx range.
For any number still exceeding 10 digits, assume the remainder is an extension of some sort, and chop it off.
Do your database search using an "ends-with" pattern search (SELECT * FROM mytable WHERE phonenumber LIKE 'blah%'). This will handle sitations (although with the possibility of error) where the area code is not provided, but your database has the number with the area code.
In respect to the points made by richardtallent, this code will handle most of your issues in respect to extension numbers, and the US country code (+1) being prepended.
Not the most elegant solution, but I had to quickly solve the problem so I could move on with what I'm doing.
I hope it helps someone.
Public Shared Function JustNumbers(inputString As String) As String
Dim outString As String = ""
Dim nEnds As Integer = -1
' Cycle through and test the ASCII character code of each character in the string. Remove everything non-numeric except "x" (in the event an extension is in the string as follows):
' 331-123-3451 extension 405 becomes 3311233451x405
' 226-123-4567 ext 405 becomes 2261234567x405
' 226-123-4567 x 405 becomes 2261234567x405
For l = 1 To inputString.Length
Dim tmp As String = Mid(inputString, l, 1)
If (Asc(tmp) >= 48 And Asc(tmp) <= 57) Then
outString &= tmp
ElseIf Asc(tmp.ToLower) = 120
outString &= tmp
nEnds = l
End If
Next
' Remove the leading US country code 1 after doing some validation
If outString.Length > 0 Then
If Strings.Left(outString, 1) = "1" Then
' If the nEnds flag is still -1, that means no extension was added above, set it to the full length of the string
' otherwise, an extension number was detected, and that should be the nEnds (number ends) position.
If nEnds = -1 Then nEnds = outString.Length
' We hit a 10+ digit phone number, this means an area code is prefixed;
' Remove the trailing 1 in case someone put in the US country code
' This is technically safe, since there are no US area codes that start with a 1. The start digits are 2-9
If nEnds > 10 Then
outString = Right(outString, outString.Length - 1)
End If
End If
End If
Debug.Print(inputString + " : became : " + outString)
Return outString
End Function
The simplest solution, without a regular expression:
public string DigitsOnly(string s)
{
string res = "";
for (int i = 0; i < s.Length; i++)
{
if (Char.IsDigit(s[i]))
res += s[i];
}
return res;
}
In .NET, you could extract just the digits from the string. Like this:
string justNumbers = new String(text.Where(Char.IsDigit).ToArray());