Without knowing what language you're using I am unsure whether or not the syntax is correct.
This should match all of your groups with very few false positives:
/\(?([0-9]{3})\)?([ .-]?)([0-9]{3})\2([0-9]{4})/
The groups you will be interested in after the match are groups 1, 3, and 4. Group 2 exists only to make sure the first and second separator characters
, .
, or -
are the same.
For example a sed command to strip the characters and leave phone numbers in the form 123456789:
sed "s/(\{0,1\}\([0-9]\{3\}\))\{0,1\}\([ .-]\{0,1\}\)\([0-9]\{3\}\)\2\([0-9]\{4\}\)/\1\3\4/"
Here are the false positives of my expression:
- (123)456789
- (123456789
- (123 456 789
- (123.456.789
- (123-456-789
- 123)456789
- 123) 456 789
- 123).456.789
- 123)-456-789
Breaking up the expression into two parts, one that matches with parenthesis and one that does not will eliminate all of these false positives except for the first one:
/\(([0-9]{3})\)([ .-]?)([0-9]{3})\2([0-9]{4})|([0-9]{3})([ .-]?)([0-9]{3})\5([0-9]{4})/
Groups 1, 3, and 4 or 5, 7, and 8 would matter in this case.