In a PHP script, what regex should I use to check for mismatched parentheses in a string? Things that I want to allow include:
Regex is not the right tool for the job. Scan a string manually.
Pseudo-code:
depth = 0
for character in some_string:
depth += character == '('
depth -= character == ')'
if depth < 0:
break
if depth != 0:
print "unmatched parentheses"
You can do this with a regular expression -- PCRE, as used by PHP, allows recursive patterns. The PHP Manual gives an example that is almost exactly what you want:
\(((?>[^()]+)|(?R))*\)
This matches any correctly parenthesised substring as long as it begins and ends with parentheses. If you want to ensure the entire string is balanced, allowing strings like "wiggedy(wiggedy)(wiggedy(wack))", here's what I came up with:
^((?:[^()]|\((?1)\))*+)$
Here's an explanation of the pattern, which may be more illuminating than obfuscatory:
^ Beginning of the string ( Start the "balanced substring" group (to be called recursively) (?: Start the "minimal balanced substring" group [^()] Minimal balanced substring is either a non-paren character | or \((?1)\) a set of parens containing a balanced substring ) Finish the "minimal balanced substring" group * Our balanced substring is a maximal sequence of minimal balanced substrings + Don't backtrack once we've matched a maximal sequence ) Finish the "balanced substring" pattern $ End of the string
There are lots of considerations of efficiency and correctness that come up with these sorts of regexes. Be careful.