Is there a Regex-like that is capable of parsing matching symbols?

前端 未结 3 1344
天涯浪人
天涯浪人 2021-01-23 10:08

This regular expression

/\\(.*\\)/

won\'t match the matching parenthesis but the last parenthesis in the string. Is there a regular expression

3条回答
  •  再見小時候
    2021-01-23 10:48

    If you only have one level of parentheses, then there are two possibilities.

    Option 1: use ungreedy repetition:

    /\(.*?\)/
    

    This will stop when it encounters the first ).

    Option 2: use a negative character class

    /\([^)]*\)/
    

    This can only repeat characters that are not ), so it can necessarily never go past the first closing parenthesis. This option is usually preferred due to performance reasons. In addition, this option is more easily extended to allow for escaping parenthesis (so that you could match this complete string: (some\)thing) instead of throwing away thing)). But this is probably rather rarely necessary.

    However if you want nested structures, this is generally too complicated for regex (although some flavors like PCRE support recursive patterns). In this case, you should just go through the string yourself and count parentheses, to keep track of your current nesting level.

    Just as a side note about those recursive patterns: In PCRE (?R) simply represents the whole pattern, so inserting this somewhere makes the whole thing recursive. But then every content of parentheses must be of the same structure as the whole match. Also, it is not really possible to do meaningful one-step replacements with this, as well as using capturing groups on multiple nested levels. All in all - you are best off, not to use regular expressions for nested structures.

    Update: Since you seem eager to find a regex solution, here is how you would match your example using PCRE (example implementation in PHP):

    $str = 'there are (many (things (on) the)) box (except (carrots (and apples)))';
    preg_match_all('/\([^()]*(?:(?R)[^()]*)*\)/', $str, $matches);
    print_r($matches);
    

    results in

    Array
    (
        [0] => Array
            (
                [0] => (many (things (on) the))
                [1] => (except (carrots (and apples)))
            )   
    )
    

    What the pattern does:

    \(      # opening bracket
    [^()]*  # arbitrarily many non-bracket characters
    (?:     # start a non-capturing group for later repetition
    (?R)    # recursion! (match any nested brackets)
    [^()]*  # arbitrarily many non-bracket characters
    )*      # close the group and repeat it arbitrarily many times
    \)      # closing bracket
    

    This allows for infinite nested levels and also for infinite parallel levels.

    Note that it is not possible to get all nested levels as separate captured groups. You will always just get the inner-most or outer-most group. Also, doing a recursive replacement is not possible like this.

提交回复
热议问题