Match the body of a function using Regex

拜拜、爱过 提交于 2019-12-01 08:39:08

Update #2

According to others comments

^\s*[\w\s]+\(.*\)\s*\K({((?>"(?:[^"\\]*+|\\.)*"|'(?:[^'\\]*+|\\.)*'|//.*$|/\*[\s\S]*?\*/|#.*$|<<<\s*["']?(\w+)["']?[^;]+\3;$|[^{}<'"/#]++|[^{}]++|(?1))*)})

Note: A short RegEx i.e. {((?>[^{}]++|(?R))*)} is enough if you know your input does not contain { or } out of PHP syntax.

So a long RegEx, in what evil cases does it work?

  1. You have [{}] in a string between quotation marks ["']
  2. You have those quotation marks escaped inside one another
  3. You have [{}] in a comment block. //... or /*...*/ or #...
  4. You have [{}] in a heredoc or nowdoc <<<STR or <<<['"]STR['"]

Otherwise it is meant to have a pair of opening/closing braces and depth of nested braces is not important.

Do we have a case that it fails?

No unless you have a martian that lives inside your codes.

 ^ \s* [\w\s]+ \( .* \) \s* \K               # how it matches a function definition
 (                             # (1 start)
      {                                      # opening brace
      (                             # (2 start)
           (?>                               # atomic grouping (for its non-capturing purpose only)
                "(?: [^"\\]*+ | \\ . )*"     # double quoted strings
             |  '(?: [^'\\]*+ | \\ . )*'     # single quoted strings
             |  // .* $                      # a comment block starting with //
             |  /\* [\s\S]*? \*/             # a multi line comment block /*...*/
             |  \# .* $                      # a single line comment block starting with #...
             |  <<< \s* ["']?                # heredocs and nowdocs
                ( \w+ )                      # (3) ^
                ["']? [^;]+ \3 ; $           # ^
             |  [^{}<'"/#]++                 # force engine to backtack if it encounters special characters [<'"/#] (possessive)
             |  [^{}]++                      # default matching bahaviour (possessive)
             |  (?1)                         # recurse 1st capturing group
           )*                                # zero to many times of atomic group
      )                             # (2 end)
      }                                      # closing brace
 )                             # (1 end)

Formatting is done by @sln's RegexFormatter software.

What I provided in live demo?

Laravel's Eloquent Model.php file (~3500 lines) randomly is given as input. Check it out: Live demo

This works to output header file (.h) out of inline function blocks (.c)

Find Regular expression:

(void\s[^{};]*)\n^\{($[^}$]*)\}$

Replace with:

$1;

For input:

void bar(int var)
{ 
    foo(var);
    foo2();
}

will output:

void bar(int var);

Get the body of the function block with second matched pattern :

$2

will output:

    foo(var);
    foo2();
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!