With Regular Expressions I\'m trying to remove all the methods/functions from the following code. Leaving the \"global scope\" alone. However, I can\'t manage to make it mat
You can't do this properly with regex. You need to write a parser that can properly parse comments, string literals and nested brackets.
Regex cannot cope with these cases:
class Hello
{
function foo()
{
echo '} <- that is not the closing bracket!';
// and this: } bracket isn't the closing bracket either!
/*
} and that one isn't as well...
*/
}
}
EDIT
Here's a little demo of how to use the tokenizer function mentioned by XUE Can:
$source = <<global();
function asodaosdo() {
}
?>
BLOCK;
if (!defined('T_ML_COMMENT')) {
define('T_ML_COMMENT', T_COMMENT);
}
else {
define('T_DOC_COMMENT', T_ML_COMMENT);
}
// Tokenize the source
$tokens = token_get_all($source);
// Some flags and counters
$tFunction = false;
$functionBracketBalance = 0;
$buffer = '';
// Iterate over all tokens
foreach ($tokens as $token) {
// Single-character tokens.
if(is_string($token)) {
if(!$tFunction) {
echo $token;
}
if($tFunction && $token == '{') {
// Increase the bracket-counter (not the class-brackets: `$tFunction` must be true!)
$functionBracketBalance++;
}
if($tFunction && $token == '}') {
// Decrease the bracket-counter (not the class-brackets: `$tFunction` must be true!)
$functionBracketBalance--;
if($functionBracketBalance == 0) {
// If it's the closing bracket of the function, reset `$tFunction`
$tFunction = false;
}
}
}
// Tokens consisting of (possibly) more than one character.
else {
list($id, $text) = $token;
switch ($id) {
case T_PUBLIC:
case T_PROTECTED:
case T_PRIVATE:
// Don'timmediately echo 'public', 'protected' or 'private'
// before we know if it's part of a variable or method.
$buffer = "$text ";
break;
case T_WHITESPACE:
// Only display spaces if we're outside a function.
if(!$tFunction) echo $text;
break;
case T_FUNCTION:
// If we encounter the keyword 'function', flip the `tFunction` flag to
// true and reset the `buffer`
$tFunction = true;
$buffer = '';
break;
default:
// Echo all other tokens if we're not in a function and prepend a possible
// 'public', 'protected' or 'private' previously put in the `buffer`.
if(!$tFunction) {
echo "$buffer$text";
$buffer = '';
}
}
}
}
which will print:
global();
?>
which is the original source, only without functions.