I need to get the value inside some tags in a comment php file like this
php code
/* this is a comment
!-
<titulo>titulo3</titulo>
<funcion>
<descripcion>esta es la descripcion de la funcion 6</descripcion>
</funcion>
<funcion>
<descripcion>esta es la descripcion de la funcion 7</descripcion>
</funcion>
<otros>
<descripcion>comentario de otros 2a hoja</descripcion>
</otros>
-!
*/
some php code
so as you can see the file has newlines and repetions of tags like <funcion></funcion>
and i need to get every single one of the tags, so i was trying something like this:
preg_match_all("/(<funcion>)(.*)(<\/funcion>)/s",$file,$matches);
this example works with the newlines but its greedy so i've been searching and seen these two solutions:
preg_match_all("/(<funcion>)(.*?)(<\/funcion>)/s",$file,$matches);
preg_match_all("/(<funcion>)(.*)(<\/funcion>)/sU",$file,$matches);
but none of them work for me, don't know why
Try using [\s\S]
, which means all space and non-space characters, instead of .
. Also, there's no need to add <funcion>
and </funcion>
in match groups.
/<funcion>([\s\S]*?)<\/funcion>/s
Also, keep in mind that the best way to do this is parsing the XML using a XML parser. Even if it's not a XML document, as you mentioned on your comment, extract the part that should be parsed and use XML parser to parse it.
This expression from your question:
preg_match_all("/(<funcion>)(.*?)(<\/funcion>)/s", $file, $matches);
print_r($matches);
This will work, but ONLY IF $file
is a string containing the XML; if it's a file name, you have to get the contents first:
preg_match_all("/(<funcion>)(.*?)(<\/funcion>)/s", file_get_contents($file), $matches);
Also, keep in mind that PCRE has backtrack limitations when you use non-greedy patterns.
Try this..
/<funcion>((.|\n)*?)<\/funcion>/i
Eg
$srting = "<titulo>titulo3</titulo>
<funcion>
<descripcion>esta es la descripcion de la funcion 6</descripcion>
</funcion>
<funcion>
<descripcion>esta es la descripcion de la funcion 7</descripcion>
</funcion>
<otros>
<descripcion>comentario de otros 2a hoja</descripcion>
</otros>";
$result=preg_match_all('/<funcion>((.|\n)*?)<\/funcion>/i', $srting,$m);
print_r($m[0]);
This one outputs
Array
(
[0] =>
esta es la descripcion de la funcion 6
[1] =>
esta es la descripcion de la funcion 7
)
. . If the structure is exactly like that (always indented inside content) you can easily match it with /\n[\s]+([^\n]+(\n[\s]+)*)\n/.
. . I always tend to avoid "lazy" ("non greedy") modifiers. It just kind of look as a hack, and it's not available everywhere and with the same implementation. Since in this case you don't seem to need it, I would suggest you not to use it.
. . Try this:
$regexp = '/<funcion>\n[\s]+([^\n]+(\n[\s]+)*)\n</funcion>/';
$works = preg_match_all($regexp, $file, $matches);
echo '<pre>';
print_r($matches);
. . The "$matches[1]" array will get you an array of the "funcion" tags contents.
. . Of course it would be nice to pre-filter the content and apply the RegExp on the comment contents only to avoid any mismatch.
. . Have fun.
来源:https://stackoverflow.com/questions/15150175/non-greedy-regex