PHP regex: is there anything wrong with this code?

前端未结

关注

 2  800

长发绾君心 2020-12-20 07:16

preg_replace_callback(\'#<(code|pre)([^>]*)>(((?!#si\', \'self::replaceit\', $text);

I\'m trying to r

2条回答

囚心锁ツ (楼主)

2020-12-20 07:50
is there anything wrong with this code?

Yes. You're trying to parse HTML with a regex. Tsk, tsk, tsk. Let's not summon Zalgo quite yet.

You should be using the DOM.
```
$doc = new DOMDocument();
$doc->loadHTML($text);
$code_tags = $doc->getElementsByTagName('code');
$pre_tags = $doc->getElementsByTagName('pre');
```
This will leave you with a set of Node lists, which you may process the contents of as you desire. If you're encountering < and friends in the textContent (or when re-serializing the contents using saveXML), and you need the actual tags, consider htmlspecialchars_decode.

Getting the first and last element in $code_tags, which is a DOM Node List:
```
$first_code_tag = $code_tags->item(0);
$last_code_tag = $code_tags->item( $code_tags->length - 1 );
```
While you can treat a node list like an array inside a foreach, it isn't directly indexable, thus the whole checking for the length property and the use of the item method. Be aware that when there's only one item in the list, the first and last node will be identical. Thankfully you can just check to see if $code_tags->length is greater than one before checking the last in addition to the first.

I'm not sure this is going to help you. Based off your other questions, it sounds like you're using this methodology to work on BBCode, and that you've turned the square brackets into less-than and greater-than. This isn't a problem, mind you, but it might make life interesting.

Try inspecting the output of:
```
echo $doc->saveXML($first_code_tag);
```
to see if it's giving you the content that you expect.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...