问题
I am trying to develop a PHP script that replaces all divs in an HTML string with paragraphs except those which have attributes (e.g. <div id="1">
). The first thing my script currently does is use a simple str_replace() to replace all occurrences of <div>
with <p>
, and this leaves behind any div tags with attributes and end div tags (</div>
). However, replacing the </div>
tags with </p>
tags is a bit more problematic.
So far, I have developed a preg_replace_callback function that is designed to convert some </div>
tags into </p>
tags to match the opening <p>
tags, but ignore other </div>
tags when they are ending a <div>
with attributes. Below is the script that I am using;
<?php
$input = "<div>Hello world!</div><div><div id=\"1\">How <div>are you</div> today?</div></div><div>I am fine.</div>";
$input2 = str_replace("<div>", "<p>", $input);
$output = preg_replace_callback("/(<div )|(<\/div>)/", 'replacer', $input2);
function replacer($matches){
static $count = 0;
$counter=count($matches);
for($i=0;$i<$counter;$i++){
if($matches[$i]=="<div "){
return "<div ";
$count++;
} elseif ($matches[$i]=="</div>"){
$count--;
if ($count>=0){
return "</div>";
} elseif ($count<0){
return "</p>";
$count++;
}
}
}
}
echo $output;
?>
The script basically puts all the remaining <div>
and </div>
tags into an array and then loop through it. A counter variable is then incremented when it encounters a <div>
tag or decremented when it encounters a </div>
within the array. When the counter is less than 0, a </p>
tag is returned, otherwise a </div>
is returned.
The output of the script should be;
<p>Hello world!</p><p><div id="1">How <p>are you</p> today?</div></p><p>I am fine.</p>"
Instead the output I am getting is;
<p>Hello world!</p><p><div id="1">How <p>are you</p> today?</p></p><p>I am fine.</p>
I have spent hours making as many edits to the script as I can think of, and I keep getting the same output. Can anyone explain to me where I am going wrong or offer an alternative solution?
Any help would be appreciated.
回答1:
Next to what mario commented, comparable to phpquery or querypath, you can use the PHP DOMDocument
class to search for the <div>
elements in question and replace them with <p>
elements.
The cornerstones are the DOM (Document Object Model) and XPath:
$input = "<div>Hello world!</div><div><div id=\"1\">How <div>are you</div> today?</div></div><div>I am fine.</div>";
$doc = new DOMDocument();
$doc->loadHTML("<div id='body'>{$input}</div>");
$root = $doc->getElementById('body');
$xp = new DOMXPath($doc);
$expression = './/div[not(@id)]';
while($r = $xp->query($expression, $root) and $r->length)
foreach($r as $div)
{
$new = $doc->createElement('p');
foreach($div->childNodes as $child)
$new->appendChild($child->cloneNode(1));
$div->parentNode->replaceChild($new, $div);
}
;
$html = '';
foreach($root->childNodes as $child)
$html .= rtrim($doc->saveHTML($child))
;
echo $html;
This will give you:
<p>Hello world!</p><p><div id="1">How <p>are you</p> today?</div></p><p>I am fine.</p>
回答2:
I took a different approach with multiple regular expressions:
$text = "<div>Hello world!</div><div><div id=\"1\">How <div>are you</div> today?</div></div><div>I am fine.</div><div>an other <div id=\"2\">small</div>test</div><div>nested<div>divs</div>...</div>";
echo "before: " . $text . "\n";
do
{
$count1 = 0;
$text = preg_replace("/<div>((?![^<]*?<div).*?)<\/div>/", "<p>$1</p>", $text, -1, $count1);
$count2 = 0;
$text = preg_replace("/<div ([^>]+)>((?![^<]*?<div).*?)<\/div>/", "<temporarytag $1>$2</temporarytag>", $text, -1, $count);
} while ($count1 + $count2 > 0);
$text = preg_replace("/(<[\/]?)temporarytag/", "$1div", $text);
echo "after: " . $text;
This will get you:
before: <div>Hello world!</div><div><div id="1">How <div>are you</div> today?</div></div><div>I am fine.</div><div>an other <div id="2">small</div>test</div><div>nested<div>divs</div>...</div>
after: <p>Hello world!</p><p><div id="1">How <p>are you</p> today?</div></p><p>I am fine.</p><p>an other <div id="2">small</div>test</p><p>nested<p>divs</p>...</p>
If you don't need the snippet, I have learned something about regexp's myself at least :P
来源:https://stackoverflow.com/questions/8772348/replacing-end-div-tags-using-preg-replace-callback-function