PHP preg_replace: Replace all anchor tags in text with their href value with Regex

拥有回忆 提交于 2021-02-17 02:31:42


I want to replace all anchor tags within a text with their href value, but my pattern does not work right.

$str = 'This is a text with multiple anchor tags. This is the first one: <a href="" title="Link 1">Link 1</a> and this one the second: <a href="" title="Link 2">Link 2</a> after that a lot of other text. And here the 3rd one: <a href="" title="Link 3">Link 3</a> Some other text.';
$test = preg_replace("/<a\s.+href=['|\"]([^\"\']*)['|\"].*>[^<]*<\/a>/i",'\1', $str);
echo $test;

At the end the text should look like this:

This is a text with multiple anchor tags. This is the first one: and this one the second: after that a lot of other text. And here the 3rd one: Some other text.

Thank you very much!


Just don't.

Use a parser instead.

$dom = new DOMDocument();
// since you have a fragment, wrap it in a <body>
$links = $dom->getElementsByTagName("a");
while($link = $links[0]) {
    $link->parentNode->insertBefore(new DOMText($link->getAttribute("href")),$link);
$result = $dom->saveHTML($dom->getElementsByTagName("body")[0]);
// remove <body>..</body> wrapper
$output = substr($result, strlen("<body>"), -strlen("</body>"));

Demo on 3v4l


In case you're still set on regex, this should work:

preg_replace("/<a\s+href=['\"]([^'\"]+)['\"][^\>]*>[^<]+<\/a>/i",'$1', $str);

But you're probably better off with a solution like what Andreas posted.

FYI: the reason your previous regex didn't work was this little number:


Because . selects everything you ended up matching everything past the url to be replaced; all the way to the end. This is why it appeared to only select and replace the first anchor tag it found and cut off the rest.

Changing that to


Ensures that this particular selection is constrained to only the portion of the string which exists between the url and the ending bracket of the a tag.


Simpler perhaps not, but safer is to loop the string with strpos to find and cut the string and remove the html.

$str = 'This is a text with multiple anchor tags. This is the first one: <a class="funky-style" href="" title="Link 1">Link 1</a> and this one the second: <a href="" title="Link 2">Link 2</a> after that a lot of other text. And here the 3rd one: <a href="" title="Link 3">Link 3</a> Some other text.';

$pos = strpos($str, '<a');

while($pos !== false){
    // Find start of html and remove up to link (<a href=")
    $str = substr($str, 0, $pos) . substr($str, strpos($str, 'href="', $pos)+6);
    // Find end of link and remove that.(" title="Link 1">Link 1</a>)
    $str = substr($str, 0, strpos($str,'"', $pos)) . substr($str, strpos($str, '</a>', $pos)+4);
    // Find next link if possible
    $pos = strpos($str, '<a');
echo $str;

Edited to handle different order of a a-tag.

