$string = '<a href="http://google.com">http://google.com</a>';
$var = str_replace('>http://','>',$string);
Just tried this in IDEone.com and it has the desired effect.
In this simple case, the preg_replace
function will probably work. For more stability, try using DOMDocument
:
$string = '<a href="http://google.com">http://google.com</a>';
$dom = new DOMDocument;
$dom->loadXML($string);
$link = $dom->firstChild;
$link->nodeValue = str_replace('http://', '', $link->nodeValue);
$string = $dom->saveXML($link);
Assuming that "http://" always appears twice on $string, search the string for "http://" backwards using strripos. If the search succeeds, you'll know the start_index of the "http://" you want to remove (and you know the length of course). Now you can use substr to extract everything that goes before and after the chunk you want remove.
Any simple regular expression or string replacement code is probably going to fail in the general case. The only "correct" way to do it is to actually parse the chunk as an SGML/XML snippet and remove the http://
from the value.
For any other (reasonably short) string manipulation code, finding a counterexample that breaks it will be pretty easy.
Without using a full blown parser, this may do the trick for most situations...
$str = '<a href="http://google.com">http://google.com</a>';
$regex = '/(?<!href=["\'])http:\/\//';
$str = preg_replace($regex, '', $str);
var_dump($str); // string(42) "<a href="http://google.com">google.com</a>"
It uses a negative lookbehind to make sure there is no href="
or href='
preceding it.
See it on IDEone.
It also takes into account people who delimit their attribute values with '
.
$str = 'http://www.google.com';
$str = preg_replace('#^https?://#', '', $str);
echo $str; // www.google.com
that will work for both http:// and https://
running live code