what I\'m trying to do is make a \'jargon buster\'. Basically I have some html and some glossary terms in a database. When the person clicks on jargon buster it replaces the wor
Assuming all your glossary "words" consist of standard "word" characters, (i.e. [A-Za-z0-9_]
), then a simple word boundary assertion can be placed before and after the word in the regex pattern. Try replacing the pertinant statement with this:
$element->innertext = preg_replace(
'/\b'. $glossary_word .'\b/i',
''. $glossary['word'] .'',
$element->innertext);
This assumes that $glossary_word
has been run trough preg_quote
(which your code does).
However, if the glossary words may contain other non-standard word characters (such as a '-'
dash), a more complex regex can be formulated which incorporates lookahead and lookbehind to ensure that only whole words are matched. For example:
$re_pattern = "/ # Match a glossary whole word.
(?<=[\s'\"]|^) # Word preceded by whitespace, quote or BOS.
{$glossary_word} # Word to be matched.
(?=[\s'\".?!,;:]|$) # Word followed by ws, quote, punct or EOS.
/ix";