I am trying to add Diez tag #
after the pressed space using jquery when user type. I have created this DEMO from codepen.io.
In this d
JavaScript Regular Expressions are not Unicode-aware, at least talking ES5. To be able to use Unicode literals you have to work with \u
(Unicode escape sequences) or libraries like XRegExp that bring support for Unicode properties.
By importing XRegExp to your project, you are able to translate old Regular Expressions to new Unicode-aware patterns. Unicode properties are then available to be used in standard \p{L}
or single-letter \pL
notation.
I modified your code a little bit to bring such a feature alive:
text = XRegExp.replaceEach(text, [
[/#\s*/g, ""],
[/\s{2,}/g, " "],
[XRegExp(`(?:\\s|^)([\\p{L}\\p{N}]+)(?=\\s|$)(?=.*\\s\\1(?=\\s|$))`, "gi"), ""],
[XRegExp(`([\\p{N}\\p{L}]+)`, "g"), "#$1"]
]);
First two regexes are easy to understand (you had them before) but third seems to be pell-mell but if you be more precise about it you'll find that (?:\\s|^)
and (?=\\s|$)
both corresponds to \b
and since \b
uses ASCII-only interpretations of word boundary I had to work it out like that.
Live demo
Breakdown:
\p{L}
any kind of letter from any language.\p{N}
any kind of
numeric character in any script.More about >Unicode categories<.