Very simple, I need to match the #
symbol using a regex. I\'m working on a hashtag detector.
I\'ve tried searching in google and in stack overflow. One
For what it is worth I only managed to match a hash(#) character as a string. In awk the parser takes out the comments as first thing. The only syntax that can 'hold' a # is
"#"
So in my case I took-out lines with only comments as:
$1 == "#" { next; }
I also attempted to make the hash a regex:
HASH_PATTERN = "^#"
$1 ~ HASH_PATTERN { next; }
... This also works. So I'm thinking you an put the whole expression in a string like: HASH_PATTERN.
The string equals does work quite well. It isn't a perfect solution, just a starter.
With the comment on the earlier answer, you want to avoid matching x#x
.
In that case, your don't need \b
but \B
:
\B#(\w\w+)
(if you really need two-or-more word characters after the #).
The \B
means NON-word-boundary, and since #
is not a word character, this matches exactly if the previous character is not a word character.
You don't need to escape it (it's probably the \b
that's throwing it off):
if (preg_match('/^\w+#(\w+)/', 'abc#def', $matches)) {
print_r($matches);
}
/* output of $matches:
Array
(
[0] => abc#def
[1] => def
)
*/
#
does not have any special meaning in a regex, unless you use it as the delimiter. So just put it straight in and it should work.
Note that \b
detects a word boundary, and in #abc
, the word boundary is after the #
and before the abc
. Therefore, you need to use the \b
is superfluous and you just need #\w\w+
.
You could use the following regex: /\#(\w+)/
to match a hashtag with just the hashtag word, or: /\#\w+/
will match the entire hashtag including the hash.