I\'m working with Ruby\'s regex engine. I need to write a regex that does this
WIKI_WORD = /\\b([a-z][\\w_]+\\.)?[A-Z][a-z]+[A-Z]\\w*\\b/
b
WIKI_WORD = /\b(\p{Ll}\w+\.)?\p{Lu}\p{Ll}+\p{Lu}\w*\b/u
should work in Ruby 1.9. \p{Lu}
and \p{Ll}
are shorthands for uppercase and lowercase Unicode letters. (\w
already includes the underscore)
See also this answer - you might need to run Ruby in UTF-8 mode for this to work, and possibly your script must be encoded in UTF-8, too.