发表新帖

发表新帖

Utf8 correct regex for CamelCase (WikiWord) in perl

前端未结

关注

 1  830

Here was a question about the CamelCase regex. With the combination of tchrist post i\'m wondering what is the correct utf-8 CamelCase.

Starting wit

相关标签:

1条回答

抹茶落季

2021-01-19 18:10
I really can’t tell what you’re trying to do, but this should be closer to what your original intent seems to have been. I still can’t tell what you mean to do with it, though.
```
m{
    \b
    \p{Upper}      #  start with uppercase code point (NOT LETTER)

    \w*            #  optional ident chars 

    # note that upper and lower are not related to letters
    (?:  \p{Lower} \w* \p{Upper}
      |  \p{Upper} \w* \p{Lower}
    )

    \w*

    \b
}x
```
Never use [a-z]. And in fact, don’t use \p{Lowercase_Letter} or \p{Ll}, since those are not the same as the more desirable and more correct \p{Lowercase} and \p{Lower}.

And remember that \w is really just an alias for
```
[\p{Alphabetic}\p{Mark}\p{Decimal_Number}\p{Letter_Number}\p{Connector_Punctuation}]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题