Is the regular expression [a-Z] valid and if yes then is it the same as [a-zA-Z]?

前端未结

关注

 7  1174

Is the regular expression [a-Z] valid and if yes then is it the same as [a-zA-Z]? Please note that in [a-Z] the a is lowercase and the Z is uppercase.

相关标签:

7条回答

轻奢々

2020-11-27 08:18

I've just fallen over this in a script (not my own).

It seems that grep, awk, sed accept [a-Z] based on your locale (i.e. LANG or LC_CTYPE environment variable). In POSIX, [a-Z] isn't allowed by these tools, but in some other locales (e.g. en_gb.utf8) it works, and is the same as [a-zA-Z].

Yes, I've checked, it doesn't match any of _^[]`.

Given that this has taken quite some time to debug, I strongly discourage anyone from ever using [a-Z] in a regex.

0 讨论(0)
发布评论:

提交评论
- 加载中...
予麋鹿

2020-11-27 08:22
I'm not sure about other languages' implementations, but in PHP you can do
```
"/[a-z]/i"
```
and it will case insensitive. There is probably something similar for other languages.
0 讨论(0)
发布评论:

提交评论
- 加载中...
生来不讨喜

2020-11-27 08:23

If it's valid, it won't do what you expect.

The character code of Z is lower than the character code of a, so if the codes are swapped to mean the range [Z-a], it will be the same as [Z\[\\\]^_`a], i.e. it will include the characters Z and a, and the characters between.

If you use [A-z] to get all upper and lower case characters, that is still not the same as [A-Za-z], it's the same as [A-Z\[\\\]^_`a-z].

0 讨论(0)
发布评论:

提交评论
- 加载中...
[愿得一人]

2020-11-27 08:23

No, it's not valid, probably because the ASCII values are not consecutive from z to A.

0 讨论(0)
发布评论:

提交评论
- 加载中...

执笔经年

2020-11-27 08:37

You could always try it:

 print "ok" if "monkey" =~ /[a-Z]/;

Perl says

Invalid [] range "a-Z" in regex; marked by <-- HERE in m/[a-Z <-- HERE ]/ at a-z.pl line 4.

0 讨论(0)

刺人心

2020-11-27 08:39

You don't specify what language, but in general [a-Z] won't be a valid range, as in ASCII the lower-case alpha characters come after the upper-case ones. [A-z] might be a valid range (indicating all upper- and lower-cased alphas as well as the punctuation that appears between Z and a), but it might not be, depending on your particular implementation. The i flag can be added to the regex to make it case-insensitive; check your particular implementation for instructions on how to specify that flag.

0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页