Can I shorten this regular expression?

匿名 (未验证) 提交于 2019-12-03 00:48:01

问题:

I have the need to check whether strings adhere to a particular ID format.

The format of the ID is as follows:

aBcDe-fghIj-KLmno-pQRsT-uVWxy

A sequence of five blocks of five letters upper case or lower case, separated by one dash.

I have the following regular expression that works:

string idFormat = "[a-zA-Z]{5}[-]{1}[a-zA-Z]{5}[-]{1}[a-zA-Z]{5}[-]{1}[a-zA-Z]{5}[-]{1}[a-zA-Z]{5}"; 

Note that there is no trailing dash, but the all of the blocks within the ID follow the same format. Therefore, I would like to be able to represent this sequence of four blocks with a trailing dash inside the regular expression and avoid the duplication.

I tried the following, but it doesn't work:

string idFormat = "[[a-zA-Z]{5}[-]{1}]{4}[a-zA-Z]{5}"; 

How do I shorten this regular expression and get rid of the duplicated parts?

What is the best way to ensure that each block does also not contain any numbers?


Edit:

Thanks for the replies, I now understand the grouping in regular expressions.

I'm running a few tests against the regular expression, the following are relevant:

Test 1: aBcDe-fghIj-KLmno-pQRsT-uVWxy
Test 2: abcde-fghij-klmno-pqrst-uvwxy

With the following regular expression, both tests pass:

^([a-zA-Z]{5}-){4}[a-zA-Z]{5}$ 

With the next regular expression, test 1 fails:

^([a-z]{5}-){4}[a-z]{5}$ 

Several answers have said that it is OK to omit the A-Z when using a-z, but in this case it doesn't seem to be working.

回答1:

If you can set regex options to be case insensitive, you could replace all [a-zA-Z] with just plain [a-z]. Furthermore, [-]{1} can be written as -.

Your grouping should be done with (, ), not with [, ] (although you're correctly using the latter in specifying character sets.

Depending on context, you probably want to throw in ^...$ which matches start and end of string, respectively, to verify that the entire string is a match (i.e. that there are no extra characters).

In javascript, something like this:

/^([a-z]{5}-){4}[a-z]{5}$/i 


回答2:

You can try:

([a-z]{5}-){4}[a-z]{5} 

and make it case insensitive.



回答3:

This works for me, though you might want to check it:

[a-zA-Z]{5}(-[a-zA-Z]{5}){4} 

(One group of five letters, followed by [dash+group of five letters] four times)



回答4:

([a-zA-Z]{5}[-]{1}){4}[a-zA-Z]{5} 


回答5:

Try

string idFormat = "([a-zA-Z]{5}[-]{1}){4}[a-zA-Z]{5}"; 

I.e. you basically replace your brackets by parentheses. Brackets are not meant for grouping but for defining a class of accepted characters.

However, be aware that with shortened versions, you can use the expression for validating the string, but not for analyzing it. If you want to process the 5 groups of characters, you will want to put them in 5 groups:

string idFormat =     "([a-zA-Z]{5})-([a-zA-Z]{5})-([a-zA-Z]{5})-([a-zA-Z]{5})-([a-zA-Z]{5})"; 

so you can address each group and process it.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!