问题
I am trying to figure out how to find numbers that are not years (I'm defining a year as simply a number that is four digits wide.)
For example, I want to pick up
1
12
123
But NOT
1234
in order to avoid dates (4 digits).
if the regex also picked up 12345
that is fine, but not necessary for solving this problem
(Note: these requirements may seem odd. They are part of a larger solution that I am stuck with)
回答1:
If lookbehind and lookahead are available, the following should work:
(?<!\d)(\d{1,3}|\d{5,})(?!\d)
Explanation:
(?<!\d) # Previous character is not a digit
(\d{1,3}|\d{5,}) # Between 1 and 3, or 5 or more digits, place in group 1
(?!\d) # Next character is not a digit
If you cannot use lookarounds, the following should work:
\b(\d{1,3}|\d{5,})\b
Explanation:
\b # Word boundary
(\d{1,3}|\d{5,}) # Between 1 and 3, or 5 or more digits, place in group 1
\b # Word boundary
Python example:
>>> regex = re.compile(r'(?<!\d)(\d{1,3}|\d{5,})(?!\d)')
>>> regex.findall('1 22 333 4444 55555 1234 56789')
['1', '22', '333', '55555', '56789']
回答2:
Depending on the regex flavor you use, this might work for you:
(([0-9]{1,3})|([0-9]{5,}))
回答3:
(\\d{0,4} | \\d{6,})
in java.
来源:https://stackoverflow.com/questions/8899552/regex-to-find-numbers-excluding-four-digit-numbers