I need to match a fixed width field on a file layout with a regular expression. The field is numeric/integer, always have four characters and is included in the range of 0..
Regular expression are not the answer to every single problem. My advice would be to do something like:
boolean isValidSomethingOrOther (string):
if string.length() != 4:
return false
for each character in string:
if not character.isNumeric():
return false
if string.toInt() > 1331:
return false
return true
If you must use a regex, there's nothing wrong with your solution but I'd probably use the following variant (just based on my understanding of RE engines and how they work):
^0[0-9]{3}|1[0-2][0-9]{2}|13[0-2][0-9]|133[01]$
Update:
Just on the elegance comment, there are many forms of elegance of which regexes are one. You can also achieve elegance just by abstracting the validation out to a separate function or macro and then call it from your code:
if isValidSomethingOrOther(str) ...
where SomethingOrOther
is a concrete business object. This allows you to change your idea of a valid object easily, even using a regex as you desire or any other checks you deem appropriate (such as my function above).
This allows you to cater for any changes down the line such as the requirement that these object now have to be prime numbers.
I'm sure I could write a "prime-number-less-than-1332" regex. I'm equally sure I wouldn't want to - I'd prefer to code that up as a function (or lookup table for raw speed), especially since the regex would most likely just look like:
^2|3|5|7| ... |1327$
anyway.
This seems too easy, am I understanding the problem correctly?
\[01][0-9]{3}\
I don't know what .. means, integer in range? That must be a perlism or something.
This seems to work the way you want to me:
In [3]: r = re.compile(r'[01][0-9]{3}')
In [4]: r.match('0001')
Out[4]: <_sre.SRE_Match object at 0x2fa2d30>
In [5]: r.match('1001')
Out[5]: <_sre.SRE_Match object at 0x2fa2cc8>
In [6]: r.match('2001')
In [7]: r.match('001')
In [8]: