I\'m trying to use the range pattern [01-12]
in regex to match two digit mm, but this doesn\'t work as expected.
This also works:
^([1-9]|[0-1][0-2])$
[1-9]
matches single digits between 1 and 9
[0-1][0-2]
matches double digits between 10 and 12
There are some good examples here
As polygenelubricants says yours would look for 0|1-1|2 rather than what you wish for, due to the fact that character classes (things in []) match characters rather than strings.
A character class in regular expressions, denoted by the [...]
syntax, specifies the rules to match a single character in the input. As such, everything you write between the brackets specify how to match a single character.
Your pattern, [01-12]
is thus broken down as follows:
So basically all you're matching is 0, 1 or 2.
In order to do the matching you want, matching two digits, ranging from 01-12 as numbers, you need to think about how they will look as text.
You have:
You will then have to write a regular expression for that, which can look like this:
+-- a 0 followed by 1-9
|
| +-- a 1 followed by 0-2
| |
<-+--> <-+-->
0[1-9]|1[0-2]
^
|
+-- vertical bar, this roughly means "OR" in this context
Note that trying to combine them in order to get a shorter expression will fail, by giving false positive matches for invalid input.
For instance, the pattern [0-1][0-9]
would basically match the numbers 00-19, which is a bit more than what you want.
I tried finding a definite source for more information about character classes, but for now all I can give you is this Google Query for Regex Character Classes. Hopefully you'll be able to find some more information there to help you.
Use this:
0?[1-9]|1[012]
To test a pattern as 07/2018 use this:
/^(0?[1-9]|1[012])\/([2-9][0-9]{3})$/
(Date range between 01/2000 to 12/9999 )
You seem to have misunderstood how character classes definition works in regex.
To match any of the strings 01
, 02
, 03
, 04
, 05
, 06
, 07
, 08
, 09
, 10
, 11
, or 12
, something like this works:
0[1-9]|1[0-2]
A character class, by itself, attempts to match one and exactly one character from the input string. [01-12]
actually defines [012]
, a character class that matches one character from the input against any of the 3 characters 0
, 1
, or 2
.
The -
range definition goes from 1
to 1
, which includes just 1
. On the other hand, something like [1-9]
includes 1
, 2
, 3
, 4
, 5
, 6
, 7
, 8
, 9
.
Beginners often make the mistakes of defining things like [this|that]
. This doesn't "work". This character definition defines [this|a]
, i.e. it matches one character from the input against any of 6 characters in t
, h
, i
, s
, |
or a
. More than likely (this|that)
is what is intended.
So it's obvious now that a pattern like between [24-48] hours
doesn't "work". The character class in this case is equivalent to [248]
.
That is, -
in a character class definition doesn't define numeric range in the pattern. Regex engines doesn't really "understand" numbers in the pattern, with the exception of finite repetition syntax (e.g. a{3,5}
matches between 3 and 5 a
).
Range definition instead uses ASCII/Unicode encoding of the characters to define ranges. The character 0
is encoded in ASCII as decimal 48; 9
is 57. Thus, the character definition [0-9]
includes all character whose values are between decimal 48 and 57 in the encoding. Rather sensibly, by design these are the characters 0
, 1
, ..., 9
.
Let's take a look at another common character class definition [a-zA-Z]
In ASCII:
A
= 65, Z
= 90a
= 97, z
= 122This means that:
[a-zA-Z]
and [A-Za-z]
are equivalent[a-Z]
is likely to be an illegal character range
a
(97) is "greater than" than Z
(90)[A-z]
is legal, but also includes these six characters:
[
(91), \
(92), ]
(93), ^
(94), _
(95), `
(96)The []
s in a regex denote a character class. If no ranges are specified, it implicitly ors every character within it together. Thus, [abcde]
is the same as (a|b|c|d|e)
, except that it doesn't capture anything; it will match any one of a
, b
, c
, d
, or e
. All a range indicates is a set of characters; [ac-eg]
says "match any one of: a
; any character between c
and e
; or g
". Thus, your match says "match any one of: 0
; any character between 1
and 1
(i.e., just 1
); or 2
.
Your goal is evidently to specify a number range: any number between 01
and 12
written with two digits. In this specific case, you can match it with 0[1-9]|1[0-2]
: either a 0
followed by any digit between 1
and 9
, or a 1
followed by any digit between 0
and 2
. In general, you can transform any number range into a valid regex in a similar manner. There may be a better option than regular expressions, however, or an existing function or module which can construct the regex for you. It depends on your language.