How do I validate that a value is equal to the UUID4 generated by this code?
uuid.uuid4().hex
Should it be some regular expression? The values
Easy enough:
import re
uuid4hex = re.compile('[0-9a-f]{32}\Z', re.I)
This matches only for strings that are exactly 32 hexadecimal characters, provided you use the .match()
method (searches from the start of the string, see .search() vs. .match()). The \Z
matches the end of the string (vs. $
which would match at the end of a string or a newline).
Just as a helping note for performance issues, I've tested both ways in terms of execution time and the regex validation method is quite a little faster:
import re
from uuid import UUID
def _validate_uuid4(uuid_string):
try:
UUID(uuid_string, version=4)
except ValueError:
return False
return True
def _validate_uuid4_re(uuid_string):
uuid4hex = re.compile('^[a-f0-9]{8}-?[a-f0-9]{4}-?4[a-f0-9]{3}-?[89ab][a-f0-9]{3}-?[a-f0-9]{12}\Z', re.I)
match = uuid4hex.match(uuid_string)
return bool(match)
In ipython
command:
In [58]: val = str(uuid.uuid4())
In [59]: %time _validate_uuid4(val) CPU times: user 0 ns, sys: 0 ns, total: 0 ns Wall time: 30.3 µs Out[59]: True
In [60]: %time _validate_uuid4_re(val) CPU times: user 0 ns, sys: 0 ns, total: 0 ns Wall time: 25.3 µs Out[60]: True
In [61]: val = "invalid_uuid"
In [62]: %time _validate_uuid4(val) CPU times: user 0 ns, sys: 0 ns, total: 0 ns Wall time: 29.3 µs Out[62]: False
In [63]: %time _validate_uuid4_re(val) CPU times: user 0 ns, sys: 0 ns, total: 0 ns Wall time: 25.5 µs Out[63]: False
As far as I know, Martijn's answer is not 100% correct. A UUID-4 has five groups of hexadecimal characters, the first has 8 chars, the second 4 chars, the third 4 chars, the fourth 4 chars, the fifth 12 chars.
However to make it a valid UUID4 the third group (the one in the middle) must start with a 4:
00000000-0000-4000-0000-000000000000
^
And the fourth group must start with 8, 9, a or b.
00000000-0000-4000-a000-000000000000
^ ^
So you have to change Martijn's regex to:
import re
uuid4hex = re.compile('[0-9a-f]{12}4[0-9a-f]{3}[89ab][0-9a-f]{15}\Z', re.I)
Hope this helps!
To be more specific. This is the most precise regex for catching uuid4 both with and without dash, and that follows all the rules of UUID4:
[a-f0-9]{8}-?[a-f0-9]{4}-?4[a-f0-9]{3}-?[89ab][a-f0-9]{3}-?[a-f0-9]{12}
You can make sure it also catches capital letters with ignore case. In my example with re.I. (uuid's do not have capital letters in it's output, but in input it does not fail, just ignores it. Meaning that in a UUID "f" and "F" is the same)
I created a validater to catch them looking like this:
def valid_uuid(uuid):
regex = re.compile('^[a-f0-9]{8}-?[a-f0-9]{4}-?4[a-f0-9]{3}-?[89ab][a-f0-9]{3}-?[a-f0-9]{12}\Z', re.I)
match = regex.match(uuid)
return bool(match)
Then you can do:
if valid_uuid(my_uuid):
#Do stuff with valid my_uuid
With ^ in the start and \Z in the end I also make sure there is nothing else in the string. This makes sure that "3fc3d0e9-1efb-4eef-ace6-d9d59b62fec5" return true, but "3fc3d0e9-1efb-4eef-ace6-d9d59b62fec5+19187" return false.
Update - the python way below is not foolproof - see comments:
There are other ways to validate a UUID. In python do:
from uuid import UUID
try:
UUID(my_uuid)
#my_uuid is valid and you can use it
except ValueError:
#do what you need when my_uuid is not a uuid