I have a string from which i want to extract 3 groups:
\'19 janvier 2012\' -> \'19\', \'janvier\', \'2012\'
Month name could contain non
You can construct a new character class:
[^\W\d_]
instead of \w
. Translated into English, it means "Any character that is not a non-alphanumeric character ([^\W]
is the same as \w
), but that is also not a digit and not an underscore".
Therefore, it will only allow Unicode letters (if you use the re.UNICODE
compile option).