问题
I want to handle geographic names i.e /new_york or /new-york etc
and since new-york is django-slugify for New york then maybe I should use the slugifed names even if names with underscores look better since I may want to automate the URL creation via an algorithm such as django slugify. A guess is that ([A-Za-z]+)
or simply ([\w-]+)
can work but to be safe I ask you which regex is best choice in this case.
I've already got a regex that handles number connecting numbers to a class:
('/([0-9]*)',ById)
#fetches and displays an entity by id
Now I want another regex to match names e.g. new_york so that a request for /new_york gets handled by the appropriate handler. Basically the negation of the regex above would or any combination letters+underscore and maybe a dash - since the names are geographical and It seems I could use this regex but I believe it works only because of precedence it that it just takes everything:
('/(.*)', ByName)
#Handle for instance /new_york entities, /sao_paulo entities etc by custom mapping for my relevant places.
Since I have other handlers and I don't want conflicting regexes and I have other request handlers, could you recommend how to formulate the regex?
How does it work when an expression suits 2 regexes? Which has higher precedence? Can you tell me more how I should learn to write regexes and possible implementations for the geographical datastore - as entities or instance variables and special problems such as geographic locations that have different names in different languages e.g. Germany in german is called Deutschland so I also want to apply translations that I can do with gettext / djang.po files.
回答1:
the first match wins.
usually your URLs will differ in other parts of the path. for example you might have
/cities/(?P<city>[^/]+)
/users/(?P<user>[^/]+)
and in many cases [^/]+ is a good regex because it will match anything except /, which you would normally avoid because it is used to separate path elements.
i don't think it's a good idea to separate URLs based solely on characters (in your case, letters or digits), but if you want to do that, use [-A-Za-z_]+
(note that the "-" goes at the start of the [], or it needs a backslash).
avoid \w
because that can also match digits. unless you want to go really crazy and send digits only to one handler and letters+digits elsewhere, in which case use:
/(?P<id>\d+)
/(?P<city>[-\w]+)
in that order.
来源:https://stackoverflow.com/questions/7139000/can-i-use-a-python-regex-for-letters-dashes-and-underscores