Can I use a python regex for letters, dashes and underscores?

余生颓废 提交于 2020-01-05 09:25:09

问题


I want to handle geographic names i.e /new_york or /new-york etc and since new-york is django-slugify for New york then maybe I should use the slugifed names even if names with underscores look better since I may want to automate the URL creation via an algorithm such as django slugify. A guess is that ([A-Za-z]+) or simply ([\w-]+) can work but to be safe I ask you which regex is best choice in this case. I've already got a regex that handles number connecting numbers to a class:

('/([0-9]*)',ById)#fetches and displays an entity by id

Now I want another regex to match names e.g. new_york so that a request for /new_york gets handled by the appropriate handler. Basically the negation of the regex above would or any combination letters+underscore and maybe a dash - since the names are geographical and It seems I could use this regex but I believe it works only because of precedence it that it just takes everything:

('/(.*)', ByName)#Handle for instance /new_york entities, /sao_paulo entities etc by custom mapping for my relevant places.

Since I have other handlers and I don't want conflicting regexes and I have other request handlers, could you recommend how to formulate the regex?

How does it work when an expression suits 2 regexes? Which has higher precedence? Can you tell me more how I should learn to write regexes and possible implementations for the geographical datastore - as entities or instance variables and special problems such as geographic locations that have different names in different languages e.g. Germany in german is called Deutschland so I also want to apply translations that I can do with gettext / djang.po files.


回答1:


the first match wins.

usually your URLs will differ in other parts of the path. for example you might have

/cities/(?P<city>[^/]+)
/users/(?P<user>[^/]+)

and in many cases [^/]+ is a good regex because it will match anything except /, which you would normally avoid because it is used to separate path elements.

i don't think it's a good idea to separate URLs based solely on characters (in your case, letters or digits), but if you want to do that, use [-A-Za-z_]+ (note that the "-" goes at the start of the [], or it needs a backslash).

avoid \w because that can also match digits. unless you want to go really crazy and send digits only to one handler and letters+digits elsewhere, in which case use:

/(?P<id>\d+)
/(?P<city>[-\w]+)

in that order.



来源:https://stackoverflow.com/questions/7139000/can-i-use-a-python-regex-for-letters-dashes-and-underscores

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!