问题
I am new to Python, I am not sure what should I be looking for but I assure you I have done my research and still came up with a rather ugly 20 lines long block of code for this simple issue.
I am processing a traversal URL with my app based on Pyramid framework.
Now, the URL can be these: (url = None)
- url = ""
- url = "/"
- url = "/block_1"
- url = "/block_1/"
- url = "/block_1/block_2"
- url = "/block_1/block_2/"
The url can contain nothing. In this case, I want my function to return False, None, or an empty list or tuple. (Does not matter which.) (matching options 0 or 1)
Block_1: This is a single word, a to Z string. Can not and should not contain any special characters. In fact, what's fetched as block_1, should be in a dictionary (or a list) and if not found, an error should be raised and returned. If block_1 is not there or not found, the function, as stated above, should return False, None or empty list or tuple. (matching options 2 and 3)
Block_2: Block_2 can be anything. For simplicity, it can contain any characters of any languages along with special characters such as: ()[]. Excuse me if I'm mistaken but I think what I want is basically for it to match [\pL\pN].*
, with one exception: its last character can not be either of slashes: neither "\"
nor "/"
. Preferably, I would like it to be a to Z (including all languages' alphabets and their accented characters) along with some other characters from a list
(which I define specially as mentioned above: () and []). If block_2 is not given it should have the value None and if it's not matched, it should return False. (matching last 2 options listed above)
My code starts with, rather primitively for which I apologise:
if not url:
return False
# then goes on evaluating the first charachter to see if it's a /
if fetch[0] == '/':
length = len(url)
#then checks if there's a second / for the block_2
slash_2 = fetch.find('/', 3) # or '/', 1
if slash_2 == -1:
block_1, block_2 = url[1:length].lower(), None
# checks if block_1 is in a dictionary
if not block_1 in the_dict:
return False
else: # if it's there it processes what's remaining
block_1 = fetch[1:slash_2]
block_2 = fetch[slash_2+1:]
# then checks if there's another slash at the end of block_2
if block_2[-1] == '/': # if so it removes it
block_2 = block_2[:-1]
return False # otherwise returns false, which can be () or [] or None
I'm sorry if my code is terrible and over complicated. I would love nothing more than a more elegant and better way to do this.
So how can I do it? What should I do to get rid of this jammed lines of code?
Thank you.
回答1:
split('/')
should definitely be used and that should help you parse the URL.
If that is not sufficient, urlparse
should be used to parse
urlparse.urlparse(path)
In [31]: url = 'http://stackoverflow.com/questions/12809298/how-can-i-separate-this-into-two-strings/12809315#12809315'
In [32]: urlparse.urlparse(url)
Out[32]: ParseResult(scheme='http', netloc='stackoverflow.com', path='/questions/12809298/how-can-i-separate-this-into-two-strings/12809315', params='', query='', fragment='12809315')
In [33]: a = urlparse.urlparse(url)
In [34]: a.path
Out[34]: '/questions/12809298/how-can-i-separate-this-into-two-strings/12809315'
In [35]: a.path.split('/')
Out[35]:
['',
'questions',
'12809298',
'how-can-i-separate-this-into-two-strings',
'12809315']
回答2:
The first thing I would try is the .split() string function:
>>> url = "/block_1/block_2"
>>> url.split("/")
['', 'block_1', 'block_2']
This will return a list of components of the string, that were separated by the /
character. From there, you can use the len()
function to find out the length of the list, and take the appropriate action according to your desired logic.
来源:https://stackoverflow.com/questions/12809298/how-can-i-separate-this-into-two-strings