Parsing URI parameter and keyword value pairs

后端 未结 3 1580
情话喂你
情话喂你 2021-01-22 12:45

I would like to parse the parameter and keyword values from URI/L\'s in a text file. Parameters without values should also be included. Python is fine but am open to suggestion

3条回答
  •  终归单人心
    2021-01-22 13:42

    I would use a regular expression like this (first code then explanation):

    pairs = re.findall(r'(\w+)=(.*?)(?:\n|&)', s, re.S)
    for k, v in pairs:
        print('{0} = {1}'.format(k, v))
    

    The first line is where the action happens. The regular expression finds all occurrences of a word followed by an equal sign and then a string that terminates either by a & or by a new line char. The return pairs is a tuple list, where each tuple contains the word (the keyword) and the value. I didn't capture the = sign, and instead I print it in the loop.

    Explaining the regex:

    \w+ means one or more word chars. The parenthesis around it means to capture it and return that value as a result.

    = - the equal sign that must follow the word

    .*? - zero or more chars in a non-greedy manner, that is until a new line appears or the & sign, which is designated by \n|&. The (?:.. pattern means that the \n or & should not be captured.

    Since we capture 2 things in the regex - the keyword and everything after the = sign, a list of 2-tuples is returned.

    The re.S tells the regex engine to allow the match-all regex code - . - include in the search the new line char as well, that is, allow the search span over multiple lines (which is not default behavior).

提交回复
热议问题