Parsing URI parameter and keyword value pairs

后端 未结 3 1575
情话喂你
情话喂你 2021-01-22 12:45

I would like to parse the parameter and keyword values from URI/L\'s in a text file. Parameters without values should also be included. Python is fine but am open to suggestion

3条回答
  •  夕颜
    夕颜 (楼主)
    2021-01-22 13:45

    You can use a regular expression to extract all the pairs.

    >>> url = 'www2.domain.edu/folder/folder/page.php?l=user&x=0&id=1&page=http%3A//domain.com/page.html&unique=123456&refer=http%3A//domain2.net/results.aspx%3Fq%3Dbob+test+1.21+some%26file%3Dname&text='
    >>> import re
    >>> url = 'www2.domain.edu/folder/folder/page.php?l=user&x=0&id=1&page=http%3A//domain.com/page.html&unique=123456&refer=http%3A//domain2.net/results.aspx%3Fq%3Dbob+test+1.21+some%26file%3Dname&text='
    >>> p = re.compile('.*?&(.*?)=(.*?)(?=&|$)')
    >>> m = p.findall(url)
    >>> m
    [('x', '0'), ('id', '1'), ('page', 'http%3A//domain.com/page.html'), ('unique', '123456'), ('refer', 'http%3A//domain2.net/results.aspx%3Fq%3Dbob+test+1.21+some%26file%3Dname'), ('text', '')]
    

    You can even use a dict comprehension to package all the data together.

    >>> dic = {k:v for k,v in m}
    >>> dic
    {'text': '', 'page': 'http%3A//domain.com/page.html', 'x': '0', 'unique': '123456', 'id': '1', 'refer': 'http%3A//domain2.net/results.aspx%3Fq%3Dbob+test+1.21+some%26file%3Dname'}
    

    And then if all you want to do is print them out:

    >>> for k,v in dic.iteritems():
        print k,'-->',v
    
    text --> 
    page --> http%3A//domain.com/page.html
    x --> 0
    unique --> 123456
    id --> 1
    refer --> http%3A//domain2.net/results.aspx%3Fq%3Dbob+test+1.21+some%26file%3Dname
    

提交回复
热议问题