Find value of input field in html doc using python

后端 未结 2 949
青春惊慌失措
青春惊慌失措 2021-01-19 18:53

I am trying to get input values from an HTML doc and want to parse out the values of hidden input fields. For example how can I parse out only the value from the snippet bel

相关标签:
2条回答
  • 2021-01-19 19:27

    You could use BeautifulSoup:

    >>> htmlstr = """    <input type="hidden" autocomplete="off" id="post_form_id" name="post_form_id" value="d619a1eb3becdc05a3ebea530396782f" />
    ...     <input type="hidden" name="fb_dtsg" value="AQCYsohu" autocomplete="off" />"""
    >>> from BeautifulSoup import BeautifulSoup
    >>> soup = BeautifulSoup(htmlstr)
    >>> [(n['name'], n['value']) for n in soup.findAll('input')]
    [(u'post_form_id', u'd619a1eb3becdc05a3ebea530396782f'), (u'fb_dtsg', u'AQCYsohu')]
    
    0 讨论(0)
  • 2021-01-19 19:30

    Or with lxml:

    import lxml.html
    
    htmlstr = '''
        <input type="hidden" autocomplete="off" id="post_form_id" name="post_form_id" value="d619a1eb3becdc05a3ebea530396782f" />
        <input type="hidden" name="fb_dtsg" value="AQCYsohu" autocomplete="off" />
    '''
    
    // Parse the string and turn it into a tree of elements
    htmltree = lxml.html.fromstring(htmlstr)
    
    // Iterate over each input element in the tree and print the relevant attributes
    for input_el in htmltree.xpath('//input'):
        name = input_el.attrib['name']
        value = input_el.attrib['value']
    
        print "%s : %s" % (name, value)
    

    Gives:

    post_form_id : d619a1eb3becdc05a3ebea530396782f
    fb_dtsg : AQCYsohu
    
    0 讨论(0)
提交回复
热议问题