How to get the hidden input's value by using python?

随声附和 提交于 2020-07-05 08:37:05

问题


How can i get input value from html page

like

<input type="hidden" name="captId" value="AqXpRsh3s9QHfxUb6r4b7uOWqMT" ng-model="captId">

I have input name [ name="captId" ] and need his value

import re , urllib ,  urllib2
a = urllib2.urlopen('http://www.example.com/','').read()

thanx


update 1

I installed BeautifulSoup and used it but there some errors

code

 import re , urllib ,  urllib2
 a = urllib2.urlopen('http://www.example.com/','').read()
 soup = BeautifulSoup(a)
 value = soup.find('input', {'name': 'scnt'}).get('value')

error

"soup = BeautifulSoup(a) NameError: name 'BeautifulSoup' is not defined"


回答1:


Using re module to parse xml or html is generally considered as bad practice. Use it only if you are responsable for the page you try to parse. If not, either your regexes are awfully complex, or your script could break if someone replaces <input type="hidden" name=.../> with <input name="..." type="hidden" .../> or almost anything else.

BeautifulSoup is a html parser that :

  • automatically fixes minor errors (unclosed tags ...)
  • build a DOM tree
  • allows you to browse the tree, search for specific tags, with specific attributes
  • is useable with Python 2 and 3

Unless you have good reasons not to do it, you should use it rather than re for HTML parsing.

For example assuming that txt contains the whole page, find all hidden fields would be as simple as :

from bs4 import BeautifulSoup
soup = BeautifulSoup(txt)
hidden_tags = soup.find_all("input", type="hidden")
for tag in hidden_tags:
    # tag.name is the name and tag.value the value, simple isn't it ?


来源:https://stackoverflow.com/questions/30489296/how-to-get-the-hidden-inputs-value-by-using-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!