问题
I am trying to select a form on a Dell Kace ticketing page but am getting a parse error. I am programing in python and have been using mechanize. I was successfully able to login to the site. I read that you might be able to fix this with html cleaners like Beautiful soup but none of those seemed to work.
br = mechanize.Browser() #have tried the various html cleaner options in mechanize
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)
br.set_handle_equiv(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
br.set_handle_robots(False)
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US;
rv:1.9.0.1)Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]
....
url_ticket = 'http://kace-server/adminui/ticket.php?ID=%d' %(box1[sel+1])
url_org1 = "http://kace-server/common/switch_to_org.php?org=1"
br.open(url_org1)
br.open(url_ticket)
br.select_form(name="ticket_form")
br.form['fields[owner_filter]']=current_user[0]
br.submit()
The program is failing at the br.select_form line with the following error
line 39, in assign
br.select_form(name="ticket_form")
....
File "C:\Python27\lib\site-packages\mechanize-0.2.5-py2.7.egg\mechanize\_form.py",
line 760, in feed raise ParseError(exc)
ParseError: expected name token at '<!\xe2\x80\x94IE7 mode --\n <'
I searched for that '!\xe2... string in the html but could not find it. I have also tried nr=0 for the select_form. Any help would be greatly appreciated
Thanks, James
回答1:
"\xe2\x80\x94" is the utf-8 encoded form of char "—" (not "-" !). Looks like it's a typo in the html (or some dummy using msword as a html editor ?), should be "
回答2:
To me it looks like the page may be encoded in UTF-8. '<!\xe2\x80\x94IE7 mode --\n <'
would decode as u'<!—IE7 mode --\n <'
. Perhaps that is meant to be an HTML comment <!--
, but --
has been changed to —
.
来源:https://stackoverflow.com/questions/11036308/parse-error-while-using-mechanize-in-python