I am having trouble with the regex in the following code:
import mechanize
import re
br = mechanize.Browser()
br.set_handle_robots(False)
br.addheaders = [(
You need to escape the parenthesis since they have a special meaning:
<a href="javascript:__doPostBack\('(.*?)','(.*?)'\)">
HERE^ HERE^
Note that, ideally, you should not be parsing HTML with regex (even though your pattern is quite specific and I don't think this is that bad). Instead, parse HTML with, say, BeautifulSoup, locate the a
element, get the href
attribute value and then extract the desired substrings with regex.