import urllib
print urllib.urlopen(\'http://www.reefgeek.com/equipment/Controllers_&_Monitors/Neptune_Systems_AquaController/Apex_Controller_&_Accessories/\
That URL does indeed result in a 404, but with lots of HTML content. urllib2
is handling it (correctly) as an error condition. You can recover the content of that site's 404 page like so:
import urllib2
try:
print urllib2.urlopen('http://www.reefgeek.com/equipment/Controllers_&_Monitors/Neptune_Systems_AquaController/Apex_Controller_&_Accessories/').read()
except urllib2.HTTPError, e:
print e.code
print e.msg
print e.headers
print e.fp.read()