问题
from urllib.request import urlopen
from bs4 import BeautifulSoup
html= urlopen("http://www.pythonscraping.com/pages/page3.html")
soup= BeautifulSoup(html.read())
print(soup.find("img",{"src":"../img/gifts/img1.jpg"
}).parent.previous_sibling.get_text())
The above code works fine but not the one below.It gives an attribute error as stated above. Can anyone tell me the reason?
from urllib.request import urlopen
from bs4 import BeautifulSoup
html= urlopen("http://www.pythonscraping.com/pages/page3.html")
soup= BeautifulSoup(html.read())
price =soup.find("img",{"src=":"../img/gifts/img1.jpg"
}).parent.previous_sibling.get_text()
print(price)
Thanks! :)
回答1:
If you compare the first and the second version, you'll notice that:
First: soup.find("img",{"src":"../img/gifts/img1.jpg"}).parent.previous_sibling.get_text()
- Note:
"src"
Second: soup.find("img","src=":"../img/gifts/img1.jpg"}).parent.previous_sibling.get_text()
- Note:
"src="
The second code returns Attribute Error:'NoneType' object has no attribute 'parent'
because it couldn't find src=="../img/gifts/img1.jpg"
in the provided soup.
So, if you remove the =
in the second version, it should work.
Btw, you should explicitly which parser you want to use, otherwise bs4
will return the following warning:
UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.
To get rid of this warning, change code that looks like this:
BeautifulSoup([your markup])
to this:
BeautifulSoup([your markup], "lxml")
So, as stated in the warning message, you just have to change soup = BeautifulSoup(html.read())
to soup = BeautifulSoup(html.read(), 'lxml')
, for example.
来源:https://stackoverflow.com/questions/43478806/attribute-errornonetype-object-has-no-attribute-parent