beautifulsoup find_all bug?

a 夏天 提交于 2019-12-02 04:16:39

You can try using different parser for Beautifulsoup.

import requests
from bs4 import BeautifulSoup

url = "<your url>"
r = requests.get(url)

soup = BeautifulSoup(r.content, 'lxml')
hrefDivList = soup.find_all("span", attrs={"class": "headline"})
print len(hrefDivList)

You can try CSS Selectors to make your life easier

hrefDivList = soup.select("span.headline")
#print hrefDivList
print len(hrefDivList)

Or you can directly iterate over every Span text

for every_span in soup.select("span.headline"):
    print(every_span.text)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!