Beautiful Soup returns 'none'

怎甘沉沦 提交于 2021-01-29 11:24:53

问题


I am using the following code to extract data using beautiful soup:

import requests
import bs4
res = requests.get('https://www.jmu.edu/cgi-bin/parking_sign_data.cgi?hash=53616c7465645f5f5c0bbd0eccccb6fe8dd7ed9a0445247e3c7dcb4f91927f7ccc933be780c6e558afb8ebf73620c3e5e3b2c68cd3c138519068eac99d9bf30e1e67ce894deb3a054f95f882da2ea2f0|869835tg89dhkdnbnsv5sg5wg0vmcf4mfcfc2qwm5968unmeh5')
soup = bs4.BeautifulSoup(res.text, 'xml')
soup.find_all("span", class_="text")

I've tried different variations of the last line trying to get the program to display anything at all but each time it returns "None" or an empty list. The only thing i can get to display is the entire html of the site using: print(soup.contents). The data I am trying to extract is the "Display" tag value within each of the signID tags. The data is clearly there when it prints the entire HTML of the site.

Additional Information: The the number I am trying to extract is the current number of spaces in a parking deck, so the website is updated by the second.

Additional Information 2: This site is an iframe of https://www.jmu.edu/parking/. The data I am after is in the bottom right corner under "commuter parking"

Website URL: https://www.jmu.edu/cgi-bin/parking_sign_data.cgi?hash=53616c7465645f5f5c0bbd0eccccb6fe8dd7ed9a0445247e3c7dcb4f91927f7ccc933be780c6e558afb8ebf73620c3e5e3b2c68cd3c138519068eac99d9bf30e1e67ce894deb3a054f95f882da2ea2f0|869835tg89dhkdnbnsv5sg5wg0vmcf4mfcfc2qwm5968unmeh5


回答1:


I can see that you're trying to extract Display tag values under each Sign tags. Hope this helps for you.

Code:

import requests
from bs4 import BeautifulSoup
res = requests.get('https://www.jmu.edu/cgi-bin/parking_sign_data.cgi?hash=53616c7465645f5f5c0bbd0eccccb6fe8dd7ed9a0445247e3c7dcb4f91927f7ccc933be780c6e558afb8ebf73620c3e5e3b2c68cd3c138519068eac99d9bf30e1e67ce894deb3a054f95f882da2ea2f0|869835tg89dhkdnbnsv5sg5wg0vmcf4mfcfc2qwm5968unmeh5')
soup = BeautifulSoup(res.text, 'lxml')
for data in soup.find_all('sign'):
    print(data.signid.text, data.display.text)

Output:

1  442
2  442
3  442
4 Happy Holidays
5 Happy Holidays

I have showed output for 5 values only and this gives 57 signId and Display values.

You can directly use soup.find_all('display') if you want only Display values. I have used signId and Display in the example just for reference.



来源:https://stackoverflow.com/questions/59525836/beautiful-soup-returns-none

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!