问题
I am using the following code to extract data using beautiful soup:
import requests
import bs4
res = requests.get('https://www.jmu.edu/cgi-bin/parking_sign_data.cgi?hash=53616c7465645f5f5c0bbd0eccccb6fe8dd7ed9a0445247e3c7dcb4f91927f7ccc933be780c6e558afb8ebf73620c3e5e3b2c68cd3c138519068eac99d9bf30e1e67ce894deb3a054f95f882da2ea2f0|869835tg89dhkdnbnsv5sg5wg0vmcf4mfcfc2qwm5968unmeh5')
soup = bs4.BeautifulSoup(res.text, 'xml')
soup.find_all("span", class_="text")
I've tried different variations of the last line trying to get the program to display anything at all but each time it returns "None" or an empty list. The only thing i can get to display is the entire html of the site using: print(soup.contents)
. The data I am trying to extract is the "Display" tag value within each of the signID tags. The data is clearly there when it prints the entire HTML of the site.
Additional Information: The the number I am trying to extract is the current number of spaces in a parking deck, so the website is updated by the second.
Additional Information 2: This site is an iframe of https://www.jmu.edu/parking/. The data I am after is in the bottom right corner under "commuter parking"
Website URL: https://www.jmu.edu/cgi-bin/parking_sign_data.cgi?hash=53616c7465645f5f5c0bbd0eccccb6fe8dd7ed9a0445247e3c7dcb4f91927f7ccc933be780c6e558afb8ebf73620c3e5e3b2c68cd3c138519068eac99d9bf30e1e67ce894deb3a054f95f882da2ea2f0|869835tg89dhkdnbnsv5sg5wg0vmcf4mfcfc2qwm5968unmeh5
回答1:
I can see that you're trying to extract Display
tag values under each Sign
tags. Hope this helps for you.
Code:
import requests
from bs4 import BeautifulSoup
res = requests.get('https://www.jmu.edu/cgi-bin/parking_sign_data.cgi?hash=53616c7465645f5f5c0bbd0eccccb6fe8dd7ed9a0445247e3c7dcb4f91927f7ccc933be780c6e558afb8ebf73620c3e5e3b2c68cd3c138519068eac99d9bf30e1e67ce894deb3a054f95f882da2ea2f0|869835tg89dhkdnbnsv5sg5wg0vmcf4mfcfc2qwm5968unmeh5')
soup = BeautifulSoup(res.text, 'lxml')
for data in soup.find_all('sign'):
print(data.signid.text, data.display.text)
Output:
1 442
2 442
3 442
4 Happy Holidays
5 Happy Holidays
I have showed output for 5 values only and this gives 57 signId
and Display
values.
You can directly use soup.find_all('display')
if you want only Display
values. I have used signId
and Display
in the example just for reference.
来源:https://stackoverflow.com/questions/59525836/beautiful-soup-returns-none