BeautifulSoup getting content behind multiple
levels

后端未结

关注

 3  1705

How can I get the time data behind two \"divs\" with BeautifulSoup?



6:00.00

I\'ve tried the f

相关标签:

3条回答

误落风尘

2021-01-27 14:14

To your second question:

if "kW" in item.text:
    itemval = item.find_parent().find_next_sibling().text.strip()
    output.append(itemval)

0 讨论(0)

再見小時候

2021-01-27 14:19
div.div selector is too ambiguous, to say the least.

Since, from what it appears, you are up to getting the "Duration at Rated Power (HH:MM)" field value, I would first locate the corresponding label and then find the next text node matching the field format:
```
label = soup.find("label", text="Duration at Rated Power (HH:MM)")
value = label.find_next(text=re.compile(r"\d+:\d+")).strip()
print(value)  # prints 6:00.00
```
(don't forget to import re module)
0 讨论(0)
发布评论:

提交评论
- 加载中...

耶瑟儿～

2021-01-27 14:27

Try this to get the time you wish to scrape:

import requests
from bs4 import BeautifulSoup

page = requests.get("https://www.energystorageexchange.org/projects/2") 
soup = BeautifulSoup(page.content, 'lxml')
for item in soup.select("label.new_font"):
    if "HH:MM" in item.text:
        itemval = item.find_parent().find_next_sibling().text.strip()
        print(itemval)

Output:

6:00.00

0 讨论(0)

BeautifulSoup getting content behind multiple levels

BeautifulSoup getting content behind multiple
levels