Using BeautifulSoup to extract specific dl and dd list elements

前端未结

关注

 2  1279

My first time posting. I am using BeautifulSoup 4 and python 2.7 (pycharm). I have a webpage containing elements and I need to extract specific elements where the tags

相关标签:

2条回答

心在旅途

2021-01-20 02:28

If order is not important just make some changes:

...
dl_data = soup.find_all("dd")
for dlitem in dl_data:
    print dlitem.string

Result:

13 September 2015
Starting at £40,130 per annum.
15 December 2015
Starting at £22,460 per annum.
10 January 2014
Starting at £18,160 per annum.

For your latest request:

for item in list(zip(soup.find_all("dd")[0::3],soup.find_all("dd")[2::3])):
    date, salary = item
    print ', '.join([date.string, salary.string])

Output:

13 September 2015, 100
14 September 2015, 200

0 讨论(0)

梦谈多话

2021-01-20 02:50
I guess it works if you just omit the .parent in your code. At least this worked for my problem which is very similar to yours.

Here's my html, where order of the <dt> is not guaranteed:
```
<dl>
 <dt>Time</dt><dd>10:05:02</dd>
 <dt>Temp</dt><dd>20.5°C</dd>
</dl>
```
I'm accessing the values successfully with the following code:
```
 time = at_tl.find("dt",text="Time").findNext("dd").string
 temp = at_tl.find("dt",text="Temp").findNext("dd").string
```
0 讨论(0)
发布评论:

提交评论
- 加载中...