问题
I'm scraping an HTML page but I'm trying to get one section of the page. There are no classes, id's or anything super useful I can plug into Cheerio I feel like (I'm new to this, so I know my ignorance plays a part).
The code looks like this.
<b> Here's some text I don't want</b>
<b> More text I don't want</b>
<hr style="width:90%; padding: 0>
<b> text I want </b>
<b> text I want </b>
<b> text I want </b>
<b> text I want </b>
<hr style="width:90%; padding: 0>
<b> Here's some text I don't want</b>
<b> More text I don't want</b>
Is there a way to grab the HTML between the two <hr>
elements with Cheerio? Both elements are exactly the same.
回答1:
You can start at the first hr and iterate next() until you get to the second one:
let el = $('hr').first()
while(el = el.next()){
if(el.length === 0 || el.prop('tagName') === 'HR') break
text += el.text() + "\n"
}
回答2:
If you can ascertain which nth to use you could try nth-of-type
selector e.g.
hr:nth-of-type(1)
You might also be able to use nth-child
来源:https://stackoverflow.com/questions/56233831/trying-to-extract-html-between-two-style-elements-with-cheerio