Trying to extract HTML between two style elements with cheerio

情到浓时终转凉″ 提交于 2019-12-24 18:28:09

问题


I'm scraping an HTML page but I'm trying to get one section of the page. There are no classes, id's or anything super useful I can plug into Cheerio I feel like (I'm new to this, so I know my ignorance plays a part).

The code looks like this.

<b> Here's some text I don't want</b>
<b> More text I don't want</b>

<hr style="width:90%; padding: 0>
<b> text I want </b>
<b> text I want </b>
<b> text I want </b>
<b> text I want </b>
<hr style="width:90%; padding: 0>

<b> Here's some text I don't want</b>
<b> More text I don't want</b>

Is there a way to grab the HTML between the two <hr> elements with Cheerio? Both elements are exactly the same.


回答1:


You can start at the first hr and iterate next() until you get to the second one:

let el = $('hr').first()
while(el = el.next()){
  if(el.length === 0 || el.prop('tagName') === 'HR') break
  text += el.text() + "\n"
}



回答2:


If you can ascertain which nth to use you could try nth-of-type selector e.g.

hr:nth-of-type(1)

You might also be able to use nth-child



来源:https://stackoverflow.com/questions/56233831/trying-to-extract-html-between-two-style-elements-with-cheerio

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!