Is it possible to find the <td> .. </td> text, when any of the <td>..</td> value is known?

给你一囗甜甜゛ 提交于 2019-12-13 03:39:24

问题


I have an webpage which has the similar kind of html format as below:

<form name="test">

<td> .... </td>
  .
  .
  .
<td> <A HREF="http://www.edu/st/file.html">alo</A> </td>
<td> <A HREF="http://www.dom/st/file.html">foo</A> </td>
<td> bla bla </td>

</form>

Now, I know only the value bla bla, base on the value can we track or find the 3rd last .. value(which is here alo)? I can track those,with the help of HREF values,but the HREF values are not fixed always, they can be anything anytime.


回答1:


Extracting every <td> from an HTML document is easy, but it's not a foolproof way to navigate the DOM. However, given the limitations of the sample HTML, here's a solution. I doubt it'll work in a real-world situation though.

Mechanize uses Nokogiri internally for its heavy lifting so doing require 'nokogiri' isn't necessary if you've already required Mechanize.

require 'nokogiri'

doc = Nokogiri::HTML::DocumentFragment.parse(<<EOT)
<td> <A HREF="http://www.edu/st/file.html">alo</A> </td>
<td> <A HREF="http://www.dom/st/file.html">foo</A> </td>
<td> bla bla </td>
EOT

doc.search('td')[-3].at('a')['href']
=> "http://www.edu/st/file.html"

How to get the Nokogiri document from the Mechanize "agent" is left as an exercise for the user.




回答2:


see http://nokogiri.org/

it helps you to parse html code and then find the elements via selectors



来源:https://stackoverflow.com/questions/14467164/is-it-possible-to-find-the-td-td-text-when-any-of-the-td-td-value

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!