how to use nokogiri methods .xpath & .at_xpath

非 Y 不嫁゛ 提交于 2020-01-02 04:47:06

问题


I'm learning how to use nokogiri and few questions came to me based on the code below

require 'rubygems'
require 'mechanize'

post_agent = WWW::Mechanize.new
post_page = post_agent.get('http://www.vbulletin.org/forum/showthread.php?t=230708')

puts "\nabsolute path with tbody gives nil"
puts  post_page.parser.xpath('/html/body/div/div/div/div/div/table/tbody/tr/td/div[2]').xpath('text()').to_s.strip.inspect

puts "\n.at_xpath gives an empty string"
puts post_page.parser.at_xpath("//div[@id='posts']/div/table/tr/td/div[2]").at_xpath('text()').to_s.strip.inspect

puts "\ntwo lines solution with .at_xpath gives an empty string"
rows =   post_page.parser.xpath("//div[@id='posts']/div/table/tr/td/div[2]")
puts rows[0].at_xpath('text()').to_s.strip.inspect


puts
puts "two lines working code"
rows =   post_page.parser.xpath("//div[@id='posts']/div/table/tr/td/div[2]")
puts rows[0].xpath('text()').to_s.strip

puts "\none line working code"
puts post_page.parser.xpath("//div[@id='posts']/div/table/tr/td/div[2]")[0].xpath('text()').to_s.strip

puts "\nanother one line code"
puts post_page.parser.at_xpath("//div[@id='posts']/div/table/tr/td/div[2]").xpath('text()').to_s.strip

puts "\none line code with full path"
puts post_page.parser.xpath("/html/body/div/div/div/div/div/table/tr/td/div[2]")[0].xpath('text()').to_s.strip
  • is it better to use // or / in xpath? @AnthonyWJones says that 'the use of an unprefixed //' is not so good idea
  • I had to remove tbody from any working xpath otherwise I got 'nil' result. How is possible to remove an element from the xpath to get things work?
  • do I have to use .xpath twice to extract data if not using full xpath?
  • why I cannot make .at_xpath working to extract data? it works nicely here what is the difference?

回答1:


  1. // means every node at every level so it's much more expensive compared to /
  2. you can use * as placeholder.
  3. No, you can make an XPath query, get the element then call the nokogiri text method on the node
  4. Sure you can. Have a look at this question and my benchmark file. You will see an example of at_xpath.

I found you often use text() expression. This is not required using Nokogiri. You can retrieve the node then call the text method on the node. It's much less expensive.

Also keep in mind Nokogiri supports .css selectors. They can be easier if you are working with HTML pages.



来源:https://stackoverflow.com/questions/2120012/how-to-use-nokogiri-methods-xpath-at-xpath

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!