Given:
require \'rubygems\'
require \'nokogiri\'
value = Nokogiri::HTML.parse(<<-HTML_END)
\"
A
And some docs you're seeking:
Nokogiri::XML::Node.send(:define_method, 'xpath_regex') { |*args|
xpath = args[0]
rgxp = /\/([a-z]+)\[@([a-z\-]+)~=\/(.*?)\/\]/
xpath.gsub!(rgxp) { |s| m = s.match(rgxp); "/#{m[1]}[regex(.,'#{m[2]}','#{m[3]}')]" }
self.xpath(xpath, Class.new {
def regex node_set, attr, regex
node_set.find_all { |node| node[attr] =~ /#{regex}/ }
end
}.new)
}
Usage:
divs = Nokogiri::HTML(page.root.to_html).
xpath_regex("//div[@class~=/axtarget$/]//div[@class~=/^carbo/]")
Use the xpath function starts-with
:
value.xpath('//p[starts-with(@id, "para-")]').each { |x| puts x['id'] }
divs = value.css('div[id^="para-"]')