How do I do a regex search in Nokogiri for text that matches a certain beginning?

后端 未结 4 604
无人及你
无人及你 2021-02-03 22:50

Given:

require \'rubygems\'
require \'nokogiri\'
value = Nokogiri::HTML.parse(<<-HTML_END)
\"

  

A

相关标签:
4条回答
  • 2021-02-03 23:09

    And some docs you're seeking:

    • Nokogiri: http://nokogiri.org/
    • XPath: http://www.w3.org/TR/xpath20/
    • CSS3 Selectors: http://www.w3.org/TR/selectors/
    0 讨论(0)
  • 2021-02-03 23:09
    Nokogiri::XML::Node.send(:define_method, 'xpath_regex') { |*args|
      xpath = args[0]
      rgxp = /\/([a-z]+)\[@([a-z\-]+)~=\/(.*?)\/\]/
      xpath.gsub!(rgxp) { |s| m = s.match(rgxp); "/#{m[1]}[regex(.,'#{m[2]}','#{m[3]}')]" }
      self.xpath(xpath, Class.new {
        def regex node_set, attr, regex
          node_set.find_all { |node| node[attr] =~ /#{regex}/ }
        end
      }.new)
    }
    

    Usage:

    divs = Nokogiri::HTML(page.root.to_html).
      xpath_regex("//div[@class~=/axtarget$/]//div[@class~=/^carbo/]")
    
    0 讨论(0)
  • 2021-02-03 23:16

    Use the xpath function starts-with:

    value.xpath('//p[starts-with(@id, "para-")]').each { |x| puts x['id'] }
    
    0 讨论(0)
  • 2021-02-03 23:30
    divs = value.css('div[id^="para-"]')
    
    0 讨论(0)
提交回复
热议问题