Select mixed mode content in Capybara

问题

I am trying to extract mixed mode content using Capybara. I did it using Nokogiri, but wonder why similar is not possible with Capybara.

require 'nokogiri'

doc = Nokogiri::HTML("<h1><em>Name</em>A Johnson </h1>")
puts doc.at_xpath("//h1/text()").content

It works, but when I try same XPath selector in Capybara it doesn't work.

visit('http://stackoverflow.com')
puts find(:xpath, "//h1/text()").text

It raises error:

[remote server] file:///tmp/webdriver-profile20120915-8089-kxrvho/extensions/fxdriver@googlecode.com/components/driver_component.js:6582:in `unknown': The given selector //h1/text() is either invalid or does not result in a WebElement. The following error occurred: (Selenium::WebDriver::Error::InvalidSelectorError)
[InvalidSelectorError] The result of the xpath expression "//h1/text()" is: [object Text]. It should be an element.

How to extract this text?

回答1:

Capybara requires a driver, and the XPath will be executed by the driver. From your error message, it is clear you are using selenium-webdriver, which will use a browser's native XPath implementation where available. For IE, it usees its own.

You appear to be using a combination where the XPath implementation is not fully compliant. You can try to change the driver or browser, but if you really want to use Nokogiri to extract content, you should be able to do the following:

doc = Nokogiri::HTML(page.html)
puts doc.at_xpath("//h1/text()").content

回答2:

I do not believe Capybara or Selenium-Webdriver have any support for directly accessing text nodes. However, if you do not want to use nokogiri, you can use selenium-webdriver to execute javascript.

You can do this (in Capybara using Selenium-Webdriver):

element = page.find('h1').native
puts page.driver.browser.execute_script("return arguments[0].childNodes[1].textContent", element)
#=> A Johnson

来源：https://stackoverflow.com/questions/12437585/select-mixed-mode-content-in-capybara

标签

ruby-on-rails

ruby

xpath

capybara