Is there a way to escape non-alphanumeric characters in Nokogiri css?

倾然丶 夕夏残阳落幕 提交于 2020-01-04 09:12:32

问题


I have an anchor tag:

file.html#stuff-morestuff-CHP-1-SECT-2.1

Trying to pull the referenced content in Nokogiri:

documentFragment.at_css('#stuff-morestuff-CHP-1-SECT-2.1')

fails with the error:

unexpected '.1' after '[#<Nokogiri::CSS:
:Node:0x007fd1a7df9b40 @type=:CONDITIONAL_SELECTOR, @value=[#<Nokogiri::CSS::Node:0x007fd1a7df9b90 @type=:ELEMENT_NAME, @value=["*"]>, #<Nokogiri::CSS::Node:0x007fd1a7df9cd0 @
type=:ID, @value=["#unixnut4-CHP-1-SECT-2"
]>]>]' (Nokogiri::CSS::SyntaxError)

Just trying talk through this - I think Nokogiri is complaining about the .1 in the selectorId, because . is not valid in an html id.

I don't own the content, so I really don't want to go through and fix all the bad IDs if it is avoidable. Is there a way to escape non-alphanumeric selectors in a nokogiri .css() call?


回答1:


Assuming your HTML looks something like this:

<div id='stuff-morestuff-CHP-1-SECT-2.1'>foo</div>

The string in question, stuff-morestuff-CHP-1-SECT-2.1, is a valid HTML ID, but it isn’t a valid CSS selector — the . character isn’t valid there.

You should be able to escape the . with a slash character, i.e. this is a valid CSS selector:

#stuff-morestuff-CHP-1-SECT-2\.1

Unfortunately this doesn’t seem to work in Nokogiri, there may be a bug in the CSS to XPath translation that it does. (It does work in the browser).

You can get around this by just checking the id attribute directly:

documentFragment.at_css('*[id="stuff-morestuff-CHP-1-SECT-2.1"]')

Even if slash escaping worked, you would probably have to check the id attribute like this if it value started with a digit, which is valid in HTML but cannot be (as far as I can tell) expressed as a CSS selector, even with escaping.

You could also use XPath, which has an id function that you can use here:

documentFragment.xpath("id('stuff-morestuff-CHP-1-SECT-2.1')")


来源:https://stackoverflow.com/questions/25108319/is-there-a-way-to-escape-non-alphanumeric-characters-in-nokogiri-css

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!