How to properly escape single and double quotes

后端 未结 3 1677
再見小時候
再見小時候 2021-01-13 00:01

I have a lxml etree HTMLParser object that I\'m trying to build xpaths with to assert xpaths, attributes of the xpath and text of that tag. I ran into a problem when the te

相关标签:
3条回答
  • 2021-01-13 00:25

    The solution is applicable If u r using python lxml. Its better to leave the escaping for lxml. We can do this by using lxmlvariables. Suppose We have xpath as below:

    //tagname[text='some_text']`
    

    If some_text has both single and double quotes, then it causes "Invalid Predicate error". Neither escaping work for me nor triple quotes. Because xml won't accept triple quotes.

    Solution worked for me is lxml variables.

    We convert the xpath as below:

    //tagname[text = $var]
    

    Then execute

    find = etree.XPath(xpath)
    

    Then evaluate these variable to its value

    elements = find(root, {'var': text})
    
    0 讨论(0)
  • 2021-01-13 00:28

    According to what we can see in Wikipedia and w3 school, you should not have ' and " in nodes content, even if only < and & are said to be stricly illegal. They should be replaced by corresponding "predefined entity references", that are &apos; and &quot;.

    By the way, the Python parsers I use will take care of this transparently: when writing, they are replaced; when reading, they are converted.

    After a second reading of your answer, I tested some stuff with the ' and so on in Python interpreter. And it will escape everything for you!

    >>> 'text {0}'.format('blabla "some" bla')
    'text blabla "some" bla'
    >>> 'ntsnts {0}'.format("ontsi'tns")
    "ntsnts ontsi'tns"
    >>> 'ntsnts {0}'.format("ontsi'tn' \"ntsis")
    'ntsnts ontsi\'tn\' "ntsis'
    

    So we can see that Python escapes things correctly. Could you then copy-paste the error message you get (if any)?

    0 讨论(0)
  • 2021-01-13 00:32

    there are more options to choose from, especially the """ and ''' might be what you want.

    s = "a string with a single ' quote"
    s = 'a string with a double " quote'
    s = """a string with a single ' and a double " quote"""
    s = '''another string with those " quotes '.'''
    s = r"raw strings let \ be \"
    s = r'''and can be added \ to " any ' of """ those things'''
    s = """The three-quote-forms
           may contain
           newlines."""
    
    0 讨论(0)
提交回复
热议问题