Using YQL multi-query & XPath to parse HTML, how to escape nested quotes?

前端 未结 2 749
死守一世寂寞
死守一世寂寞 2021-02-10 18:00

The title is more complicated than it has to be, here\'s the problem query.

SELECT * 
FROM query.multi 
WHERE queries=\"
    SELECT * 
        FROM html 
                


        
相关标签:
2条回答
  • 2021-02-10 18:22

    I've come up with a solution that doesn't really answer my original question but does solve the problem.

    The data.html.cssselect table will take a CSS selector & parse it into an XPath, avoiding the nasty escaping issues.

    SELECT *
    FROM query.multi 
        WHERE queries="
            SELECT * 
                FROM data.html.cssselect 
                WHERE url='http://www.stumbleupon.com/url/http://www.guildwars2.com' 
                AND css='li.listLi div.views a span';
            SELECT * 
                FROM xml 
                WHERE url='http://services.digg.com/1.0/endpoint?method=story.getAll&link=http://www.guildwars2.com';
            SELECT * 
                FROM json 
                WHERE url='http://api.tweetmeme.com/url_info.json?url=http://www.guildwars2.com';
            SELECT * 
                FROM xml 
                WHERE url='http://api.facebook.com/restserver.php?method=links.getStats&urls=http://www.guildwars2.com';
            SELECT * 
                FROM json 
                WHERE url='http://www.reddit.com/button_info.json?url=http://www.guildwars2.com'"
    
    0 讨论(0)
  • 2021-02-10 18:36

    You need to escape whatever character is delimiting your XPath query with a double backslash... in other words:

    SELECT * FROM query.multi 
    WHERE queries="
        SELECT * 
            FROM html 
            WHERE url='http://www.stumbleupon.com/url/http://www.guildwars2.com' 
            AND xpath='//li[@class=\\'listLi\\']/div[@class=\\'views\\']/a/span';
        SELECT * 
            FROM xml 
            WHERE url='http://services.digg.com/1.0/endpoint?method=story.getAll&link=http://www.guildwars2.com';
        SELECT * 
            FROM json 
            WHERE url='http://api.tweetmeme.com/url_info.json?url=http://www.guildwars2.com';
        SELECT * 
            FROM xml 
            WHERE url='http://api.facebook.com/restserver.php?method=links.getStats&urls=http://www.guildwars2.com';
        SELECT * 
            FROM json 
            WHERE url='http://www.reddit.com/button_info.json?url=http://www.guildwars2.com'"
    

    (try this in the YQL console)

    0 讨论(0)
提交回复
热议问题