yql

How to select the first n elements in XPath

南楼画角 提交于 2019-12-24 17:33:13
问题 I am using YQL to scrape some images from a website. The problem is I want only the first 5 images from that website. I have the following query: select * from html where url="http://myanimelist.net/anime/9253/Steins;Gate" and xpath='//img[position()<=5]' But, it is returning all image elements instead of the first 5. YQL console: open YQL console with above XPath Is there anything wrong with my XPath query ? PS: I cannot use LIMIT 5 since I may need to scrape some other tags too. 回答1: This

How to select the first n elements in XPath

…衆ロ難τιáo~ 提交于 2019-12-24 17:33:04
问题 I am using YQL to scrape some images from a website. The problem is I want only the first 5 images from that website. I have the following query: select * from html where url="http://myanimelist.net/anime/9253/Steins;Gate" and xpath='//img[position()<=5]' But, it is returning all image elements instead of the first 5. YQL console: open YQL console with above XPath Is there anything wrong with my XPath query ? PS: I cannot use LIMIT 5 since I may need to scrape some other tags too. 回答1: This

YQL JSON script not returning?

拜拜、爱过 提交于 2019-12-24 15:49:42
问题 I have a script here, copied pretty much directly off this. Why doesn't the code, listed below, return anything? ajax.html : <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html dir="ltr" lang="en-US"> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title>Cross-Domain Ajax Demo</title> </head> <body> <div id="container"> <form> <p><label>Type a URL:</label><input type="text" name="sitename" id="sitename"/></p> <p><input

Performing image scrapping using YQL with lowest resources usage possible i.e. lowest number of queries

孤人 提交于 2019-12-24 01:36:07
问题 I am trying to perform some image scrapping tool which enables the user to scrap all the images contained within a given page using xpath process the scrapped images to find which have an alt tags and which doesn't and return the result as two separate json objects i.e. {alted:["",""],nonAlted:["",""]} now comes my problem, although i am able to scrap the page and retrieve all the images and separate them to the alted and nonAlted categories i can't put them in the response object ! I think

Why `search.web` YQL table doesn't work anymore?

一笑奈何 提交于 2019-12-23 16:26:56
问题 When I'm using search.web YQL table, I always get the error: No definition found for Table search.web in my YQL statements. Even when using SELECT url FROM search.web(0,10) WHERE query="stackoverflow" for example. So I am assuming Yahoo discontinued search.web or BOSS? What are the alternatives? Is there still a similar way to crawl the web? 回答1: We can read in YQL Blog: We’ve removed all search tables that relied on the BOSS v1 API (search.web, search.image, and search.news) as the

How to get the formatted view of YQL as result?

余生长醉 提交于 2019-12-23 01:36:21
问题 YQL gives out result only in tree view. Is there any way to get the result in Formatted view?? 回答1: Use an XSLT stylesheet to create a formatted view. Here is an example for an RSS feed: <?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="XML" encoding="utf-8" doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN" doctype-system=http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd indent="yes"/> <xsl

Doing a Simple Yahoo Search in Python

我的梦境 提交于 2019-12-23 00:41:05
问题 I need to write a Python script which, at one point, does a Yahoo web search to find and download a bunch of C source files. I'm very new to this and I can't figure out how to just get started with doing a simple web search... I've seen a lot of stuff about BOSS but, from my understanding, this is something you need to pay to use? I am not willing to pay for this. I've used Python YQL to get some RSS results as follows: import yql y = yql.Public() result = y.execute('select * from rss where

How to simply display the xml output from YQL or have the JSON output to html

元气小坏坏 提交于 2019-12-22 12:54:34
问题 So I've been working on a way to scrape the data from a page and display it (in roughly the same format as the source). I found YQL and I am finding it brilliant, except I can't figure out how to just display the whole output with nothing special (except the basic formatting) The YQL input code is: select * from html where url="http://directory.vancouver.wsu.edu/anthropology" and xpath="//div[@id='facdir']" using that it returns the JSON: http://query.yahooapis.com/v1/public/yql?q=select%20*

Yahoo Pipes: filter items in a feed based on words in a text file

我怕爱的太早我们不能终老 提交于 2019-12-22 12:12:47
问题 I have a pipe that filters an RSS feed and removes any item that contains "stopwords" that I've chosen. Currently I've manually created a filter for each stopword in the pipe editor, but the more logical way is to read these from a file. I've figured out how to read the stopwords out of the text file, but how do I apply the filter operator to the feed, once for every stopword? The documentation states explicitly that operators can't be applied within the loop construct, but hopefully I'm

How do I pass a Yahoo Pipes item into a YQL query?

青春壹個敷衍的年華 提交于 2019-12-22 11:35:43
问题 One common thing to want to do in the Yahoo Pipes YQL element is pass in a Pipes value to the YQL query. For example: select * from html.tostring where url='<someurl>' and xpath='//div[@id="foo"]' and you want to pass in a dynamic value for <someurl> . Let's say that it's an RSS feed item's URL called item.link . Attempting to simply replace the quoted someurl with item.link gives you this error: Invalid identifier item.link. me is the only supported identifier in this context How can I pass