Reading RSS Feeds: What Aggregators Do That I'm Not

痞子三分冷 提交于 2019-12-12 01:29:57

问题


I drop the following feed into Google Reader, and it update normally.

http://www.indeed.ca/rss?q=&l=Hamilton%2C+ON

However, when I use any of a number of approaches suggested thither and yon on the 'net that simply involve reading from this source and parsing the XML I receive the same 20 items.

What is Google Reader doing that I should be in my code so that I receive new items?

Thanks for your advice. Incidentally, I'm coding in Python.


回答1:


RSS aggregators "poll" the sources, i.e., they repeat the HTTP query periodically on each source, and check if anything new appears in the results. That's unfortunate, as polling always is, as it wastes resources in an unending series of "are we there yet?" questions (kind of like taking a toddler along in a long car drive;-), and nevertheless implies delays (if you poll a given source every hour, say, you'll wait up to an hour to see some results).

Unfortunately, in the RSS architecture itself, there are no alternatives, no way to ask for a "callback" when new stuff appears or opt for a saner "publish-subscribe architecture".

A good effort to remedy that is pubsubhubbub, but it inevitably requires cooperation (above and beyond the RSS standards) from RSS sources and aggregators -- so it needs very wide takeup before it can be called "a solution" to the problem, though, technically, it already is (for cooperating sites;-).

So back to your question, you're doing nothing wrong: you just need to poll periodically, like RSS aggregators do, in order to get to see new results eventually.




回答2:


1) Have you tried with other RSS feeds?

2) If so, it sounds like some kind of cache... Are you behind some proxy?



来源:https://stackoverflow.com/questions/3382942/reading-rss-feeds-what-aggregators-do-that-im-not

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!