问题
As per the official documentation, properly setup websites should indicate the URL of their RSS / Atom feed(s) when asked politely:
GET / HTTP/1.1 Host: example.com Accept: application/rss+xml, application/xhtml+xml, text/html
When an HTTP server (or server-side script) gets this, it should redirect the HTTP client to the feed. It should do this with an HTTP 302 Found. Something like:
HTTP/1.1 302 Found Location: http://example.com/feed
I'm trying to get this response, without luck:
request(
{ method: 'GET',
url: 'https://stackoverflow.com',
followRedirect :false,
accept: ['application/rss+xml', 'application/xhtml+xml', 'text/html']
}, function (error, response, body) {
console.log('statusCode: ', response.statusCode);
}
);
Yelds
statusCode: 200
How do I formulate my request so that the website responds with the feed URL(s)?
回答1:
It is not common practice for websites to send back their RSS feed from an HTTP request to the home page asking for an application/rss+xml MIME type in the Accept header. That documentation on Mozilla you've linked is a suggestion I've never seen before after many years involvement in RSS as a developer.
A more established and widely adopted method for a site to identify its RSS feed is a technique called RSS Autodiscovery. Open the site's home page and look for this tag in the HEAD section:
<link rel="alternate" type="application/rss+xml" title="RSS"
href="http://feeds.example.com/rss-feed">
The type attribute can be any of the MIME types for RSS, Atom or JSONFeed feeds.
回答2:
The material you quote is prefixed with:
Although this advanced technique for syndication is not required, support of this is recommended, especially for web sites and applications with high performance needs.
If you get HTML back, then you should construct a DOM with an HTML parser and then search it for the appropriate <link>
element as described in an earlier section of that page.
来源:https://stackoverflow.com/questions/49479712/how-to-get-the-feed-urls-from-a-website