问题
Although I know a lot of email clients will pre-fetch or otherwise cache images. I am unaware of any that pre-fetch regular links like <a href="somelinkhere">some link</a>
Is this a practice done by some emails? If it is, is there a sort of no-follow type of rel
attribute that can be added to the link to help prevent this?
回答1:
Although I know a lot of email clients will pre-fetch or otherwise cache images.
That is not even a given already.
Many email clients – be they web-based, or standalone applications – have privacy controls that prevent images from being automatically loaded, to prevent tracking of who read a (specific) email.
On the other hand, there’s clients like f.e. gmail’s web interface, that tries to establish the standard of downloading all referenced external images, presumably to mitigate/invalidate such attempts at user tracking – if a large majority of gmail users have those images downloaded automatically, whether they actually opened the email or not, the data that can be gained for analytical purposes becomes watered down.
I am unaware of any that pre-fetch regular links like some link
Let’s stay on gmail for example purposes, but others will behave similarly: Since Google is always interested in “what’s out there on the web”, it is highly likely that their crawlers will follow that link to see what it contains/leads to – for their own indexing purposes.
If it is, is there a sort of no-follow type of rel attribute that can be added to the link to help prevent this?
rel=no-follow concerns ranking rather than crawling, and a no-index (either in robots.txt or via meta element/rel attribute) also won’t keep nosy bots from at least requesting the URL.
Plus, other clients involved – such as a firewall/anti-virus/anti-madware – might also request it for analytical purposes without any user actively triggering it.
If you want to be (relatively) sure that any action is triggered only by a (specific) human user, then use URLs in emails or other kind of messages over the internet only to lead them to a website where they confirm an action to be taken via a form, using method=POST; whether some kind of authentication or CSRF protection might also be needed, might go a little beyond the context of this question.
回答2:
As of Feb 2017 Outlook (https://outlook.live.com/) scans emails arriving in your inbox and it sends all found URLs to Bing, to be indexed by Bing crawler.
This effectively makes all one-time use links like login/pass-reset/etc useless.
(Users of my service were complaining that one-time login links don't work for some of them and it appeared that BingPreview/1.0b is hitting the URL before the user even opens the inbox)
Drupal seems to be experiencing the same problem: https://www.drupal.org/node/2828034
回答3:
All Common email clients do not have crawlers to search or pre-build <a>
tag related documents if that is what you're asking, as trying to pre-build and cache a web location could be an immense task if the page is dynamic or of large enough size.
Images are stored locally to reduce load time of the email which is a convenience factor and network load reduction, but when you open an email hyperlink it will load it in your web browser rather than email client.
I just ran a test using analytics to report any server traffic, and an email containing just
<a href="linktomysite">linktomysite</a>
did not throw any resulting crawls to the site from outlook07, outlook10, thunderbird, or apple mail(yosemite). You could try using a wireshark scan to check for network traffic from the client to specific outgoing IP's if you're really interested
回答4:
You won't find any native email clients that do that, but you could come across some "web accelerators" that, when using a web-based email, could try to pre-fetch links. I've never seen anything to prevent it.
Links (GETs) aren't supposed to "do" anything, only a POST is. For example, your "unsubscribe me" link in your email should not directly unsubscribe th subscriber. It should "GET" a page the subscriber can then post from.
W3 does a good job of how you should expect a GET to work (caching, etc.)
http://www.w3schools.com/tags/ref_httpmethods.asp
来源:https://stackoverflow.com/questions/34346732/do-any-common-email-clients-pre-fetch-links-rather-than-images