Facebook crawler is hitting my server hard and ignoring directives. Accessing same resources multiple times

后端 未结 8 1440
盖世英雄少女心
盖世英雄少女心 2021-02-05 05:52

The Facebook Crawler is hitting my servers multiple times every second and it seems to be ignoring both the Expires header and the og:ttl property.

In some cases, it is

8条回答
  •  野的像风
    2021-02-05 06:26

    According to Facebook documentation only Facebot crawler respects the crawling directives. However they also suggest this

    You can target one of these user agents to serve the crawler a nonpublic version of your page that has only metadata and no actual content. This helps optimize performance and is useful for keeping paywalled content secure.

    Some people suggest to rate limit the access for facebookexternalhit however I doubt that is a good idea since it may prevent the crawler to update the content.

    Seeing multiple hits from different IPs but the same bot may be acceptable, depending on their architecture. You should check how often the same resource gets crawled. og:ttl is what the documentation recommends and should help.

提交回复
热议问题