I already have a 404 handler in the SPA which works. The problem here is that Google for example links to old pages that no longer exist. While the user will see a custom 404 co
I have performed some research on how SPA can mimic or respond to search-bots-requests, so here we go - three working solutions.
Supporting links:
Meta tag #1
Description:
HTTP code 404 means that there is no resource or it was removed permanently. Removed resource means that we want to tell GoogleBot to remove the "dead" link from search index. Great! Now we have another question which can be answered -
As Google docs state:
You can prevent a page from appearing in Google Search by including a noindex meta tag in the page's HTML code, or by returning a 'noindex' header in the HTTP request. When Googlebot next crawls that page and see the tag or header, Googlebot will drop that page entirely from Google Search results, regardless of whether other sites link to it.
Supporting links:
Meta tag #2
Description:
If we cannot (or do not want to) use our server to respond with 404 or any other code we can try to perform some sort of redirect - seo-safe
redirect (if there is no JS enabled).
This redirect uses HTML meta
-tag, an example (redirects to example.com immediately):
Quote from StackOverflow answer:
As a reminder, and although it is not the preferred way to perform a redirect, Google accepts and follows pages having a Refresh tag with its delay set to 0, because, in some tricky cases, there is simply no other way to perform a redirect. This is the recommended method for Blogger pages (owned by Google).
HTTP code 301 will eventually be converted
to 404 if you will permanently redirect to a file which does not exist. From Google Docs (Prepare for 301 redirects):
While Googlebot and browsers can follow a "chain" of multiple redirects (e.g., Page 1 > Page 2 > Page 3), we advise redirecting to the final destination. If this is not possible, keep the number of redirects in the chain low, ideally no more than 3 and fewer than 5. Chaining redirects adds latency for users, and not all browsers support long redirect chains.
Supporting links:
JavaScript Redirect
Description:
Perform an onload
-redirect with window.location = '/404.html'
to invalid location (a file that does not exist) + integrate Google Not Found Widget.
Supporting links: