I am currently writing a web application using angularjs, but I think this question applies to any client-side javascript framework that does routing on the client side (as
If you care about SEO, one of the ways that angular.io was able to solve this problem (at least with Google anyway) is by using noindex meta tag "to indicate soft-404 status which will prevent crawlers from crawling the content of the page". Apparently it can be added to the document via JavaScript.
Alternatively, using JavaScript, you can redirect to a page that will respond with an actual HTTP 404 status code. Google understands JavaScript redirects just fine. Your original /does-not-exist
page, when redirected to /404-error?from=does-not-exist
, will be associated with the 404 status code returned by the server. The URL structure does not matter, only the status code and the redirect are important here.
Your other options are SSR (Nuxt.js, Next.js, Angular Universal, etc) or pre-rendering (prerender.io, puppeteer, etc) which Google calls dynamic rendering where you respond to search bot requests with a pre-rendered version while human users get your normal client-side rendered app.
tl;dr: Drop hashbang support and opt for PJAX like behavior if you care about SEO.
Are you making an App or a Website? If website you need to return 404
so that you don't confuse google. It needs be a real 404
not just show a message of page not found (ie 200
with message "page not found" is very bad). Also what browsers do you care to support?
My opinion is that the whole hashbang server side rendering should be avoided (ie the nasty Google SEO #!
hack). Either use real pushstate or re-render the whole page if the URL changes for browsers that don't support pushstate (not a hash change).
Now the reason this matters is that a #!
should never return a 404
because it doesn't make sense and its impossible to mimic server side because the server never gets whats after the #!
with out running Javascript.
Thus if you really care about SEO I would do something like PJAX and only use true pushstate for routing and then just fail to old web 1.0. Consequently the links I recommend you share that can truly be a 404
should not have #!
(traditional #
being fine so long as the contents of the page don't change drastically).
Finally the 404
is mostly not a problem but rather 30X
ie redirect responses. Thats because the browser will automatically handle redirects so your Javascript AJAX calls will never see a 30X
(they will get the redirect response instead... ie 200). To handle 30X
responses you will have to send a header back for every request to indicate what the redirected URL is/was (ie what you were redirected to) so that you don't mess up the Pushstate History.
Of course if you need to support hashbang like Twitter used too (and they are the ones that even killed hashbang), you can leverage Google Sitemaps and the rel=nofollow to try to mitigate bad SEO.