We have an AngularJS site using HTML5 routes. I just did some test \"Fetch as Google\" runs. The results are a bit confusing:
I will try to answer your questions based on our experiences in the last month of developing a SPA with HTML5 mode.
This is actually quite simple but easy to overlook. In fact, there are two different ways to get Googlebot to try the escaped_fragment. The first method is to run your site in non-html5 mode. This means that your URLs will be of the form:
http://my.domain.com/base/#!some/path/on/website
Googlebot recognizes the #! and makes a second call to your server with an altered URL:
http://my.domain.com/base/?_escaped_fragment_=some/path/on/website
Which you can then handle as you wish. The second way to get Googlebot to try _escaped_fragment_ mode is to include the following meta tag on the index page you supply to the bot:
<meta name="fragment" content="!">
This will make googlebot check the other version of the webpage every time it sees the tag. Interestingly you can use both these techniques together or you can do what we ended up doing, which is running in html5 mode with the meta tag. This means that your URLs will be escaped as follows:
http://my.domain.com/base/some/path/on/website?_escaped_fragment_=
Interestingly, the bot will not put anything at the end of the fragment. But depending on what webserver you are running, you can easily map this with a pattern matching the "_escaped_fragment_" text to your alternate bot page. For more information on the escaped fragment go here.
Google's Bots can actually interpret JavaScript to a limited extent since early 2014. For more information, read the official blog entry on google webmasters here. However, as is made clear in the blog entry, this comes with a lot of caveats. For instance:
As of 18/12/2014, we are still unsure if Googlebot can actually extract any information from an SPA in rendered mode for its index beyond finding links to follow in the javascript. In our experience, Googlebot will include {{}} in its index listing so that when you try to use {{}} to fill meta information (description, keywords, title, etc...) your site looks like this in Google Search results:
{{meta.siteTitle}}
http://my.domain.com/base/some/path/on/website
{{meta.description}}
rather than what you expect which might look like this:
Domain
http://my.domain.com/base/some/path/on/website
This is a random page on my domain. An excellent example page to be sure!
Google recommend to serve an HTML snapshot of AJAX website by using hashbang (#!) and _escaped_fragment_ param.
But as often for new Google feature all Google services do not support it from the begging.
For now, by experience, we are sure GoogleBot for indexing webpage use HTML snapshot and _escaped_fragment_. You can check your Server Access Logs to be sure Google did it on your application.
(For now and by experience, nothing official by Google) other services like PageSpeed Insight, Webmaster Tools parser, Richsnippet testing tools, etc.: hasbang (#!) is not supported. You have to use _escaped_fragment_.
No. Just don't. For different reasons :
Google looks for #! in our site urls and then takes everything after the #! and adds it in _escaped_fragment_ query parameter. Some developers create basic html pages with real data and serve these pages from server side at the time of crawling. So , why not we render same pages with PhantomJS on serve side which has _escaped_fragment_. For more detail please read this blog .
Maybe a bit outdated, but for the completeness:
According to the statement from May 23, 2014 Google bot is now able to "see your content more like modern Web browsers".
According to their statement from October 14, 2015 Google deprecated the AJAX crawling scheme.
So using the HTML5 History API (html5mode in angular) should be no problem to Google.