Using a pushState
enabled page, normally you redirect SEO bots using the escaped_fragment
convention. You can read more about that here
I'm using Symfony2, and although I'm told by other devs that Googlebot and Bingbot execute Javascript well enough to generate their own HTML snippets, I don't feel confident. I also feel that serving static resources is a better alternative for ppl running with JS turned off (however unlikely that is) and so am interested in serving HTML snippets anyway, so long as it's not a hassle. Below is a method I'm thinking of using but haven't tried:
Here are other SO questions that are similar (one is mine).
Angularjs vs SEO vs pushState
HTML snippets for AngularJS app that uses pushState?
Here's a solution I posted in that question and am considering for myself in case I want to send HTML snippets to bots. This would be a solution for a Symfony2 backend:
In your Symfony2 routing file, create a route that matches your SPA. I have a test SPA running at localhost.com/ng-test/, so my route would look like this:
# Adding a trailing / to this route breaks it. Not sure why.
# This is also not formatting correctly in StackOverflow. This is yaml.
NgTestReroute:
----path: /ng-test/{one}/{two}/{three}/{four}
----defaults:
--------_controller: DriverSideSiteBundle:NgTest:ngTestReroute
--------'one': null
--------'two': null
--------'three': null
--------'four': null
----methods: [GET]
In your Symfony2 controller, check user-agent to see if it's googlebot or bingbot. You should be able to do this with the code below, and then use this list to target the bots you're interested in (http://www.searchenginedictionary.com/spider-names.shtml)...
if(strstr(strtolower($_SERVER['HTTP_USER_AGENT']), "googlebot"))
{
// what to do
}
If your controller finds a match to a bot, send it the HTML snippet. Otherwise, as in the case with my AngularJS app, just send the user to the index page and Angular will correctly do the rest.
Also, if your question been answered please select one so I and others can tell what worked for you.
I'm using PhantomJS to generate static snapshots of my pages. My directory structure is only one level deep (root
and /projects
), so I have two .htaccess files, in which I redirect to a PHP file (index-bots.php
) that starts a PhantomJS process pointed at my SPA index.html
and prints out the rendered static pages.
The .htaccess files look like this:
/.htaccess
# redirect search engine bots to index-bots.php
# in order to serve rendered HTML via phantomjs
RewriteCond %{HTTP_USER_AGENT} (bot|crawl|slurp|spider) [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !^/index-bots\.php [NC]
RewriteRule ^(.*)$ index-bots.php?url=%{REQUEST_URI} [L,QSA]
/projects/.htaccess
# redirect search engine bots to index-bots.php
# in order to serve rendered HTML via phantomjs
RewriteCond %{HTTP_USER_AGENT} (bot|crawl|slurp|spider) [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ ../index-bots.php?url=%{REQUEST_URI} [L,QSA]
A couple of notes:
!-f
RewriteCond
is critical! Since .htaccess will apply RewriteRule
s to all requests, assets on your page will each be rewritten to the PHP file, spinning up multiple instances of PhantomJS and bringing your server to its knees.index-bots.php
from the rewrites to avoid an endless loop.Had a similar problem on a single page web app.
The only solution I found to this problem was effectively creating static versions of pages for the purpose of making something navigable by the Google (and other) bots.
You could do this yourself, but there are also services that do exactly this and create your static cache for you (and serve up the snapshots to the bots over their CDN).
I ended up using SEO4Ajax, although other similar services are available!
I was having the exact same problem. For now, I've modified .htaccess like so:
RewriteCond %{QUERY_STRING} ^_escaped_fragment_=(.*)$
RewriteRule ^$ /snapshots/index.html? [L,NC]
RewriteCond %{QUERY_STRING} ^_escaped_fragment_=(.*)$
RewriteRule ^(.*)$ /snapshots/$1.html? [L,NC]
Not sure if there's a better solution, but it's working for me so far. Just be sure to have the directory structure for your snapshots match the URL structure.