Using a pushState
enabled page, normally you redirect SEO bots using the escaped_fragment
convention. You can read more about that here
I'm using PhantomJS to generate static snapshots of my pages. My directory structure is only one level deep (root
and /projects
), so I have two .htaccess files, in which I redirect to a PHP file (index-bots.php
) that starts a PhantomJS process pointed at my SPA index.html
and prints out the rendered static pages.
The .htaccess files look like this:
/.htaccess
# redirect search engine bots to index-bots.php
# in order to serve rendered HTML via phantomjs
RewriteCond %{HTTP_USER_AGENT} (bot|crawl|slurp|spider) [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !^/index-bots\.php [NC]
RewriteRule ^(.*)$ index-bots.php?url=%{REQUEST_URI} [L,QSA]
/projects/.htaccess
# redirect search engine bots to index-bots.php
# in order to serve rendered HTML via phantomjs
RewriteCond %{HTTP_USER_AGENT} (bot|crawl|slurp|spider) [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ ../index-bots.php?url=%{REQUEST_URI} [L,QSA]
A couple of notes:
!-f
RewriteCond
is critical! Since .htaccess will apply RewriteRule
s to all requests, assets on your page will each be rewritten to the PHP file, spinning up multiple instances of PhantomJS and bringing your server to its knees.index-bots.php
from the rewrites to avoid an endless loop.