I have staging and production apps on Heroku.
For crawler, I set robots.txt file.
After that I got message from Google.
Dear Webmaster
A great solution with Rails 3 is to use Rack. Here is a great post that outlines the process: Serving Different Robots.txt Using Rack. To summarize, you add this to your routes.rb:
# config/routes.rb
require 'robots_generator' # Rails 3 does not autoload files in lib
match "/robots.txt" => RobotsGenerator
and then create a new file inside lib/robots_generator.rb
# lib/robots_generator.rb
class RobotsGenerator
# Use the config/robots.txt in production.
# Disallow everything for all other environments.
# http://avandamiri.com/2011/10/11/serving-different-robots-using-rack.html
def self.call(env)
body = if Rails.env.production?
File.read Rails.root.join('config', 'robots.txt')
else
"User-agent: *\nDisallow: /"
end
# Heroku can cache content for free using Varnish.
headers = { 'Cache-Control' => "public, max-age=#{1.month.seconds.to_i}" }
[200, headers, [body]]
rescue Errno::ENOENT
[404, {}, ['# A robots.txt is not configured']]
end
end
Finally make sure to include move robots.txt into your config folder (or wherever you specify in your RobotsGenerator
class).
What about serving /robots.txt
dynamically using a controller action instead of having a static file?
Depending on the environment you allow or disallow search engines to index your application.