Can I block search crawlers for every site on an Apache web server?

前端未结

关注

 6  1137

走了就别回头了 2021-01-31 05:13

I have somewhat of a staging server on the public internet running copies of the production code for a few websites. I\'d really not like it if the staging sites get indexed. <

6条回答

醉话见心 (楼主)

2021-01-31 05:58
Create a robots.txt file with the following contents:
```
User-agent: *
Disallow: /
```
Put that file somewhere on your staging server; your directory root is a great place for it (e.g. /var/www/html/robots.txt).

Add the following to your httpd.conf file:
```
# Exclude all robots

    SetHandler None

Alias /robots.txt /path/to/robots.txt
```
The SetHandler directive is probably not required, but it might be needed if you're using a handler like mod_python, for example.

That robots.txt file will now be served for all virtual hosts on your server, overriding any robots.txt file you might have for individual hosts.

(Note: My answer is essentially the same thing that ceejayoz's answer is suggesting you do, but I had to spend a few extra minutes figuring out all the specifics to get it to work. I decided to put this answer here for the sake of others who might stumble upon this question.)
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...