How do I tell search engines not to index content via secondary domain names?

拟墨画扇 提交于 2019-12-07 16:02:41

You can simply create a redirect with a .htaccess file like this:

RewriteEngine on
RewriteCond %{HTTP_HOST} \.b\.com$ [OR]
RewriteCond %{HTTP_HOST} \.c\.com$
RewriteRule ^(.*)$ http://a.com/$1 [R=301,L]

robots.txt is the way to tell spiders what to crawl and what to not crawl. If you put the following in the root of your site at /robots.txt:

User-agent: *
Disallow: /

A well-behaved spider will not search any part of your site. Most large sites have a robots.txt, like google

User-agent: *
Disallow: /search
Disallow: /groups
Disallow: /images
Disallow: /news
#and so on ...

It pretty much depends of what you want to achieve. 301 will say that the content is moved permanently (and it is the proper way of transferring PR), is this what you want to achieve?

You want Google to behave? Than you may use robots.txt, but keep in mind there is a downside: this file is readable from outside and every time located in the same place, so you basically give away the location of directories and files that you may want to protect. So use robots.txt only if there is nothing worth protecting.

If there is something worth protecting than you should password protect the directory, this would be the proper way. Google will not index password protected directories.

http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93708

For the last method it depends if you want to use the httpd.conf file or .htaccess. The best way will be to use httpd.conf, even if .htaccess seems easier.

http://httpd.apache.org/docs/2.0/howto/auth.html

Have your server side code generate a canonical reference that point to the page to be considered "source". Example =

Reference: http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html - Update: this link-tag is currently also supported by Ask.com, Microsoft Live Search and Yahoo!.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!