I want to add nofollow
and noindex
to my site whilst it\'s being built. The client has request I use these rules.
I am aware of
There is a non-standard Noindex
field, which Google (and likely no other consumer) supported as experimental feature.
Following the robots.txt specification, you can’t disallow indexing nor following links with robots.txt.
For a site that is still in development, has not been indexed yet, and doesn’t get backlinks from pages which may be crawled, using robots.txt should be sufficient:
# no bot may crawl
User-agent: *
Disallow: /
If pages from the site are already indexed, and/or if other pages which may be crawled link to it, you have to use noindex
, which can not only be specified in the HTML, but also as HTTP header:
X-Robots-Tag: noindex, nofollow
I just read this thread, and thought to add an idea.
In case one wants to place a site under construction or development, not vieawable to unauthorized users I think this idea is safe although a bit of IT proficiency is required.
There is a "hosts" file on any operating system, that works as a manual repository of DNS entries, overriding an online DNS server.
In Windows, it is under C:\Windows\System32\drivers\etc\hosts and linuxes distros (Android, too) I know have it under /etc/hosts. Maybe in OSX it's the same.
The idea is to add an entry like
xxx.xxx.xxx.xxx anyDomain.tld
to that file. It is important that the domain is created in your server/provider, but it is not sent to the DNS servers yet.
What happens: while the domain is created in the server, it will respond to calls on that domain, but no one else (no browsers) in the internet will know the IP address to your site, besides the computers you have added the above snippet to the hosts file.
In this situation, you can add the change to anyone interested in seeing your site (and has your authorization), end no one else will be able to see your site. No crawler will see it until you publish the DNS online.
I even use it for a private file server that my family share.
Here you can find a thorough explanation on how to edit the hosts file: https://www.howtogeek.com/howto/27350/beginner-geek-how-to-edit-your-hosts-file/
noindex and nofollow means you do not want your site to crawl in search engine.
so simply put code in robots.txt
User-agent: *
Disallow: /
it means noindex and nofollow.