How to add `nofollow, noindex` all pages in robots.txt?

前端 未结 4 1344
半阙折子戏
半阙折子戏 2020-12-19 03:44

I want to add nofollow and noindex to my site whilst it\'s being built. The client has request I use these rules.

I am aware of

相关标签:
4条回答
  • 2020-12-19 04:06

    There is a non-standard Noindex field, which Google (and likely no other consumer) supported as experimental feature.

    Following the robots.txt specification, you can’t disallow indexing nor following links with robots.txt.

    For a site that is still in development, has not been indexed yet, and doesn’t get backlinks from pages which may be crawled, using robots.txt should be sufficient:

    # no bot may crawl 
    User-agent: *
    Disallow: /
    

    If pages from the site are already indexed, and/or if other pages which may be crawled link to it, you have to use noindex, which can not only be specified in the HTML, but also as HTTP header:

    X-Robots-Tag: noindex, nofollow
    
    0 讨论(0)
  • 2020-12-19 04:08
    • Noindex tells search engines not to include pages in search results, but can follow links (and also can transfer PA and DA)
    • Nofollow tells bots not to follow the links. We also can combine noindex with follow in pages we don´t want to be indexed, but we want to follow the links
    0 讨论(0)
  • 2020-12-19 04:10

    I just read this thread, and thought to add an idea.

    In case one wants to place a site under construction or development, not vieawable to unauthorized users I think this idea is safe although a bit of IT proficiency is required.

    There is a "hosts" file on any operating system, that works as a manual repository of DNS entries, overriding an online DNS server.

    In Windows, it is under C:\Windows\System32\drivers\etc\hosts and linuxes distros (Android, too) I know have it under /etc/hosts. Maybe in OSX it's the same.

    The idea is to add an entry like

    xxx.xxx.xxx.xxx anyDomain.tld

    to that file. It is important that the domain is created in your server/provider, but it is not sent to the DNS servers yet.

    What happens: while the domain is created in the server, it will respond to calls on that domain, but no one else (no browsers) in the internet will know the IP address to your site, besides the computers you have added the above snippet to the hosts file.

    In this situation, you can add the change to anyone interested in seeing your site (and has your authorization), end no one else will be able to see your site. No crawler will see it until you publish the DNS online.

    I even use it for a private file server that my family share.

    Here you can find a thorough explanation on how to edit the hosts file: https://www.howtogeek.com/howto/27350/beginner-geek-how-to-edit-your-hosts-file/

    0 讨论(0)
  • 2020-12-19 04:11

    noindex and nofollow means you do not want your site to crawl in search engine.

    so simply put code in robots.txt User-agent: * Disallow: /

    it means noindex and nofollow.

    0 讨论(0)
提交回复
热议问题