Can I use robots.txt to block any directory tree that starts with numbers?

核能气质少年 提交于 2019-12-11 04:10:02

问题


I'm not even sure if this is the best way to handle this, but I had made a temporary mistake with my rewrites and Google (possibly others) picked up on it, now it has them indexed and keeps coming up with errors.

Basically, I'm generating URLs based on a variety of factors, one being the id of an article, which is automatically generated. These then redirect to the correct spot.

I had first accidentally set up stuff like this:

/2343/news/blahblahblah

/7645/reviews/blahblahblah

Etc.

This was a problem for a lot of reasons, the main one being that there would be duplicates and stuff wasn't pointing to the right places and yada yada. And I fixed them to this now:

/news/2343/blahblahblah

/reviews/7645/blahblahblah

Etc.

And that's all good. But I want to block anything that falls into the pattern of the first. In other words, anything that looks like this:

** = any numerical pattern

/**/anythingelsehere

So that Google (and any others who have maybe indexed the wrong stuff) stops trying to look for these URLs that were all messed up and that don't even exist anymore. Is this possible? Should I even be doing this through robots.txt?


回答1:


You don't need to setup a robots.txt for that, just return 404 errors for those urls and Google and other search engines will eventually drop them.

Google also has Webmaster tools which you can use to deindex urls. I'm pretty sure other hosts have similar things.




回答2:


To answer the question: Yes, you can block any URLs that start with a number.

User-agent: *
Disallow: /0
Disallow: /1
Disallow: /2
Disallow: /3
Disallow: /4
Disallow: /5
Disallow: /6
Disallow: /7
Disallow: /8
Disallow: /9

It would block URLs like:

  • example.com/1
  • example.com/2.html
  • example.com/3/foo
  • example.com/4you
  • example.com/52347612

These URLs would still be allowed:

  • example.com/foo/1
  • example.com/foo2.html
  • example.com/bar/3/foo
  • example.com/only4you


来源:https://stackoverflow.com/questions/13355409/can-i-use-robots-txt-to-block-any-directory-tree-that-starts-with-numbers

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!