php code to exclude google

别来无恙 提交于 2019-12-06 09:20:00
Ryan Matthews

You can use the following snippit which should detect the GoogleBot and not store to the database.

if (!strpos($_SERVER['HTTP_USER_AGENT'],"Googlebot")) {
     // log to database
}

Why in the world would you want to only keep google out? Other search-engines may index your site aswell. What about bing, yahoo, altavista and others?

You can make use of a robots.txt to disallow any crawler to index your site.

Make a robots.txt in your root and put the following in it:

User-agent: *
Disallow: /

If you want to allow crawlers on some page tho, you can set the meta instead

<meta name="robots" content="noindex, nofollow" />

Not all bots are "nice" and respect these tags tho.

Did you think about all the other robots, spiders and automatic scripts surfing the web? They will also fill up your database. And it is hell to find out about all those UserAgents, IPs and other characteristics. Maybe it's better you just limit the history to lets say 25 entries.

So my answer is: limit the entries of your history db or store the history in a cookie in the visitors client.

<?php echo $_SERVER['REMOTE_ADDR'];?> 

will give you the address of the client. Then you set a session variable that will store or discard the pages based on your logic checking the ip.

@Jan's answer is better way. Although that will cut off all robots.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!