问题
I want to, using PHP, differentiate between an actual person and a bot. I currently track page views and they are massively inflated due to bots crawling my pages so I want to only record real people. It doesn't matter if its not 100% accurate I just want a nice simple way to do it via PHP.
To be clear, this is not for analytics's per se; it is so that I can track what images are being served daily so I can produce a "top images of the day" sort of script.
回答1:
You should be checking the user agent string, most well behaved search bots will report themselves as such.
Google's spider for example.
回答2:
First, the obvious: check the user agent.
I use another trick that works pretty good. I map robots.txt to a PHP file and log the IP into the database. Then when logging user activity, I make sure they aren't from one of those logged IPs. If the user authenticates via the login system then I track them regardless.
Of course neither solution guarantees any accuracy, but for general logging, it has been sufficient for my purposes.
回答3:
I'm not sure that PHP is the best solution for this kind of problem.
You can read How to block bad bots and How to block spambots, ban spybots, and tell unwanted robots to go to hell to see more solutions about blocking bots but this time with apache.
Apache will act faster a require less CPU to do this sort of task than a php program.
来源:https://stackoverflow.com/questions/3560613/php-differentiate-between-a-human-user-and-a-bot-other