How to protect html form from spammers?

谁说我不能喝 提交于 2019-11-27 13:18:15
Геннадий-Ванин

Update: The answer was accepted because I recommended KeyCAPTCHA. From my hard-earned painful expereince, KeyCAPTCHA is a scam by professional spammers. I removed my recommendations of KeyCAPTCHA


Note that most professional spambots are integrated with sweatshops (1 USD a 1000 solutions) human captcha solvers API. When a spambot cannot pass captcha itself it (spam bot), keeping hundreds of open connections, sends screenshot (or webpage code) with CAPTCHA for solving by sweatshop human. This is legal and big business. In order to be legal and integrate with bots through APIs the human solvers can not directly interact with cracked web boards (blog comments, registration pages, chats, wiki, forums, etc.).

Another problem is that anti-spam programs cannot detect context-based spamming by professionally made bot. There are many approaches. The simplest one is web scraping multi-author human dialogs from other web boards and posting them CONTEXT-SENSITIVELY (bots can detect topics) from different IP addresses of different countries at different times, so even (a weblog) owner (human) cannot detect that dialogs are posted by bots(they are really from stored in database human dialogs).

This is only the matter of interest to your website from professional spammers or time+qualified persistence of amateurs to automatically circumvent most (if not all) CAPTCHAs.

To be honest, I find those things quite useless. If someone can bypass your CAPTCHA then they will for sure be able to bypass simple mathematical equations, as it requires much less effort to do so.

If it is for a signup form I guess the best thing to do is to have a CAPTCHA + confirmation link sent by email (and exclude bogus email addresses, like mailinator). You can purge the DB from unconfirmed registrations periodically.

Of course there is no 100% safe method, any form of CAPTCHA can be bypassed (given enough time and resources), so I guess we have to live with that.

This question has come up many times on this sit [reference needed :) ]

It is quite a complex issue but I guess the short answer is that we are stuck with the usual methods!

I think this site addresses the issue quite well, but, as always I guess without horribly compromising the usability of the user you will have use CAPTCHA. The more you use it the less spam you'll get, but at a price remember that there is always the obtion of limiting by IP when a certain IP is involved in suspicious activity.

As fot the mat question validation, I have done som trying myself in PHP, it goes something like this:

<?php

$x = mt_rand(1,5);
$y = mt_rand(1,5);

function add($x, $y) { return $x + $y; }
function subtract($x, $y) { return $x - $y; }
function multiply($x, $y) { return $x * $y; }

$operators = array(
    'add',
    'subtract', 
    'multiply'
    );

$rdno = $operators[array_rand($operators)];

$result = call_user_func_array($rdno, array($x, $y));
session_start();
$_SESSION['res'] = $result;

if ($rdno == "add") {
    $whato = "+";
}elseif ($rdno == "subtract") {
    $whato = "-";
} else {
    $whato = "*";
}
$output = $x . $whato . $y . " = ";
$_SESSION['out'] = $output;
?>
<img src="image.php" />
<form name="input" action="check.php" method="post">
<input type="text" name="result" />
<input type="submit" value="Check" />
</form>

chech.php:

<?php

session_start();


if($_SESSION['res'] == $_POST["result"]){
    echo "correct!";
    $_MCAPTCHA = TRUE;

}else{

    echo "incorrect";
    $_MCAPTCHA = FALSE;

}
session_unset();

?>

and

<?php
session_start();
//image creation

// Create a 100*30 image
$im = imagecreate(100, 30);

// White background and blue text
$bg = imagecolorallocate($im, 255, 255, 255);
$textcolor = imagecolorallocate($im, 0, 0, 255);

// Write the string at the top left
imagestring($im, 5, 0, 0, $_SESSION['out'], $textcolor);

// Output the image
header('Content-type: image/png');

imagepng($im);
imagedestroy($im);
?>

You could add some gaussian blur to it to etc etc-

Of course this is only an example (DO NOT EVER USE THIS :) )

But is just and idea of what could be done.

This bad thing about this, is unless you want users to do very complex math (that may be fine to only some audiences) you have mor limited options and besides, if any one wants to target specifically your site, having limited options, might be a bad idea since very vulnerable.

To sum up, IMHO you are stuck with the usual ad will have to live with SOME spam, it's just a compromise that you might have to live with.

You might fint Jeff's article from coding horror very interesting.

Good luck!!

I'm getting problems with spam entries in my database through signup form. I have tried many open source Captcha solution, but still facing same problem.

What kind of spamprotection are you using. I find it strange that the spamprotection is failing(completely). Like a lot of other people are saying recaptcha is pretty good and a lot of big players are using them(Think Twitter).

You could for example make registration use recaptcha. Next verify the user is not posting spam by testing a number of (first) post for spam using wordpress's akismet. This should help you even more detecting spam.

Then again completely defeating spam is really difficult. It is almost impossible to completely defeat spam. I read somewhere that some spammers even hire people from India cheap to break your spam protection.

Would it be better to have a series of simple randomized questions or something like "6 + ? = 9" be better as a question? The only thing that concerns me is that if it's as easy as this to protect a signup then why aren't the big giant like Facebook doing this?

This approach has a couple of drawbacks:

  • This logic can easily be embedded inside of spambot. I could even write code that will defeat 6 + ? = 9 without any effort.
  • Some users could be bad(really) in math or don't know the answer to your question.

Since that wasn't mentioned here, I'll briefly go over the method I have been using rather successfully on a moderately visited forum. Note, that I will only explain the basic idea. There are several variations that can be implemented to make automated spam even harder.

What I do is this:

  1. Introduce some constant as salt. This constant is unique to your site and it's supposed to be a secret.
  2. Use the remote IP, user agent, hour of the day (note that this can make it fail if the hour switches in between requesting and sending the form) and similar data to calculate a salted hash (MD5, SHA1) ... another input to it is the original field name of the forum element (e.g. email, name, ...) so that each field name is now computed per client. I prepend some letter or similar to make sure the name doesn't start with a digit, which can cause problems.
  3. User sends the form.
  4. Receiving script has the same input data (i.e. it does not have to be sent via the form or so).
  5. After the receiving script uses the same method as in 2., it can evaluate the form data and take respective action.

Again, this can be combined with other means. But the unique salt will allow this to be widely used - different salt values make it impossible to predict the field name easily, even if the method to compute the hash is known. Other means will have to be used to disguise the respective form input elements if the spammers get smart, though (i.e. if they don't just look for the name of the field).

It's simple, 100% screen-reader-compatible (i.e. usable even for blind people) and worked wonders for me. It cut down tremendously on spam in a forum I manage. Hope it'll help you, too.

Even captchas are decoded as can be seen in this article by John Resig:

OCR and Neural Nets in JavaScript

And there exist online tools too.

Having said that, the popular Google's reCAPTCHA solution seems to be good to go for, the one used by this site as well.

On the other hand, one always has the option of moderation.

Have you already tried reCAPTCHA?

There are already many spambots out there which can solve simple math questions.

The reason Facebook isn't using something like that is if they did, their solution would be specifically cracked because they are a massive company with millions of users.

Are you sure you can't use reCAPTCHA? I think it is the best captcha on the internet right now.

But I thought of a completely different approach to the problem, which may be worth trying.

You could saddle Google, Twitter, Facebook and others with the problem by using OpenID for sign up. This way, spammers need to have a Google Account, for instance. I'm pretty sure they won't spam with that.

I've just finished porting the excellent CFFormProtect by Jake Munson to PHP. It's hosted at http://code.google.com/p/phpformprotect/

It uses a combination of tests including javascript-based checks for mouse movement, keyboard usage and time spent filling it out, as well as some basic checks for urls, spammy words and optional integration with Akismet and Project Honey Pot. I've found it to be an excellent deterrent that's pretty much completely invisible to legitimate users.

I'm sure the port needs work but it works for me. Feel free to contribute anything.

You can do that without captcha, you can add an hidden form and than check if this form that people can' t see is filled, you can do that with php

if($_POST['hidden_input'] != ""){
    echo('<p>You are a spam bot</p>');    
}

This because spambot usually fill every text area.

In the form you should add only

 <input type="text" id="hidden_input" name="hidden_input" style="display:none;"/>
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!