One of the rules in today’s web is if you have forms on your website, you need to have some kind of spam prevention measure implemented. Here I will talk us through the several methods that can be used and why CAPTCHA isn’t one.
Why not CAPTCHA?
Well, spam prevention wise CAPTCHA challenge works. But the main problem is it considerably lowers the user experience by requiring extra effort and time to read the garbled image that may deter the user from completing the form altogether. To overcome that, they came out with the audio version of CAPTCHA which seriously, is even more troublesome. On top of that, having a CAPTCHA image certainly cripples the aesthetics of a well-designed form. With all these shortcomings however, CAPTCHA might be the best method to counter manual spammers (humans) because they would get too annoyed to do the CAPTCHA challenge.
Apart from image CAPTCHA, we could use a different version of challenge question method. One of the popular approaches is to present users with simple arithmetic operations e.g. “What is 4 + 2?” As with CAPTCHA, it is also randomized each time the form reloads. Another variant is to use questions that requires textual answers such as “What animal meows?”. It should be noted that questions like these should be made simplistic as it introduces language barriers between the users and the system and to prevent the user from taking extra efforts.
Alternative “Are You Human?” Tests
The methods described above require user input in order to distinguish humans from spambots. This additional step can be eliminated by using a little bit of programming logic to validate the form submission on the server side.
Spambots usually love to fill out every field on the form. We can take advantage of this behaviour to trap the bots by setting up a field that is hidden from the user’s view, assuming it would not be filled by a legitimate human user. This field has to be a normal type=“text” with a tempting name like ‘email’ or ‘website’ except it has a CSS “display:none” property. The idea is to flag the form submission as spam if this field was filled. To cater screen reader users without CSS support, adequate label should be applied telling them to leave this field blank. Additionally, bots also have the tendency to post links and irrelevant keywords in textarea fields. A carefully crafted regex validation on these fields would help prevent faulty form submissions substantially.
Another advanced approach is to attach a unique, dynamically-generated token as a hidden field on your form and then check its validity upon submission. These tokens can be produced using session IDs or simply timestamps. The idea behind this method is to ensure the user viewing the form is essentially the same user submitting it. The timestamp method can be used to calculate the time elapsed between the page being viewed and submitted.
Modern Day Solutions
In order to stay concurrent with the advancement of spambots, the measures taken to combat spammers have also been taken a step further by using a collaborative, distributed and intelligent approach as can be seen in Akismet and Project Honey Pot. The Akismet service is fairly popular because it comes natively with WordPress, arguably the most used blogging platform in the world. What it actually does is it runs numerous tests on the form submission data against its own huge collection of black/white list and returns the status with a thumbs up or thumbs down. The filter works by combining information about spam captured on all participating sites, and then using those spam rules to block future spam.
All in all, there will not be a silver bullet for solving web form spamming. Spambots will always become smarter and we quite surely couldn’t stop manual spamming. Nevertheless, with modern services like Akismet and Project Honey Pot the prospects certainly look promising.