Posted by Deliverator on July 29th, 2007

Discerning whether a comment comes from a human being or a spambot is a surprisingly difficult problem and a large number of automated solutions have sprung up to keep bots from filling up the blogoverse with blogorhea (at least the type generated by bots, to say nothing about human inanities). Up until recently, I have used a combination of Akismet and a plugin called Did You Pass Math? to automatically discard the vast majority of comment spam. If a potential comment passed Akismet and demonstrated basic math skills, WordPress throws the comment in a moderation queue for further examination by a meat filter (myself). WordPress emails me and I can discard or approve the comment in short order. Recently, I have started receiving comments in my final stage moderation queue that indicate that robots have learned to add and subtract, or an equally startling possibility, that human beings have done the same! Today addition, tomorrow the world! Surely the apocalypse is nigh!

I decided to swap out Did You Pass Math? for a more robust Captcha based solution. I have not been a big fan of Captcha based solutions, in part due to their almost universally poor implementations. Many Captcha implementations are extremely difficult for the average human being to “solve,” but surprisingly easy for special purpose OCR software. In other cases, spammers looking to circumvent Captcha based solutions will cleverly relay the Captcha images to the login pages of high traffic porn sites and use porn starved human beings to solve the Captcha for them. Additionally, many Captcha based solutions make vital Internet servicesa…like my blog…inaccessible to blind users. Enter reCAPTCHA, a free service from Carnegie Mellon University, the guys who quite literally invented the term CAPTCHA (or at least hold all the trademarks).

reCAPTCHA places a couple twists on the CAPTCHA concept:

-Make it difficult to impossible to redirect the CAPTCHA to another site to be solved by an unwitting human.
-Get your initial source material for generating the CAPTCHA image from books being scanned for the Internet Archive. Use only snippets which are given the Archive’s OCR software problems. This text is by definition difficult for automated OCR software to solve.
-Distort the image in ways that make it even more difficult for OCR software, but which don’t increase the difficulty for human identification.
-Use human beings as “proof readers,” making every solving of a Captcha a meaningful contribution towards the preservation of human knowledge and not simply a task which wastes 15 seconds of you time and has you swearing under your breath.
-Provide an audio CAPTCHA system for blind users

Anyways, I installed the WordPress plugin for reCAPTCHA today. Give it a try and let me know what you think….or at least try. I apologize in advance to any readers of this site who are both deaf and blind. I have a system in the works to address the problem based on Smell-O-Vision.