Archive for October, 2007

reCAPTCHA

So, some bright sparks at Carnegie Mellon found a way to modify the CAPTCHA anti blog-spam system to aid in the digitization of old books. They call it reCAPTCHA. OCR scanning of old books is somewhat reliable, but a lot of individual words can be misread. These guys distribute the words that are obviously misread to CAPTCHA interfaces all over the place, to have them verified. It should be working on this blog right now.

Take a look at the comment interface underneath this post. Go ahead, scroll down and come back. I’ll wait.

There are two words down there. One is a control word, that is actually a traditional captcha. That gets checked against the known correct answer.  The other is an OCR-error word. If multiple people solve the OCR-error word the same way, it gets accepted as official, and the sum total of preserved human knowledge has been increased, slightly. It also comes in wordpress plugin flavor.

So, I’ll be trying this system out for a while. Right now I’m using a moderation queue to vet new posters, allowing unlimited unmoderated posts from vetted posters. If reCAPTCHA works well enough, I might disable the moderation queue.

3 comments October 2nd, 2007


Calendar

October 2007
M T W T F S S
« Sep   Nov »
1234567
891011121314
15161718192021
22232425262728
293031  

Posts by Month

Posts by Category