“Because of the bad actions of a minority, we’re going to punish the majority by making them prove they’re human.”adactio.com/journal/10307
Yes, I’m talking about SEO spammers.
Huffduffer tends to get it worse than The Session, but even then it’s fairly manageable—just a sign-up or two here or there. This weekend though, there was a veritable spam tsunami. I was up late on Friday night playing a constant game of whack-a-mole with thousands of spam postings by newly-created accounts. (I’m afraid I inadvertently may have deleted some genuine new accounts in the trawl; if you signed up for Huffduffer last Friday and can’t access your account now, I’m really, really sorry.)
Normally these spam SEO accounts would have some pattern to them—either they’d be from the same block of IP addresses or they’d have similar emails. But these all looked different enough to thwart any quick fixes. I knew I’d be spending my Saturday writing some spam-blocking code.
Most “social” websites have a similar sign-up flow: you fill in a form with your details (including your email address), and then you have to go to your email client to click a link to verify that you are indeed who you claim to be. The cynical side of me thinks that this is mostly to verify that you providing a genuine email address so that the site can send you marketing crap.
Neither Huffduffer nor The Session includes that second step of confirming your email address. The only reason for providing your email address is so that you can reset your password if you ever forget it.
I’ve always felt that making a new user break out of the sign-up flow to go check their email was a bit shit. It also strikes me as following the same logic as CAPTCHAs (which I hate): “Because of the bad actions of a minority, we’re going to punish the majority by making them prove to us that they’re human.” It’s such a machine-centric way of thinking.
But after the splurge of spam on Huffduffer, I figured I’d have no choice but to introduce that extra step. Just as I was about to start coding, I thought to myself “No, this is wrong. There must be another way.”
I thought a bit more about the problem. The issue wasn’t so much about spam sign-ups per se. Like I said, there’s always been a steady trickle and it isn’t too onerous to find them and delete them. The problem was the sheer volume of spam posts in a short space of time.
I ended up writing some different code with this logic:
- When someone posts to Huffduffer, check to see if they’ve posted at least ten items in the past;
- If they have, grab the timestamps for the last ten posts;
- Calculate the cumulative elapsed time between those ten posts;
- If it’s less than 100 seconds (i.e. an average of one post every ten seconds), delete the user …and delete everything they’ve ever posted.
It worked. I watched as new spam sign-ups began to hammer the site with spam postings …only to self-destruct when they hit the critical mass of posts over time.
I’m still getting SEO spammers signing up but now they’re back to manageable levels. I’m glad that I didn’t end up having to punish genuine new users of Huffduffer for the actions of a few SEO marketing bottom-feeders.
“Spamduffing” - I love love love how Jeremy Keith approaches problems and always fights for the best thing. adactio.com/journal/10307
“Because of the bad actions of a minority, we’re going to punish the majority” adactio.com/journal/10307
I’m looking for more unique approaches to spam detection like adactio.com/journal/10307 Know any?