Tags: seo




Running The Session and Huffduffer is immensely rewarding …most of the time. There are occasions when the actions of a few bad apples make it a real pain in the bum.

Yes, I’m talking about SEO spammers.

Huffduffer tends to get it worse than The Session, but even then it’s fairly manageable—just a sign-up or two here or there. This weekend though, there was a veritable spam tsunami. I was up late on Friday night playing a constant game of whack-a-mole with thousands of spam postings by newly-created accounts. (I’m afraid I inadvertently may have deleted some genuine new accounts in the trawl; if you signed up for Huffduffer last Friday and can’t access your account now, I’m really, really sorry.)

Normally these spam SEO accounts would have some pattern to them—either they’d be from the same block of IP addresses or they’d have similar emails. But these all looked different enough to thwart any quick fixes. I knew I’d be spending my Saturday writing some spam-blocking code.

Most “social” websites have a similar sign-up flow: you fill in a form with your details (including your email address), and then you have to go to your email client to click a link to verify that you are indeed who you claim to be. The cynical side of me thinks that this is mostly to verify that you providing a genuine email address so that the site can send you marketing crap.

Neither Huffduffer nor The Session includes that second step of confirming your email address. The only reason for providing your email address is so that you can reset your password if you ever forget it.

I’ve always felt that making a new user break out of the sign-up flow to go check their email was a bit shit. It also strikes me as following the same logic as CAPTCHAs (which I hate): “Because of the bad actions of a minority, we’re going to punish the majority by making them prove to us that they’re human.” It’s such a machine-centric way of thinking.

But after the splurge of spam on Huffduffer, I figured I’d have no choice but to introduce that extra step. Just as I was about to start coding, I thought to myself “No, this is wrong. There must be another way.”

I thought a bit more about the problem. The issue wasn’t so much about spam sign-ups per se. Like I said, there’s always been a steady trickle and it isn’t too onerous to find them and delete them. The problem was the sheer volume of spam posts in a short space of time.

I ended up writing some different code with this logic:

  1. When someone posts to Huffduffer, check to see if they’ve posted at least ten items in the past;
  2. If they have, grab the timestamps for the last ten posts;
  3. Calculate the cumulative elapsed time between those ten posts;
  4. If it’s less than 100 seconds (i.e. an average of one post every ten seconds), delete the user …and delete everything they’ve ever posted.

It worked. I watched as new spam sign-ups began to hammer the site with spam postings …only to self-destruct when they hit the critical mass of posts over time.

I’m still getting SEO spammers signing up but now they’re back to manageable levels. I’m glad that I didn’t end up having to punish genuine new users of Huffduffer for the actions of a few SEO marketing bottom-feeders.

Perfect Pitch

We were having a chat in the Clearleft office today about site stats and their relative uselessness; numbers about bounce rates are like eyetracking data—without knowing the context, they’re not going to tell you anything.

Anyway, I was reminded that I have an account over at Google Webmaster Tools set up for three of my sites: adactio.com, huffduffer.com and thesession.org. I logged in today for the first time in ages and started poking around.

I noticed that I had some unread messages. Who knew that Google Webmaster Tools has a messaging system? I guess all software really does evolve until it can send email.

One of the messages had the subject line Blocked URLs:

For legal reasons, we’ve excluded from our search results content located at or under the following URL/directory:


This content has been removed from all Google search results.

Cause: Somone has filed a DMCA complaint against your site.

What now?

I visited the URL and found a fairly tame discussion about Perfect Pitch. Here’s the only part of the discussion that references an external resource in a non-flattering light:

I think that is referring to www.PerfectPitch.com. I’m not saying anything about such commercially-oriented courses because I don’t know them, but I think we’d all be wise to bear in mind the general comments voiced in the first two posts on this thread.

That single reference to a third-party site is, apparently, enough to trigger a DMCA complaint.

Google link to the complaint on Chilling Effects but that just says The cease-and-desist or legal threat you requested is not yet available. It does, however, list the party who sent the complaint: Boucherle.

By a staggering coincidence, Gary Boucherle of American Educational Music, Inc. is registered as the owner of perfectpitch.com.

So let’s get this straight. In a discussion about perfect pitch, someone mentions the website perfectpitch.com. They don’t repost any materials from the site. They don’t even link to the site. They don’t really say anything particularly disparaging. But it all takes is for the owner of perfectpitch.com to abuse the Digitial Millenium Copyright Act with a spurious complaint and just like that, Google removes the discussion from its search index.

To be fair, Google also explain how to file a counter-complaint. However, the part about agreeing to potentially show up in a court in California is somewhat off-putting for those of us, like me, who live outside the United States of America.

There is another possible explanation for this insane over-reaction; one that would explain why the offended party sent the complaint to Google rather than going down the more traditional route of threatening the ISP

The Session has pretty good Google juice. The markup is pretty lean, the content is semantically structured and there’s plenty of inbound links. Could it be that the owner of perfectpitch.com sent a DMCA complaint to Google simply because another site was getting higher rankings for the phrase “perfect pitch”? If so, then that’s a whole new level of SEO snake-oilery.

Hmmm… that gives me an idea.

If you have a blog or other personal publishing platform, perhaps you would like to write a post titled Perfect Pitch? Feel free to republish anything from this post, which is also coincidentally titled Perfect Pitch. And feel free to republish the contents of the original discussion on The Session titled, you guessed it: Perfect Pitch.

Update: Thanks for inbound links, everyone. The matter is now being resolved. I have received an apology from Gary Bourcherle who was being more stupid than evil.


Derek Powazek gave up smoking recently so any outward signs of irritability should be forgiven. That said, the anger in two of his recent posts is completely understandable: Spammers, Evildoers, and Opportunists and the follow-up, SEO FAQ.

His basic premise is money spent on hiring someone who labels themselves as an SEO expert would be better spent in producing well marked-up relevant content. I think he’s right. In the comments, the more reasonable remarks are based on semantics. Good SEO, they argue, is all about producing well marked-up relevant content.

Fair enough. But does it really need its own separate label? Personally, I would always suggest hiring a good content strategist or copy writer over hiring an SEO consultant any day. Here’s why:

Google—or at least the search arm of the company—is dedicated to a simple goal: giving people the most relevant content for their search. Google search is facilitated by ‘bots and algorithms, but it is fundamentally very human-centric.

Search Engine Optimisation is an industry based around optimising for the ‘bots and algorithms at Google.

But if those searchbots are dedicated to finding the best content for humans, why not cut out the middleman and go straight to optimising for humans?

If you optimise for people, which usually involves producing well marked-up relevant content, then you will get the approval of the ‘bots and algorithms by default …because that’s exactly the kind of content that they are trying to find and rank. This is the approach taken by Aarron Walter in his excellent book Building Findable Websites.

On Twitter, Mike Migurski said:

I think SEO is just user-centered design for robots.

…which would make it robot-centred design. But that’s only half the story. SEO is really robot-centred design for robots that are practising user-centred design.

Ask yourself this: do you think Wikipedia ever hired an SEO consultant in order to get its high rankings on Google?

Dyson ball

When I was in Japan last year, I noticed that most advertisements don’t mention URLs. Instead, they simply show what to search for. The practice seems to be gaining ground over here too. Advertising for the government’s Act on CO2 campaign didn’t include a URL—just an entreaty to search for the phrase.

The current television advertising for the latest Dyson vacuum cleaner finishes with the message to search for “dyson ball.” Sure enough, the number one search result goes straight to the Dyson website …for now. That might change if Google were to implement any kind of smart synonym swapping. There would be quite a difference in scale if the word “ball” were interchangeable with the word “.”