// Internet Duct Tape

Using Comment Spam to Measure Blog Rank

Posted in Technology, The War on Spam by engtech on May 22, 2007

bambi baby adultUsing the Technorati Rank as a measure of blogging hierarchy is so 2005. Deciding if a blog is part of the top 100 purely by the number of other blogs linking to it is one way to measure popularity, but there must be other ways. In nature you can track the population increases of Bambi, Thumper and friends by the co-related increase the number of hunters going around killing their mothers. Could there be another way to measure blog worth other than Technorati?

If only there was some parasitic relationship that fed off the blogosphere the way predators feed off of prey?

Of course! Spam.


spicy hot blog comment spam

I’m joking about quantity of spam as a measure of blog worth. But what I’m not joking about is how much more spam I am getting now compared to a year ago. I’d like to think it’s because my blog is so much more popular now, but the sad truth is that spam is an epidemic that’s affecting bloggers from all walks of life. Even Robert Scoble.

penny arcade bob the door to door spam salesman

The War on Spam

Comments spam is an infection and it is spreading further and further. It attacks our blogs and stands out like a rash. There are several over the counter remedies to comment spam, but sometimes the medicine is worse than the disease.

  • Force users to login to a verified account
    • Which means no one will bother to comment unless the login is part of a larger network like a Google account or Typepad account
  • Captcha image response algorithms
    • Which means no one will bother to comment because they are impossible to read and a complete pain in the ass
    • (I’m talking about you, Typepad)
  • Simple captcha (math, unscrambled word)
    • Works except for the 90% of the time I forget to fill it out
  • Akismet filtering (what we use at wordpress.com)

Akismet says that 95% of all comments left on blogs are spam

Akismet – Building Spam into Haystacks

One of the limitations about being hosted at wordpress.com is that the only vaccine I have for fighting off comment spam is Akismet. Which is great when it works, but, uh, not so great when it doesn’t. Akismet does a very good job of identifying ham from spam, but the problem is that it doesn’t do anything to decrease the sheer volume of spam you get. Akismet will help you lead a normal day-to-day life, but it won’t keep you from having the occasional sore on your lip for all the world to see.

I get around 1500 spam a day now. Sometimes Akismet isn’t strong enough or isn’t vaccinated against a new strain and I’ll have between 5-15 spam sores to manually delete for that day. Other times Akismet gets overzealous starts attacking the valid comments as spam (which often happens on blog posts where I ask people to post links). It’s easy enough to correct the situation if I can find out it happened. But finding that one valid comment is like trying to find a beauty mark on a leper — it ain’t pretty no matter which way you look at it.

That’s why I created the Akismet Auntie Spam for Firefox extension to make the anti-spam (ham) stick out more from all the obvious spam. In an update I never officially announced, our little old Auntie will now mark all Akismet-marked comments that have common spam words in red so that we can completely skip over them while dumpster diving through the caught spam folder. Akismet Auntie Spam helps me heal the lepers.

Akismet Auntie Spam

How to Reduce the Volume of Spam

But that still doesn’t stop the fact that I’m getting 1500 spam a day. For someone who likes to write about productivity and making the most of your time I am wasting entirely too much time being a good netizen and monitoring spam. We often call it the War on Spam but it’s a war I’m not winning. The only intelligent decision is to stop wasting my time and energy and to pull out. Like any social disease the underlying problem is that I’m being way to promiscuous. Everything I’ve ever posted to my blog is tarted up in a short skirt on a dark alleyway, just waiting for trouble, with nothing but Akismet and hope to avoid the clap.

It’s not working.

So I’m following in the footsteps of many other members of the wordpress.community and I’ve turned comments off for all posts that are over 60 days old. It isn’t because Akismet doesn’t do the job, it’s because even with Akismet doing most of the work, that last little bit takes too much of my time. It’s time for me to take my blog posts off the street and into a private school and hope they start running with a better crowd.

If the spam rash clears up appreciably, I’ll create an automated program like my Tag Cloud Generator for disabling comments on older posts so that everyone can enjoy having one less thing to worry about.

43 Responses

Subscribe to comments with RSS.

  1. Javier Aroche said, on May 22, 2007 at 10:25 am

    I admin one Spanish wordpress based blog (Maestros del Web) we get about 3,000 spam comments a day. A month ago we had to close the comments in the oldest posts (because we migrated from another CMS to WordPress) due the high server load and problems with wp-cache. A few days ago I ban some IPs seendings us spam that akismet can’t catch :S Such life…

    The spam is growing too much, and I don’t see that nothing could stop/terminate it. My blog here at wordpress.com don’t gets too much spam (150avg a day) but also closed some posts due strangle spam.

  2. Ross said, on May 22, 2007 at 10:34 am

    I’ve been tempted to enable and toy w/ Akismet, but Spam Karma 2 has been such a pleasure to ‘work with’ that I haven’t bothered.

    # Total Spam Caught: 28408 (average karma: -5437.56)
    # Total Comments Approved: 1237 (average karma: 5.59)
    # Total Comments Moderated: 418

    So far I’ve had SK2 falsely identify about 4 comments as spam, and never has it let through an actual piece of spam. That’s w/ the default settings. I should knock on wood right about now..

  3. Arnold said, on May 22, 2007 at 10:36 am

    The big uptick in the spam is the one major downside of the Technorati favourites exchange programme. Actually, that indicates that it’s a programme that works in that, as you say, the spammers wouldn’t bother if your blog didn’t have a respectably high profile, would they?

    My biggest problem is that an increasing number of legit posts are caught by akismet. Some with multiple links, of course, but the vast majority for no good reason. Net effect being that I can’t just hit “delete all spam” and need to skim through them.

    Strangely, for me, is that one or two of the older comments are picking up nearly all of the spam comments. I have thought about disabling comments on those specific ones (and there are only two or three) but figure that it would then spill over into a whole range of comments.

  4. Mike said, on May 22, 2007 at 4:56 pm

    I found the solution to spam very easy. I have no spam, with no anti-spam measures in place. My solution? I just make my blog impossible to find, unpopular and void of content. Sure one day the spam bots will find me, but who will care?

  5. TheShortFatKid said, on May 22, 2007 at 6:59 pm

    I have a random question about comment spam. Why the spam with no links? E.g. the anonymous I like your site comments? Are they testing the waters with those messages? Just curios.

  6. haacked75 said, on May 22, 2007 at 7:50 pm

    Ha ha! What a great metric!

    Seriously though, you need Invisible Captcha! http://haacked.com/archive/2006/09/26/Lightweight_Invisible_CAPTCHA_Validator_Control.aspx

    Ever since implementing it, I have almost 0 comment SPAM.

    Now trackback spam is a different story altogether since the whole point of a trackback is that it IS generated by a computer – one blog server to another. So CAPTCHA doesn’t work well there.

  7. Webomatica said, on May 22, 2007 at 8:26 pm

    I hate spam too. My Aksimet is at about 11,000.

    Tomorrow I hope add two badges to my sidebar: the technorati authority and the akismet spams caught number.

    I have no solution for the spams either, so might as well flaunt the large number…

  8. engtech said, on May 22, 2007 at 9:11 pm

    @shortfatkid: The spam with no comments is either to test if your blog is open to spam, or more likely as a way of poisoning anti-spam software.

    Akismet says its spam
    but dumb people says its not spam
    which makes Akismet say its not spam

  9. spqr said, on May 22, 2007 at 9:25 pm

    I use to let many false e-mail addresses for poisoning the spammers :-)

  10. elainevigneault said, on May 22, 2007 at 11:10 pm

    I self host so I have more options. I use Bad Behavior, which cuts out a lot of spam by preventing spiders from crawling the site. But I certainly respect your decision to turn off comments to old posts. I turned off comments for a while partially because of the spam war. They’re back on now, partially thanks to Bad Behavior.

  11. ronet said, on May 23, 2007 at 1:44 am

    I use spam karma but I’m not getting any spam at the moment. What am I doing wrong?

  12. Mr Angry said, on May 23, 2007 at 4:35 am

    I like the idea of turning off comments for old posts, that had never occurred to me. Plus, some killer turns of phrase in there, man. Tarted up in a dark alley indeed.

  13. engtech said, on May 23, 2007 at 5:10 am

    @Javier: 3000 a day, you must be in the Spamerati Top 100 for sure.

    @Ross & Elaine: SK2 + Bad Behaviour + Akismet sounds like the ideal mix. But I can’t use it on wordpress.com.

    @Arnold: I have no doubt that some of the blogs participating in the Technorati Favorites Exchange might have also been associating with spammers. I know there are some SEO blogs where all you have to do is leave a comment and next thing you know you’ll start getting bombarded with all kinds of SEO spam.

    @Mike: Damn you and your relaxed good nature. I notice you didn’t leave a link to your blog in the comment. :)

    @Phil: Invisible Captcha sounds reet. But I have my hands tied because of my blogging platform (although I always enjoy finding ways to hack beyond the WordPress.com constraints, IE: turning off all my comments using XMLRPC calls).

    @webomatica: I hear ladies are impressed with the size of a man’s spam.

    @mrangry: What’s funniest about your comment is that I had to pull it out of Akismet — and I would have missed it if I hadn’t drastically reduced the number of spam I get a day using this technique. I’m trying to improve my writing by using metaphors. :)

  14. Lloyd Budd said, on May 23, 2007 at 7:25 am

    You got to do what you got to, but it hurts me a little when good blogs turn off comments on older posts — it shortens the conversation.

  15. engtech said, on May 23, 2007 at 8:46 am

    I know Lloyd, but it was getting to the point where I was spending about 10-15 minutes a day on comment maintenance and I’d usually lose comments in Akismet never to be found again.
    In reality this change will affect the spammers more than anyone else. Most people don’t comment on the older posts. I’ve left trackbacks open so I don’t feel like the conversation is really closed.

  16. engtech said, on May 23, 2007 at 9:19 am

    If anyone is looking for an invitation to the me.dium beta test:
    http://me.dium.com/from/94966034/

    It’s a browser extension for Firefox that does instant messaging and sharing pages.

  17. engtech said, on May 23, 2007 at 10:57 am

    So I’ve been working on a program to manage comment settings for wordpress.com blogs. Still ironing out the kinks. It’ll be like the Tag Cloud Generator, a program you run on your computer. Four functions:

    – create a report with the comment status of all posts
    – turn all comments on
    – turn all comment off
    – turn off all comments on posts older than X days

    Only downside is it’s slow because of the limited functionality of XML-RPC (you can only do a full edit post like you were using Performancing or MarsEdit, you can’t just modify the comment settings). So it takes up as much bandwidth as downloading your blog and uploading it again (without images). That’s about 8 MB for my blog.

  18. adam said, on May 24, 2007 at 1:25 am

    i’m with lloyd, some of my favorite comments (dearly departed) happened a long time after the post was “new”. but you’re right, spam doesn’t hit the new posts near as badly as the old ones.

    having moved off wordpress.com, i am seeing that the spam bots don’t seem to follow my apache redirects the way a normal browser does. my spam’s been pretty negligible. i think they’re a bit attracted to the url.

  19. engtech said, on May 24, 2007 at 2:50 am

    @adam: I’m about to pull the button for switching domains, so it’ll be interesting to see how that changes the spam situation.

  20. adam said, on May 24, 2007 at 5:54 am

    i didn’t get much of a reprieve when i added domain mapping (under a week IIRC), but i’m still waiting for the onslought from merging two blogs.

  21. engtech said, on May 25, 2007 at 11:30 pm

    I’m noticing the domain mapping lull now. Since I got the new domain spam has dropped to less than 30 a day.

  22. Lorelle VanFossen said, on June 07, 2007 at 3:51 am

    Turning off comments on old posts doesn’t work. On several high profile blogs, my posts get spammed within seconds of publishing. Many more recent blog posts get hit – sometimes more than old, other times not. There is more to the whole comment spam issue than blaming old posts.

    You wanted popularity, now you have it. Increased comment spam is the result of a LOT of trackbacks. Hell, I probably sent you half your comment spam from my blog. :D

    I get a LOT more comment spam than you, my friend, and while there are spikes of new comment spam coming through occasionally as comment spammers improve and change their efforts, I use the Mass Edit Mode to quickly remove all them.

    I use the search function, which needs serious improvement but works for now, to search for keywords, like my name, through the caught by Akismet comment spam. It takes very little time to track down false/positives.

    Don’t be short-sighted and short sheet your readers who are discovering the value in your blog three, four, and six months or two years from now. They still want their say and you’ve just killed it.

    A blog is not about the “now”. It’s about the value you with. Those who write timeless content need to appreciate it when their blog post is dug by Digg two years after it was written, and make sure that folks can still have their say – even if it is “thank you”.

    Remember, Akismet and comment spam fighting tools are very new. I expect them to evolve and fight back with a vengeance in time.

  23. Sakimichi said, on June 08, 2007 at 10:42 am

    Haha, spammers are crazy i even got 12 spams in one day on my wp blog and its not a pretty sight indeed. Glad the team installed akismet ^_^.
    Is that a penny arcade comic, i just loled at it 8D

  24. engtech said, on June 08, 2007 at 9:55 pm

    @lorelle

    thanks for the long response. I can not argue with the numbers though. In one week I have only gotten 400 spams instead of 7000 to 12000. Out of those 400 spams I have found 20 false positives that did not have any keywords.

    I would not be able to manage my spam while on vacation unless I had taken that measure.

    I agree that it would be better to leave comments turned on for the older posts, but I do not think the cost to my time is worth it.

  25. [...] esta señal de que estoy bien posicionado en el Spamerati?  Claro mis números no son nada comparado a los sitios mayores como Maestros del Web, donde en [...]

  26. [...] around the content I’ve created. There’s been times when I’ve been tempted to disable commenting altogether because spam is too annoying. There’s been other times where I let comments languish without responding to them even [...]

  27. Humbert said, on November 27, 2008 at 4:20 am

    Good news! Thanks!,

  28. Nolan said, on November 27, 2008 at 6:24 am

    I don’t want to see threads like this whonews.,

  29. Peta said, on November 27, 2008 at 10:09 am

    A fantastic site, and brilliant effort. A great piece of work.,

  30. Louisa said, on November 27, 2008 at 1:42 pm

    I have loved your site for its useful and funny content and simple design.,

  31. Spike said, on November 27, 2008 at 3:18 pm

    I love the great insight!,

  32. Liberty said, on November 27, 2008 at 4:59 pm

    I love the great insight!,

  33. NogeLarloro said, on February 10, 2009 at 10:38 pm

    fascinating and educational, but would be suffering with something more on this topic?

  34. arhiderrr said, on February 28, 2009 at 1:44 pm

    Nice article

  35. Denis said, on April 30, 2009 at 5:57 pm

    I see that SPAM is a disease in the internet. This blog is suffering of it. Greetings.

  36. abertrocore said, on June 11, 2009 at 5:54 pm

    visit us!
    newsbox.cc
    newsbox.us
    nbstatus.wordpress.com
    NOW!

  37. raildingtoini said, on June 12, 2009 at 10:40 pm

    http://www.dug-portal.com/
    BESUCHT UNS JETZT – VISIT US NOW!

  38. [...] Engtech Fights Spam: Much hate for spammy comments and the spammers that spam them. [...]

  39. Alice said, on August 07, 2010 at 9:57 pm

    Fantastic website that helped me a lot in collecting pogs slammers hxxp://pogsslammers[com]

  40. Zeme said, on August 07, 2010 at 11:10 pm

    Fantastic website that helped me a lot to find the best Games ever hxxp://gamesboxes[com]

  41. Alice said, on August 07, 2010 at 11:49 pm

    Fantastic website that helped me a lot to find the best drugs books ever hxxp://drugsbooks[com]

  42. Krymaiustai said, on August 12, 2010 at 8:06 am

    weslo cardio glide is helpful to healthy and diet hxxp://weslocardioglide[com]

  43. Zelym said, on August 14, 2010 at 8:53 am

    also a good way to be closly with your baby hxxp://sonybabymonitor[com]


Comments are closed.

Follow

Get every new post delivered to your Inbox.

Join 286 other followers

%d bloggers like this: