June 05, 2004
Comment Spam and MT-Blacklist
Comment spam seems to have gotten even more rampant lately — twice last week I’ve found 20-50 new comments, all spam. And I’m also seeing more and more comment spam on other blogs (plus Brayden King’s been griping about spam recently).
Certainly, it can be a nice boost for a blogger’s self-esteem to read through comments like “Nice site!” and “You are doing a great service to the web!” and “Excellent, that was really well explained and helpful!” But after a while it gets about as old as those generic feedbacks you get from eBay, like “Great transaction, would do business again, A+++++++!!” And there’s just only so many porn links a blog can take before it starts to look a little sleazy.
Well, I hope most of you Movable Type bloggers already know about this plug-in, but in case not (and it seems Dan Drezner only found out about it last month), let me direct your attention to Jay Allen’s indispensable spam-fighting plug-in, MT-Blacklist.
The technology is nothing to write home about, as it simply scans incoming comments for websites that match a master blacklist. Currently, the plug-in doesn’t automatically update the blacklist, but instead relies upon the user to periodically manually retrieve it from the MT-Blacklist/Comment Spam Clearinghouse site and import it (I’m sure he’s working on a better solution).
But it works a lot better than banning IP addresses, which change all the time (resulting in a lot of false positives) and can be spoofed easily anyway. Jay Allen has a more detailed explanation of the folly of IP banning. Scanning for spammer’s website URLs makes more sense because the spammer needs that site in the comment. Realize, the whole point of comment spam is not to convince readers, but to increase the spammer’s PageRank on Google (which values the number of links a site has). Also, the spammer has to spend time and money to change their domain name to something not on the blacklist. To be sure, it’s not that hard nor expensive, but it’s not as quick and easy as changing your IP address — or updating your blacklist.
To be sure, spams will still slip through even if you constantly keep your blacklist up-to-date. However, the real power of the plug-in isn’t really in the filtering, but in the ability to delete comment spams quickly and easily. Movable Type really doesn’t have very good comment management, so deleting comment spam can be a tedious affair. With MT-Blacklist, you can quickly scan for comments that match your blacklist, a specific IP address, or a specific string. This way, you can quickly delete all the spam that did get through.
For example, if a lot of spam got through because you haven’t updated your blacklist in a while, you can update your blacklist and then de-spam against that updated list. Or if you see a bunch of spam all using the same e-mail address, username, IP, or even the same stock phrases, you can scan for those as well. You can then check off the ones that are actually spam (so your search doesn’t have to be perfect), and then with one click, you can tell the plug-in to delete the comment spam and rebuild the affected entries. Note, if you’re deleting a lot of comments, you might want to backup your blog first (of course, you’re doing that regularly already, right?).
It’s not that hard to install. Like other plug-ins, it’s a matter of FTP’ing a few perl scripts into the correct directories and making sure the permissions are correct — stuff you should be able to handle if you installed Movable Type in the first place (if someone else did it for you, you might want to get their help if you don’t know what FTP or permissions are). Note, the plug-in does not work with Movable Type 3.0. But then again, there’s very little reason for you to upgrade to 3.0 since it’s a developer’s release1, not a feature release.
Now, if you already have MT-Blacklist installed, you might have noticed a lot of recent spam getting through the blacklist where some letters in the spammer’s website URL have been replaced by their ASCII equivalent (e.g. freeporn instead of freeporn — oh geez, that’s going to get a lot of false hits from Google now, isn’t it?). Note, the website URL still appears normally on the blog itself because it gets decoded before it’s displayed. It’s just in the Movable Type comment editing window and in the notification e-mails from your server that you’ll see the ASCII codes. Well, this is due to a vulnerability that Jay recently fixed in MT-Blacklist 1.64, so update your plug-in and then de-spam against your blacklist to get rid of them (yeah, I ran into this last week).
So anyway, while it’s an imperfect solution at best, it works pretty well for now. The arms race between spammers and spam-fighters will continue, and maybe comment registration will be the way to go, or maybe spammers will figure out that redirected comment URLs don’t affect their PageRank. We’ll see. But I’ll bet Jay Allen will still be in the middle of it all, fighting the good fight. If you find MT-Blacklist as useful as I have, please make a donation to Jay to help him out. Just hit the PayPal button at the bottom of his left sidebar on his site.
Update 6/8/04
Elise Bauer at Learning Movable Type has some other excellent suggestions that I hadn’t known about, like renaming mt-comments.cgi and mt-tb.cgi.
1 You may have noticed a bit of an uproar in the Movable Type community about Movable Type 3.0, specifically their new licensing scheme, and the fact that this release doesn’t have any new features, even though Six Apart has been promising a lot of new features for 3.0. Well, Jay Allen has a good explanation that should clear up confusion of what a developer’s release actually is, namely that it is intended for use by plug-in developers who can work with it to update their plug-ins. Regular users should not upgrade to it. He also thinks everyone overreacted to the licensing, and Brad Choate, author of MT-Textile, has a similar reaction. Personally, I’d already been leaning towards moving to an open-source PHP-based solution, so I’m going to wait and see. Return.
June 05, 2004 11:25 AM in Blogging, Technology | PermalinkWeblog: Urban Scrawl
Excerpt: Never a dull moment in the war against spam. The proliferation of anti-spam techniques such as MT-Blacklist (in use on this domain) have dealt a solid blow to spammers who tirelessly come up with new methods to increase their search...
Tracked: July 30, 2004 11:53 PM
Jay is working on a new MT3 compatible version of MT-Blacklist which will I believe have a lot of new features, including easier ways to submit new spam URLs as well as better ways to update your personal blacklist from his master list (although there are hacks already out there for existing users that update your list via parsing of Jay’s RSS changes list)
Posted by demonsurfer at 06/06/04, 09:05 AM (link)Personally, I am ready to file suit against Google for my pain and suffering. After all, it is they who are enabling this garbage in the first place.
If a woman can sue McDonalds for making her fat and getting millions of dollars to boot, then imagine what I would get from Google. Bwah ha ha ha ha!
Seriously though, I love MT-Blacklist. 6A should have done this a long time ago themselves.
Posted by Political Pulpit at 06/09/04, 09:36 PM (link)