Trackback spam eliminated

Wednesday Aug 24 2005

has always been a bit of a geeky thing. Not all bloggers are using it and some don't even understand what it is. however have found another opportunity to annoy the hell out of us: by spamming our trackback scripts. You might already have noticed the rather different trackback links on my site compared to most other weblogs. In fact they aren't even shown on my pages. Yet any legitimate visitor can obtain a trackback URL for any of my articles just fine. In this little article I'll explain how the 'hardened trackbacks' which will be included in the upcoming version of Pivot-Blacklist works. It should be easy enough to implement the same principles for other software.

Dark Matter Pro: a premium photoblog template.

Affiliate program available

A tiny bit of Ajax


It seems up till now, and hopefully for quite a long time, javascript is a pain in the ass for spammers. Writing a php or perl script to send massive amounts of GET requests towards thousands of innocent weblogs is one thing but these scripts choke miserably when we use javascript as a defensive instrument. The approach I used for the hardened trackbacks is somewhat similar to what I did to tackle referrer spammers.

On most weblogs, trackback URLs are shown on it's pages like this:

The trackback URL for this entry is: http://www.some-site.com/blog/pivot/tb.php?id=456

Anyone who wants to send a trackback to the weblog will paste this URL into their weblog software and send a so called trackback-ping to it. Ping is bad terminology in my humble opinion because a trackback-ping is nothing more than a regular http GET request. It has little or nothing to do with a network ping most of you may know about. Along with the id parameter some others like a title, an excerpt, and some other stuff is added to the request string and the script that responds to this request will add the trackback to the weblog on which the trackback script was called. It can be seen as some sort of automated comment-posting. A spammer's dream come true. All the spammer will have to do is find out what the GET parameters are that need to be used, use Google to harvest enormous amounts of trackback URLs and spam away. But not on this site!

The trackback box


If you look at the source of this page you will find a section that looks like this:

<script type="text/javascript" src="/weblog/extensions/snippets/htrackback.js"></script><div id="tbgetter"><a href="javascript:void(null);" onclick="javascript:loadFragmentInToElement('
/weblog/extensions/blacklist/tbkey.php?id=271', 'tbgetter')">click to generate a trackback url</a><br /><em>Note:generated url valid for only 15 minutes and javascript is required!</em></div>


First of all a file is included. The file contains the following code:


function loadFragmentInToElement(fragment_url, element_id) {
var xmlhttp=false;
/*@cc_on @*/
/*@if (@_jscript_version >= 5)
// JScript gives us Conditional compilation, we can cope with old IE versions.
// and security blocked creation of the objects.
try {
xmlhttp = new ActiveXObject("Msxml2.XMLHTTP");
} catch (e) {
try {
xmlhttp = new ActiveXObject("Msxml2.DOMDocument.4.0");
} catch (E) {
xmlhttp = new ActiveXObject("Msxml2.XMLHTTP.4.0");
}
}
@end @*/
if (!xmlhttp && typeof XMLHttpRequest!='undefined') {
xmlhttp = new XMLHttpRequest();
}
var element = document.getElementById(element_id);
element.innerHTML = '...Creating a trackback key for you';
xmlhttp.open("GET", fragment_url);
xmlhttp.onreadystatechange = function() {
if (xmlhttp.readyState == 4 && xmlhttp.status == 200) {
element.innerHTML = xmlhttp.responseText;
}
}
xmlhttp.send(null);
}


As you can see in the javascript, a function is defined which uses the XMLHttpRequest object, either the W3C version or Microsoft's incarnation of it. Both work fine. This object works like magic. It's used in many applications in which we can fetch data from a webserver without ever refreshing the browser. The function takes two arguments. First of all the URL to fetch and secondly the DOM element in which we want to insert whatever we receive back from the server.

Getting a key



In our example we've got a link in the trackback box that calls the function defined in the javascript file. We call a php script named tbkey.php which returns something like this:

http://www.i-marco.nl/weblog/pivot/tb.php?tb_id=286
&key=44df06b7019f984d9554479d4ae30072


This content is inserted into the div we saw earlier. Just try it below this story to see what I mean. We now have a trackback URL we can use to send a trackback to my weblog. The URL looks quite a lot like a normal one except for one difference. It contains an unique key. The script named tbkey.php generates an unique MD5 hash and stores it on disk. The same hash is appended to the common trackback URL.

Incoming trackback requests


We're almost there. When checking an incoming trackback request, Pivot (and most other weblog systems) accepts anything sent to it as long as all parameters are provided. On this site this isn't the case. Before accepting the trackback and storing it into the site's backend we check whether the key provided in the request actually exists on disk. If it does, we also check if it's not older than 15 minutes. If these prerequisites are satisfied, the trackback is stored and the key is deleted from disk. In all other cases the trackback request is ignored.

Mission accomplished.

If you want to see all the code you can download Pivot-Blacklist 0.9.1a in which this code is included. Of course it's only usable with Pivot in it's current form. It should however serve fine as an example to implement the same principle for other weblog software.
bookmarking

Commentary

Join the discussion! Leave a comment through the comment form below!

Got something to add to this?

Feel free to leave a comment on this site. You can use Textile and Emoticons. Your email address is only used to show a gravatar. Please stay on-topic and use common decency. Spammers will be shot in front of a live studio audience.

If you plan on posting code, use pastebin please and post a URL to the code. The comment processing doesn't deal very well with code. Sorry for the inconvenience.

Human comment spammers: don't bother posting your crap here. Comments are moderated and I won't let any of your shit through.

Remember personal info?
Yes
No

Trackbacks

If you have an interesting related post on your own site you can leave a trackback. As they say: 'a little AJAX a day keeps the spammers away' which is why you'll have to click below to generate a trackback key. The key will be valid for 15 minutes and can be used only once.

Trackbacks aren't dead...
... they’re just ‘resting’.

That is to say, the current state of blog trackbacks is utter crap – but that’s enough to propel people to find a solution. I realized this yesterday, when I accidentally trackbacked someone I shouldn’t have. I e-mailed …Sent on 27 August '05 - 21:34 , via Greg Yardley's Internet Blog
Trackback new style!
I-marco:

Trackback has always been a bit of a geeky thing. Not all bloggers are using it and some don’t even understand what it is. Spammers however have found another opportunity to annoy the hell out of us: by spamming our trackback scripts. You might alr…Sent on 28 August '05 - 22:43 , via Bakkel's weblog
All About Trackbacks and Pingbacks with Wordpress
Most blogging software out there gives you the option to use trackbacks and pingbacks. These concepts can be a bit confusing at first as they are very similar.
Basically, they both allow you to notify other blogs that your post is related to one o…Sent on 14 November '05 - 10:45 , via radicalbright.com
DOMOROTO BOGO TUGO
DOMOROTO excerptSent on 18 January '07 - 09:04 , via DOMOROTO BOGO

Technology Blogs - Blog Top Sites

 

  • Featured Links
RockySomewhere near the Orion NebulaBookalicio.usGolden Gate BridgeThames River BankJackie and mePimpin' it