Bye bye referrer spammers (2-1)

Wednesday Aug 17 2005

By releasing the hardened referrers 0.3 extension for I guess I've sent referrer spammers back to the drawing board for quite a while. No more filthy sites on our precious little last referrers lists anymore, at least for now. You might like to know what I did to accomplish this so here's some background information on how it works. The principles used can probably easily be applied to other systems that log referrers by somewhat clever programmers.

Quite a while ago already I performed some minor improvements on Shaun Inman's excellent ShortStat site statistics package. I realized that my site stats are completely free from pesky . The reason why is the fact that some javascript is used to insert ShortStat into webpages. Originally I did this in order to get it to work on non-php pages but soon I realized it had an extra advantage in respect to combatting ...

Dark Matter Pro: a premium photoblog template.

Affiliate program available

Pivot's [get_referrer]


Originally, Pivot contains a template tag that calls a PHP script which adds the incoming referrer to a local database. The contents of the database are displayed on the site if the owner uses the [show_referrers] tag in the site's templates. It generates a list which looks like this (the example below is actually a real-world example of the tag in use):



When working on Pivot Blacklist I've created a modified version of the PHP script that logs the referrers which uses the blacklist to scan incoming referrers. However, history has shown that spammers are too quickly with registering new domains and coming up with stealth URLs that easy fool or outrun any centralized blacklist. Therefore turned out to be only 80-90% effective in combatting referrer spam. Pretty good but not good enough since every child-porn, bestiality or other disgusting link that appears on our frontpages is one too many.

Hardened referrers



The hardened referrers extension still uses the core PHP script that logs the incoming referrer. However, it uses some logic in order to decide whether a referrer should be added to the database or not. Instead of blindly including the PHP script, I created some code that adds the following to any page of my site:


<script type="text/javascript" src="/weblog/pivot/../extensions
/snippets/getkey.php"></script>
<script type="text/javascript">
function getReferer() {
document.write('<img src="/weblog/pivot/../extensions
/snippets/refgetter.php?key=' + refkey + '&referer=' +
document.referrer + '" width="1" height="1" alt="" />');
}
getReferer();
</script>


The first part of the protective effect is easy to understand. If the visitor isn't running a javascript interpreter (which is the case with a spam script) the code will never be called and therefore no referrer will be logged. The file refgetter.php isn't an image but a script that performs several checks. The incoming referrer is passed to it through javascript. The getkey.php script you see in the sourcecode example generates an MD5 key which is sent back like this:


var refkey = 'some_generated_md5_hash';


It's passed to the refgetter.php script, again through javascript. The script that generates the key actually stores it on disk with a validity of 10 seconds. This value could probably be lowered to 1 or 2 seconds even.

protection against abuse



Now a spammer may think he can simply call the refgetter.php script and insert a spammy referrer like that. One could try to call:


http://some-pivot-site.com/extensions/snippets/
refgetter.php?referer=http://some-spammy-site.com/


The above will never work because refgetter.php demands the real referrer (not the GET parameter) to be YOUR site. In any other case it won't do a thing.

Another thing a spammer could try is download the source and find out that the real script that calls the logger is called hgetref.php and call that script:


http://some-pivot-site.com/extensions/snippets/
hgetref.php?referer=http://some-spammy-site.com/


This won't work first of all because there's no valid key and secondly because the hgetref script (the script that actually calls Pivot's referrer logger) cannot be called directly but may only be included.

Try it here.

Finally, Bob, creator of Pivot pointed out that someone could fake the referrer to be your site and then call:


http://some-pivot-site.com/extensions/
snippets/refgetter.php?referer=Bob_hates_spammers.com


This is the real reason why I added the generated key with a very short time to live.

Pivot-Blacklist



As I pointed out in the first part of this story, Pivot Blacklist is about 80-90% effective in blocking referrers. We'd be stupid if we wouldn't use this power to make things even harder to . Therefore the hardened referrers extension applies Pivot Blacklist's spam checking routine as a last line of defense. This means that even if someone took all the trouble of inventing a way to circumvent this system, the majority of his stuff would still be blocked. Therefore I guess it's pretty safe to conclude that we've beaten the referrer spammers on our Pivot sites for at least quite a while.

Of course these anti-spam measures can be circumvented by clever coders. There's no need to point this out in the reactions. Pivot Blacklist uses Elliott Back's excellent HashCash as a first line of defense against and I could even write a script that bypasses that. The important point is: Up till now, haven't advanced to this kind of clever scripting yet.

Let's hope it will stay that way and if not, my message to the spammers is:

Until we meet again... I'll be waiting, bitches!

If you happen to be interested in the sourcecode: Download it from the corresponding Pivot Forum thread
bookmarking

Commentary

Join the discussion! Leave a comment through the comment form below!

Got something to add to this?

Feel free to leave a comment on this site. You can use Textile and Emoticons. Your email address is only used to show a gravatar. Please stay on-topic and use common decency. Spammers will be shot in front of a live studio audience.

If you plan on posting code, use pastebin please and post a URL to the code. The comment processing doesn't deal very well with code. Sorry for the inconvenience.

Human comment spammers: don't bother posting your crap here. Comments are moderated and I won't let any of your shit through.

Remember personal info?
Yes
No

Trackbacks

If you have an interesting related post on your own site you can leave a trackback. As they say: 'a little AJAX a day keeps the spammers away' which is why you'll have to click below to generate a trackback key. The key will be valid for 15 minutes and can be used only once.

Refferer Spam
Na Bung Ph en Marco
hebben de refferer-spammers nu ook mijn log gevonden en ze dringen door
tot de lijst. Ik heb ‘for the time being’ the refferers verwijderd van
m’n log. Helaas moet ik Marco gelijk geven als hij zegt dat er helaas
weinig te doen is tegen dit…Sent on 17 August '05 - 14:45 , via Bakkel's weblog

Technology Blogs - Blog Top Sites

 

  • Featured Links
RockySomewhere near the Orion NebulaBookalicio.usGolden Gate BridgeThames River BankJackie and mePimpin' it