Saving some valuable bandwidth
I've just started to block single words for referring sites in my .htaccess. Referrer spammers are stealing so much bandwidth that it's not even funny anymore. Pivot-Blacklist can generate .htaccess rules but usually they're one rule per domain and some regexps that still don't catch enough to my taste. The most agressive thing one can do is blocking single words. Sure it will probably result in some false positives but I guess everything is better than having to pay your ass off due to massive amounts of stolen bandwidth due to referrer spam. Here's how...
updated!
# Single word blocks
RewriteEngine On
RewriteCond %{HTTP_REFERER} poker [OR]
RewriteCond %{HTTP_REFERER} medicine [NC,OR]
RewriteCond %{HTTP_REFERER} pills [NC,OR]
RewriteCond %{HTTP_REFERER} diet [NC,OR]
RewriteCond %{HTTP_REFERER} viagra [NC,OR]
RewriteCond %{HTTP_REFERER} mortgage [NC,OR]
RewriteCond %{HTTP_REFERER} casino [NC,OR]
RewriteCond %{HTTP_REFERER} insurance [NC,OR]
RewriteCond %{HTTP_REFERER} loan [NC,OR]
RewriteCond %{HTTP_REFERER} buy [NC,OR]
RewriteCond %{HTTP_REFERER} xanax [NC,OR]
RewriteCond %{HTTP_REFERER} meridia [NC,OR]
RewriteCond %{HTTP_REFERER} incest [NC,OR]
RewriteCond %{HTTP_REFERER} lesbian [NC,OR]
RewriteCond %{HTTP_REFERER} viagra [NC,OR]
RewriteCond %{HTTP_REFERER} adult [NC,OR]
RewriteCond %{HTTP_REFERER} hentai [NC,OR]
RewriteCond %{HTTP_REFERER} tramadol [NC,OR]
RewriteCond %{HTTP_REFERER} phentermine [NC,OR]
RewriteCond %{HTTP_REFERER} gambling [NC,OR]
RewriteCond %{HTTP_REFERER} texas- [NC,OR]
RewriteCond %{HTTP_REFERER} holdem [NC,OR]
RewriteCond %{HTTP_REFERER} pharmacy [NC,OR]
RewriteCond %{HTTP_REFERER} ultram [NC,OR]
RewriteCond %{HTTP_REFERER} tramadol [NC]
RewriteRule .* - [F,L]
You can add as many words as you want, as long as the first rule has only [OR] and the last rule only [NC]. Otherwise it won't work. An easy way to do it is to copy/paste my example and just add lines in the middle like:
RewriteCond %{HTTP_REFERER} your_word_to_block [NC,OR]
If the referrer contains the word, it will be blocked. This strategy will block many sites, for example the viagra one will block stuff like: www.buy-viagra.com but also stealth domains like www.7zulu.com/viagra.html etc. etc.
I also added an extra rule in order to be not all that harsh:
ErrorDocument 403 /honeypot/403.php
This means that whoever is caught by the rules will get this page which is slightly friendlier than a standard 403 / Forbidden notice while still not taking up too much kilobytes.
To further annoy the spammer I've put the following php code in my 403 document:
<?php
echo "If you see this, your referrer:".$_SERVER["HTTP_REFERRER"]." has been marked as a spam referrer. Click <a href=\"".$_SERVER["R
EQUEST_URI"]."\">HERE</a> to access the page anyway.<br /><br />";
flush();
for($i=0;$i<10;$i++) {
echo "p";
sleep(5);
flush();
echo "i";
sleep(5);
flush();
echo "s";
sleep(5);
flush();
echo "s";
sleep(5);
flush();
echo " ";
sleep(5);
flush();
echo "o";
sleep(5);
flush();
echo "f";
sleep(5);
flush();
echo "f";
sleep(5);
echo " ";
flush();
echo "s";
sleep(5);
flush();
echo "p";
sleep(5);
flush();
echo "a";
sleep(5);
flush();
echo "m";
sleep(5);
flush();
echo "m";
sleep(5);
flush();
echo "e";
sleep(5);
flush();
echo "r";
sleep(5);
flush();
echo "s";
sleep(5);
flush();
echo "<br />";
}
?>
This will make the request be sloooooooooooooow as hell... (try it here)
Filed under: cyberspace
Number of comments:
Number of trackbacks:
Tagged with: 







At 29 August '05 - 22:36 Bram wrote:
At 30 August '05 - 10:13 Niwla wrote:
thanks for your rewrites
At 30 August '05 - 12:41 Marco wrote:
You have to set up your own 403 document and place it somewhere where it’s NOT affected by the rules. This means that you need something like this:
/weblog/.htaccess
In this file there’s (apart from all the blocking stuff) the following line:
ErrorDocument 403 /errors/403.php
The 403 doc must be freely accessible even by spam referrers or it will result in a 403 on your 403 document.
Hope this makes sense?
At 01 September '05 - 10:32 Niwla wrote:
On your 403-page:
“Try clicking HERE to access the page directly.”
Your “HERE”-link point to http://www.i-marco.nl/honeypot/index.php
Shouldn’t that be the index of your weblog ?
At 01 September '05 - 10:47 Marco wrote:
At 01 September '05 - 12:19 Niwla wrote:
At 05 July '07 - 16:31 Thomas wrote:
It even seems to co-exist with the existing /403.shtml that’s used for IP blocking (A couple of bad spiders.)
At 10 December '07 - 22:07 AskApache wrote:
Of course the dangerous thing about doing either of these methods I’ve learned is that your server only has a certain number of sockets/threads/and processes and by making spambots be delayed like that it actually can really slow down your site.
I’ve instead opted to issue a specific ErrorDocument for certain obvious spam… Like ErrorDocument 402 /realspam.cgi where the cgi is just a bash shell script that echoes a one-liner error message.. Otherwise each spam request hogs your bandwidth and resources, even if you block them!