On RSS and the coming wave of content theft
Yesterday I spotted the admission of Mark Wade's Blog Marketing, Blog Promotion for Newbies to the 9rules Network. I was slightly shocked by the fact that Mark's site had some links to rather dark corners of the internet placed in his weblog's sidebar. Luckily, Mark responded to my comment by removing these links from his site. An excellent move because the advertised products are like the plague for the blogosphere. What kind of links were those then, you probably wonder? I'm not going to repost the links here because I don't want to promote something that is quite likely to become a growing problem in the already over-hyped Web 2.0 world: content theft. RSS is one of the big buzzwords of this time. It's part of what many people call 'Web 2.0', including Wikipedia. The beforementioned links promoted applications that use RSS in a (imho) malicious way. But let's get to the good stuff first!
What's great about RSS
RSS is a great way to do interesting things with other people's content. The most important thing for most of us is probably the fact that RSS enables us to read a lot of sites in an 'all-in-one' solution such as an RSS reader application, an RSS-aware browser or a service such as Bloglines. Probably the greatest thing since sliced bread. I read quite a lot of blogs through Bloglines myself. It's also an excellent tool to add value to your website by posting relevant headlines from other sites on your pages. Quite some people indicate this also seems to affect search engine rankings but I'm not really convinced of this side effect. I myself consider it a valuable extra service to readers. There's no need to be afraid they'll leave your website. If there's enough interesting material written by yourself it will most probably keep them coming back for more.
Being a good blogger ain't easy
Drive thousands of visitors to your website without ever writing a single line yourself!Typical marketing line for a malicious RSS product
The above quote will probably make one scratch their head and wonder... That can't be true and if it's true it can't be right. Right? Right! As we all know, search engines will index anything they'll get to chew on and AdSense banners will appear on any website with content that has previously been indexed by Google. Of course this got some clever people thinking. A lot of bloggers like myself spend a lot of time writing lenghty posts, hoping visitors will appreciate them and maybe even leave a comment or two to join a potentially interesting discussion. This is what blogging is all about. To cover some expenses, many of us use AdSense banners. I'm no exception. We write the content,, Google spiders it eventually, resulting in (sort of) relevant ads appearing on our pages. Ads triggered by our own hard work, OUR content. Google's AdSense regulations prohibit disclosing the amount of money earned with it by publishers who use it so I can't give you any numbers, but I can assure you one will need either a LOT of visitors or a LOT of sites to earn a decent amount of money through AdSense and similar programs. Of course this is where a lot of money-hungry people get discouraged. It actually takes hard work to make a living on blogging, or even earn an extra buck or two by blogging.
RSS: Really Simple Stealing
Most people understand the concept of running a site with meaningful content and earning some cash with advertisement embedded into this content. So what if you don't want to write the content yourself? You'll just STEAL it! Since the rise of RSS, stealing has never been easier. RSS stands for Really Simple Syndication but to some it means Really Simple Stealing. Both acronyms have truth in them. In the early days scraping a site was difficult. The content needed to be stripped from the often complex HTML, requiring quite advanced tools. Today there's no need for that anymore. Full RSS feeds baby! The content's waiting to be stolen, placed on dumb webpages surrounded by ads that generate money with YOUR content. And there's nothing anyone can do about it. Of course Google isn't stupid and they're trying to develop ways to detect content theft. However it's already been proven by SEO blackhats this can be circumvented. Just read this thread on DarkSEOTeam taking a stab at Google Employee Matt Cutts by 'GoogleWashing' his own unique content, making it look like the content was originally written by the DarkSEOTeam. An amusing thread but also a frightening one. To make life even easier there's a tool from hell available called ArticleBot (nope I'm NOT going to link to it). This 'clever software' takes your content and tries to automatically rewrite it in order to prevent detection of the fact the content was stolen from you.So you thought you owned your content?
So here we are, spending valuable time writing interesting content for our viewers while lowlife scum is earning hard cash with it. Since the techniques I described in this article require quite some technical knowledge, some programmers with dollar signs in their eyes created what I call 'tools from hell' that enable any non-programmer (but with the same dollar signs in their eyes) to 're-use' our content and earn money. If you thought comment-spam was bad, wait for the rise of RSS abuse. The example on Matt Cutt shows that the question 'Who owns the content anyway?' is going to be an interesting debate. What's going to happen next? Only time will tell.On a final note: If you want to do some research on whether your content has already been stolen you might want to check out CopyScape, a handy dandy service that can check whether someone has already plagiarized you. Beware of the false positives though!
Closing note
The great thing about RSS is the fact that it enables anyone, anywhere to have unrestricted access to your site's content. The bad thing about RSS is the fact that it enables anyone, anywhere to have unrestricted access to your site's content.Small addendum: Simon Willis added a clever comment to his link to this article: It's a reason to avoid licenses on your content that allow commercial use. If you have a generic CC license that allows commercial use you're basically allowing people to scrape your blog! Use Attribution-NonCommercial 2.5 license instead if you care about your work.
Filed under: cyberspace
Number of comments:
Number of trackbacks:
Tagged with: 







At 05 October '05 - 17:58 anders wrote:
At 05 October '05 - 19:02 Marco wrote:
At 05 October '05 - 19:19 Wim wrote:
Paragraph 7 of Terms states: “However, You may accurately disclose the amount of Google’s gross payments to You pursuant to the Program.”
At 05 October '05 - 22:55 Web Feeds wrote:
Im suprised why you believe RSS should be called Web 2.0 though, as its been around for over 10 years.
Web 2.0 in my opinion is The site should not act as a “walled garden” – it should be easy to get data in and out of the system.
Users should own their own data on the site
Purely web based – most successful web 2.0 sites can be used almost entirely through the browser.
Wouldnt worry also about people copying your work too much, as loggers build relationships with their readers that the difference.
Its actually quite hard to make money from adsense by blogging, there are certain nichesites that make quite a lot of money with maybe only 2-3000 uniques a day.
Also posted at my site Marco
At 06 October '05 - 12:41 James E. Robinosn, III wrote:
Even the latest Atom 1.0 spec says this:
‘The atom:rights element SHOULD NOT be used to convey machine-readable licensing information.’
At 06 October '05 - 17:05 Marco wrote:
At 11 October '05 - 01:24 cypherpuke wrote:
Please. Yeah, duplicate pages can clog up search results and be generally annoying to surfers. But how does it affect authors? Nobody is going to leave your site to read the spammy version of your work. All that’s left is some perverse puritan moralism: “how dare someone else be making money with my words?!” Seriously, what’s so wrong with that?
At 11 October '05 - 07:53 Marco wrote:
Or do you feel it’s perfectly ok to sell other people’s music or other people’s artwork without their permission as well?
At 13 January '06 - 01:04 curious wrote: