Entrap: To lure into performing a previously or otherwise uncontemplated illegal act.
With the recent release of AntiLeech, an anti splog plugin for WordPress by Owen Winkler, there finally exists a real method of fighting back against content scraping thieves. AntiLeech is a plugin for WordPress that attempts to serve up fake content to known splogs. The plugin identifies splogs by either their User-Agents or IP address (user supplied). From the plugin page:
What does AntiLeech do? AntiLeech does not prevent the splogger bots from accessing your site. No, it does better than that. It produces a fake set of content especially for them that includes links back to your site (and mine, too, ok?) and sends it only to them.
AntiLeech also offer up the option of creating custom content to serve up only to splog bots. This option is how I now wildly progistcate on how to entrap would be splogers. By using AntiLeech, a splog will publish on their site a fake piece of content that you or I write. This content can be anything….
Consider: sploggers splog for money. They put advertisements (like Adsense) next to the stolen content in the hopes that surfers will click on them as they read the stolen content. When anybody signs up for these advertising services they agree to be bound by a set of rules and regulations collectively called the ‘Terms of Service.’ If one party were to break a few of the rules found in the agreement (splogs), and if the other party found out (Google), the agreement would should be terminated.
Here are a few choice rules that publishers who use Google’s Adsense program must adhere to. Site content may not include:
- Excessive profanity
- Violence, racial intolerance, or advocate against any individual, group, or organization
- Hacking/cracking content
- Illicit drugs and drug paraphernalia
- Pornography, adult, or mature content
- Gambling or casino-related content
- Excessive advertising
- Any other content that promotes illegal activity or infringes on the legal rights of others
- Incentives to, clicking on ads or links, performing searches, surfing websites, reading emails, or completing surveys
- Sales or promotion of weapons, such as firearms, ammunition, balisongs, butterfly knives, and brass knuckles, alcohol, tobacco or tobacco-related products, prescription drugs, and products that are replicas or imitations of designer goods
My suggestion, write something that completely violates these rules in an overt way. Go crazy, use your imagination. This is your chance to play a very very dirty prank.
Take your new creative writing piece and use it with AntiLeech so that whenever a splog comes to take your content, they take your agreement violating creative writing piece instead. Should Google ever discover that the splog is placing ads next to this disgusting drivel — they’ll cancel (or warn) the offending site.
With the money gone, or atleast the constant threat of your site promoting illegal behaviour, it will require the splogger to be much more careful and/ or drop your site from the list of those he or she steals from. It will require more of their time which costs them money.
Sploggers are lazy, if it takes too much energy to make money this way they’ll stop. If they can’t make money this way they’ll stop. The same principle also applies to splogger hosts (if they use decent law abiding ones that is). For example, in the DreamHost TOS, they explicitly state that customers who deal with child pornography and / or “tools or methods to send unsolicited e-mail or usenet postings (spam)” will break their agreement resulting in a terminated account. Consider the mockup of some fake content (shown at the top of this post) that would get a lot of websites in trouble.
MaxPower thinks it would be more than a little crass to use something as horrible as child pornography to take down a spammer. However, helping out a splog promote spam and spamming methods, firearms, alcohol, gambling, hacking, drugs… — no problem!
Here is the kicker: since this vile content never ever ever appears on your site, you are in the clear. Afterall, the splog came for your content, you gave them some content. Whats it to you if the site owner wasn’t watching what his/ her bot was doing? Is there even any evidence (available publicly) that the content that appeared on another site came from yours? Even if there was… so what?
As a justice seeking AntiLeech wielding blog owner vigilante, all you need do is figure out when a splog as stolen your content, and report them to the right place. You could track splog acceptance of this fake content using a personal googlewhack (like the digital fingerprint plugin for WordPress published by MaxPower). Even if nothing happens, the splog gets useless content that will offend any normal webuser into hitting the back button.
Problems I foresee:
- Feeding the wrong content to a good guy
- Conceivably, you could use AntiLeech to target the wrong IP / useragent. Only RSS readers would see this so it would never appear on your site in the traditional sense. One way to get around this is to add a statement at the start of the fake text to indicate its intent. This would have to be carefully worded so as not to alert those companies who will be investigating the splog for TOS breaking.
- Some splogs strip out all HTML before reposting
- Many splogs won’t just copy your post verbatim, they strip out HREF and IMG tags. Your dirty message bomb needs to be explicit in the link (eg. www.websitelink.com) so that the splog takes the spelled out link as text regardless of your markup
- FeedBurner and AntiLeech
- FeedBurner is a service that takes the feed published at your site, and makes it available at a different address for potential subscribers. At this time, I’m not sure if FeedBurner hurts AntiLeech performance with respect to feeding blogs fake content. More investigation is required.
Subscribe to MaxPower: empowered by monkeys

05 Oct 06
10:29 am
Feeding the wrong content to a good guy
A funny story. When I first posted the plugin, I pointed at Val’s site, and she was livid about her stolen content. She wrote out complicated instructions for blocking Bitacle using methods built into the Apache server, which are less resource intensive and more thorough than my plugin. But in the process, she included my home IP address in the list of IPs that Bitacle uses. Why? Because I had used her site to test her rules (with her knowledge), and she had gathered all of the IPs in her server log from the Bitacle bot, which I was impersonating.
The next day I was browsing the web at home, and found a few sites that I could no longer access. Funny how information travels.
So, bottom line: Yes, it’s possible (although rare) that you feed the wrong content to someone you know. Be sure they have a way to contact you and let you know about the problem.
FeedBurner and AntiLeech
I’ll say it explicitly: FeedBurner absolutely does harm performance of AntiLeech, even when using AntiLeech’s methods to redriect to FeedBurner. Why?
FeedBurner essentially scrapes, reformats, and republishes your content just like a splogger would, except you’ve (usually) asked them to. Because of that, there is no way for AntiLeech to embed a proper image tag into the FeedBurner output for tracking bots that steal your content from FeedBurner.
If you are not using FeedBurner, I suggest that you add FeedBurner to your list of excluded user-agents, because this has now become a way that sploggers are using to circumvent your protections. They sign your feed up for a FeedBurner account, then use FeedBurner to indirectly scrape your content for them.
FeedBurner really needs to add protections to its service for people who want to use it. When asked via email about adding this feature, their response was not encouraging.
05 Oct 06
11:20 am
“They sign your feed up for a FeedBurner account, then use FeedBurner to indirectly scrape your content for them.”
Ooooo thats dirty (and clever). Thanks for giving the straight goods on using feedburner together with antileech. Plagiarism Today author Jonathan Bailey was told that FeedBurner was working on methods of blocking IP’s — who knows if they will come through.
And I agree with you that mistakes can happen regarding serving up poison content and that there needs to be some mechanism to allow people to report it. However, if this was impossible, I would still use AntiLeech because the benifts far outweigh the potential drawbacks.
BTW AntiLeech protected this post from a splog already. I’ll post the details later.
06 Oct 06
5:46 am
We should setup a site like http://phishtank.com for splogs? I’ve been thinking about that since they released the site, wondering why nobody has done it yet.
This way there is a central place to pull splog information and possibly responses that get them pulled from the network.
I’ve been working on a web crawler designed to locate splogs, fraud and phising sites. If anyone wants to help build such a site I’d be willing to contribute the data found. I still have work to do on the code so there’s nothing to show yet.
Wayne
06 Oct 06
8:49 am
I like http://madeforads.com/. Check it out.
06 Oct 06
4:39 pm
Ah ! wicked idea :))
07 Oct 06
2:05 am
Brilliant post!
Have you heard about Numly.com? They allow you to register your works online and return a verifiable Numly Number for proof of copyright submission. To my knowledge, sploggers do not strip their Numly Numbers from their posts. This provides proof of where they picked up the original copyrighted post.
09 Oct 06
5:32 am
Is there an “inoffical list” of splog user-agents? Might be good to share.
30 Oct 06
10:59 am
[...] Using Owen’s fantastic WordPress plugin antileech, I have successfully taken control over new posts appearing on a splog intent on copying them. Instead of getting actual post content, the splog gets a message written by me. Right now that message advocates gambling and buying firearms, two things expressly forbidden by the Adsense TOS (read Fight dirty by entrapping splogs using antiLeech). [...]
21 Nov 06
11:05 pm
Heh, yeah, did I ever mention how stupid I feel about that whole IP thing?
Go Antileech! It’s great:
en.bitacle.org/v/445z-txl7did0/well-to-tell-you-the-truth-we-do-things-a-little-differently-here-.html
en.bitacle.org/v/446z-txl7did0/a-b-c-d-i-j-k-m-m-z-.html
16 Feb 07
6:32 pm
[...] MaxPower suggests fighting dirty by using AntiLeech to hand ‘questionable’ content to the splogs - such as profanity, racial hatred, pornography. Just the kind of stuff that Google loves to ban its Adsense users from displaying. [...]
21 Apr 10
9:31 am
[...] you like to fight fire with fire, there is also anti-leech. This is for bots that take your information and post it as there own on a blog. What is it? [...]