Recently, it has come to my attention that there is a new type of insipid blogger. Unlike splogs which are created to promote commercial websites and contain mostly garbage text, these new blogs look polished and contain seemingly pertinent content of value.
The problem is, the text within has already been published by someone else. I name these plagiarist’s doppelbloggers.
Why Doppelblogger? A doppelganger is a double or lookalike of a living person. The word can also be used to describe the phenomenon whereby you catch your own image out of the corner of your eye. For our purposes, creative freedom allows doppelblogger, a double or lookalike of a working web page. By extension, a doppelblog is written by a doppelblogger. For a refresher to all, plagiarism as defined from wikipedia:
Plagiarism most commonly refers to the passing off of another person’s work as one’s own, whether deliberate or accidental. Accidental plagiarism is usually the result of poor citation or referencing, or of poor preparation or a misunderstanding of what constitutes plagiarism. Deliberate plagiarism is a purposeful attempt to claim another person’s work as one’s own.
Any unacknowledged use of words, ideas, information, research, or findings not one’s own, taken from any source including a published or unpublished book, the Internet, a lecture or conversation, a film, television or radio show, even if paraphrased, constitutes plagiarism. [source]
A doppelblogger’s motivation could be self-promotion or money. By appearing as an ‘expert’ they hope to gain prestige and build up their reputation as someone ‘in the know’. Doppelblogger’s may also be motivated by money — they gain more traffic as people begin to value their contributions. Again, with more traffic comes revenue via advertising systems such as adsense.
Doppelbloggers are hard to spot. While your typical splog’s content often consists of random words interspersed with keywords speficially designed to rank the page higher in the search engines, a doppelblog’s content is simply taken from other sources and posted online.
Doppelbloggers find it easy to get content for their blog. Their exists a superabundance of seemingly faceless websites that provide content to web readers, everything from small websites to super-sites like wikipedia. All a doppelblogger need do to create web content is to search through sites such as wikipedia for relevant content to the web users they are trying to attract. Once found, they simply cut and paste it into their own site. Sometimes, they are careful to observe the appropriate copyright uses and acknowledgements of the original work by placing a small link to the original work or even just naming the source. An example would be, “Hey everyone, I found this over at www.website.com and I think its great! [entire contents of article inserted here].
Acknowledgment of sources is the exception, not the rule.
There is no “blogging police,” its up to individuals to find cases where their work is being used without permission. Since, the vast majority of bloggers create content for their website on a part-time basis, they don’t have time to police the internet looking for copies of their work. Doppelbloggers know this and they prey upon these types of websites by copying and pasting from the original source to the doppelblog. Doppelbloggers also rely on people’s ignorance of copyright and plagiarism. Even if a blogger discovers he or she has been ripped off, they often don’t know exactly what to do.
What really gets us riled up here is that we are extremely tedious about citing our sources. You’ll note that when I quoted the definition for plagiarism, I linked to the place where I found the definition. [...] We care about what we write and give credit where credit is due, so yes, it pisses us off to see that others — particularly Latino blogs — are not doing the same. [vivirlatino.com]
Lets be clear: the right thing to do is confront a doppelblogger. Once confronted, they often plead innocence and ignorance and offer to remove the offending content. I call this the ‘accident-on-purpose’ defense. From the point of view of a doppelblogger, its easy to remove something you haven’t written to appease an angry plagiarism victim as more content is only a few clicks away.
Doppelbloggers can also be very passive aggressive about the way they operate. Remember, plagiarists copy and paste to prop up their own status and gain traffic to their site. A doppelbloggers rep would seriously suffer if it was found out that their content contained mostly plagiarised work. In order to avoid this, they remove or change the offending article such that it is no longer easily recognisable as a plagiarised work. They then respond to allegations of plagiarism with accusatory and legal threats trying to intimidate or scare the original content creator into doing nothing.
The first thing to do when you find out that you have been plagiarised is to make a copy of the offending work. If a doppelblogger responds to your notice of copyright infringement all you need do is refer to the copy you have made of their site showing the infringing work.
The website Vivirlatino recently discovered a doppelblogger was copying their content, you can read all about it, but its the same tired story. Once ‘outed’ the doppelblogger eliminated the offending work without any official acknowledgment or apology. The doppelblogger then posted a comment to vivirlatino’s website under a different name trying to sway the court of of public opinion. Of course, this too was caught.
I firmly believe that doppelbloggers think all other bloggers must be stupid, after all why write content when you can just take it? What they fail to realise is that most website owners are smart enough to be web savvy even if they don’t have enough time to police the internet looking for infringements of their work.
Spotting a doppleblog isn’t easy, but it is possible. If you suspect that what you are reading isn’t an original piece of work, do what I do: simply grab a chunk of the text and throw it into google with quotes around it. If you get some hits, you can probably bet that something is fishy. I contend that it is your duty to send a quick email off to the real copyright holder if you can find them. Good karma and all that aside, wouldn’t you like to know if your work was being stolen? [send an email or comment to let the original author no of the the plagiarism] Consider the following quote from an article on resume writing:
Begin sentences with action verbs. Portray yourself as someone who is active, uses their brain, and gets things done. Stick with the past tense, even for descriptions of currently held positions, to avoid confusion.
Google returns 28 word-for-word matches for this search phrase. Each match coresponds to a copy of the original article in its entirety. Who is the original author? Its impossible to tell, but without a doubt 27 of those search results are plagairsts. Doppelbloggers specialise in finding content like this to post as original on their own website. Unsuspecting readers think they are being treated to valuable new information. Ads are clicked. Everybody wins except the original author.
Ok I lied. Nobody wins. The following quote from Disconnect the Dots author Ja (who was commenting on an earlier post here at maxpower regarding a plagiarism issue) sums up many web users frustraions:
In the end web-pollution makes it impossible to find stuff with all this crazy linking and content regurgitating clogging up any effort to find what you’re actually looking for not to mention that it seemingly causes multiple locations of the same goddamned stuff.
He believes that all this duplicate fluff will drown out the real, new, and original content. I tend to agree. Many people also think that duplicate content is worth talking about. The situation appears to be some sort of search engine arms race: steal as much as you can until you are caught, then find a new method.
In a doppelblogger case found here MaxPower, the plaigairst has been caught red handed. When confronted, the doppelblogger claimed ignorance of the law and changed (slightly) the content stolen. Next, a comment posted on this site purporting to be from a ‘friend’ of the the doppelblogger suggested that because the plagiarist lived in Sweden he was untouchable by the law (the direct quote is: good luck sending a cease and desist letter to a guy living in Sweden (do they even have the DMCA there?).
Doppelbloggers hide behind a veil of anonymity to shield themselves from being outed. This sheild must make them feel safe knowing that their real identity can’t be found. This is simply not true, unless a doppelblogger is very good, online identities are easy to piece together. Even if you can’t find a doppelbloggers real name, you can often find a trail of handles (or usernames) that they use across different websites on the net. Using google as your Dr. Watson, much can be learned.
My doppelblogger may or may not live in Sweden (I highly doubt it). I don’t really care because what is important is that the doppelblogger’s site is hosted in the USA, home of hamburgers, pork rinds, Nascar and the DMCA.
Plagiarism Today author Jonathan Bailey has an excellent writup of using the DMCA to stop plagarists. I highly reccomend giving it a read. From the PT page referenced: “The DMCA has a notice-and-takedown provision which requires hosts to remove or disable access to items which have been reported to be infringing on copyright.” You may disagree with the principles behind the DMCA, a much despised law, but you might as well put it to good use and fight the good fight.
Since the doppelblogger who copied me has already fessed up, changed the content, and moved on their is nothing, in the legal sense, that I can do. This is because the DMCA can only be used against a copyright violator if you own the rights to the original work. In a non-legalease sense, there is something that can be done about caught doppelbloggers — shame them. A few examples of shaming a doppelblogger are listed at the bottom of this page.
While shaming someone certainly feels like a good thing to do (its an emotional / visceral response) it isn’t always the best thing to do. Consider the age old addage, “any publicity is good publicity.” Do you really want to send more visitors to the doppelblog?
It a doppelblogger copies content to enhance their reputation, then outing them may be very effective. Espicially if you can outrank them in the search engine results (or at least come close). A semi related example would be if you were to search for bestbuy, among the top results is the bestbuysux.org website chronicling why bestbuy sucks.
If you intend to publish a doppelblogger’s transgressions, be sure that things don’t break down into a pissing contest. Your goal should be reporting of the truth in order to shame the plagairst into action. Given that the good Dr. Watson (google) can be very effective in determining online identities, creating a page which documents copyright offences by a doppelblogger, who they are, and what web communities they hang out in could seriously dammage an individuals online reputation. As a doppelblogger, imagine for a moment a potential employer googling your name only to find proof of your previous plagiarism and copyright violating actions. As the Internet become more and more pervasive in everyday life, you better believe that every human resource department will be googling candidates names to learn as much as they can about a potential employee.
If you do go down the path of shaming a doppelblogger, make sure you intend to follow through. I heard on the radio today that the etymology (root meaning) of the word ’success’ literally means ‘to follow through.’ Therefore, if you want to be successfull against plagiarsts, be prepared to ‘play the long game.’ A good shaming could take weeks or months.
In conclusion, doppelbloggers do what they do for a variety of reasons inlcuding fortune and fame. They are lazy and risk not fame but infamy. To paraphrase Plagiarism Today: fighting plagiarism is about preserving your work and your efforts as your own. Don’t let others make money off of your words, thoughts, and ideas by calling them their own.
Appendix
I haven’t talked much about how to prevent plagiarism because others have done it so well — see the further reading secition below for a list of excellent resources regarding plagiarism. In addition, a short list of valuable plugins for wordpress is presented. After what I have learned abou this subject, I strongly encourage their use. Lastly, I’d really appreciate some feedback on some of the ideas I presented here. Do you think shaming works? Are there more examples than the ones posted below? I touched on this, but its got me thinking: if the only one who can legally do something about plagiarism is the content owner, then how do we go about getting duplicate content site and their authors punished? Perhaps google and yahoo secretly condone this form of plagiarism — they like their ads clicked ($) too.
Examples of Doppelbloggers caught in the wild
- Different Poets get their content stolen and combined
- Blogger shuts down his doppelblogger
- Latino website gets plagiarised by a site representing ‘blown pride’
- The Pop eye finds out that random thoughts aren’t so random
References and further reading
What Do You Do When Someone Steals Your Content
Content Theft from Feeds - It’s Time To Take Action
Finding Stolen Content and Copyright Infringements
Scraping for Content
10 Big Myths about copyright explained
Wordpress plugins that help with copyright issues
- Numly Numbers
- Blog Copyright WordPress Plugin
- WP-CC Plugin for Wordpress
- WPLicense WordPress Plugin
- Automatically add copyright message to Feeds
28 Apr 06
9:40 pm
You keep silencing me, but I am still here. I know i am getting under your skin and you cannot do anything about it. :-).
Have a nice day,
John
28 Apr 06
9:58 pm
Dude, I moved your comment to a more relevant location. Thanks for stopping by. To everyone else, please ignore the plagiarist. He is ignorant of the law, good taste, and irony.
30 Apr 06
6:59 am
I wouldn’t doubt it… with google we’re talking about a company that sells words to the highest bidder for advertisement, including trademarked phrases. In a lot of ways they’ve started to become the new Microsoft of the web… their search engine has gone to shit yet they don’t care to make it better because everyone uses it by default these days (kinda like MS and IE, etc) anyway so why bother while they can be spending money on aquiring other companies and putting out free new stuff to keep a loyal fanbase thinking their cool/hip while they’re just really devious… okay maybe they’re more like a combination between MS/Apple, heh.
Anyway, as far as all this content being all over the place… it’s going to get worse. Structured Blogging and Microformats are heating up and I’m long overdue for writing up a little expose on that situation and my concerns with both. So stay tuned for that to pop up on dtd should I ever get through all this work I need to get done at the moment.
Great article, btw!!
And if it’s not covered by any of the plugins you mentioned (or anyone just wants to check it out): http://microformats.org/wiki/rel-license
30 Apr 06
11:38 pm
Excellent! Absolutely brilliant! Thank you!
And thanks for the references to my articles.
01 May 06
12:05 am
[...] Max Power’s article, “You’ve Heard of Splogs. Meet Dopplebloggers”, is a must read if you give any thought about the “good side” of content theft as well as the bad side. Recently, it has come to my attention that there is a new type of insipid blogger. Unlike splogs which are created to promote commercial websites and contain mostly garbage text, these new blogs look polished and contain seemingly pertinent content of value. [...]
03 May 06
1:58 pm
[...] Kirk Montgomery nos habla en su blog de la nueva lacra de los blogs, los doppleblogers: personas que se dedican a llenar sus blogs con contenido plagiado. [...]
03 May 06
8:02 pm
A major annoyance, I’ve been seeing this, along with the splogs. It’s even easier than cut and paste, however, as you could just import someone’s RSS feed… heck, hundreds of RSS feeds. So, it’s important not to give out your full content in RSS - in my opinion.
Maybe there needs to be some kind of public blacklist. Not sure how much it would help now, but could be useful down the line.
03 May 06
8:17 pm
I’ve been reading about people who detest feeds that aren’t full content. I kind of agree with them, I kind of don’t. While providing only some of the content prevents plagaiarism, it doesn’t protect against cut and pasters and it pisses people off. Rss splogs are easier to find via the search engines, but that doesn’t mean easier to catch… its a toss up either way.
Thanks for the comment, I hope doppelblogger catches on.
03 May 06
9:29 pm
Yeah, I’m not bothered at all by excerpts in the RSS. I’m usually pretty sure if I want to read on or not, and I like the excerpts because I can scan a whole lot at once. If I want to read on, i’m happy to go through to the site. I can see it being a toss up though!
04 May 06
5:33 am
What’s really annoying in search engines is the RSS feed services that just take the RSS feeds and redistribute them so even if you do go searching for the dopplesplogsters via RSS Scraping tracing, all you’ll really find are these RSS repeater services, tagging services, whatever. Search engines like Google along with certain other services like The Way Back machine and such not only index data but they cache it. With all their money they have (Google that is) you’d think they could put a system into place to compare material they’re indexing with material already stored and red flag something that matches up too well. This may not seem feasible, but it could be a subscription service where you register your site with them and pay X amount per month for them to keep it protected. That would control the amount of people signing up (which could make it feasible) AND earn them some money directly for providing an ACTUAL service. Really it’s the least they could do since since they are a major cause of all this.
Another interesting theory is creating web crawlers that analyze the writing _style_ of your blog (yes, it’s possible). Then if a single author blog overall has major discrepencies in style from post to post… or if it has style matching perfectly that of another blog… again, red flag. I think way too much.
But anyway, while I read hardly any feeds anymore, I used to read a ton and I have to admit that I liked having the full text in the feed so I could just read them without having to wait for the sites to load up. The worst though were the ones that provided only a title or less than a line of text… a decent excerpt or description at least does the job of helping one figure out if they want to read it, mark it for later, or just skip it. So that seems to be a pretty good compromise if done correctly.
What feed readers do you guys use? The only decent one that was even close to matching what I wanted out of a feed reader was Onfolio (not specifically a feed reader but had a great one built in) which I used for a while until Microsoft completely ruined the excellent product by buying it with the intention of adding a stripped down free version to their Microsoft yadda yadda toolbar something which I will never ever use. So I’m back in the hunt for a decent one. Any suggestions? Last time it was a major undertaking to find one I was satisfied with.
Jā
09 May 06
11:39 am
I use gregarious as my feed reader. But to be honest. I like visiting the pages rather than viewing the text. As to full text feeds, I would like to get 1 ad per article in the feed (at most) however, nobody wants to provide this service. I think this is because I don’t write about things that directly translate into products (such as reviewing digital cameras or other electronic products).
13 May 06
7:23 pm
[...] A doppelblogger is someone who plagiarizes the content of another blogger for personal gain or recognition. Term coined by Max Powers. [...]
23 May 06
2:01 pm
[...] The general population of bloggers are absolutely sick of sploggers and dopplebloggers (see an excellent post here on maxpower and one with some good comments from a very popular blog earlier in the year right here) and to say Google is completely responsible would be innacurate… but not too far off. [...]
18 Sep 06
2:32 pm
[...] Social Bookmarking « You’ve heard of splogs? Meet doppelbloggers. WordPress Plugin: Dash-Note » [...]
15 Oct 06
4:12 pm
Whats the point in writing a blog if you are just copying. Blogs are supposed to be like a journal of YOUR life. The world is turning into a really fake place.