

THE INFAMOUS DUPLICATE CONTENT PENALTY THEORIES…
Which Version? Which Penalty? Is There Any Truth To Any Of It?
The meaning of Duplicate Content Penalty seems to vary, depending on whose blog you read. One version holds that search engines penalize a site when its content is substantially duplicated on other websites. Another holds that only the duplicating site would be penalized. Then there's the version in which article reprinters are penalized when their individual articles are reprinted across a number of sites. Not to mention that "penalty" itself is a tricky term which implies that search engines wil ban your site, or refuse to list duplicate pages in SERPs, or de-index/de-list/remove the duplicates from their indexes.
Well, here's what we know for sure. A penalty is imposed upon sites which substantially duplicate en masse — or mirror — another site in its entirety, with little or no variation in the content of each page, or of the file directory structure. The mirroring site will eventually be diminished in a search engine's SERPs (search engine results pages).
But from there, it gets murkier. The reality is that there are thousands of sites which consist almost entirely of reprint content, and many do, in fact, have extraordinarily high PageRanks. Major newspapers such as The New York Times (PR 10) and the Los Angeles Times (PR 8), while offering original content, are also fed by the news wires like Associated Press (PR 9) and Reuters (PR 8). Clearly none of these sites is incurring a duplicate content penalty. Not to mention auxiliary news sources like Google, Yahoo, MSN, et al, who would have to penalize themselves for duplication right along with the zillions of other news sources.
If you're an article reprinter, you want your original article to appear higher in SERPs than its reprints, of course, since you want your original to be perceived as the authoritative version. Plus, one of your goals in reprinting was to send targeted traffic to your own website. But if you've got lots of reprint articles floating around out there, you may face a cold reality one day. A search for your reprints could reveal that nasty note appended to your SERPs.
"In order to show you the most relevant results, we have omitted some entries very similar to the 22 already displayed. If you wish, you can repeat the search with the omitted results included."
Devastating, isn't it? You could submit an article to 200-300 publisher sites, and for a few weeks, enjoy 200-300 SERPs. Cool! But search again a few weeks later, and you're likely to find that your SERPs have been filtered to around 20 or so. And that's if you're lucky. Article reprinters have to work much harder now to generate inbound links in order to improve PageRank, not to mention the search engine traffic that would result from 200-300 SERPs listings. But you can get away with a certain amount of duplication without incurring a mirror penalty. The simple fact is this: no search engine is currrently mathematically capable of comparing any single page in its index to every other single page in its index.
Imagine the horsepower it would take for some omniscient search engine analyzer to comparatively crunch every page on the Web!
Let's say it's comparing two pages: the original page Page A, posted on Site A. Site B, an articles reprint site, posts Page B, a duplicate of Site A's original page (with Site A's blessing, of course). Now, let's make it more complicated. Omniscient Analyzer discovers that Site A's original page is brand new, which it could presumably determine from file date headers. How could Omniscient Analyzer be certain which page is the duplicate? Suppose Page A is simply a new file with a different filename, but contains content that has been on the Web for five years… perhaps the website was newly redesigned, whatever.
And where does Omniscient Analyzer draw the line with "similar results?" If Page A is x% similar to page Page B… what's the X factor? Could you simply change the word order in a couple of sentences — say the lead sentence — and trick Omniscient Analyzer? Can article submitters avoid the penalty by slightly re-writing the article? Renaming versions of the reprinted article? It's worth a shot.
Some experiments to try…
For starters, don't submit the same article to every reprint site on the internet! "Customize" the article. Sure, it's work. But remember, you can't even buy inbound links this valuable. Experiment with making HTML page titles different from actual article titles. Introduce your reprints with slightly different leads. Include different intra-site links — especially in your original — to increase relevancy.
Draft a couple of versions of the article when you first write it (while it's still fresh), each with a slightly different angle. Create versions with reordered paragraphs, writing new segues if necessary. Shift sentences around in your last paragraph. Write different versions of your finale sentence, using different keywords and different adjectives.
Most of all, don't be discouraged and don't be dissuaded from reprinting your articles. Article marketing costs nothing and it generates free traffic. It's still a killer technique for generating inbound links and increasing search engine traffic. (If you're not convinced, see Article Marketing: The Purist Approach To SEO & Brand-Building.) And there's always the PageRank boost to consider, especially if your reprint articles are posted on sites with high PageRanks. Especially if your own site's link popularity is low; i.e., there are not many websites linking to your site. High PageRank sites which point to your site are essentially casting a vote for your site, and link popularity is crucial in influencing your site's position in SERPs. If it comes down to it, you can always yank your article from reprint sites if the duplicate content filter drops your "authoratative" original to a status of "Omitted Results."
Given the burgeoning number of article reprint sites, no doubt the search engines — particularly Google, who stakes its reputation on relevancy — are working on algorithms to diminish the influence of inbound links from article directories on PageRank. But for now, article marketing remains a safe promotional strategy.
|