SEO Duplicate Web Content Penalty Myth Exploded
The “duplicate content penalty” myth is one of the biggest obstacles I face in getting web professionals to embrace reprint content. The myth is that search engines will penalize a site if much of its content is also on other websites.
Clarification: there is a real duplicate content penalty for content that is duplicated with minor or no variation across the pages of a single site. There is also a “mirror” penalty for a site that is more or less substantially duplicating another single site. What I’m talking about here is the reprint of pages of content individually, rather than in a mass, on multiple sites.
Another clarification: “penalty” is a loaded concept in SEO. “Penalty” means that search engines will punish a website for violations of the engine’s terms of service. The punishment can mean making it less likely that the site will appear in search results. Punishment can also mean removal from the search engine’s index of web pages (“de-indexing” or “delisting”).
How have I exploded the “duplicate content penalty” myth?
- PageRank. Many thousands of high-PageRank sites reprint content and provide content for reprint. The most obvious case is the news wires such as Reuters (PR 8) and the Associated Press (PR 9) that reprint to sites such as http://www.nytimes.com (PR 10).
- The proliferation of content reprint sites. There are now hundreds of websites devoted to reprint content because it’s a cheap, easy magnet for web traffic, especially search engine traffic.
- Experience. I’ve seen significant search engine traffic both from distributing content to be reprinted and from reprinting content on the site.
How I Doubled Search Engine Traffic with Reprint Content
When I first started distributing content for my main site, I was stunned by the highly targeted traffic I got from visitors clicking on the link at the end of the article. Search engine traffic also slowly increased both from the links and from having content on the site.
But I was even more stunned with the search engine traffic I got when I started putting reprint articles on the site in September. I had written quite a number of reprint articles for clients and accumulated a few webmaster “fans” who looked out for my articles to reprint them. I wanted to make it easier for them to find all the reprint articles I had written.
I didn’t want to draw too much attention to these articles, which had nothing to do with the main subject of the site, web content. So I secluded the articles in one section of the site.
The articles got a surprising amount of search engine traffic. The traffic was overwhelmingly from Google, and for long multiple-word search strings that just happened to be in the article word for word.
Why was I surprised with all the search engine traffic?
- The articles had so little link popularity. The link popularity to the articles came primarily from a single link to the “reprint content” page from the homepage, which linked to category pages, which linked to the articles themselves–three clicks from the homepage. The sitemap was enormous, well over 100 links, so its PageRank contribution was minimal. Since these articles were on the site such a short time I strongly doubt they got any links from other sites.
- The articles had so much competition. These articles had been reprinted far more widely than the average reprint article, which is lucky if it makes it into a few dedicated reprint sites. As part of my service I had done most of the legwork of reprinting my clients’ articles for them. In fact, I guarantee at least 100 reprints on Google-indexed web pages either for each article or group of articles. So that’s up to 100 web pages, sometimes more, that were competing with my web page to appear in search engine results for the search string.
Why Do Reprint Articles Get Search Engine Traffic?
You would think Google would just pick one web page with the article as the authoritative edition and send all the traffic to it.
But that’s not how Google works. All the search engines look at factors beyond just the content on the web page. They look at links. Google, at least, claims to look at 100 factors total. Many of these must relate to the content on the page, but not all of them.
The whole experience has given me great insight into what factors Google uses in addition to what we would consider the page itself, and the relative importance of each.
- Web page titles (the one in the html title tag) are extremely important as tie-breakers between two otherwise equally matched pages. Most reprinters waste the html title, using the article title as the web page title. Set yourself apart by creating unique five-to-ten-word web page titles that include target keywords.
- Content tweaks. You can also introduce the article with a unique, keyword-laden editor’s note, and finish the article off with some keyword-laced comments.
- Intra-site link popularity and anchor text (that is, for links to the article page from other web pages on the site) are also important. If you can’t link to the page from the homepage, keep it as close to the homepage as possible and weed out extraneous links (try putting all your site policies on a single page).
Reprint articles, like the search engine traffic they bring, cost nothing. Don’t look a gift horse in the mouth. Forget the “duplicate content penalty.” Get in on content reprints and share the search engine wealth.