The Duplicate Content Penalty Myth

One thing that has plagued the SEO industry for years has been a lack of consistency when it comes to SEO terms and definitions. One of the most prevalent misnomers being bandied about is the phrase "duplicate content penalty." I’m here to tell you that there is no such thing as a search engine penalty for duplicate content. At least not the way many people believe there is.

Don’t get me wrong; I’m not saying that the search engines like and appreciate duplicate content — they don’t. But they don’t specifically penalize websites that happen to have some duplicate content.

Duplicate content has been and always will be a natural part of the Web. It’s nothing to be afraid of. If your site has some dupe content for whatever reason, you don’t have to lose sleep every night worrying about the wrath of the Google gods. They’re not going to shoot lightning bolts at your site from the sky, nor are they going to banish your entire website from ever showing up when someone searches for what you offer. The duplicate content probably won’t show up in searches, but that’s not the same thing as a penalty.

Let me explain.

The search engines want to index and show to their users (the searchers) as much unique content as algorithmically possible. That’s their job, and they do it quite well considering what they have to work with: spammers using invisible or irrelevant content, technically challenged websites that crawlers can’t easily find, copycat scraper sites that exist only to obtain AdSense clicks, and a whole host of other such nonsense.

There’s no doubt that duplicate content is a problem for search engines. If a searcher is looking for a particular type of product or service and is presented with pages and pages of results that provide the same basic information, then the engine has failed to do its job properly. In order to supply its users with a variety of information on their search query, search engines have created duplicate content "filters" (not penalties) that attempt to weed out the information they already know about. Certainly, if your page is one of those that is filtered, it may very well feel like a penalty to you, but it’s not – it’s a filter.

Search engine penalties are reserved for pages and sites that are purposely attempting to trick the search engines in one form or another. Penalties can be meted out algorithmically when obvious deceptions exist on a page, or they can be personally handed out by a search engineer who discovers an infraction through spam reports and other means. To many people’s surprise, penalties rarely happen to the average website. Most that receive a penalty know exactly what they did to deserve it.

Honestly, the search engines are not out to get you. Matt Cutts isn’t plotting new ways to take food off your table. If you have a page on your site that sells red widgets and another very similar page selling blue widgets, you aren’t going to find your site banished off the face of Google because of this. The worst thing that will happen is that only the red widget page may show up in the search results instead of both pages showing up.

On the other hand, if you’ve created a Mad Libs spam site — i.e., one that uses a pre-written template where specific keyword phrases are substituted out for other ones — the pages in question might get filtered out completely. Not so much because of their dupe content (although that’s part of it), but because it’s search engine spam (low-quality pages with little value to people, created solely for search engine rankings).

The bottom line is that the engines are actively seeking out lousy content and removing it from their main results. If this sounds like your site, don’t be surprised to wake up one day and find you’ve lost some or all of your rankings. It’s time to bite the bullet and use them as PPC landing pages instead. There’s definitely some irony in the fact that those types of pages are welcome in Google if you’re willing to pay for each clickthrough you receive, but those are obvious moneymaker pages, and Google has a right to demand their cut.

Regionalized pages are another duplicate-content "spam" model that has been losing ground with the engines lately. Those consist of hundreds of pages/sites selling the same basic thing, but they are targeted to every city in the US. Unfortunately, there’s no easy answer to how to create high-quality pages that do the same thing.

Suffice it to say that just about any content that is easily created without much human intervention (i.e., automated) is not a great candidate for organic SEO purposes.

Another duplicate-content issue that many are concerned about is the republishing of online articles. Reprinting someone’s article on your site is not going to cause a penalty. At best, your page with the article will show up in a search related to it; at worst, it won’t. No big deal either way.

If your own bylined articles are getting published elsewhere, that’s a good thing. There’s no need for you to provide a different version to other sites or to not allow them to be republished at all. The more sites that host your article, the more chances you will have to build your credibility as well as to gain links back to your site through a short bio at the end of the article. If the site your article is hosted on shows up instead of yours, so be it. There’s nothing wrong with that, as your site can be easily clicked to from your bio; the pros far outweigh the cons. In many cases, Google still shows numerous instances of articles in searches, but even if they eventually show only one version, that’s still okay.

When it comes to duplicate content, the search engines are not penalizing you or thinking that you’re a spammer; they’re simply trying to show some variety in their search results pages.

Jill Whalen is owner of High Rankings, a search engine optimization firm founded in 1995. She speaks and writes regularly on SEO issues and also maintains the High Ranking Forums, where the community over of 10,000 members discusses SEO topics. The 100% Organic column appears Thursdays at Search Engine Land.

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: All Things SEO Column | Channel: SEO | SEO: Duplicate Content

Sponsored


About The Author: is a pioneer in SEO, beginning in the field in the early 1990s and founding High Rankings in 1995. If you enjoy Jill's articles at Search Engine Land, be sure to subscribe to her High Rankings Advisor Search Marketing Newsletter for SEO articles, SEM advice and discounts on industry events and products.

Connect with the author via: Email



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://searchengineland.com Danny Sullivan

    Over at SEOmoz, Rand recently wrote The Illustrated Guide to Duplicate Content in the Search Engines, which touches on some of the things here, as well. He covers things he’s heard from various search engines and tries to illustrate how in some cases, the original document should do fine.

  • http://www.VenturesWithoutCapital.com Bruce Judson

    Jill:

    Thanks for this very, very useful article. As you note, this has been a source of great confusion, particularly since so many services (whether correctly or incorrectly) advocate distributing your content among multiple Websites.

    I know your column focuses on organic search, but I just wanted to note that it is also getting increasingly hard to send “banished content” to Google for cost competitive PPC activity. On the PPC side, Google’s “Quality Score” is now designed to ensure that users clicking on advertising results also have a quality experience.

    Your readers may be interested in an article I wrote on the best references available for guiding advertisers to ensure their content yields high “Quality Scores” and low PPC costs. It’s available at Ventures Without Capital, and is titled The Secret to Top Google Quality Scores: Comparing Google Slap Reports”.

    Thanks again for your realy valuable insights

  • http://www.paulzhao.com Paul Zhao

    How do you feel about sites that have unique content on all pages, but have some duplicated content?

    Example: A site with a small paragraph of their “legal terms” on all its pages.

    Thanks,
    Paul Zhao

  • hounddog

    Jill,

    (1) I agree that the terminology is problematic. Just as “penalty” has some wrong connotations, “filter” also seems inadequate to describe some of what we see. For example, “original” pages that are not eliminated, but are ranked below lower quality duplicates. Or original pages that show up only in the “omitted” results.

    It does seem that there’s something besides “eliminating duplicate content from the result” set going on. Maybe not a penalty, and maybe just the SEs doing a less-than-perfect job at filtering, but in any case, “filter” doesn’t seem to fully describe the observed results.

    (2) Saying that “If the site your article is hosted on shows up instead of yours, so be it. There’s nothing wrong with that, as your site can be easily clicked to from your bio; the pros far outweigh the cons.” may be accurate for some scenarios. But if the page in question is being monetized by adsense (or other advertising/affiliate links), you surely must account for the risk that the other instances of your article will outrank yours or cause yours to be filtered, and may siphon off some of this advertising revenue. I think it’s too broad a generalization to conclude that the pros outweight the cons in all cases.

  • http://www.highrankings.com/ Jill

    Example: A site with a small paragraph of their “legal terms” on all its pages.

    It shouldn’t be a problem. If you’re concerned or worried, simply make the legal disclaimer into an image instead of real HTML text.

  • http://www.seo4fun.com/blog/2007/01/05/seo-myth-there-is-no-duplicate-content-penalty.html Halfdeck

    Completely disagree.

    http://www.seo4fun.com/blog/2007/01/05/seo-myth-there-is-no-duplicate-content-penalty.html

    “Duplicate content penalty is a myth” is the real myth.

    Adam lasnik says:

    “As I noted in the original post, penalties in the context of duplicate content are rare.”

    The keyword there is rare.

  • Steve Amundsen

    Good Comments, Jill. You are very correct that duplicate content is not really a penalty. What is a de facto “penalty” is poor or little original content. Besides, does anyone really believe that Google cannot discern which site has the original content versus duplicate or copied content? Original content is king. Long live the king.

  • http://www.demib.dk Mikkel deMib Svendsen

    To a site owner it dosn’t really matter if you call it a “penalty” or a “filter” – getting pages removed or not ranked well at all feels the same. It hurts.

    And the fact is, if you have a website with real duplicate or identical content it won’t perform as well as a “clean” website with a “one dimensional” architecture.

    The main problem with duplicate content filtering is that you leave it up to the engines to decide what to keep and what to filter out. NEVER leave it the engines to figure out your site! In my experience they NEVER make the choices you would have.

    Also, by having duplicate content you risk wasting your links. If each unique page you have can be found on several URL’s on your site some links may go to versions of the page that are being filtered out. Not good!

    The bottom line is that there is absolutely NO reason not to create a good site architecture and avoid duplicate content. Not just for the sake of engines but for the users as well.

  • http://www.globalwarming-awareness2007online.com NunoH

    A myth?
    Remember this?

    Quality guidelines – specific guidelines

    * Avoid hidden text or hidden links.
    * Don’t employ cloaking or sneaky redirects.
    * Don’t send automated queries to Google.
    * Don’t load pages with irrelevant words.
    * Don’t create multiple pages, subdomains, or domains with substantially duplicate content.

    Don’t pretend to be a bigger expert than the ones who actually make the technology.

    http://www.google.com/support/webmasters/bin/answer.py?answer=35769

  • http://www.ppcdiscussions.com jeremy mayes

    “It’s time to bite the bullet and use them as PPC landing pages instead.”

    Was that supposed to be a joke?

  • Jill

    [quote]The bottom line is that there is absolutely NO reason not to create a good site architecture and avoid duplicate content. Not just for the sake of engines but for the users as well.[/quote]

    Mikel, absolutely, positively agree!

  • Fridaynite

    Think about the 50 or 60 Google patents on duplicate content. Maybe there are no penalties, but i am sure that there are filters which send your duplicate sites on place 950.

  • http://www.neoranking.com neoranking

    Has anyone considered the impact of having good quality content articles that are duplicated from another source on a single domain which has strong links associated with it, would such a site be penalized? I think if one focus their effort on adding value to his/her visitors instead of just pleasing the Algorithm alone, any site will do just fine in the long run.

    In other words, if your website contains a small collection of content duplicated from another source but yet adds value to the visitors to your website, there is no need to worry about any penalty.

    From a personal experience, my own content was copied on another website though not completely but almost 80% of the content was taken but the other site was doing much better in terms of traffic while the original site was penalized. I know because I created the scenario and how the 2nd site filled with duplicated content was able to rank better is purely because it managed to attract some good links to it.

    There is really no end to this, so just focus on serving your site visitors better.

  • http://sample-as-that.blogspot.com ciaran

    Sorry Jill – I just don’t agree (well, not entirely)

    I know from your email newsletters (which I usually enjoy) that you enjoy a good myth debunk, but this is a myth which turns out to be true.

    Search engines DO penalise (not just filter) for duplicate content – such as when domains aren’t handled correctly (not necessarily as a spamming technique, but simply due to human error), and you end up with multiple versions of the same content. I’m saying this because of 1st hand experience

    And saying that its not big thing if someone with who reposts your article ranks higher than you, seems a very lax attitude to take.

    Sorry – but I think you got this one wrong.

  • http://www.wordtravels.com tomp

    I run a website which has just been penalised (or filtered) by Google, and Google traffic has gone down by 90% – we’re losing around 30,000 page views per day as a result. Our pages are still in Google but way down the listings, and the top results page include a few really bad computer-geneated sites. We are a small publisher and write all our own good-quality content. The content is well-regarded and is not deliberately optimised for search engines or packed with unnenecessary keywords. Lots of good sites (including Google) link to our site and most pages have a high page rank.

    I think the problem may be that we syndicate the content to other sites, who use it to add value to their sites, some take xml and integrate themselves and around 20 others get a white label version of our site. We also have a certain amount of duplicate content on our own site – snippets of reviews link to a full version on another page – seems reasonable to me, but this was a recent change as some pages were getting too long. And standard disclaimers etc on every page.

    There are still a few searches which we are still No 1, or first page, but only a fraction of what was the case.

    Do you have any advice for me and specifically for sites which syndicate content?

  • http://www.semreportcard.com/yahoos-trademark-policy-recklessly-abuses-brands-online/ semreportcard

    Let me throw in my two cents for a related issue–duplicate, or “mirror” sites.

    The search engines do penalize duplicate sites (content) through the use of filters–Yahoo much more so than Google.

    The majority of my clients are network marketing brands, aka MLM, aka direct selling companies.

    What many of these companies have done is use a software to generate replicated (duplicate) websites for new reps on either sub-domains or sub-folders. Unfortunately, too many of them do this on their corporate domain and the results are as follows:

    Google: Google dramatically stifles the rankings of these pages, even the originally published site, but still indexes them and displays them in the results. The effects of the filters are overcome through the acquisition of many quality links. As a result, the index page of the corporate domain tends to prevail as the “official home page.” The threshold point where the filters kick in on Google tends to be a couple hundred duplicate pages.

    Yahoo: Yahoo can be very unforgiving to duplicate content pages. I have seen some clients’ web pages (including the home page) rank outside of the top 1,000 results for publishing a hundred replicated (duplicate) sites. The only exception here seems to be when you type the name of the company with their domain extension, for example, “Company.com.” This search in Yahoo tends to rank the corporate site #1 across the board.

 

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide