One thing that has plagued the SEO industry for years has been a lack of consistency when it comes to SEO terms and definitions. One of the most prevalent misnomers being bandied about is the phrase "duplicate content penalty." I’m here to tell you that there is no such thing as a search engine penalty for duplicate content. At least not the way many people believe there is.
Don’t get me wrong; I’m not saying that the search engines like and appreciate duplicate content — they don’t. But they don’t specifically penalize websites that happen to have some duplicate content.
Duplicate content has been and always will be a natural part of the Web. It’s nothing to be afraid of. If your site has some dupe content for whatever reason, you don’t have to lose sleep every night worrying about the wrath of the Google gods. They’re not going to shoot lightning bolts at your site from the sky, nor are they going to banish your entire website from ever showing up when someone searches for what you offer. The duplicate content probably won’t show up in searches, but that’s not the same thing as a penalty.
Let me explain.
The search engines want to index and show to their users (the searchers) as much unique content as algorithmically possible. That’s their job, and they do it quite well considering what they have to work with: spammers using invisible or irrelevant content, technically challenged websites that crawlers can’t easily find, copycat scraper sites that exist only to obtain AdSense clicks, and a whole host of other such nonsense.
There’s no doubt that duplicate content is a problem for search engines. If a searcher is looking for a particular type of product or service and is presented with pages and pages of results that provide the same basic information, then the engine has failed to do its job properly. In order to supply its users with a variety of information on their search query, search engines have created duplicate content "filters" (not penalties) that attempt to weed out the information they already know about. Certainly, if your page is one of those that is filtered, it may very well feel like a penalty to you, but it’s not – it’s a filter.
Search engine penalties are reserved for pages and sites that are purposely attempting to trick the search engines in one form or another. Penalties can be meted out algorithmically when obvious deceptions exist on a page, or they can be personally handed out by a search engineer who discovers an infraction through spam reports and other means. To many people’s surprise, penalties rarely happen to the average website. Most that receive a penalty know exactly what they did to deserve it.
Honestly, the search engines are not out to get you. Matt Cutts isn’t plotting new ways to take food off your table. If you have a page on your site that sells red widgets and another very similar page selling blue widgets, you aren’t going to find your site banished off the face of Google because of this. The worst thing that will happen is that only the red widget page may show up in the search results instead of both pages showing up.
On the other hand, if you’ve created a Mad Libs spam site — i.e., one that uses a pre-written template where specific keyword phrases are substituted out for other ones — the pages in question might get filtered out completely. Not so much because of their dupe content (although that’s part of it), but because it’s search engine spam (low-quality pages with little value to people, created solely for search engine rankings).
The bottom line is that the engines are actively seeking out lousy content and removing it from their main results. If this sounds like your site, don’t be surprised to wake up one day and find you’ve lost some or all of your rankings. It’s time to bite the bullet and use them as PPC landing pages instead. There’s definitely some irony in the fact that those types of pages are welcome in Google if you’re willing to pay for each clickthrough you receive, but those are obvious moneymaker pages, and Google has a right to demand their cut.
Regionalized pages are another duplicate-content "spam" model that has been losing ground with the engines lately. Those consist of hundreds of pages/sites selling the same basic thing, but they are targeted to every city in the US. Unfortunately, there’s no easy answer to how to create high-quality pages that do the same thing.
Suffice it to say that just about any content that is easily created without much human intervention (i.e., automated) is not a great candidate for organic SEO purposes.
Another duplicate-content issue that many are concerned about is the republishing of online articles. Reprinting someone’s article on your site is not going to cause a penalty. At best, your page with the article will show up in a search related to it; at worst, it won’t. No big deal either way.
If your own bylined articles are getting published elsewhere, that’s a good thing. There’s no need for you to provide a different version to other sites or to not allow them to be republished at all. The more sites that host your article, the more chances you will have to build your credibility as well as to gain links back to your site through a short bio at the end of the article. If the site your article is hosted on shows up instead of yours, so be it. There’s nothing wrong with that, as your site can be easily clicked to from your bio; the pros far outweigh the cons. In many cases, Google still shows numerous instances of articles in searches, but even if they eventually show only one version, that’s still okay.
When it comes to duplicate content, the search engines are not penalizing you or thinking that you’re a spammer; they’re simply trying to show some variety in their search results pages.
Jill Whalen is owner of High Rankings, a search engine optimization firm founded in 1995. She speaks and writes regularly on SEO issues and also maintains the High Ranking Forums, where the community over of 10,000 members discusses SEO topics. The 100% Organic column appears Thursdays at Search Engine Land.
Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.