Google’s Matt Cutts: 25-30% Of The Web’s Content Is Duplicate Content & That’s Okay

matt-cutts-linksMatt Cutts, Google’s head of search spam, posted a video today about duplicate content and the repercussions of it within Google’s search results.

Matt said that somewhere between 25% to 30% of the content on the web is duplicative. Of all the web pages and content across the internet, over one-quarter of it is repetitive or duplicative.

But Cutts says you don’t have to worry about it. Google doesn’t treat duplicate content as spam. It is true that Google only wants to show one of those pages in their search results, which may feel like a penalty if your content is not chosen — but it is not.

Google takes all the duplicates and groups them into a cluster. Then Google will show the best of the results in that cluster.

Matt Cutts did say Google does reserve the right to penalize a site that is excessively duplicating content, in a manipulative manner. But overall, duplicate content is normal and not spam.

Here is the video:

Related Stories:

Related Topics: Channel: SEO | Google: SEO | SEO: Duplicate Content | Top News

Sponsored


About The Author: is Search Engine Land's News Editor and owns RustyBrick, a NY based web consulting firm. He also runs Search Engine Roundtable, a popular search blog on very advanced SEM topics. Barry's personal blog is named Cartoon Barry and he can be followed on Twitter here. For more background information on Barry, see his full bio over here.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://www.lastres0rt.com Rachel Keslensky – Last Res0rt

    I would suspect a nontrivial number of that duplicate content is “social network security” — i.e. you tweet something, that tweet gets posted to facebook or repeated on another service, etc.

    Now, how to keep the 9gag version of your favorite comic suppressed… that’s another issue.

  • http://www.thelaw.com/ Michael Wechsler

    As long as my content is in the 70-75%, that’s OK with me too. ;-)

  • http://www.clickfire.com/ Emory Rowland

    “…which may feel like a penalty if your content is not chosen — but it is not.”

    It’s comforting to know that my site doesn’t have a penalty. It’s just not chosen to come up.

  • http://creativerty.com/ Rob jH

    Good as most of his advise videos are duplicate content, lol

  • Synthia Rose

    Well, I hope this doesn’t encourage more copying. And what does this mean for DRM violations. Will notices of those be ignored now?

  • Alex Polonsky

    Google should ban Yahoo Search since it’s 100% duplicate content from Bing.

  • Christian Noel

    I see what Cutts is saying. However, if it is dupe content on your site you should still go through the process of signaling to Google which version YOU would prefer to show in the SERP. Not Google. Leave as little in their hands as possible. Leaving it up to them to figure it out honestly is silly. Period.

  • http://www.homeshikari.com/ swathisharma925

    its a trap don’t click on it

  • http://melbourne.fortuneinnovations.com/ Steve Zissou

    Thanks for posting with great insight.I like it very much.

  • http://www.weboutsourcing-gateway.com/ Web Outsourcing Gateway

    We feel the same @synthiarose:disqus. After all, there may be no guarantee that any content is 100% unique, because at some point we may have quoted them from others. Also, we may have just added a new thought from an already existing topic, thus the need to cite the previous topic.

    But those who are just copying others’ contents with the thought that they could get away with it, we can just hope that they will get penalized, or at least not encouraged that what they were doing is acceptable.

  • Isaac

    Now im just confused…

  • http://www.thesofaandchair.co.uk/ Tom Goodwin

    There’s a clear distinction between scraper style duplication and hand written content curation which genuinely adds value to the web. Glad to see Matt Cutts offer some clarity on this. Focus on quality, it’s really just common sense

  • dubert11

    “Google takes all the duplicates and groups them into a cluster. Then Google will show the best of the results in that cluster.”

    “It is true that Google only wants to show one of those pages in their search results”

    So if you have the same product description as the manufacturer or amazon or a store with more authority than you, you’re out of the running if that’s all you have.

    So an interesting question is: what if you have the manufacturers description on the page, but also some unique content, does that mfg desc hurt you?

  • http://www.nathanielbailey.co.uk/ Nathaniel Bailey

    “you’re out of the running if that’s all you have”

    That’s where you need to pay attention to the part where Matt says “Google will show the best of the results”.

    In other words, you need to add value to your content. For example ,if you
    have the same product description as Amazon and hundreds of other
    stores selling the same products, you need to have something on your
    description pages and site to make them better then the rest!

    An example of that might be product reviews (original reviews, not copy
    pasted from Amazon as that would defeat the point lol), product videos
    is another great idea as it means you potential buyers can see your
    products in operation before ordering!

  • http://www.nathanielbailey.co.uk/ Nathaniel Bailey

    So Cutts has yet again told us nothing new… Just the same old ‘give your site something to better the competition’ which works for everything in his eyes lol

  • http://www.nathanielbailey.co.uk/ Nathaniel Bailey

    Having duplicate content like product descriptions and steeling someone else’s content or creating a duplicate site are totally different and reporting such things will still have the same effect.

  • http://www.weblineindia.com/blog/twitter-in-talks-of-coming-out-with-nearby-feature/ Richard Boss

    Now, I am totally confused. Frist, Matt Cutts told that duplicate content is not spammy unless & untill it shows the same content to searchers for the same query. Now he said that 25 – 30% of web’s duplicate content is okay.

    In short means, Google accepts duplicate content while it decide which content users prefer to read easily. Readability test must be good.

  • http://www.brickmarketing.com/ Nick Stamoulis

    I think Google understands that people share, quote, copy, re-post, etc. content all the time. That’s kind of the point of the web! It’s one thing to copy a small section of a post (and cite it) then to completely steal a piece of content.

  • donthe

    This is nonsense. You will hurt your rankings by posting duplicate content.

    Why do we assume that Google’s head of webspam and it’s Public Relations director, Matt Cutts, is aware of every aspect of Google’s ranking algorithms.
    Especially as it changes over 500 times a year.

  • dubert11

    Thank you for the reply, Nathaniel.

    I agree that unique, value adding content is a minimum requirement.

    What I’m hoping to find out is whether or not *also* including the manufacturers’ descriptions (along with that unique content) hurts you. It could be interpreted that by including the manufacturer’s description along with your unique content still results in you getting grouped in with the other sites that have those same descriptions, effectively penalizing you.

    If that’s the case, it would be better to remove those non-unique descriptions (which I like to include since they add value to the visitor) and limit the page to only the unique content. I hate to remove something valuable to the visitor just to try to get more traffic from Google.

    Has anyone done any testing around this?

    Does Google decide a page is unique/non-unique or are they able to decide part of a page is unique while another part of the page is non-unique? Maybe a different way to phrase the question is does google rank pages or parts of pages?

  • Art Enke

    Unintentional duplicate content (like tag pages) may not be treated as spam but if your content is getting truncated and not displaying in results, that is something to worry about. Am I not right?

  • http://www.brand.com/blog James R. Halloran

    Man, one confusing message after another! Is he talking about duplicate content as a whole all across the Web? Or just duplicate content on one site?

  • http://topspot-official.blogspot.com/ Daniel Benny Simanjuntak

    If you published a content on your site and site is in regular state of Google indexing then all privileges to that content is yours. It is bit obvious that “Good Sites” never try to copy others content.

  • Scott Davis

    Um… Amazon steals their product descriptions and photos from their users… Amazon is a content thief. Suppose it’s a good thing that ‘Google Shopping’ is now all paid listings.

  • http://www.nathanielbailey.co.uk/ Nathaniel Bailey

    No not really because you give Amazon your content feed, so you are giving them permission to use your content.

    Content theft is when someone uses your content without consent!

  • http://www.nathanielbailey.co.uk/ Nathaniel Bailey

    It’s always best to use your own unique content but if you must use the standard product descriptions from manufacturers I would advise using it after your own content.

    And try to out weigh the ‘manufacturers’ content with your own, IE if you have 300 words of ‘manufacturers’ content, try to use double your own content (so 600 words or more).

    I’m not quite sure how one would test this, but I do know our clients products with unique content perform much better then those with manufacturers content.

    On a side note’ish’ Google have said its google to quote sources so I shouldn’t see quoting/using parts of manufacturers content alongside your own…

  • Scott Davis

    Amazon takes your product description, product pricing, images and everything else and if your product actually starts to sell well, Amazon then takes the buy box from you and sells it themselves. I’m quite familiar with dealing with Amazon, unfortunately. And Amazon requires you to give them permission to use your product info and images if you sell through them. I suppose my point should have been, “don’t use Amazon” sell through ebay, craigslist or someone else (like google shopping).

  • Rian

    This is a bunch of garbage! Isn’t this what Panda was about? Checking if you have quality content! Obviously if a lot of your content is duplicate it won’t be considered high quality.

  • http://www.nathanielbailey.co.uk/ Nathaniel Bailey

    Or simply create a different feed for Amazon!?

  • Jon Ryan

    if it’s a toss up between your page and another page getting chosen, i could almost guarantee you that the 1 with a Google ad will be chosen because it is revenue for google. If I owned Google, that’s what I would do.

  • kamal

    Don’t believe on matt Cutts words. What ever matt tell us is totally wrong every time even when some one ask him about PR update than he replied that its not possible that Pr will updated in This Year, but PR is Updated on Dec 6, 2013.

  • Nate Wheeler

    So in other words, “Copy content if you’d like, we just won’t show your copied content in search results; copy too much content and we won’t show your entire site in search results”.

  • Durant Imboden

    As usual, Matt Cutts’s message boils down to: “Use common sense, and don’t be stupid, greedy, or malicious.” Why should anyone find that controversial or offensive?

  • http://www.mixchatroom.co.uk/ usmansarwer

    Great Explanation Sir But I Have 1 question From You I want To Start Sms Site And I Have Seen Duplicate content on every site but my question is if every one have same content and when i start link building its will effect my site or it is ok please reply thanks

  • http://www.seo.kirbyworks.net/ Kirby Hopper

    I don’t think readability is the main criteria but rather relevance. Google will decide which of all the duplicate content is the most relevant to the search query and then decide how it stacks up against all the competition in terms of relevancy and authority.

  • John

    I’ve got a question. I run a nationwide business that provides a valuable service to customers in just about every city in the U.S. We contract out to service providers who operate within each city. The value we provide as “middlemen” is a layer of technology and customer service that is unprecedented in the industry and is something our contracted service providers neither can nor want to provide. They love us, our customers love us; it’s a win for everyone involved.

    The problem we run into is that we have landing pages for each city we service across the country. As you can imagine, this yields a fair amount of duplicate content because a) the industry is not content driven: People don’t really want to read articles about it and b) the service is the same in every city.

    Now, at the end of this video Mr. Cutts says “…unless you’re creating a page for every city or state in the country…” which is a bit unnerving since our business is heavily reliant on organic SEO traffic from Google.

    So the question is: What’s the “right” thing to do here? I don’t exaggerate when I say every single one of our competitors has a similar site structure (a mostly duplicate page keyword targeted for every city/state) and they all rank well and have done so for years. How can I “legally” (in the eyes of Google) SEO my site in such a way that I can target search terms like ” in , ” for any given location across the country? I want to build a legitmate, long lasting business, but there’s not a chance in hell I could compete in the SERPs without geo targeted pages for the cities we service.

  • Pat Grady

    30% is low. 30% is low.

  • Dave

    It’s the irony with the Panda and Penguin update. If you get hit by Panda or Penguin, then Panda update will work in opposite direction. Your original content will become duplicate one and the one who has cut pasted your content will show above yours.

    And they will give this type of nonsense comments that the content is not chosen :(

  • Dave

    One day Matt say this http://searchengineland.com/googles-matt-cutts-stitching-content-is-bad-seo-quality-content-178904 and the other day http://searchengineland.com/googles-matt-cutts-25-30-of-the-webs-content-is-duplicate-content-thats-okay-180063 he just give the scrappers the hope to scrap content and get ranked as they may be the chosen one.

    The scrappers are not only copying all website content but also the design of the website. If Google is going to treat a Panda/Penguin hit site, having original content, is of lower standard to that of scrapper site, then scrapper are going to win always.

    If anyone want to copy the content look for a site that has been recently hit by Google updates, scrap the content and get ranked above the site being hit by the update.

  • Unique SEO Tips

    Viewed the video thrice already. I’m a bit confused with his explanation.

  • Unique SEO Tips

    Viewed the video thrice already. I’m a bit confused with his explanation.

  • http://www.miracl3.com/ Alxdavd

    I don’t know why Matt Cutts publish this type of post after 4 Penguin releases and Humming bird release, may be its bit confusion so i suggest to all my friends please don’t copy content from any where, If any one do this sure they will in problem after next Google update .

    Thanks
    Miracl3

  • http://remkovanderzwaag.nl/ Remko van der Zwaag

    I think the BIG question we all have is still not answered by Google. Okay, Google clusters stuff, which makes sense and we already new that, but: are internal duplicates canonicalized to the page that is ranking? In other words: is page value automatically flowing from the duplicate pages to the ranking page (like it would when we take manual measures eg rel=canonical) if the duplicate content occurs on the own website?

    From my own experience I’d say no, which is why duplicate content is still a big deal.

  • http://www.rankinstyle.com/ Jacques Bouchard

    I’ve never seen a Yahoo Search result show on Google. Have you?

  • http://www.mixchatroom.co.uk/ usmansarwer

    Hi Sir I Have Question From You But I Did,t Get My Answer i reapt my question again i want to make sms poetry website and i have seen many website have almost same content if copy data from different site its will effect my site from google panda or any other update please answer my question thanks

  • seoraviraj

    Actually Matt Cutt know that every person/business on same vertical have similar thought. thats why he is ok with 25 -30% duplicate web content.

  • http://www.skiusainc.com/ Ehtesham_SKI USA Inc

    What I think about the algorithm of Google for content is that it may work like, If I had copied a content from someone’s post and noted the reference from where I have copied, then may be there is a probability of showing the source post result first in the SERP rather than the copied content for the result keyword.

  • http://www.turkcealtyaziliizle.com/ turkcealtyaziliizle.com

    i believe matt. because turkish results, show it. turkish results has a lot of duplicate content. on the first page of the same content always.

  • http://kustcom.com/ Garratt Campton

    If you’ve been doing well for year why are you worried? Cutts has said the same garbage thing for the last 6 years as well. It hasn’t effected you yet. This is simply a “yellow pages” you’re talking about. So I don’t see the problem. You will however need to SEO the heck out of it though as yellow pages is pretty darn bad at seo.

  • Rhonda Holscher

    I just had an SEO analysis done of my site, and was told to create duplicate pages for geotagging. At the end of Matt’s video it says not to do this for every city in the United States. Okay, well I don’t want to do that believe me I don’t want that much work, but my question is: How many times is it safe to duplicate the page for geotagging without being penalized?

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide