Do You Have Duplicate Content Issues Across Domain? Google Will Now Alert You

Today, Google webmaster tools has launched a new message alert to let site owners know when a particular URL doesn’t appear because Google sees it as duplicate of a URL on a different domain. In the blog post announcing the feature and in an in-depth help topic, they provide details on how they identify duplicate clusters of content and choose a “canonical” version of that cluster to display in search results.

“When we discover a group of pages with duplicate content, Google uses algorithms to select one representative URL for that content. A group of pages may contain URLs from the same site or from different sites.”

They note that when they choose a representative URL from a different domain, they call this “cross-domain URL selection”.

In cases where multiple URLs contain the same content (for instance, due to infrastructure configuration, optional parameters, syndication, or internationalization), many options exist for site owners to indicate to Google which version is canonical.

However, in some cases, the site owner doesn’t use these options to specify a preferred version or Google may select a different version than the site owner specifies.

This new feature alerts site owners  when their “algorithms select an external URL instead of one from their website”. They say common reasons for this include:

  • Site owner-specified – if you’ve moved your domain or have implemented the rel=canonical attribute to indicate that a page on another domain is canonical, then this alert is simply confirmation that Google is indexing as you’ve specified.
  • Regional sites – if you have the same content on multiple regional sites (for instance, the same English content on a .com (for US), a .co.uk, and a .com.au), Google may cluster pages with identical content across sites and use relevance signals to determine which to display per query.
  • Incorrect canonicalization – in this case, a page may inadvertently use the rel=canonical attribute to specify a page on a different domain as canonical.
  • Misconfigured server – a hosting misconfiguration (this in particular happens sometimes with shared hosting) may cause a two different domains to display the same content)
  • Hacked site – sites are sometimes hacked to point to other domains.
  • Scraped content – the blog says that “in rare situations”, Google may select a URL from a site that has scraped your content.
This alert is available within the message center, so you’ll only see it if your site has this issue and Google is currently only reporting on the URLs from the Top Pages report. This is feature is great insight for site owners who otherwise would have no idea why a particular page doesn’t appear in search results. I’ll be posting a follow up shortly with more details on some of these scenarios and what you can do if you get an alert.

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: Channel: SEO | Features: General | Google: Webmaster Central | SEO: Duplicate Content | Top News

Sponsored


About The Author: is a Contributing Editor at Search Engine Land. She built Google Webmaster Central and went on to found software and consulting company Nine By Blue and create Blueprint Search Analytics< which she later sold. Her book, Marketing in the Age of Google, (updated edition, May 2012) provides a foundation for incorporating search strategy into organizations of all levels. Follow her on Twitter at @vanessafox.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://www.linkedin.com/in/hschachter Harris Schachter

    Should be a handy feature, both for confirmation of the canonical tags you set, and for issues which were previously unknown.

    A bit of confusion though, “Incorrect canonicalization – in this case, a page may inadvertently use the rel=canonical attribute to specify a page on a different page as canonical.”

    Did you mean “…to specify a page on a different domain…” ?

  • http://ninebyblue.com/ Vanessa Fox

    Heh, yes. Thanks.

  • http://pdpoints.com Owen

    Suppose if we run a premium blog and place important writings on home page sticky with full content. Does that still be a duplicate content of what originally posted in the posting URL ?

    Sorry if I’m wrong !! Just wanted to make sure to remove those sticky writings.

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide