The Importance Of XML Sitemaps In The Age Of Panda

Columnist Janet Driscoll Miller reminds us that in an age of content syndication, a well-maintained XML sitemap is key to establishing your site as the original source of your content.

Chat with SearchBot

sitemap-ss-1920

In the early days of search engines, I wasn’t much of a believer in XML sitemaps. But over time, I began to see first hand how they can benefit websites.

XML sitemaps serve as a way to communicate directly with the search engines, alerting them to new or changed content very quickly and helping to ensure that the content is indexed faster.

For content publishers, it’s become critical to help Google specifically understand if your site is the original publisher of content. Why? Panda.

Content Syndication, Duplicate Content & Panda

It’s not uncommon for publishers to syndicate their content on other websites. Further, it’s also not uncommon for publishers to have their site’s content “curated” by other websites without a formal syndication agreement.

Unfortunately, the definition of content curation is fuzzy at best. In a quick Google search for a recent Search Engine Land article, I found over 47 copies of the article on other sites. (Editor’s note: these are not authorized copies.)

For every publisher site offering syndicated content or having content curated by others (with or without permission), the stakes could not be higher with Google. The Panda algorithm update focused in part on removing duplicate content from search engine results pages — meaning that if a site is not deemed the content originator, it’s at risk of being excluded from the results altogether.

XML sitemaps are just one tool that can help content creators establish their stake as the content originator.

Just how profound can XML sitemaps be for indicating content origination?

In theory, the content originator would likely have the earliest indexed timestamp for the content. But take this example, from a publisher that is not using XML sitemaps, into consideration. The curating or syndicating site is having the same content indexed nearly 40 minutes earlier than the original content:

original_content_rgb

curated_site_example_rgb

How To Get Started

So, how should you get started? First, you’ll need to create an XML sitemap for your site. Some content management systems (CMS) have an integrated capability to auto-generate XML sitemaps. For WordPress users, I recommend using the Yoast SEO Plugin as WordPress does not have built in sitemap generation capability. (If you are already using Yoast for SEO, make sure you have updated to the most recent version.)

Ideally, you’ll want to use a plugin for your CMS (or innate CMS functionality) to create a sitemap because these tools normally will automatically update your sitemap as new content is added or content is changed. However if you don’t use a CMS or WordPress, you can also create an XML sitemap using various tools like xml-sitemaps.com; however, you’ll need to update your sitemap manually on a regular basis to ensure that its information is correct and up to date.

If you have a particularly large website, you may also need to employ a sitemap index. Search engines will only index the first 50,000 URLs in a sitemap, so if your site has more than 50,000 URLs, you’ll need to use an index to tie multiple sitemaps together. You can learn how to create indices and more about sitemaps at sitemaps.org.

After you’ve created your sitemaps (and potentially sitemap indices), you’ll need to register them with the various search engines. Both Google and Bing encourage webmasters to register sitemaps and RSS feeds through Google Webmaster Tools and Bing Webmaster Tools.

Taking this step helps the search engines identify where your sitemap is — meaning that as soon as the sitemap is updated, the search engines can react faster to index the new content. Also, content curators or syndicators may be using your RSS feeds to automatically pull your content into their sites.

Registering your sitemap (or RSS feed) with Google and Bing gives the search engines a signal that your content has been created or updated before they find it on the other sites. It’s really a very simple process with both engines. To submit a sitemap to Google:

  1. Ensure that the XML Sitemap is on your web server and accessible via its URL.
  2. Log in to Google Webmaster Tools.
  3. Under “Crawl,” choose “Sitemaps.”
  4. Click on the red button in the upper right marked “Add/Test Sitemap.” Enter the URL of the sitemap and click “Submit Sitemap.”

To register a sitemap with Bing:

  1. Ensure that the XML Sitemap is on your web server and accessible via its URL.
  2. Log in to Bing Webmaster Tools.
  3. Click on “Configure My Site” and “Sitemaps.”
  4. Enter the full URL of the sitemap in the “Submit a Sitemap” text box.
  5. Click “Submit.”

Another great reason to register sitemaps with Google specifically is to catch Sitemap errors. Google Webmaster Tools provides great information about the status of each Sitemap and any errors it finds:

sitemaps

For sites with multiple types of content, there are also additional sitemap types that can be used, including image, video and mobile sitemaps.


Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.


About the author

Janet Driscoll Miller
Contributor
Janet Driscoll Miller is the President and CEO of Marketing Mojo and has been working digital marketing for over twenty-five years. She is the author of "Data-Frist Marketing: How to Compete and Win in the Age of Analytics" and is a frequent speaker on digital marketing and data analytics. She specializes in providing technical SEO, digital marketing analysis and management and accurate attribution and data analytics.

Get the must-read newsletter for search marketers.