Pro Tip: 3 important XML sitemap checks to improve your SEO

Evaluate what important URLs are missing, what needs to be removed and asses how Google indexes your XML sitemap URLs.

Chat with SearchBot

An XML sitemap is like a roadmap for search engines of the URLs within your website. Regularly checking them is vital to prevent incorrect URLs from being crawled and potentially indexed and important URLs from being missed.

Here are three checks that you should be making:

1. Are any important URLs missing?

The first step is to check that your key URLs are in there.

Your XML sitemap may be static, where it’s a snapshot of the website at the time it was created. If so, there’s a chance that it will be outdated. A dynamic sitemap is better as it automatically updates, but settings should be checked to ensure key sections/URLs are not excluded.

How to check: Compare URLs on a web crawl with URLs from your XML sitemaps. You can use crawlers like Screaming Frog, Deepcrawl or Sitebulb for this as they give the option to include the sitemap within a crawl.

Image1 10

2. Do any URLs need to be removed?

Generally, avoid the following in your XML sitemaps:

  • 4xx / 3xx / 5xx URLs
  • Canonicalised URLs
  • Blocked by robots.txt URLs
  • Noindexed URLs
  • Paginated URLs
  • Orphaned URLs

An XML sitemap should normally only contain indexable URLs that serve a 200 response code and which are linked within the website – including these URL types that will contribute to the crawl budget and potentially cause issues, such as orphaned URLs being indexed.

Image2 9

How to check: As in the first step of checking URLs, the same crawl will also highlight problem URLs from the above list.

3. Has Google indexed all my XML Sitemap URLs?

To get a better idea of which URLs are indexed, submit your sitemap in Search Console.  Go to Index > Sitemaps, select your sitemap and See Index Coverage to view the Coverage report.

Image4 2

The “Errors” section highlights issues such as 404 URLs. The “Excluded” section will show reasons why other URLs are not indexed.

  • Duplicate, submitted URL not selected as canonical
  • Crawled – currently not indexed
  • Discovered – currently not indexed

URLs found within the above can suggest issues with thin or duplicate content, poorly linked/orphaned URLs, or an issue when accessing them.

Image5 2

Use URL Inspection to test the live URL, otherwise, it’s a good indication that the quality and internal linking of these pages should be reviewed.

Image3 6

For larger websites, splitting URLs into smaller/child sitemaps and submitting them individually gives you a more focused Coverage report, helping you to better understand and prioritize.

Pro Tip is a special feature for SEOs in our community to share a specific tactic others can use to elevate their performance. You can submit your own here.


Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.


About the author

Michelle Race
Contributor
Michelle Race is Head of Technical at Ricemedia, a digital marketing agency based in Birmingham that has been helping businesses succeed online for over 19 years. She is experienced in performing complex Technical Audits, absolutely loves Search Console and is passionate about helping companies diagnose and solve their Technical SEO issues.

Get the must-read newsletter for search marketers.