Does Google respect the URL parameters tool?

E-commerce sites should not make assumptions about Google’s crawling parameter - check the log files to confirm activity.

Chat with SearchBot

Any e-commerce site is probably familiar with the “URL Parameters Tool.” This is a feature in Google Search Console that SEOs have long used to help control the crawl of their websites. In this tool, you inform Google of what your different URL parameters do and how Google should crawl them (“Let Googlebot Decide”, “No URLs” etc). Google has provided extensive documentation on the different settings that can be configured and how the crawl commands interact with each other. 

However, recently Google has moved this tool to the ambiguous “Legacy tool and reports” section. Ever since that time, I’ve wondered what that meant for the tool. Is this just a way of categorizing an older feature? Does Google plan on sunsetting it eventually? Does Google even still use the commands here? 

Something else I’ve found interesting is that when reviewing client log files, we’ve encountered examples where Google didn’t appear to be abiding by the rules set in the URL parameter tool. 

To find out more, I decided to perform a test. I took one of our test sites and found URL parameters that Google was crawling. Using Google’s Index Coverage report, I was able to confirm Googlebot was crawling the following parameters:

?cat
?utm_source
?utm_medium
?utm_campaign
?ref

Image3

On June 26, I went ahead and added these URLs to Google’s URL parameters report. I instructed Googlebot specifically to crawl “No URLs.” 

Image1 1

I then waited and monitored Google’s crawl of the site. After collecting a couple of week’s worth of data, we can see that Google was still crawling these URL parameters. The primary parameter we were able to find activity on was “?cat” URLs:

Image4

Zooming out a bit further, you can see that these are verified Googlebot events that occurred on June 27 or later, after the crawl settings had been configured: 

VID TBdZPprJmxVuMauO57rpZW221 DZ52MYsVNGF7 SmXCYvZ9JUH5njGZy0k90SJlt17rSxf5iSc4M7IGzb0bKyoki X7jdMmVwGdNWkOnVrJZenrfZjup5S0LKeIwtIfUSRSF

We were also able to confirm crawl activity of both “?cat” and “?utm” URLs using Google’s URL Inspection Tool. Notice how the URLs had “Last crawls” after the new rules went into place. 

Image6
Image5

What does this mean for SEOs? 

While we’re not seeing overwhelming crawl activity, it is an indicator that Google might not always respect the rules in the URL Parameters tool. Keep in mind that this is a smaller site (around 600 pages) so the scale in which these URL parameters will be crawled is much lower than a large eCommerce site.

Of course, this isn’t to say that Google is always ignoring the URL parameters report. However, in this particular instance, we can see that it might be the case. If you’re an e-commerce site, I would recommend not making assumptions about how Google’s crawling your parameters and check the log files to confirm crawl activity. Overall, if you’re looking to limit the crawl of a particular parameter, I’d rely on the robots.txt first and foremost.


Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.


About the author

Chris Long
Contributor
Chris Long is the VP of marketing at Go Fish Digital. Chris works with unique problems and advanced search situations to help his clients improve organic traffic through a deep understanding of Google’s algorithm and Web technology. Chris is a contributor for Moz, Search Engine Land, and The Next Web. He is also a speaker at industry conferences such as SMX East and the State Of Search. You can connect with him on Twitter and LinkedIn.

Get the must-read newsletter for search marketers.