Googlebot Makes An Appearance In Web Analytics Reports

A few days days ago, I noticed some strange Google Analytics data: Googlebot appeared as a browser in the reports. Although this might sound like a not-so-important fact when it comes to SEO, it is a major change in the Web Analytics field. As Avinash Kaushik and I wrote in the SEMJ journal article Web […]

Chat with SearchBot

A few days days ago, I noticed some strange Google Analytics data: Googlebot appeared as a browser in the reports. Although this might sound like a not-so-important fact when it comes to SEO, it is a major change in the Web Analytics field. As Avinash Kaushik and I wrote in the SEMJ journal article Web Analytics 2.0: Empowering Customer Centricity, an important advantage of all JavaScript based solutions (Google Analytics, Omniture, Yahoo Web Analytics…) is:

The JavaScript is not read by crawlers, which generates high amounts of traffic and are not representative of customers’ behavior. Crawlers can be excluded from the analysis; however, it is a time consuming task, and many of them are not recognizable.

To check whether this bot is really from Google, and not some kind of user agent switcher, I drilled down on the data and here is what I found.

Googlebot appears in Google Analytics reports

First of all, as we can see below, the Googlebot is recognized as a browser (version 2.1):

Googlebot Browser on Google Analytics

Second, when we drill down to the network location report we find the following:

Googlebot Network properties

How does it affect the data?

If we look at the behavior of this bot, we see a very low time on site, very low pages/visit, and very high percentage of new visits. This might be due to the fact that the bot does not fetch cookies, which is essential to accurate analytics tracking. Below are some numbers:

Googlebot Behavior

Statistically speaking, this means that the Googlebot is an outlier, which is a data point that lies outside of the overall pattern of a distribution. It means that it can distort the numbers. In the example above, just a few visits with very low time on site and percentage of new visits can significantly decrease the overall average time on site andpercentage of new visitors, which is clearly bad for someone looking at the overall behavior of visitors.

How to exclude Googlebots from your Google Analytics data

Here is a filter that can be applied to Google Analytics profiles to exclude this Googlebot from messing with your data.

Exclude Googlebot Filter on Google Analytics

What lies ahead?

Google has been officialy scanning JavaScript since 2008. So maybe this has been a low priority or low usage technique untill now, used only in very specific cases. But recently we have seen an increase in this practice, so the big question is whether this is a trend that will increase as time passes or is it just a few specific tests run by Google? Editor’s note: Google declined to comment when asked for more information.

For now, we can only hope that this kind of data is not being collected by analytics packages from the back door. If it has been this might have been skewing the data quite a bit given Googlebot’s low time on site and percentage of new visits stats.

Disclosure: The data used on the screenshots above was extracted from the Web Analytics Association website. If you would like to take a look at this data, it is currently available to all members as part of the Web Analytics Championship.

Postscript: Google Analytics posted a response in the comments:

“The official Google bot does not execute Google Analytics JavaScript. We’re not sure what it is exactly, it could be anyone’s bot, some intern’s experiment, or other such traffic.”

I agree with this comment in that the official Googlebot reads JavaScript but does not execute it. Besides, it does not store and send cookies, which means that Paves/Visit would be exactly 1 and time on site exactly 0. Lastly, If the officiall Googlebot did execute JavaScript, we would have seen massive ammounts of visits.

It is also important to note that although we used Google Analytics as an example, we mean all JavaScript based solutions, including Omniture, Yahoo Web Analytics, WebTrends and others.

Please note that this issue requires additional investigation both in regards to Google Analytics and to how Google Search uses the Googlebot.


Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.


About the author

Daniel Waisberg
Contributor
Daniel Waisberg has been an advocate at Google since 2013. He worked in the analytics team for six years, focusing on data analysis and visualization best practices; he is now part of the search relations team, where he's focused on Google Search Console. Before joining Google, he worked as an analytics consultant and contributed to Search Engine Land & MarTech.

Get the must-read newsletter for search marketers.