Google On How They Know When To Slow Or Stop Crawling Your Web Site

Today at SMX East, Google’s Webmaster Trends Analyst, Gary Illyes shared with the audience two technical ways Google determines when GoogleBot, its crawler, should slow down or stop crawling your website. One of the more important factors with SEO is to ensure the search engine crawlers can access your web pages. If they cannot access your […]

Chat with SearchBot

google-tools1-ss-1920

Today at SMX East, Google’s Webmaster Trends Analyst, Gary Illyes shared with the audience two technical ways Google determines when GoogleBot, its crawler, should slow down or stop crawling your website.

One of the more important factors with SEO is to ensure the search engine crawlers can access your web pages. If they cannot access your web pages, then you will most likely have a hard time ranking in the search results.

Google said that it uses many signals for determining if it should stop crawling your website, outside of the obvious one like the disavow, robots.txt and nofollow tags.

Gary from Google said the following two signals are important crawl signals for Google:

Connect Time

Google will look at how much time it takes for them to connect to the server and web page. If that connect time is getting longer and longer, Google will back off and slow or stop crawling your web pages. Google doesn’t want to end up taking down your server, so it uses connect time as a crawling factor.

HTTP Status Codes

Google will also stop or slow their crawling when it gets server status codes in the 5xx range. The 5xx Server Error status codes often mean there are issues with the server responding. You can find a whole list of them on Wikipedia.

Google says when it sees these codes, it backs off, as to not cause more issues for your server.

In both cases, GoogleBot will come back later, but it backs off when it sees these two signals are causing issues as to not cause more issues for your users when it tries to access your website.


About the author

Barry Schwartz
Staff
Barry Schwartz is a Contributing Editor to Search Engine Land and a member of the programming team for SMX events. He owns RustyBrick, a NY based web consulting firm. He also runs Search Engine Roundtable, a popular search blog on very advanced SEM topics. Barry can be followed on Twitter here.

Get the must-read newsletter for search marketers.