What PPC Practitioners Should Know About Robots.txt Files
Search engines use a computer program known as a bot to crawl and index the Web. A robots.txt file is an instruction manual that tells a bot what can and cannot be crawled on your site. An improperly configured robots.txt file can: Lower your quality scores Cause your ads not to be approved Lower your […]
Search engines use a computer program known as a bot to crawl and index the Web. A robots.txt file is an instruction manual that tells a bot what can and cannot be crawled on your site.
An improperly configured robots.txt file can:
- Lower your quality scores
- Cause your ads not to be approved
- Lower your organic rankings
- Create a variety of other problems
Robots.txt files are often discussed in terms of SEO. As SEO and PPC should work together, in this column, we will examine what PPC users should know about robots.txt files so they do not cause problems with either their paid search accounts or their organic rankings.
The AdWords Robot
Google uses a bot called “adsbot-Google” to crawl destination URLs for quality score purposes.
If the bot cannot crawl your page, then you will usually see non-relevant pages, because Google isn’t being allowed to index your pages, which means they cannot examine the page to determine if its relevant or not.
Google’s bot uses a different set of rules for how it interprets a robots.txt file than most other bots.
Most bots will see a global disallow, which means no bot can crawl a page or a file, and then not examine the page at all.
Adsbot-Google ignores global disallows. It assumes that you made a mistake. Since you are buying traffic to a page and have not called out their bot specifically, then they ignore the disallow and read the page anyway.
However, if you call out the bot in your robots.txt file specifically, then adsbot-Google will follow the instructions.
Usually, you don’t purposefully block adsbot-Google.
What does happen, though, is that the IT or other departments are looking at the bandwidth by robot and they see a bot they don’t know well using up a lot of bandwidth as it crawls your site. Since they don’t know what it is, they block the bot. This will cause a large drop in landing page quality scores.
The easiest way for non-techies to see this is with Google Webmaster Tools. You can create a webmaster tools account, and then see if your robots.txt file is blocking adsbot-Google from crawling your site.
In addition, Google Webmaster Tools will let you see crawl errors on your site. A problem that many larger PPC accounts run into is that they end up sending traffic to broken links as the site and URLs change over time.
You can also use a free spider to check for broken links in your AdWords account.
The Microsoft AdCenter Robot
Microsoft also has a robot that is used for ad approval purposes. This robot is called “adidxbot” or “MSNPTC/1.0”.
This robots follows the standard robots.txt conventions. If you use a global disallow to block bots from crawling parts of your site, then this bot will not see those pages and you will have ad approval issues.
While Bing also has a Webmaster Center, it does not have a way to see if you are blocking their ads bot.
Testing Landing Pages & Causing Duplicate Content
Often with landing page testing, you create several versions of the same page with different layouts, buttons, headlines, and benefits.
However, much of the content is the same between all of the pages. If all of these pages are indexed by the robots involved in organic rankings, it can cause your organic rankings to suffer. Therefore, you want to make sure that your test pages are being blocked by bots that crawl for organic purposes, but can be indexed for PPC purposes.
This is much easier in AdWords than in Microsoft’s AdCenter.
For testing landing pages in AdWords, you can simply put all your test pages in a single folder and then use a global disallow to block that folder. Since adsbot-Google ignores global disallows, it will crawl the page; however, the organic bots will obey the robots.txt file and not crawl your pages.
With AdCenter, you need to put the test pages in a folder, and then block all the standard bots except for “adidxbot” from crawling that folder.
By taking an extra step in your testing processes of blocking your test pages from being crawled by organic bots, yet being accessible to the paid search bots, you will not affect your organic rankings when you test landing pages.
Yet More Information
If you understand the basic concept of blocking the appropriate bots, yet need more help understanding how Robots.txt files work, please see this excellent article on understanding robots.txt.
Over the past few years, I’ve seen many instances of SEOs messing up a company’s paid search program or the paid search team causing organic rankings to decline. These two programs are complimentary to each other (see my last column on Should You Bid On A Keyword If You Rank Organically For That Term?) and can help each other out in many different ways.
At SMX East, I am putting together a brand new session on PPC & SEO: Can’t We All Just Get Along?, where Todd Friesen, Tim Mayer, and myself will look at how these two programs can be complimentary to each other and how to make them both work for you to increase your overall exposure. If you want to learn more about AdWords, I will be teaching an Advanced AdWords Course at the beginning of the conference.
SEO and PPC can help each other in many ways. They can also hurt each other if the two sides aren’t working together properly. The first step your PPC department can take in helping out your SEO department is to not damage their organic rankings with your testing. You need to test. Testing is essential for your account to improve.
However, taking a few extra minutes to ensure your robots.txt file is configured properly will help make sure your paid search landing pages are being crawled correctly while not causing organic penalties at the same time.
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.