While most of the time we want search engine crawlers to grab and index as much content from our web sites as possible, there are situations where we want to prevent crawlers from accessing certain pages or parts of a web site. For example, you don’t want crawlers poking around on non-public parts of your web site. Nor do you want them trying to index scripts, utilities or other types of code. And finally, you may have duplicate content on your web site, and want to ensure that a crawler only gets one copy (the “canonical” version, in search engine parlance).
Today’s Search Illustrated illustrates how you can use the “robots.txt” file as a “keep out” notice for search engine cawlers:
Graphic by Elliance, an eMarketing firm specializing in results-driven search engine marketing, web site design, and outbound eMarketing campaigns. The firm is the creator of the ennect online marketing toolkit. The Search Illustrated column appears Tuesdays at Search Engine Land (and today only, on Wednesday… :-).
Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.