While most of the time we want search engine crawlers to grab and index as much content from our web sites as possible, there are situations where we want to prevent crawlers from accessing certain pages or parts of a web site. For example, you don’t want crawlers poking around on non-public parts of your web site. Nor do you want them trying to index scripts, utilities or other types of code. And finally, you may have duplicate content on your web site, and want to ensure that a crawler only gets one copy (the “canonical” version, in search engine parlance).

Today’s Search Illustrated illustrates how you can use the “robots.txt” file as a “keep out” notice for search engine cawlers:

robots_txt_explained_500w.gif

Graphic by Elliance, an eMarketing firm specializing in results-driven search engine marketing, web site design, and outbound eMarketing campaigns. The firm is the creator of the ennect online marketing toolkit. The Search Illustrated column appears Tuesdays at Search Engine Land (and today only, on Wednesday… :-).

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: Infographics | Search Illustrated | SEO: Blocking Spiders


About The Author: Graphic by Elliance, an eMarketing firm specializing in results-driven search engine marketing, web site design, and outbound eMarketing campaigns. The firm is the creator of the ennect online marketing toolkit. The Search Illustrated column appears Tuesdays at Search Engine Land.

Connect with the author via: Email


SMX - Search Marketing Expo

Sign Up To Receive This Column Via Email:  


Like This Story? Please Share!

Other ways to share:

Like Our Site? Follow Us!

Subscribe to Our Feed! Join our LinkedIn Group Check out our Tumblr! See us on Pinterest Get Search Engine Land on your mobile device!

Comments

4 Comments on Search Illustrated: Blocking Search Engines With Robots.txt

Daniel Waisberg,

Should I understand from this post that the only kind of pages you recommend to block access to crawlers are private, scripts, and duplicate pages? Or these are just examples?

Do you also believe that internal search pages should be blocked? I understand that there is some controversy on the subject…

Thank you.
Daniel Waisberg



Danny Sullivan,

Yes, Daniel, I think that was the point of the illustration — to be a general guide of examples as to what someone might block rather than explicit instructions saying you must do this.

As for search pages, you are correct. Google warns that these should be blocked.



GolfVacationInsider.com,

Is is possible to block a portion of a page, or only the entire page?

Thanks in advance for any insight you can provide.



Danny Sullivan,

Yahoo supports blocking of parts of pages. See Yahoo Supports New Robots-Nocontent Tag To Block Indexing Within A Page for more about this.



 

Get Our News, Everywhere!

 
  • Advertise With Us
 

Click to watch SMX conference video

Join us at an upcoming SMX event:

North America

EMEA

APAC

Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.

SMX Site » | SMX Difference » | SMX News »



 

Search Engine Land Periodic Table of SEO Ranking Factors

Get Your Copy
Read The Full SEO Guide