While most of the time we want search engine crawlers to grab and index as much content from our web sites as possible, there are situations where we want to prevent crawlers from accessing certain pages or parts of a web site. For example, you don’t want crawlers poking around on non-public parts of your web site. Nor do you want them trying to index scripts, utilities or other types of code. And finally, you may have duplicate content on your web site, and want to ensure that a crawler only gets one copy (the “canonical” version, in search engine parlance).

Today’s Search Illustrated illustrates how you can use the “robots.txt” file as a “keep out” notice for search engine cawlers:

robots_txt_explained_500w.gif

Graphic by Elliance, an eMarketing firm specializing in results-driven search engine marketing, web site design, and outbound eMarketing campaigns. The firm is the creator of the ennect online marketing toolkit. The Search Illustrated column appears Tuesdays at Search Engine Land (and today only, on Wednesday… :-).

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: Channel: SEO | Infographics | Search Illustrated | SEO: Blocking Spiders

Sponsored


About The Author: Graphic by Elliance, an eMarketing firm specializing in results-driven search engine marketing, web site design, and outbound eMarketing campaigns. The firm is the creator of the ennect online marketing toolkit. The Search Illustrated column appears Tuesdays at Search Engine Land.

Connect with the author via: Email



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://www.esnips.com/web/WebAnalyticsGraphs Daniel Waisberg

    Should I understand from this post that the only kind of pages you recommend to block access to crawlers are private, scripts, and duplicate pages? Or these are just examples?

    Do you also believe that internal search pages should be blocked? I understand that there is some controversy on the subject…

    Thank you.
    Daniel Waisberg

  • http://searchengineland.com Danny Sullivan

    Yes, Daniel, I think that was the point of the illustration — to be a general guide of examples as to what someone might block rather than explicit instructions saying you must do this.

    As for search pages, you are correct. Google warns that these should be blocked.

  • http://www.golfvacationinsider.com GolfVacationInsider.com

    Is is possible to block a portion of a page, or only the entire page?

    Thanks in advance for any insight you can provide.

  • http://searchengineland.com Danny Sullivan

    Yahoo supports blocking of parts of pages. See Yahoo Supports New Robots-Nocontent Tag To Block Indexing Within A Page for more about this.

 

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide