While most of the time we want search engine crawlers to grab and index as much content from our web sites as possible, there are situations where we want to prevent crawlers from accessing certain pages or parts of a web site. For example, you don’t want crawlers poking around on non-public parts of your web site. Nor do you want them trying to index scripts, utilities or other types of code. And finally, you may have duplicate content on your web site, and want to ensure that a crawler only gets one copy (the “canonical” version, in search engine parlance).

Today’s Search Illustrated illustrates how you can use the “robots.txt” file as a “keep out” notice for search engine cawlers:

robots_txt_explained_500w.gif

Graphic by Elliance, an eMarketing firm specializing in results-driven search engine marketing, web site design, and outbound eMarketing campaigns. The firm is the creator of the ennect online marketing toolkit. The Search Illustrated column appears Tuesdays at Search Engine Land (and today only, on Wednesday… :-).

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: Channel: SEO | Infographics | Search Illustrated | SEO: Blocking Spiders


About The Author: Graphic by Elliance, an eMarketing firm specializing in results-driven search engine marketing, web site design, and outbound eMarketing campaigns. The firm is the creator of the ennect online marketing toolkit. The Search Illustrated column appears Tuesdays at Search Engine Land.

Connect with the author via: Email


SMX - Search Marketing Expo

Like This Story? Please Share!

Other ways to share:

Like Our Site? Follow Us!

Subscribe to Our Feed! Join our LinkedIn Group Check out our Tumblr! See us on Pinterest Get Search Engine Land on your mobile device!

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://www.esnips.com/web/WebAnalyticsGraphs Daniel Waisberg

    Should I understand from this post that the only kind of pages you recommend to block access to crawlers are private, scripts, and duplicate pages? Or these are just examples?

    Do you also believe that internal search pages should be blocked? I understand that there is some controversy on the subject…

    Thank you.
    Daniel Waisberg

  • http://searchengineland.com Danny Sullivan

    Yes, Daniel, I think that was the point of the illustration — to be a general guide of examples as to what someone might block rather than explicit instructions saying you must do this.

    As for search pages, you are correct. Google warns that these should be blocked.

  • http://www.golfvacationinsider.com GolfVacationInsider.com

    Is is possible to block a portion of a page, or only the entire page?

    Thanks in advance for any insight you can provide.

  • http://searchengineland.com Danny Sullivan

    Yahoo supports blocking of parts of pages. See Yahoo Supports New Robots-Nocontent Tag To Block Indexing Within A Page for more about this.

 

Get Our News, Everywhere!

 
  • Advertise With Us
 

Click to watch SMX conference video

Join us at an upcoming SMX event:

North America

EMEA

APAC

Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.

SMX Site » | SMX Difference » | SMX News »




 

Search Engine Land Periodic Table of SEO Ranking Factors

Get Your Copy
Read The Full SEO Guide