You have probably heard the phrase “information architecture” but may not have given it much thought. That’s a mistake! In today’s article I am going to talk about why SEOs should care about the concept.
The Information Architecture Institute defines “information architecture as the art and science of organizing and labeling websites, intranets, online communities and software to support usability.” For most people, it is pretty intuitive to say that designing a web site that is easy for end users to navigate so they can find what they want will be good for business.
What may be less intuitive is how this affects SEO. Maile Ohye recently did a nice post about this at the Google Webmaster Central blog titled To infinity and beyond? No!. This post provides a great example of a scenario where a search engine crawler could possibly get tripped up. Here is what Maile described:
“The classic example of an “infinite space” is a calendar with a “Next Month” link. It may be possible to keep following those “Next Month” links forever! Of course, that’s not what you want Googlebot to do. Googlebot is smart enough to figure out some of those on its own, but there are a lot of ways to create an infinite space and we may not detect all of them.”
Now you may not have a calendar application on your web site, but you may still have some information architecture issues. Anything you do to make crawling (and user navigation!) more difficult is an information architecture problem. If the crawler spends time on your site crawling pages that are basically useless, that is time not spent crawling the pages you really want indexed.
This is a particularly thorny issue with sites that are not yet fully indexed. If you are pursuing a long tail search term strategy, and have created lots of different pages, you want the crawler to discover all of them as quickly as possible. Crawler traps such as the calendar example raised by Maile are not the only way to cause problems for yourself. Here are a few other well known examples:
Duplicate content. If you have an article on your site that can be reached using multiple URLs, you have a problem. Search engines want to show only one version of a page with a given set of content, and if it crawls the same content on 3 different pages of your site, the time it spent crawling 2 of those pages is inherently wasted.
Print pages. These are inherently duplicate content, since as well. Time spent crawling print pages could be spent crawling other pages of your site. If you implement print pages, try using a unique URL folder structure for those pages, so you can modify your robots.txt file to tell crawlers to not spend time crawling them.
Canonical issues. The classic canonical problem is allowing users to address your site using both http://yourdomain.com and http://www.yourdomain.com. Crawlers will see these as two unique web sites which are copies of each other, resulting in a massive duplicate content problem. If you have this problem, learn what a canonical redirect is, and implement it as quickly as you can.
So now let’s talk about a couple of things you really want to do:
1. Implement a clean and simple navigation hierarchy. As a general rule of thumb, if it’s easy for a user to understand and navigate, then it’s probably easy for the crawler to navigate. The converse is true – if it’s hard for the user to navigate, it may well be hard for the search engine crawler to navigate.
Make sure that your navigation always appears in the same place on your web pages, and is consistently structured. It’s OK if you have local navigation specific to certain sections of the site, but make sure it also always appears in the same place. The easier you make it for the crawler to understand what is going on, the better off you will be.
2. Implement smart cross linking. Amazon does a great job at this. Here is an example page, using the book Netherland. They always include something that shows “Customers Who Bought This Item Also Bought”. Here is what it looks like for this book:
This type of cross linking facilitates discovery of pages of your site for the crawler, and is an extremely valuable technique if you have large sites with lots of pages.
Understanding the basics about how to put together a user friendly information architecture can really help you with the user interactions with your site. This alone justifies the effort to think this through (and test your site) carefully. The less obvious side benefit is that this is also very, very good for crawlers.
Eric Enge is the president of Stone Temple Consulting, an SEO consultancy outside of Boston. Eric is also co-founder of Moving Traffic Inc., the publisher of Custom Search Guide. The Industrial Strength column appears Mondays at Search Engine Land.
Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.