In June 2011, I spoke at SMX Advanced about SEO issues that I commonly run in to during technical SEO site evaluations. The part of my presentation that dealt with Microsoft’s Internet Information Server (IIS) generated a lot of comments and questions afterward, so this column addresses some of those questions about how to improve techncial SEO on the Microsoft stack.

First, a caveat: The majority of my experience has been with Linux- and BSD-based operating systems, starting with SunOS way back at Berkeley, so I’m definitely not an expert on deploying servers on Windows and/or .NET.

I’ve asked Microsoft-stack expert Colin Cochrane to correct anything Windows-related that I have stated incorrectly. (Thank you, Colin. Your link is in the mail.) Any remaining errors in this article are definitely mine, and not his.

After completing technical SEO assessments on numerous sites running on IIS and .NET, I believe that it is a very scalable and production-worthy platform, but I have found that its default settings are far from optimal from a technical SEO point of view.

This article describes the most common issues I’ve seen. Several of these issues cause canonicalization problems, as described in more detail in this article about Google’s parameter handling feature.

Oh, and here is a second caveat: Please be sure to test any changes on a staging server before rolling them out to production. I would hate for something to happen to your website because I made a typo or worded something unclearly.

1.  Default Pages (Default.aspx)

The problem

Directory pages are available at two URLs, one with and one without the default page. For example, these two URLs would lead to the same page:
  • http://www.site.com/directory/
  • http://www.site.com/directory/Default.aspx
In this example, the default page is Default.aspx, though it could be configured to be a different name.

Why it is bad

  • Link diffusion. Inbound links to the page could point at either of these two URLs. It would be much better to focus the inbound links on only one URL.
  • Crawl inefficiency. Crawlers have to crawl two URLs to get one page for each directory on the site.

The usual way to deal with duplicate URLs like these is to permanently (with a 301) redirect one URL to the other. However, in this case, it will result in an infinite redirect loop.

The culprit

The reason that redirecting one URL to the other leads to a redirect loop is because both of these URLs look exactly the same to the .NET application. For directory URLs, the default page is always appended to it so the application can’t tell whether it should redirect the URL or not.

Fixing it

The easiest way to fix this is to put a link rel=canonical tag on these pages and point to whichever URL you want to be the canonical. It’s not as good as a permanent redirect, but it will work in a pinch if you don’t want to mess around with your server configuration.

A more permanent fix is to use a 3rd party URL rewriter, which will redirect the URL before it gets to the .NET application. Some URL rewriters I have seen used successfully on sites are URLRewrite (for IIS7 only), URLRewriter, and ISAPI Rewrite 2.

2.  Case Insensitive URLs

The problem

The path part of the URLs served by IIS is case-insensitive. So any of these URLs will usually lead to the same page:

  • http://www.site.com/directory/default.aspx
  • http://www.site.com/Directory/Default.ASPX
  • http://www.site.com/DIRECTORY/DeFaUlT.aSpX

Why it is bad

  • Crawl inefficiency. Google and Bing will crawl all of the different case variations that it sees in links, even though they all lead to the same page.
  • Link diffusion. Inbound links could go to any of the variations of the same URL. I’ve even seen different capitalizations of URLs used in internal links within a website.
  • Robots.txt problems. Because the robots.txt file is case-sensitive, if your URLs aren’t crawlers may be accessing URLs that you thought were blocked.

The culprit

My guess is that it has something to do with the Windows path handling in general, which is also case-insensitive.

Some ideas for fixing it

Similar to the first issue, the easiest way to resolve this is to use a link rel=canonical tag that points to the URL with the correct capitalization.

The URL rewriters listed above are the best option for normalizing the case. They can be configured to permanently redirect a URL to the right capitaliziation. If you pick an easy method for canonicalizing URLs, like converting everything to lower case, it can be implemented with one general rule.

Here is an example rule that rewrites a URL to all lower case that will work with URLRewrite:

<rule name="LowerCaseRule">
  <match url="[A-Z]" ignoreCase="false" />
  <action type="Redirect" url="{ToLower:{URL}}" appendQueryString="true" />
</rule>

If you implement something like this keep in mind that some URLs may require upper case, such as the Bing authorization file BingSiteAuth.xml. URLs like these need to be added to the rule as exceptions.

Here is a post containing 10 very useful rewriting rules, one of which converts URLs to lowercase.

3.  Handling Page Not Found Errors & Internal Server Errors

The problem

In its default configuration, ASP.NET handles errors (like page not found or internal server problems) by redirecting with a 302 temporary redirect to an error page, which usually returns a 200 response.

Why it’s bad

  • Crawl inefficiency. Because a 302 redirect is a temporary redirect, search engines will continue to check that URL often in hopes of one day getting a page at that URL instead of a redirect. And if the target page returns a 200 response, then the search engines will index the initial URL, which means your site might start ranking with URLs that lead searchers to error pages.

This means that pages that are removed from the site or pages that throw an error will get continue to be crawled as if they were regular pages. This means that the crawler is spending time on these URLs instead of on actual pages with useful content.

And because the page not found page gets so much traffic and has so many URLs pointing to it, they tend to get crawled pretty frequently, which further reduces crawl efficiency.

  • “Non-graceful” site failure. If your site starts returning an error — due to a temporary database problem, for example — large portions of your site could get de-duplicated out of the index because they are suddenly redirecting to the same URL.

The culprit

This is the default behavior in ASP.NET.

Some ideas for fixing it

Fortunately, this issue has a fix that is pretty straight forward and requires a minor change to the web.config file.

Here is part of an example web.config file that prevents these redirects:

<customErrors mode="RemoteOnly" defaultRedirect="GeneralErrorPage.aspx" redirectMode="ResponseRewrite">
  <error statusCode="404" redirect="404ErrorPage.aspx" />
</customErrors>
The attribute redirectMode needs to be set to ResponseRewrite instead of its default value of ResponseRedirect.

redirectMode is not available in all versions of .NET, so you may need to update first. More detail can be found in this article.

4. Browser-dependent code

The problem

.NET has some hooks that makes it pretty easy to write code that changes a page depending on the user agent requesting it.

Why it’s bad

  • Cloaking. Pages that change based on the user agent (i.e. Googlebot or Firefox) is dangerous for a lot of reasons, but from an SEO perspective it is dangerous because it could lead to unintentional cloaking of content, which can result in having a severe penalty put on your site.
By default, there is nothing user agent-dependent about the code that is served by IIS/.NET. But because the functionality is there, it is possible that browser-dependent code exists in your site.

The culprit

I believe this functionality dates back to the late 1990′s/early 2000′s when browsers had widely different support for web standards. If you are feeling nostalgic for those days, here is an old browser compatability chart that you can look at until the feeling goes away.

Some ideas for fixing it

Chances are there is nothing to fix, but if you want to look at your source code for potential browser-dependent logic, here is an article with sample code that should give you an idea of what to look for.

Conclusion

I hope this article helps you make your IIS installation more search engine-friendly. I have spoken with some very smart Windows developers who initially swore to me that there was no fix for some of the issues in this list, so there is a pretty good chance that your development team isn’t aware of all of these issues or even that these fixes exist.

Of course, these are only a few of the issues that I see with IIS on a regular basis. Others include cacheability of the site, character encoding issues, and URL redirects.

The easiest way to pinpoint these types of issues is by looking at your server logs.

(Blatant Product Placement/Disclaimer: It just so happens that at Nine By Blue, where I work, I created server log analysis software for just this purpose when I got tired of looking for all of these issues manually, so if you’re interested in that product for either your IIS or Apache logs, ask me about an invite to our private alpha.)

I guess the real lesson of this article is that IIS and .NET are a great help to SEO job security.

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: All Things SEO Column | Channel: SEO

Sponsored


About The Author: is Director of Technical Projects at Nine By Blue, where he helps on-line businesses develop search traffic acquisition strategies from both a technical and a content-oriented point of view.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://martinnormark.com Martin H. Normark

    I believe any platform has its own SEO problems. While I’m no SEO expert, I can’t say whether IIS and .NET has more problems than other platforms, but reading some of the problems makes me think that the people using the platform is not aware of what they’re doing, how search engines work, and to some extend how the web works.

    The problem with ASP.NET (Not ASP.NET MVC) is that it’s a huge abstraction of what web development is: requests and responses – which draws this picture of a website that fools developers into thinking about pages all the time, just like Desktop developers has views etc.

    Anyway, I think some of the solutions you outlined deserves further explanation – which is why I created a page showing each solution alive and kicking. You can see it here: http://iis-seo.martinnormark.com/

  • http://www.toddnemet.com Todd Nemet

    Martin, I just got done reading your article. Thank you very much for taking the time to write out and explain those implementations. I am sure that a lot of webmasters will find it very helpful.

    You say you aren’t an SEO expert — and I am definitely not a Windows expert — so this is a good example of the way that developers and SEO can collaborate to get quick results.

    Other platforms certainly have their own SEO problems. (I’ve seen Java-based sites with case-insensitivity issues and WordPress generates a lot of near-duplicate content with the way they handle comments.) IIS/.NET is very popular and every site I’ve evaluated that uses it has at least a few of these problems.

    Colin, who worked with me on this article, sent me some notes about ASP.NET MVC to include. I felt that the article was already getting too long, so I didn’t include it.

    Thanks again!

  • http://martinnormark.com Martin H. Normark

    Hi Todd, I guess SEOs and Developers need to team up very early in the process, which is rarely the case for most.

    There’s simply too many important details we, developers, don’t pay attention to.

    The thing I like about ASP.NET MVC, is that it removes the abstractions and lets you focus on how requests are handled, how URLs look, and what you return. In a way, it’s old-school – but the control you get is amazing.

    You can still get that control in ASP.NET, but you need more work to fight the built-in features.

    Luckily most problems can be solved by the IIS Rewrite module I mention in the article.

    I’m sure there’s even more SEO problems with IIS/.NET, if you have any – don’t hesitate to contact me via the article, and I’ll see what I can do to further address SEO problems with IIS by adding the more solutions to the article.

  • https://www.zeta-uploader.com Axbm

    Your Regex link has one square bracket at the end which results in a 404.

  • https://www.zeta-uploader.com Axbm

    There is also the “IIS SEO Toolkit” at http://www.iis.net/download/SEOToolkit which helped me a lot in spotting those issues you mentioned.

  • Matt McGee

    Thx – think the link is fixed now.

  • http://www.chrisfaron.com chris faron

    Great post Todd, I often find a brick wall when trying to find solutions in optimizing IIS for SEO

 

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide