Redirects: Good, Bad & Conditional

Whenever you make changes to a web site, one of the most important considerations should be how to use “redirects” to alert the search engine to your changes to avoid having a negative impact on your search rankings. Whether you’re moving pages around, switching CMS platforms, or just wanting to avoid duplicate content and PageRank […]

Chat with SearchBot

Whenever you make changes to a web site, one of the most important considerations should be how to use “redirects” to alert the search engine to your changes to avoid having a negative impact on your search rankings. Whether you’re moving pages around, switching CMS platforms, or just wanting to avoid duplicate content and PageRank dilution, you’ll want to employ redirects so as not to squander any link juice (PageRank) that your site has acquired. There are multiple ways of redirecting, and it’s important you get it right if you want the SEO benefit without risk of falling outside search engine guidelines (such as is the case with “conditional redirects”).

Programmers and sysadmins who are not SEO-savvy will likely default to using a “temporary redirect,” also known as a “302 redirect.” Unfortunately, such a redirect does not transfer link juice from the redirected URL to the destination URL. It isn’t that the programmers are intentionally negligent. It’s simply a case of them “not knowing what they don’t know.” Just gently inform them that what they really need to be using is a “permanent redirect,” or a “301 redirect.” If they ask why, just tell them “Because the SEO consultant said so.”

What would be some of the “use cases” for a 301 redirect? I mentioned some in my opening paragraph, but let’s examine several scenarios in greater detail… Generally speaking, if any of your URLs are going to change, you’ll want to employ 301 redirects, like if you are changing domain names (tiredoldbrand.com to newbrand.com). Or if you are migrating to a new content management system (a.k.a. CMS), thus causing the URLs of your pages to all change. You’ll want to do it even if you are “retiring” certain pages to an archive URL (e.g., the current year’s Holiday Gift Guide once the holiday buying season is over—although I’d make the case that you should maintain such a page at a date-free URL forever and let the link juice accumulate at that URL for use in future years’ editions and NOT redirect at all).

Then there are situations where you will want to mitigate with 301s where multiple URLs respond to the same content, thus creating multiple copies of the same page in the search engine’s index. Duplicate content is bad enough, but the bigger issue is that of “PageRank dilution,” where the votes (links) are spread across the various versions instead of all aggregating to the one single, definitive, “canonical” URL. This can happen when tracking codes are appended to a URL (e.g., “?source=SMXad”). A current example of duplicate copies of pages with tracking code appended URLs getting indexed in Google can be found here—ironically, it’s Google’s own site (yes, it happens to the best of us, even to Google!). Or when essential parameters are not always ordered in a consistent manner (e.g., “?subsection=5&section=2” versus “?section=2&subsection=5”). Or when parameters are used as flags but the setting does not substantively change the content (e.g., “?photos=ON” versus “?photos=OFF”). Or when multiple domains or subdomains respond to the request with the same content but no redirect (e.g., “jcpenney.com/jcp/default.aspx” and “www1.jcpenney.com/jcp/default.asp” and “jcp.com/jcp/default.asp” and “jcpenny.com/jcp/default.asp” versus “www.jcpenney.com/jcp/default.asp”). In all the above cases, 301 redirects pointing to the canonical URL would save the day.

Usually a single redirect “rule” can be written to match against a large number of URLs. This is referred to as “pattern matching,” and it allows you to use wildcards (such as the asterisk character) and to capture a portion of the requested URL in memory and to utilize it later in the redirect. This is possible whether you are running Apache or Microsoft IIS Server as your web server. Consider some of the above-mentioned examples, and how to handle each of them using Apache’s mod_rewrite module (which comes bundled with Apache):

# Changing domain names
RewriteCond %{HTTP_HOST} tiredoldbrand\.com$ [NC]
RewriteRule ^(.*)$ https://www.newbrand.com/$1 [R=301,QSA,L]

# Removing tracking parameter (but tracked URL still registers in the analytics). Assumes no other parameters.
RewriteCond %{QUERY_STRING} ^source=
RewriteRule ^(.*)$ $1 [R=301,L]

# Reordering parameters
RewriteCond %{QUERY_STRING} ^subsection=([0-9]+)&section=([0-9]+)$
RewriteRule ^(.*)$ $1?section=%2&subsection=%1 [R=301,L]

# Redirecting the non-www subdomain to www
RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteRule ^(.*)$ https://www.example.com/$1 [R=301,QSA,L]

The above examples change somewhat if you are on Microsoft IIS Server. For example, for those of you using the ISAPI_Rewrite plugin for IIS Server, use “RP” in place of “R=301.”

Sometimes it’s not possible to pattern match and a lookup table is required. This can be accomplished easily by creating a text file and referencing it using the “RewriteMap” directive. You could even reference a script that does some fancy search-and-replace work, rather than a text file, like so:

# Search-and-replace on the query string part of the URL, using my own Perl script
RewriteMap scriptmap prg:/usr/local/bin/searchandreplacescript
RewriteCond %{QUERY_STRING} ^(.+)$
RewriteRule ^(.+)$ $1?${scriptmap:%1} [R=301,L]

Intrigued by all this geeky stuff and want more about pattern matching and rewrite rules? Then you might want to check out my PowerPoint deck from my presentation on “Unraveling URLs” at SMX West.

There’s another type of redirect that bears mentioning—the “conditional redirect.” And it comes with a warning: it could get you in big trouble with Google. Matt Cutts, the head of Google’s webspam team, advised during his keynote at SMX Advanced that folks not employ conditional redirects due to the risk of a Google penalty or ban. For those unfamiliar with the term, it refers to serving a 301 redirect selectively to search engine spiders like Googlebot. Obviously, when you start serving up different content to humans than you do to spiders (and yes, this includes differing redirects), you get into dangerous territory with the search engines. You might have the purest of “white hat” intentions, but the risk remains. Consider the above-mentioned case of two URLs both substantively similar content—one containing “photos=OFF” and another containing “photos=ON”—and both receive some number of links. You could make a compelling argument that, to eliminate duplicate content filtering and PageRank dilution, both versions should collapse into one, but only for Googlebot. After all, if you redirect ALL requests, then low-bandwidth users could not toggle off and on the loading of thumbnail product images. However, this is a false choice. You don’t actually need a redirect in the first place, let alone a conditional one. You could add rel=nofollow to all links pointing to the photos=OFF URLs, so no link juice is “spent” on this version. Then make photos=ON implied so that photos load by default when the parameter is not specified, and remove photos=ON from the URLs of any and all internal links.

Or consider the case of needing to retain a tracking parameter in the URL throughout the user session. By employing a conditional redirect, spiders requesting the tracking URL could be redirected instead to the canonical URL (namely the URL minus the tracking parameter)—thus maintaining the integrity of your tracking system for visitors but not needlessly tracking spiders or creating numerous copies of pages for the engines to index. But again, I bet you could find another way through this without having to resort to conditional redirects. For instance, you could store the tracking parameter in a cookie at the same time as you do an un-conditional redirect to the canonical URL, as soon as the request comes in for the tracked URL.

No question that conditional redirects can solve difficult problems for legitimate businesses. Indeed, one of the world’s largest retailers is using conditional redirects, as I had discovered the day before I presented at SMX Advanced. The retailer is conditionally redirecting bots requesting affiliate URLs to the canonical product URL, so as to capture the link juice from their affiliates. Seems like a smart thing for them to do. But the approach is fraught with risk.

Instead, the retailer could choose to redirect ALL visitors—humans and spiders alike—to the canonical URL without the affiliate ID. Some affiliates might get in a huff because then they’d notice that the retailer was now sending PageRank to the merchant. But you know what? It’s within the affiliate’s power to not pass PageRank; they can simply “nofollow” the links. Thus, it becomes merely a public relations exercise for the merchant to manage their affiliates through the switchover to unconditional 301s and to remind them of their right to nofollow the links.

Another scenario I learned of just recently is from a leading media property who conditionally redirect two types of URLs. The first type have tracking parameters to differentiate clickthroughs on links that lead to the same content. Their use of conditional redirects allow them to collapse duplicates and aggregate PageRank. But guess what? If the redirect were unconditional, the clicks could still be differentiated because the clicktracked URL would still register in their log files.

The same would hold true if the tracking parameter were for differentiating marketing programs / traffic sources. When a request comes in for a URL containing “?source=blog,” for instance, it’s not necessary to send human visitors to a different destination. Even if the traffic source were to be carried through the user’s session and then included as a hidden field on the site’s Contact Us inquiry form, that can be accomplished by using cookies and storing the source information in a session variable. No conditional redirect required.

The other type of URL being conditionally redirected by this company led to co-branded partner sites. And you guessed it, the content is substantially duplicate. Here it isn’t a simple matter of switching to an unconditional redirect, because the partner’s rev share on the ad revenue is calculated by keeping the visitor on a separate subdomain. By unconditionally redirecting to the canonical URL on the main subdomain, each session would go from multiple pageviews to a single one. Partner revenue would drop significantly; the partners would not be happy campers. Cookies could be employed to track the entire session—not a minor undertaking for them to switch to—and even if they did, a non-negligible number of visitors have Norton Internet Security or a similar utility installed that zeros out the referrer line from web page requests, thus obscuring potential click revenue that would be rightfully due to the partner. A bit of a predicament. For this one I haven’t figured out a workaround.

One other scenario where a case could be made for conditional redirects is when the site’s URLs are being rewritten to be spider-friendly. This can be a major initiative that I’ve seen take many months—even years—to roll out over a large complex website. One retailer of outdoor gear spent over two years and over 1000 man-hours implementing URL rewrites and still aren’t finished. It’s not always feasible to implement rewrites on all URLs, or to replace every occurrence of spider-unfriendly URLs. The larger the site, and the less flexible the CMS, the bigger the headache. In such a situation, conditional redirects could help collapse duplicates and channel PageRank to the spider-friendly version of each URL until all occurrences of the spider-unfriendly URLs are replaced across the site. Alternatively, duplicates could be eliminated with a robots.txt disallow or meta robots noindex of all spider-unfriendly URLs, but this wouldn’t be feasible if the URL rewriting is still in progress, and it wouldn’t allow PageRank to be directed to the rewritten version of the URL.

There’s one workaround I will leave you with that negates the use of redirects altogether—including conditional ones. It’s useful specifically for tracking, and involves appending tracking information to URLs in such a way that tracked URLs are automatically collapsed by the engines. No, it doesn’t involve JavaScript. Curiously, I don’t ever hear this method being discussed. The method makes use of the # (hash or pound character), which is normally used to direct visitors to an anchored part of a web page. Simply append a # to your URL followed by the tracking code or ID. For example: www.example.com/widgets.php#partner42. Search engines will ignore the # and everything after it; thus, PageRank is aggregated and duplicates are avoided.

Hopefully this has challenged you to think critically about redirects—temporary, permanent and conditional—and their implications for SEO. Opt for permanent (301) over temporary (302) if you want the link juice to transfer. Conditional redirects should be avoided, especially if your risk tolerance for penalization is low. If you take a good hard look at your “need” for conditional redirects, I think you may find you don’t really need them at all.


Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.


About the author

Stephan Spencer
Contributor
Stephan Spencer is the creator of the 3-day immersive SEO seminar Traffic Control; an author of the O’Reilly books The Art of SEO, Google Power Search, and Social eCommerce; founder of the SEO agency Netconcepts (acquired in 2010); inventor of the SEO proxy technology GravityStream; and the host of two podcast shows Get Yourself Optimized and Marketing Speak.

Get the must-read newsletter for search marketers.