Search Engine Land
  • SEO
    • > All SEO
    • > What Is SEO?
    • > SEO Periodic Table
    • > Google: SEO
    • > Bing SEO
    • > Google Algorithm Updates
  • PPC
    • > All PPC
    • > What is PPC?
    • > Google Ads
    • > Microsoft Ads
    • > The Periodic Tables of PPC
  • Focuses
    • > Local
    • > Commerce
    • > Shopify SEO Guide
    • > Content
    • > Email Marketing Periodic Table
    • > Social Media Marketing
    • > Analytics
    • > Search Engine Land Awards
    • > All Focuses
  • SMX
  • Webinars
  • Intelligence Reports
  • White Papers
  • About
    • > About Search Engine Land
    • > Newsletter
    • > Third Door Media
    • > Advertise

Processing...Please wait.

Search Engine Land » Channel » SEO » Why Canonicalization Matters From A Linking Perspective

Why Canonicalization Matters From A Linking Perspective

Search engine optimization (SEO) can be like any other technical field of study. It is filled with specialized jargon that, to a newbie, can be more than intimidating. I recall that feeling was especially strong when I first encountered the term canonicalization. It is a 14-letter, seven-syllable monster of a term. I first heard it […]

Karen DeJarnette on September 6, 2011 at 12:14 pm

Search engine optimization (SEO) can be like any other technical field of study. It is filled with specialized jargon that, to a newbie, can be more than intimidating. I recall that feeling was especially strong when I first encountered the term canonicalization.

It is a 14-letter, seven-syllable monster of a term. I first heard it spoken, and had to ask the person who said it to repeat it. It didn’t help. (It had been a long day!)

The truth of the matter is that canonicalization is not all that complicated to understand if the explanation is lucid. So let’s try to explain what it means, why it’s important, and what it has to do with linking.

What Is Canonicalization?

In mathematics, when the same data can be represented in multiple ways, it is best to standardize that representation by establishing the data’s canonical form, the one primary form in which it will be used. In the computer science field, the act of defining the canonical form of data is called canonicalization.

Simply put, canonicalization defines the one primary way you’ll use to write data, such as a URL string. As webmaster, you can choose which canonical form to use for a given URL on your site, but once selected, the chosen form should always be the way that URL is written.

Why Canonicalization Is Important

Fundamentally, you need to know that search engines do not index pages by their content. They index URLs. The content associated with the indexed URLs is brought in to the search engine database, but URLs are what possess ranking.

What complicates matters in search (and why canonicalization is important) is that the same content page can have multiple URLs associated with it.

I’m not talking about when Web spammers scrape your content and publish it on their own website. I’m talking about variations of URLs on your website all pointing to the same page.

For example, the following hypothetical URLs would likely all point to the same page (in this case, the home page of a site):

  • example.com
  • www.example.com
  • www.example.com/
  • www.example.com/index.html
  • www.example.com/index.html?var1=105
  • www.example.com/index.html?var1=105&var2=abc

As you can see, a valid URL may either include or omit the subdomain prefix “www.”, a trailing slash after the top-level domain, the default webpage name for a folder, and/or one or more URL parameter suffixes (there are even more, but these are the most common). They can also be used in various combinations. The possible permutations of the above examples can quickly add up to a large number of URLs all pointing to the same content page.

And this is not only a problem for home pages. Deep link pages can have the similar problems, such as the following hypothetical examples:

  • www.example.com/folder1/
  • www.example.com/folder1/index.html
  • www.example.com/folder1/index.html?product=49
  • www.example.com/folder1/?userID=tinytim

When search engine crawlers encounter multiple URLs successfully pointing to the same content page, the overall potential PageRank for that content page is split among the URLs crawled. After all, even though the content is exactly the same, each crawled URL will have its own number of backlinks, so the PageRank for a given piece of content will differ among the URLs crawled.

Metaphorically speaking, imagine a full pitcher of water (the total potential page rank) and several empty cups of various sizes (your non-canonicalized URLs).

When you split up the water from the pitcher among the cups, you are technically still working with the same amount of water, but each cup only has a percentage of the total. None of the cups contains as much water as the pitcher could.

When that comes to PageRank, if your site’s pages are not canonicalized, you’re not using your full potential for page ranking. Not only are your URLs competing against those of your rivals from other websites, you are also competing against URL variations within your own website!

Wouldn’t it be better if you could consolidate your page rank in one URL as you might pour all of those cups of water back into one pitcher? That’s why we need to canonicalize our sites.

Canonicalization’s Connection To Linking

“Yeah, yeah, this is all well and good. But where’s the connection to linking,” you ask? Well, as you are a webmaster, you do have a degree of control over how at least some pages link to you.

After all, your intrasite links, not to mention your site navigation scheme links (and for that matter, the links in your XML-based Sitemap file) are all controlled by you.

This means you need to comb through your site (or your content management system, aka CMS) and see how the link to each page is referenced. You need ensure each link to a given page always uses the exact same URL form.

I personally advocate using absolute (aka full) URLs in links, if only because of the plague of content scrapers. As those people are too lazy to create their own content, they are also usually too lazy to examine and change stolen content source code.

If your content is scraped, readers of that content will be brought back to your site when they click the inline links you created (you do create inline links when relevant opportunities appear, right?).

Admittedly, there are times when your site architecture requires that you use URL parameters. In that case, you can also create rel=canonical tags in the section of your pages. The href attribute of this tag will define the canonical URL for the page, so if the URL normally requires URL parameters, the canonical URL is still defined.

Note that search engines have stated they will look at rel=canonical as a hint, not as a mandate. As such, this is not the magic canonicalization bullet for your site. You still need to be consistent with your canonical intrasite linking.

Also, for URL parameter users, be sure to check out both the Google and Bing Webmaster Tools. Both have added options enabling webmasters to define specific URL parameters to be ignored during crawls.

 

Google also allows you to select whether or not you want to use the subdomain prefix “www.” in your preferred URL. I’d guess that option will eventually come to Bing as well.

Lastly, for links you don’t control, such as inbound links from other sites, you can set up 301 permanent redirects for all non-canonical URL forms to the canonical URL for each page.

Just be sure you use a 301 permanent redirect. As the 301 is a permanent redirect, search engines interpret this to mean they can safely transfer all of the page rank value from the original (non-canonical) URL to the new (canonical) one.

Note that while 302 temporary redirects will redirect users to a canonical URL, search engines will not transfer any acquired page rank! (I have written in more detail about using 301 redirects here.)

If you’re really detail-oriented, you could even look at backlink tools, such as the aforementioned search engines’ webmaster tools or a third-party tool such as Open Site Explorer, to see who is linking to you and work with the errant webmasters who are not using your canonical URL in their outbound links.

After all, as good as a 301 redirect is for canonicalization, a redirect also introduces a potential page load speed delay, although that’s not likely as detrimental to your page rank as non-canonicalized URLs)

The bottom line is this: you have the ability to consolidate the PageRank for your content pages into canonical URLs.

Depending upon how badly your multiple URLs are dividing up your PageRank today, given how competitive (not to mention how valuable) top ranking can be for a given query, why wouldn’t you take the steps needed to consolidate the page rank of your content pages into one canonical URL?

Canonicalization may be a seven-syllable monster, but it’s not that complicated, and doing something about it could improve your position in the SERPs.

Image from Shutterstock, used under license.

Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.


New on Search Engine Land

    4 elements of good content, according to Google research 

    Google Ads bug inflating some cost-per-click (CPCs) for non-US campaigns

    3 changes coming to Google Ads audience features

    Chrome will show Google Lens results in the same browser tab

    Google Marketing Live 2022: Everything you need to know

About The Author

Karen DeJarnette
Karen DeJarnette is a senior SEO Strategy Analyst at Expedia Inc. Previously, she was in-house SEO at MSN.com and was part of Microsoft’s Live Search and Bing Webmaster Center teams, serving as the primary contributor to the Bing Webmaster Center blog and then later as an in-house SEO for the Bing content properties. She also randomly adds to her own blog, The SEO Ace.

Related Topics

SEO

Get the daily newsletter search marketers rely on.

Processing...Please wait.

See terms.

ATTEND OUR EVENTS

Learn actionable search marketing tactics that can help you drive more traffic, leads, and revenue.

March 8-9, 2022: Master Classes (virtual)

June 14-15, 2022: SMX Advanced (virtual)

November 15-16, 2022: SMX Next (virtual)

Learn More About Our SMX Events

Discover time-saving technologies and actionable tactics that can help you overcome crucial marketing challenges.

Start Discovering Now: Spring (virtual)

September 28-29, 2022: Fall (virtual)

Learn More About Our MarTech Events

Webinars

Take a Crawl, Walk, Run Approach to Multi-Channel ABM

Content Comes First: Transform Your Operations With DAM

Dominate Your Competition with Google Auction Insights and Search Intelligence

See More Webinars

Intelligence Reports

Enterprise SEO Platforms: A Marketer’s Guide

Enterprise Identity Resolution Platforms

Email Marketing Platforms: A Marketer’s Guide

Enterprise Sales Enablement Platforms: A Marketer’s Guide

Enterprise Digital Experience Platforms: A Marketer’s Guide

Enterprise Call Analytics Platforms: A Marketer’s Guide

See More Intelligence Reports

White Papers

Reputation Management For Healthcare Organizations

Unlock the App Marketing Potential of QR Codes

Realising the power of virtual events for demand generation

The Progressive Marketer’s Ultimate Events Strategy 2022 Worksheet

CMO Guide: How to Plan Smart and Pivot Fast

See More Whitepapers

Receive daily search news and analysis.

Processing...Please wait.

Topics

  • SEO
  • PPC

Our Events

  • Search Marketing Expo - SMX
  • MarTech

About

  • About Us
  • Contact
  • Privacy
  • Marketing Opportunities
  • Staff

Follow Us

  • Facebook
  • Twitter
  • LinkedIn
  • Newsletters
  • RSS
  • Youtube

© 2022 Third Door Media, Inc. All rights reserved.