Understanding SEO Friendly URL Syntax Practices

Poor URL structure is a frequent SEO issue, one that can impair rankings, keep pages out of the search engine indexes, and suck ranking authority from your other pages or even the entire websites. Some content management systems bake poor URL structures right into their websites. Lax rules can be a culprit, for example, not […]

Chat with SearchBot

SEO Friendly URL SyntaxPoor URL structure is a frequent SEO issue, one that can impair rankings, keep pages out of the search engine indexes, and suck ranking authority from your other pages or even the entire websites.

Some content management systems bake poor URL structures right into their websites. Lax rules can be a culprit, for example, not encoding spaces or special characters.

Meanwhile, some CMS platforms devise URLs using illegal characters that should not appear in addresses. Others generate multiple URLs for pages, creating duplicate content.

While it is true that search engines go to great lengths to read and index even the worst URLs, attention to URL management and optimization will provide both SEO and usability advantages.

Good URL Structure

A few years ago, Dr. Peter J. Meyers put together a cheat sheet on the anatomy of a URL. It’s a good one to keep handy.

 SEOmoz SEO Syntax Cheetsheet

Look at this URL:  https://www.mobilesmart.com/phones/android/samsung-galaxy-s3

  • It is easy to read and understand. If I saw this address pasted into a blog or forum, I would likely click on it.
  • It is SEO optimized with breadcrumb style keywords. Search engines look for keywords in URLs; it’s a known ranking factor. This layout, going from general to specific, is ideal for enterprise SEO.
  • The URL includes its own anchor text. If this address were pasted into a blog or other web page as a link, that link would possess well-optimized anchor text.

Old style dynamic addresses are legal and acceptable, though they have drawbacks.

  • They tend to be longer and difficult to read because they contain both parameter names plus values.
  • Pairing parameter names with values adds extra words. This may dilute the SEO value derived from keywords within the URLs.
  • This type of address may contain information better transmitted outside of the URL. A user ID, session ID, sort code, print code and many other possible parameters could create duplicate content, security or other issues.

Diagnosing URL Issues

To find URL based issues:

  1. Check for errors and warnings then determine if URLs are the culprit.
  2. Audit all URLs for proper syntax.

To check for errors, begin with Google and Bing webmaster tool reports. Look for duplicate content then examine the webpage addresses themselves and their locations. Numerous third-party SEO tools can locate SEO issues as well.

Canonical issues, parameters that do not change page content, loose adherence to coding standards, or any number of reasons will create duplicate content.

Options for dealing with duplicate content include:

  • Reconfigure the content management platform to generate one consistent URL for each page of content.
  • 301 redirect duplicate URLs to the correct version.
  • Add canonical tags to webpages that direct search engines to group duplicate content and combine their ranking signals.
  • Configure URL parameters in webmaster tools and direct search engines to ignore any parameters that cause duplicate content.

I worked with a newspaper that used unique numerical identifiers, outside of parameters, to serve articles as webpages. It did not matter what the URL contained, as long as the identifier was somewhere in the address. Unfortunately, the writing of link hooks into templates was inconsistent, resulting in thousands upon thousands of duplicate content pages. We had to pour through each template, rewrite each link hook as an SEO friendly URL, then catalog all the legacy URLs and 301-redirect them to the new optimized addresses.

When auditing URL syntax, I prefer to export every webpage address into a spreadsheet or database. If you’re thinking about using Google site: queries, don’t bother as many of the issues you will look for do not appear in search results.

Search For Reserved & Unsafe Characters

Reserved Characters

Reserved URL Characters

Each character has a specific use. Should they appear, determine if they are used properly, should be encoded, or if the URL needs reconfiguration.

Unsafe Characters

Unsafe URL Characters

Encode unsafe characters unless used for a specific purpose. The % symbol does not require encoding when used to encode a character. The # symbol does not require encoding when used to create an anchor tag.

Miscellaneous Characters

Miscellaneous URL Characters

Strictly speaking, these characters do not require encoding. In reality, many CMS platforms will encode these automatically. If you want links that contain these characters to remain consistent when shared from website to website, it’s a safe bet to encode these.

Search For The Pound Symbol, #

Search engines ignore the # and everything after it in a URL. If using the #, make sure the webpage appears as you want it crawled and indexed when the # and everything that follows is removed. If the # changes content you want indexed, you will need to find a different URL structure. For example,

  • /celebrities.html#bill-clinton
  • /celebrities.html#bette-davis
  • /celebrities.html#deadmau5

Based on these webpage addresses, let’s assume the webpages are all different. This will be a problem because search engines will index only /celebrities.html.

A better URL would be /celebrities/deadmau5 or /celebrities/bill-clinton.

Search For Underscores, _

Underscores, while legal, are problematic for SEO. It’s an issue search engines have always dealt with but never solved. Search engines see underscores as connectors. To separate words, use dashes.

So, in practical terms, while hello-dolly is hello dolly, hello_dolly is hello_dolly.

Always uses dashes, -, to separate words.

Search For Mixed Case

URLs, in general, are case-sensitive (with the exception of machine names). Mixed case URLs can be a source of duplicate content. These are not the same URLs,

  • https://example.com/Hello-Dolly
  • https://example.com/hello-dolly

The easiest way to deal with mixed case URLs is to have your website automatically rewrite all URLs to lower case. With this one change, you never have to worry if the search engines are dealing with it automatically or not.

Another great reason to rewrite all URLs to lower case is it will simplify any case sensitive SEO and analytics reports. That alone is pure gold.

Check Your CMS Platform’s Settings

A major clothing retailer I worked with uses a popular retail CMS for enterprise. When this client came onboard, it had some of the nastiest URLs I’d ever seen. I wanted to blame the CMS, except other retailers using the same platform had gorgeous webpage addresses.

In our CMS audit, we found that the client left the optimized URL field blank, leaving the CMS to default to non-optimized, sketchy addresses.

Creating Optimized URLs

How you get from your present URL structure to an SEO optimized one depends on your content management system.

  • In WordPress the administrator select the permalink structure, define and create category slugs. Writers can edit the slug that will become an article or page URL (if the slug is part of the permalink structure).
  • In some CMS programs you can create almost any URL structure you want by placing links into templates. You just include the correct page identifier and pull from accompanying variables (ex. https://domain.com/{article_category}/{article_id}/{article_slug}).
  • Your developer may have to create dynamic URL rewrites.

To find the best path for your business, speak with your Web developer or CMS vendor.

Meet The Minimum Syntax Standard

Not all websites can optimize their URLs; some CMS platforms will not support it. However, no website should use illegal URLs. If you cannot optimize, make certain that you at least meet the standard.

Further Reading

Image credit: SEOmoz Anatomy of a URL

Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.


About the author

Tom Schmitz
Contributor
Thomas Schmitz is a longtime digital marketing professional who works with startups, SMBs, enterprise, media and not-for-profit organizations. Regarded as an expert in inbound and content marketing, search engine optimization and social media, Tom's an innovative growth creator and turnaround specialist. Follow Tom at @TomSchmitz.

Get the must-read newsletter for search marketers.