It’s Not Just Google That Treats Underscores Like Dashes

Last week’s news that Google is now is treating underscores URLs as word separators, as it does with hyphens, quickly spread through the SEOs and webmaster communities. But what about the other search engines?

I immediately contacted them to find out how they treat underscores and hyphens. Finally, the results are in. Yahoo and Microsoft (and now also Ask.com), the other two of the big three, confirmed that they do treat underscores the same as dashes or hyphens in the URL.

Let me step back and explain this a bit more.

Some SEOs believe that the keywords in the URL of a page have some limited impact on the ranking of that page in the search engines. So if you sold blue widgets, and you had a page at www.domain.com/blue-widgets.html, those keywords are sometime perceived to help – while keeping all the other factors in ranking a page equal.

In the past, Google treated hyphens but not underscores in a URL as a word separator. So in our example above, the blue-widgets part would be seen as two different words: blue & widgets.

If it were like this, blue_widgets, then Google would have seen it as one single word: blue_widgets.

Now Google treats underscores the same way as hyphens. As for Microsoft, Ramez Naam told me:

We treat underscores as word separators in URLs. Always have.

Priyank Shanker Garg from Yahoo told me:

For URL tokenization (separating words in URLs), we treat dashes or underscores identically, but these are not our only tokens and we take a more general approach to finding words in URL.

I also asked Ask.com, but they’ve yet to send a reply.

Postscript: Peter Linsley of Ask.com has now given me a response, they treat underscores as word separators also.

For the record, we also treat underscores as word separators in URLs.

Postscript: We have an update from Google’s Matt Cutts that Actually, Dashes Aren’t The Same As Underscores Yet. We will keep you posted on this.

Related Topics: Channel: SEO | SEO: Domain Names & URLs

Sponsored


About The Author: is Search Engine Land's News Editor and owns RustyBrick, a NY based web consulting firm. He also runs Search Engine Roundtable, a popular search blog on very advanced SEM topics. Barry's personal blog is named Cartoon Barry and he can be followed on Twitter here. For more background information on Barry, see his full bio over here.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://www.trulybored.com gamermk

    Somehow I think that we are going to see the number 301 used more in the next month than it has been used in the past year…

  • jimbeetle

    So in our example above, http://www.domain.com/blue_widgets.html, would be seen as one word, or as “bluewidgets.”

    Almost, but…Google actually looked at it as “blue_widgets,” including the underscore. See Matt Cutts’ somewhat nerdy take on it from a couple of years ago.

  • http://www.naturalsearchblog.com Silver

    jimbeetle, it should be noted that Google is constantly evolving, so Matt’s earlier advice in this instance may no longer apply.

    He’s the one who revealed that Google is now treating underscores as word-breaks.

    I’d guess that this change was made to improve performance for the majority of users. There’s lots of documents where “word1_word2″ should be made to be more relevant to user searches for “word1 word2″, but relatively few users who would type in “word1_word2″. So, this change makes sense, and Google is king at doing what provides best usability.

  • http://www.seo-blog.com MichaelDuz

    To the very few of us that continually conduct our own experiments this of course has been known for some time; Keywords in URLs and URLs (Update)

    These experiments also show if the keywords are actually indexed and if concatenated keywords in urls are recognized.

    You can’t beat testing a hypothesis yourself…. :)

    - Michael

  • http://www.dannedelko.com Dan Nedelko

    This is not much in the way of news. As others have commented this tokenization has been around for quite some time.

    The rule here is if you’re in the search marketing game, you should have known this for quite some time.

  • jimbeetle

    jimbeetle, it should be noted that Google is constantly evolving, so Matt’s earlier advice in this instance may no longer apply.

    I was referring to the way G looked at “blue_widgets” in the past. It saw it as “blue_widgets” and not as Barry stated, “bluewidgets”.

  • http://www.psymple.com Asia

    It’s interesting how we all jump when Google finally makes this decision, yet Yahoo has always recognized the underscore as separation. This was approached at last years PubCon – I specifically asked the question regarding underscores vs dashes and Matt responded that I should do nothing as Google would implement the separation in the near future. Yahoo’s Tim Meyer stated it was already noticed on their end. I’m not a fan of Yahoo, but I prefer to give credit where credit is due.

    I advise that we should all, take out a one page blog post on how Yahoo beat Google in the Underscore War.

  • http://www.naturalsearchblog.com Silver

    One reason one might still opt for going with dashes instead of underscores could be that there are likely 2nd-tier and 3rd-tier search engine sites which don’t treat the underscores as white-space characters. Having those less-important sites linking to you is useful, and if they don’t place your links on nice, semantically-related pages, it doesn’t get you as much.

    Also, it should be noted that Google still gives different search results if you search for “blue_widgets” versus “blue widgets”. They apparently give exact-match priority to searches including the underscores.

    So, the functionality that Matt liked which included exact-match of terms including underscores is really still supported, while they now also do good relevancy for the multi-word search term cases.

  • http://www,igorthetroll.com Igor The Troll

    Sorry to do this but Google is lying!

    Google still treats underscores as one word!

    You may not like the url as the example but the reason for the url is because Google and especially Adam Lasnik lie…

    http://www.google.com/search?hl=en&q=adam_lasnik_the_google_drag_queen&btnG=Search

    The cache date is July 30

    http://72.14.235.104/search?q=cache:fDezWvI9tN8J:www.geekentertainment.tv/2007/07/18/dontcha-wish-your-cell-phone-was-hot-like-me/feed/+adam_lasnik_the_google_drag_queen&hl=en&ct=clnk&cd=1

    Comments on: Dontcha Wish Your Cell Phone Was Hot Like Me?iPhone nay but Google yea! Check out Adam Lasnsnik The Google Drag Queen! http://www.igorthetroll.com/Adam_Lasnik_The_Google_Drag_Queen.jpg
    http://www.geekentertainment.tv/2007/07/18/dontcha-wish-your-cell-phone-was-hot-like-me/feed/

    This one shows the result for the keywords!
    ———————————————
    But this one does not show!

    http://www.google.com/search?q=adam+lasnik+the+google+drag+queen&hl=en&pwst=1&start=30&sa=N&filter=0

    So this is a lie again and again by Google.

  • http://www.igorthetroll.com Igor The Troll

    Thanks for posting the comments.

    Looks like the blog is being taken down
    http://www.geekentertainment.tv

    Is it going the way of
    http://www.threadwatch.org

    Is big G now a bad G

    I just spoke with Randfish at seomoz,org and he told me, “That Matt C, said to him that the underscore equal dash fix is new and will take time to propagate.”

    Wag of a finger at Matt, he should have waited to make the announcement of the fix until the changes have taken place…

    We need to check if the change does happen or some more B.S. on Google part.

    Google been a bad boy lately, lots of lies and hiding stuff.

    Will fill you in on more hot stuff later…

    Igor

  • WebmasterT

    I think that Matt’s statement could mean that the algo doesn’t always treat the – and _ the same. The reason being, _ are not permitted in domain names. For instance Google has been rumored to parse urls. So IMO, the domain parse may just remove the hyphen so it can use the same function to parse all domains. Then when it encounters hyphens in a file or folder name it replaces the – with a space.

    There is also good reason to believe if urls are parsed then / = ? ; . could also be used as delimiters for keywords or analyzing particular URI tokens within the parsed url ie: the protocol, domain and TLD being analyzed ([b]possibly[/b] used in determining trust) . IMO, there is no reason not to assume the list above are either removed or replaced with a space. FWIW, it is likely, making this assumption has little or no downside and if correct it has huge upside. Isn’t that what we are paid to know and figure out? Which makes any comment from Matt, though interesting, inconsequential to the implementation of the strategy? IME, analyzing and parsing URI’s, the algo has to deal with or use these character list in order to delimit and analyze the URI tokens and keyword matches. When you try to actually do it using regex it becomes apparent these are the keys to analyzing and parsing URIs for keyword matches.

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide