• Search Engine Land
  • Sections
    • SEO
    • SEM
    • Local
    • Retail
    • Google
    • Bing
    • Social
    • Resources
    • More
    • Home
  • Search Engine Land
  • SEO
  • SEM
  • Local
  • Retail
  • Google
  • Bing
  • Social
  • Resources
  • Live
  • More
  • Events
  • SUBSCRIBE

Search Engine Land

Search Engine Land
  • SEO
  • SEM
  • Local
  • Retail
  • Google
  • Bing
  • Social
  • Resources
  • More
  • Newsletters
  • Home
SEO

Are Your Language Detection Methods Blocking Search Engine Crawlers?

At a recent international search marketing conference in London the most frequent question asked by the audience was “How do I get my content found and indexed by global and local search engines?” During the breaks I talked to a few people who indicated little or none of their local market content was being indexed […]

Bill Hunt on November 24, 2009 at 1:30 pm
  • More

At a recent international search marketing conference in London the most frequent question asked by the audience was “How do I get my content found and indexed by global and local search engines?”

During the breaks I talked to a few people who indicated little or none of their local market content was being indexed by the major engines. Close examination of these sites revealed that they were all using some form of language detection. In two cases they were doing language detection because they saw Google doing it and assumed that this was the best approach.

However, if you are like me and travel a lot to other countries, you know that assumption can lead to a big problem: Just because I am physically located in a particular country doesn’t necessarily mean that its native language is the content I wish to see. There are other factors that come into play, such as my browser default language, the language I use for queries and so on.

Lets take a deeper look at dynamic language and location detection and explore some of the things you should do to make the process work better.

What is your default dynamic language response?

Browser level language detection is the most popular method of determining a language preference. Your web server simply looks at the visitor’s language preference submitted to the server via “accept-language header” and then locates and serves any content that contains that language code. For example, a person who downloads the French-language version of Firefox will typically have their default language preference set to “French” or “French_France.” When they visit your site the server will read the preference and automatically redirect the visitor to the French version of the site.

While using the accept-language header can be a good starting point for determining the language of the user, it is often misused to “assume” their location. While there can be many advantages to determining a searcher’s language preference to serve them local content, determine local currency, or even format phone numbers that might be more suitable for visitors, there are also potentially catastrophic implications for your search marketing program if you default to browser language preference without considering other factors.

The problem for search marketers is that most search engine crawlers do not use the “accept-language header” and therefore are not sending a language preference. Because crawlers do not send a preference when requesting pages, they are served the “default” language of the server. Do you know the default language of your server? Many web servers, especially .NET and IIS Server, by default, will serve English as the default, meaning search engine crawlers will only see the English language version of your site, regardless if you have tons of content in other languages.

Is your IP detection default location keeping crawlers from finding your local content?

All of those problem sites I encountered at the conference were using IP location detection scripts. Essentially, these scripts receive the IP location of the site visitor and serve them predetermined content based on the county and/or city where the visitor has connected to the Internet. For example, I am writing this article while in Berlin Germany. When I go to Google.com, the server detects I am in Germany and routes me to Google.de and presents the home page in German. For me, a native English speaker, I have to take steps to counter this and select Google in English to get to the content that I need. This is a major problem for search engine crawlers.

The problem is, most search engine crawlers crawl from a specific country location. While I have seen Google crawlers occasionally come from Zurich they are primary crawling from the main data center in Mountain View, California. Due to the crawler locations, no matter where they hit your site, the detection scripts would only route them to English or US centric content—again making content in other languages and countries invisible to the crawlers.

Testing the defaults and making exceptions

To truly understand what your servers are doing you need to test them so that you are confident that they are serving the right content for all situations, especially to the crawlers. Just asking your IT team what is happening is never enough proof of the right settings.

The most reliable test is to have co-workers or customers visit the sites from various locations with different language settings turned on and off to see what is happening for each situation. You should not only check the default settings of your server but also any subscription services your company may deploy such as Akamai’s Global Traffic Management IP Intelligence or cyScape’s CountryHawk IP detection solution. For both of these tools, as well as any other scripts found on the web, you need to ensure that you have loaded exceptions to the redirection rules to ignore the user agent names for the major search crawlers to allow them to access the page they requested.

Finally, you should develop local language/country XML site maps for each local version of the site and register them so that the crawlers have a direct way to access the pages and index your content.


Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.



About The Author

Bill Hunt
Bill Hunt is currently the President of Back Azimuth Consulting and co-author of Search Engine Marketing Inc. His personal blog is whunt.com.

Related Topics

Channel: SEO

We're listening.

Have something to say about this article? Share it with us on Facebook, Twitter or our LinkedIn Group.

Get the daily newsletter search marketers rely on.

Processing...Please wait.

See terms.

ATTEND OUR EVENTS

Lorem ipsum doler this is promo text about SMX events.

Available On-Demand: SMX Create

May 18-19, 2021: SMX London

June 8-9, 2021: SMX Paris

June 15-16, 2021: SMX Advanced

June 21-22, 2021: SMX Advanced Europe

August 17, 2021: SMX Convert

November 9-10, 2021: SMX Next

December 14, 2021: SMX Code

Available On-Demand: SMX

Available On-Demand: SMX Report

×


Learn More About Our SMX Events

Discover actionable tactics that can help you overcome crucial marketing challenges. Our next conference will be held:

Next Event: Sept. 14-15, 2021

Available On-Demand: March 2021

Available On-Demand: October 2020

×

Attend MarTech - Click Here


Learn More About Our MarTech Events

White Papers

  • Gartner Magic Quadrant for Digital Experience Platforms
  • Selecting a Customer Data Platform For Your Organization: The 2020 Gartner Market Guide
  • The Complete Guide to Web Core Vitals
  • The New Era of Automation in SEO
  • Nielsen Annual Marketing Report: Era of Adaptation
See More Whitepapers

Webinars

  • Drive Customer Engagement with the Power of Personalization
  • 7 Use Cases That Prove Why You Should Implement DAM
  • Accelerate Your SEO & Content Marketing Program with 4 Key Milestones
See More Webinars

Research Reports

  • Local Marketing Solutions for Multi-Location Businesses
  • Enterprise Digital Asset Management Platforms
  • Identity Resolution Platforms
  • Customer Data Platforms
  • B2B Marketing Automation Platforms
  • Call Analytics Platforms
See More Research

Attend SMX For Only $199

h
Receive daily search news and analysis.

Channels

  • SEO
  • SEM
  • Local
  • Retail
  • Google
  • Bing
  • Social

Our Events

  • SMX
  • MarTech

Resources

  • White Papers
  • Research
  • Webinars

About

  • About Us
  • Contact
  • Privacy
  • Marketing Opportunities
  • Staff

Follow Us

  • Facebook
  • Twitter
  • LinkedIn
  • Newsletters
  • RSS
  • Youtube

© 2021 Third Door Media, Inc. All rights reserved.

Your privacy means the world to us. We share your personal information only when you give us explicit permission to do so, and confirm we have your permission each time. Learn more by viewing our privacy policy.Ok