Cutting Through The Confusion Of Google’s Guidance To Multilingual Website Owners
Google speakers at both the International Search Summit San Jose and SMX West went to town on how multilingual website owners should proceed when geo-targeting their sites using the canonical and hreflang tags.
Susan Moskwa provided a very helpful session on the Monday at the International Search Summit event and was followed by Maile Ohye at SMX West (with Susan’s support from the audience).
Suddenly, A Lot Of Code To Implement
Why all this effort? Well frankly, not many have been adopting Google’s latest advice. Reason? Not many actually understand what it’s all about.
The reaction from many delegates at the Summit was that few webmasters would actually have the capacity to implement the code as Google suggests.
Strong opinions were expressed by some – especially my fellow columnist and regular speaker at the ISS Bill Hunt who firmly expressed his view that “Panda is going out and making decisions about content before the normal indexing processes are taking account of geo-targeting settings. This has been significantly and negatively affecting global websites”.
Nothing To Do With Panda!
Google has firmly denied that the Hreflang tag is a response to Panda both in conference sessions and recently when I quizzed UK-based Google engineer Pierre Farr. The hreflang tag, according to Google, has absolutely nothing to do with Panda.
Regardless of the debate over the background to the new tags, it is the case that Google is launching and promoting many more tags for various purposes.
I have joked to people that SEO’s may need to change their job titles in future to “Google Markup Manager” as just keeping tabs on all the tags and micro formats will require some in-depth training. Nor will there be time to spend on other SEO matters!
So here’s the simplest way I can present the guidance from the sessions! If I’ve over-simplified, feel free to pick me up on the details in the comments!
Originally, the canonical tag was added to enable publishers to identify that content was, indeed, duplicate – perhaps because of session parameters in the URL string or because a site had multiple URL routes to the same content.
The canonical tag basically says, “Yes this content is the same, but this is the single URL I’d like to represent this content despite the various forms in which I present it.”
The above remains true. Google has confused users by repeatedly saying that you can use the canonical with the hreflang tags. Whilst this is true, the canonical should not be used with different URLs where the languages are different.
There is a possible exception to this which is when a page is translated dynamically on the fly by a tool such as Google Translate, which would mean that Googlebot would see the original content – not that which was translated for the user.
It should also be used even where the templates are in different languages but the main content is the same.
For years, we’ve been saying that the geo-selector of a website is a critical component in ensuring it’s success. With the Hreflang tags, Google is effectively asking website owners to include directions to all related content for other countries to the content on the page, within the code of the site.
This means for an 80-country site, there would indeed be 80 lines of extra code sitting in the meta content.
The purpose for Google is that it then knows and has a clearer idea that certain content is related and should not be shown at the same time in results.
The Effect Of Combining Canonical Tags & Hreflang Tags
Not forgetting that the canonical tags should only be used with content in the same language, when would we use both?
Well firstly, the use of both would involve what I usually call world languages such as English, Spanish, French or Portuguese. These languages are used in many countries and, whilst there are variations between the use of these languages in those countries, the variations are sometimes small.
Additionally, multinational publishers often save costs by using one version of the language for all countries speaking that general language, thus ignoring the regional variations. In other words, for Spain and Mexico, Google is presented with exactly the same content, letter for letter.
The canonical acknowledges that this is the same content. The Hreflang tag identifies which URL should be displayed in different sets of results.
So, in other words, canonical + Hreflang = same content + different URL.
Google knows the content is the same, but displays the correct URL for the Google domain search (e.g. google.com.mx will see the relevant URLs for Mexico displayed in the results).
With Dot Coms Or Local Domains?
The simple answer is that the use of Hreflang and canonical tags applies to both local domains and dot coms, though Google’s examples tend to show dot coms.
By the way, neither the canonical nor the Hreflang tags have a direct impact on ranking – canonicals do not share the link equity of the domestic market with the new markets targeted.
It’s A Mystery
Bearing in mind that the Google team clearly says that you don’t need to identify the language – they use language detection for that purpose; that you don’t need to indicate the location for local domains; that you can set the geo-location of a dot com in Webmaster Central and that they say they can figure out which is the most important content; why the heck is this rather intensive tagging needed all of a sudden?
I’m sorry Google, but as someone who spends most of their time advising the owners of multilingual websites on how best to manage their content, there is only one reason which rationalizes the whole thing.
Bearing in mind that “Panda” is a separate process from indexing and ranking sites which examines the quality of content and throws away poor quality content and sites, this is a rather clumsy Panda fix.
Perhaps Google Will Explain The Why?
Canonicals and Hreflang tags are visible on the page to Panda and say “Please leave me in – I’m not just a duplicate and have a specific local market purpose and this is the market.”
Many large websites rely on machine translation (not a good solution for SEO at any time) and they are particularly affected by Panda.
Google, if you disagree with me, please explain why all of this extra coding is suddenly needed. Or simply come clean and explain why you need us all to do this for Google and then we’ll all understand things more clearly.
Both Pierre Farr and John Mueller of Google are speaking on this subject at the Munich and London International Search Summits. We look forward to developing this discussion with them!