Can You Now Trust Google To Crawl Ajax Sites?
On October 14, Google announced it no longer recommends the Ajax crawling scheme they published in 2009. Columnist Mark Munroe dives into the question of whether this means you can now count on Google to successfully crawl and index an Ajax site.
Web designers and engineers love Ajax for building Single Page Applications (SPA) with popular frameworks like Angular and React. Pure Ajax implementations can provide a smooth, interactive web application that performs more like a dedicated desktop application.
Back in 2009, Google came up with a solution to make Ajax crawlable. That method either creates “escaped fragment” URLs (ugly URLs) or more recently, clean URLs with a Meta=”fragment” tag on the page.
A popular food site switched to Angular, believing that Google could crawl it. They lost about 70 percent of their organic traffic and are still recovering from that debacle. Ultimately, both sites went to pre-rendering HTML snapshots, the recommended Ajax crawling solution at the time.
And then, on Oct 14, Google said this:
We are no longer recommending the AJAX crawling proposal we made back in 2009.
Note that they are still supporting their old proposal. (There have been some articles announcing that they are no longer supporting it, but that is not true — they are simply no longer recommending that approach.)
In deprecating the old recommendation, they seemed to be saying they can now crawl Ajax.
Then, just a week after the announcement, a client with a newly launched site asked me to check it out. This was an Angular site, again an SPA Ajax implementation.
Upon examining Google’s index and cache, we saw some partially indexed pages without all the content getting crawled. I reiterated my earlier recommendation of using HTML snapshots or progressive enhancement.
This site was built with Angular, which does not yet support server-side rendering (again, in this case, the server initially renders a page to serve up the HTML document), so progressive enhancement would be difficult to support, and HTML snapshots are still the best solution for them.
She replied, “But why? Everything I read tells me Google can crawl Ajax.”
Can they? Let’s take a deeper look at the new recommendation in regard to Ajax.
Google’s New Ajax Recommendations
In explaining why they are deprecating the old recommendation, they say (emphasis mine):
We are generally able to render and understand your web pages like modern browsers.
Many people might be quick to conclude that they can now crawl Ajax without a problem. But look at the language: “generally able”? Would you bet your business revenue on the knowledge that Google is “generally able” to understand your page?
Could it be I am just picking on semantics? Let’s examine the announcement further. Later in their announcement, they state in regard to Ajax:
Since the assumptions for our 2009 proposal are no longer valid, we recommend following the principles of progressive enhancement.
I worried that I was perhaps overanalyzing Google’s words, but then…
John Mueller Confirms Google Still Has Trouble With Ajax
On October 27 (less than two weeks after the Google announcement), John Mueller, on his Webmaster Central Hangout, confirmed that Google indeed still has problems with Ajax.
You can view the exchange at about around 1:08:00 into the video, where there was a question relating to a specific Angular implementation:
They still have trouble with rendering, and they expect to get better over time. John recommends some actions to help debug the issues.
Ultimately, he recommended using HTML snapshots until Google gets better at Ajax (Yes, the method that was just officially deprecated).
So, What To Do?
Progressive enhancement. Server-side rendering would be required for progressive enhancement, and it is not yet supported by Angular. However, the upcoming Angular 2.0 will support server-side rendering. React does, in fact, support server-side rendering today.
This is, however, more work than simply creating HTML snapshots. You need to make sure you render any required links so Google can crawl and index additional content that is loaded into the page.
Nevertheless, for sites using an Ajax framework, this would be my recommended approach. (And, of course, it is Google’s recommended approach.)
Pre-rendering HTML snapshots. Again, don’t be confused if you have heard or read that Google no longer supports this method. They will continue to support it for the foreseeable future. They are just no longer recommending it.
This method works; however, writing the code to pre-render and serve up the snapshots is not trivial. The good news is, there are several vendors out there such as prerender.io who will do the work for you at a relatively low cost. That is probably the simplest approach.
Nevertheless, if you use a platform that does not support server-side rendering, then this may be your only solution.
Better Safe Than Sorry
Even if I had seen evidence that Google was consistently crawling Ajax sites, I would still be wary. It takes far more resources and much more time to fully render a page than to simply serve up HTML.
What will happen to sites with hundreds of thousands or millions of pages? How will it impact crawl budget? Will the crawl rate remain consistent?
Before recommending this approach, I’d rather wait and see strong evidence that Google can and does consistently crawl large, pure Ajax Single Page Applications with no negative impact on crawl rate, indexing and rankings. Please do share your own experiences.
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.
New on Search Engine Land