SEO For Semantic Search Engines

A new generation of search engines is starting to become publicly available, so it’s time to start thinking about how it will affect SEO efforts. The new search engines I’m talking about are the semantic search engines, meaning they are search engines that can be queried using natural language (not keywords like when using Google). […]

Chat with SearchBot

A new generation of search engines is starting to become publicly available, so it’s time to start thinking about how it will affect SEO efforts.

The new search engines I’m talking about are the semantic search engines, meaning they are search engines that can be queried using natural language (not keywords like when using Google). Behind the scenes, these search engines try to understand the meaning behind the text web pages and so when you query them, they map what your query means and find answers based on the meaning they’ve extracted. It’s all very neat, and there are many examples: Powerset (which Microsoft recently acquired), Hakia, [true knowledge], Cognition, and a few others.


Since these are still early days, I won’t be telling you how to do SEO for this breed of search engines—we can’t just yet for various reasons. Instead, I want to get you to start thinking about these search engines and what they mean for SEO in the future. I will illustrate my points with two simple examples.

While we’re at it, two quick notes: at the moment, these search engines are definitely still in beta and are of the quality of mobile phones about 10 years ago: they work most of the time, but they break regularly enough to remind you the technologies, and the companies behind them, are still in development. A related note is that their coverage is not great; actually, Powerset and Cognition are developing their algorithms using only Wikipedia as their corpus.

On to our two examples:

Who built the empire state building?

In January I visited the Empire State Building in New York City for the first time. I wanted to learn more about the building, so I tried this natural language query at four engines: “Who built the empire state building?” Have a look at these result pages from each engine:

There is a lot to analyze, and it’s interesting to compare how different these results are, both from Google and from one another.

Hakia is confused but takes an excellent stab at answering the question by referring us to the Empire State Building’s official website. This is the second best feature of Hakia in that it tries to answer you directly (I’ll illustrate Hakia’s best feature in the next example). This is the best innovation in search since PageRank, and it all started with Google’s calculator feature. So our first SEO point is how do we get to be the answer for this kind of functionality? As an aside, True Knowledge (mentioned above) is actually much better at this, in that it directly answers queries all the time, and (amazingly) can deduce facts it doesn’t know from other facts in its database (see their demo video or ping me if you would like an invite to the beta).

Powerset answers the question immediately using its Factz feature. The person responsible is apparently John J. Raskob. Clicking through to the Powerset result (which is a Powerset-hosted Wikipedia copy), we can see the highlighting of the sentence from which Powerset derived the answer. Very cool! Hakia and Cognition also do this highlighting of results.

The Cognition highlights their first hit is probably the best as it’s quite thorough. It also highlights sentences that take a wide interpretation of the verb "build," highlighting sentences talking about the designers and the construction company. Very, very neat, and raises an interesting question: if these search engines have a cached copy of our content so they can highlight the answers, will they ever send traffic to our sites? Think about it from the user’s point of view: why leave a search engine’s super-functional “result” page when you already have the answer?

Look at Google’s results: they are nowhere near as useful as Powerset’s or even Hakia’s best-shot. Also, the number one hit does contain the right answer, but the fact that it’s number one seems to be because it exactly matches the query, not because Google understood the query or the page. Thus, our third SEO point: you can optimize a page to respond well to keyword-based queries and natural language queries.

Next question:

Who makes Diet Coke?

This question is interesting because it has a specific answer and shows a range of responses from the search engines:

Google and Powerset win this round hands down, Google because it can understand synonyms and parse the structured data in Wikipedia, and Powerset because it understands the meaning properly. The highlights in Powerset’s result show clearly Powerset’s algo in action. It works really well.

What else can you do to start thinking about doing SEO for semantic search engines? Consider these points:

Did you notice how the highlights were for sentences or fragments? To me, that tells me that these search engines are analyzing text in those terms. Another clue comes from Powerset measuring its analysis performance in seconds per sentence, at one second per sentence about a year ago. So when you write content optimized for them, think about how each sentence can answer a specific query related to the page. If content was previously king, it’s now emperor!

Think about reputation management. Is Coke good for you? Who is Bill Gates? (Answer at Hakia, and that’s their best feature; they call it the Gallery). Facts about brands and people are straightforward to discern. Imagine a semantic search engine highlighting a sentence that says “Coke has been shown to be bad for kids.” It doesn’t need to be accurate, but it’s an answer nonetheless.

Which raises this question: how will these search engines determine authority? Once they leave the confines of Wikipedia, which is reasonably accurate, and begin to index the full web, will they return sites in their results simply because a page happened to have the right sentence structure to answer a query? Sadly, from my experience, Hakia seems to be doing this already. Not being able to accurately identify authority could be deadly for the success of these semantic search engines.

What about links? We’ve been link building for years and so far not one of these search engines has talked about links! What on Earth will we be doing in the future? It could still be links as a way to measure authority, but it could be something else. Watch this space!

One way or another, semantic search engines will be part of the future of search engines in terms of natural language queries and indexing. This is new to our industry and we have to sit up and pay attention. Failure to do so may mean that you will miss the next big thing. On the other hand, all these search engines could go bust and we remain stuck with our keywords for a while longer. You choose, but to me, the answer is clear.

Pierre Far recently launched Social Alerter and maintains a set of SEO tools. He has a PhD from the University of Cambridge, UK, and works as an innovation consultant.


Contributing authors are invited to create content for Search Engine Land and are chosen for their expertise and contribution to the search community. Our contributors work under the oversight of the editorial staff and contributions are checked for quality and relevance to our readers. The opinions they express are their own.


About the author

Pierre Far
Contributor
Pierre Far is a digital product management consultant, speaker, startup advisor, award judge, and founder of several online businesses. Previously, he held several roles at Google and the technology sector in the UK, including product management, community management, innovation consulting, and online marketing after completing a Ph.D. in microbial genetics from the University of Cambridge, UK.

Get the newsletter search marketers rely on.