Yandex launches new algorithm named Palekh to improve search results for long-tail queries
Did Yandex's new algorithm Palekh just go head to head with Google's RankBrain?
Yandex announced on their Russian blog that they have launched a new algorithm aimed at improving how they handle long-tail queries. The new algorithm is named Palekh, which is the name of a world-famous Russian city that has a firebird on its coat of arms.
The firebird has a long tail, and Yandex, the largest Russian search engine, used that as code name for long-tail queries. Long-tail queries are several words entered into the search box, more often seen in voice queries these days. Yandex says about 100 million queries per day fall under the “long-tail” classification within their search engine.
The Palekh algorithm allows Yandex to understand the meaning behind every query, and not just look for similar words. Which reminds me of Google RankBrain. I asked Yandex if it is similar to Google’s RankBrain, and they said they “don’t know exactly what’s the technology behind Google’s RankBrain, although these technologies do look quite similar.”
Yandex’s Palekh algorithm has started to use neural networks as one of 1,500 factors of ranking. A Yandex spokesperson told us they have “managed to teach our neural networks to see the connections between a query and a document even if they don’t contain common words.” They did this by “converting the words from billions of search queries into numbers (with groups of 300 each) and putting them in 300-dimensional space — now every document has its own vector in that space,” they told us. “If the numbers of a query and numbers of a document are near each other in that space, then the result is relevant,” they added.
When I asked if they are using machine learning, Yandex said they do use machine learning and explained that they teach their “neural network based on these queries will lead to some advancements in answering conversational based queries in the future.” Adding that they “also have many targets (long click prediction, CTR, “click or not click” models and so on) that are teaching our neural network — our research has showed that using more targets is more effective.”