Yahoo! Lets You “Build Your Own Search Service”
Yahoo! has just unveiled the next phase in their plan to spur search innovation by providing search-related resources to developers. The Yahoo! Build Your Own Search Service (BOSS) enables developers to access Yahoo! search results, combine them with other sources, rerank them, and define their appearance. Yahoo! says they are making BOSS available in an attempt to spur innovation in the search space and disrupt the market. They point out that unlike with other web companies, a search startup has many obstacles: from cost (it takes a lot of machines to process all the web’s data) to expertise (some of the world’s smartest PhDs work at the major search engines) to historical data (which search companies can only get a hold of by that old-fashioned method of waiting).
Below, more on what Yahoo! is making available through BOSS and how likely it is that the launch will really create the next Google.
Is BOSS different from the search APIs offered by Google and
On first glance, the BOSS API seems somewhat similar to
Google’s custom search API
and Microsoft’s Live Search API. It enables developers to request search
results (from web search, news, and images), reorder them, and style them.
However, Yahoo! points out key differences, mostly based on the overall intent
of the BOSS program (to power new search startups). Yahoo’s API allows unlimited queries, a necessary feature
for developers who use the API to build a search engine. And it allows for mashups
of its data with other data sources.
Hakia, for instance, is using the API to
blend Yahoo! results with their own and runs all results through their proprietary
algorithm, SemanticRank. And they are displaying Yahoo!’s image results in a
mashup with their own web index results. Neither Google nor Microsoft allow such
flexibility, and both require branding of the search results.
Yahoo! doesn’t provide access to its ranking signals, but it does allow
developers to add their own signals to the set of results. Me.dium is using the API to order results based on social signals, notably what pages its
users have accessed recently.
Yahoo! provided two examples of custom ranking: search results reordered for
popularity based on matches to popular Delicious results and search results
reordered based on topical matches to recently edited Wikipedia
pages. Below, you can see the before and after results for the recency
On the one hand, intent isn’t as important as practical application, and the
current feasible applications of BOSS seem somewhat similar to what Google and
Microsoft offer. Its current feature set seems ideal for a comprehensive site search implementation, for example. However, intent does become
important when considering how Yahoo! might evolve the program, and they say
this is only the first phase of planned features. They say they’re looking to
the developer community to determine what their roadmap should be.
Yahoo! is explicitly looking to disrupt the search market by helping search
startups overcome some of those obstacles inherent in the search business. They
reduce the burden of crawling and the limitations caused by a lack of historical
data by providing search results, and they allow for innovation by enabling
developers to create their own ranking, look and feel, and mashups.
Will BOSS power innovation in search?
who are truly looking to innovate in the search space may feel that they need
access to the raw content of a web index, not simply to the results.
Hakia, who is a satisfied customer of the API, isn’t replacing their crawl.
They need to crawl the web themselves to implement their natural search
innovations using what they term “QDEXing“. Hakia president Melek Pulatkonak
told me that “Hakia views the BOSS initiative as a means of accelerating our
efforts to QDEX the entire Web, and therefore become the first full-scale
semantic search engine. Yahoo! Search BOSS is the best partnership offer for
developments like the one in Hakia, and is an unprecedented initiative in the
market." (It remains to be seen if
of Powerset will speed up or slow down competitor Powerset’s roadmap.)
Me.dium is also primarily using its own index, and in any case, isn’t looking to topple Google. As Chris Sherman points out,
they’re more akin to StumbleUpon. However, both implementations highlight ways
the API could help developers try innovative things in search, particularly if
those innovations revolve around ranking and display.
What’s the revenue model?
In addition to search startups needing more access than BOSS provides, they
also may want to control their revenue stream. While BOSS isn’t launching with
ads, and developers can monetize any way they want for now, Yahoo! does plan to
require Yahoo! ads be displayed beside the search results at some point. Yahoo!
told me that “Over the next several months, a BOSS monetization capability,
using Yahoo! search advertising and potentially other models, will be made
available for partners and developers to create a search revenue stream for
their business.” At that point, developers will be locked into using whatever
Are there other options?
Yahoo! isn’t the only company looking to spur innovation by making
large-scale data available. For instance, CommonCrawl says that their mission is
to "build, maintain and make widely available a comprehensive crawl of the
Internet for the purpose of enabling a new wave of innovation, education and
research" and plan to operate as a non-profit. This project isn’t aimed at
powering search innovation specifically, but rather at the entire realm of
information fields. Gil Elbaz, founder of CommonCrawl, told me, "we think a common
crawl of the web will be a great resource for anyone trying to innovate in
Alexa also makes a web index available, although
it’s much smaller than Yahoo!’s and isn’t free.
Yahoo! is providing a separate API for academic use. For now, the data
available is the same as the public API but will provide more results per API
request (1000 results rather than 50). A “custom” API is still in the works for
which Yahoo! will work more closely with partners.
Uses other than consumer search?
I asked Yahoo! how they felt about other uses of this API. For instance,
search marketers can think of many tools that could be powered by search results
data. For now, the terms of service requires the data be used only for consumer
“You are permitted to use the Services only for the purpose of incorporating
and displaying Web Search Results from such Services as part of a Search Product
deployed on your Web site (“Your Offering”). A “Search Product” means a service
which provides a response to a search query, keyword or other request served
from an index or indexes of data related to Web pages generated, in whole or in
part, by the application of an algorithmic search engine.”
They’ll be monitoring use by looking at things such as how many queries
resulted in clicks. However, they said that they were looking at potentially
making other data offerings available that may be of interest to SEOs.
Overall, this is an interesting idea from Yahoo! Can it shake up the status quo market share? I’m not so sure about that. But it is another sign of Yahoo!’s commitment to the developer community and of their willingness to think creatively about market share (although they may be thinking more about ways to find distribution channels beyond toolbar deals than they are about helping competing search engines be successful).