Get the best search news, tips and resources, delivered each day.
Wikiseek: Leveraging Wikipedia For Web Search, Poorly
The Wikipedia search engine has arrived —
Wikiseek — but it’s not the Wikipedia search engine you’re thinking of.
Wikiseek is completely different than the
Search Wikia project
backed by Wikipedia founder Jimmy Wales that I
wrote about last
month. Below, a look at the disappointing new service along with a revisit to
how it is different from Search Wikia.
The idea behind Wikiseek is simple. Crawl only content referred to within
Wikipedia itself, which the site boasts
will make it better:
The contents of Wikiseek are restricted to Wikipedia pages and only those
sites which are referenced within Wikipedia, making it an authoritative source
of information less subject to spam and SEO schemes.
The idea of restricting searches to a subset of pages from across the web
isn’t new. Eurekster’s long-standing
Swicki service allows it. The Google Custom Search engine service allows it,
and there are even some popular services using Google that are listed
In fact, Google once ran a web-wide, editor-selected index of pages almost
exactly like what Wikiseek is doing with Wikipedia.
The Google Directory, when
in 2000, allowed you to search against all the pages from sites that were
referenced in the Google Directory. Those were sites that had been hand-approved
by editors of the underlying Open Directory Project.
To be clear, a search at the Google Directory didn’t just match the title and
descriptions of sites that were listed within the directory. Instead, Google
would see that a site was listed, then a search would hit the full-text content
of all pages from that particular site and other sites listed in the directory,
which had been found through Google’s crawling of the web. It allowed you to
effectively search against only approved sites.
That search feature was little understood, little used and was dropped at
some point over the past few years. One reason was probably due to the fact that
Google’s basic relevancy was good enough, not causing people to feel they needed
to drill down into an editor approved area. A bigger issue was likely Google’s
poor promotion of the service.
Now Wikiseek is back with effectively the same idea. Hit only pages from
sites listed in Wikipedia and you’ll have better relevancy. From the draft press
release I was sent in advance of tomorrow’s official launch:
The service is expected to have significantly less Search Engine
Optimization (SEO) spam because only “authoritative