The Wikipedia search engine has arrived — Wikiseek — but it’s not the Wikipedia search engine you’re thinking of. Wikiseek is completely different than the Search Wikia project backed by Wikipedia founder Jimmy Wales that I wrote about last month. Below, a look at the disappointing new service along with a revisit to how it is different from Search Wikia.
The idea behind Wikiseek is simple. Crawl only content referred to within Wikipedia itself, which the site boasts will make it better:
The contents of Wikiseek are restricted to Wikipedia pages and only those sites which are referenced within Wikipedia, making it an authoritative source of information less subject to spam and SEO schemes.
The idea of restricting searches to a subset of pages from across the web isn’t new. Eurekster’s long-standing Swicki service allows it. The Google Custom Search engine service allows it, and there are even some popular services using Google that are listed here.
In fact, Google once ran a web-wide, editor-selected index of pages almost exactly like what Wikiseek is doing with Wikipedia. The Google Directory, when launched in 2000, allowed you to search against all the pages from sites that were referenced in the Google Directory. Those were sites that had been hand-approved by editors of the underlying Open Directory Project.
To be clear, a search at the Google Directory didn’t just match the title and descriptions of sites that were listed within the directory. Instead, Google would see that a site was listed, then a search would hit the full-text content of all pages from that particular site and other sites listed in the directory, which had been found through Google’s crawling of the web. It allowed you to effectively search against only approved sites.
That search feature was little understood, little used and was dropped at some point over the past few years. One reason was probably due to the fact that Google’s basic relevancy was good enough, not causing people to feel they needed to drill down into an editor approved area. A bigger issue was likely Google’s poor promotion of the service.
Now Wikiseek is back with effectively the same idea. Hit only pages from sites listed in Wikipedia and you’ll have better relevancy. From the draft press release I was sent in advance of tomorrow’s official launch:
The service is expected to have significantly less Search Engine Optimization (SEO) spam because only “authoritative