<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Search Engine Land &#187; Search Engines: Hakia</title>
	<atom:link href="http://searchengineland.com/library/search-engines/search-engines-hakia/feed" rel="self" type="application/rss+xml" />
	<link>http://searchengineland.com</link>
	<description>Search Engine Land: News On Search Engines, Search Engine Optimization (SEO) &#38; Search Engine Marketing (SEM)</description>
	<lastBuildDate>Fri, 10 Feb 2012 01:45:03 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
		<item>
		<title>Kayak Says Bing Isn&#8217;t Playing Air Fair, Hakia Sees Near Mirror Image</title>
		<link>http://searchengineland.com/kayak-says-bing-isnt-playing-air-fair-hakia-sees-near-mirror-image-21542</link>
		<comments>http://searchengineland.com/kayak-says-bing-isnt-playing-air-fair-hakia-sees-near-mirror-image-21542#comments</comments>
		<pubDate>Thu, 25 Jun 2009 17:52:54 +0000</pubDate>
		<dc:creator>Elisabeth Osmeloski</dc:creator>
				<category><![CDATA[Microsoft: Bing Travel]]></category>
		<category><![CDATA[Search Engines: Hakia]]></category>
		<category><![CDATA[Search Engines: Travel Search Engines]]></category>
		<category><![CDATA[Top News]]></category>

		<guid isPermaLink="false">http://searchengineland.com/?p=21542</guid>
		<description><![CDATA[Kayak travel search launched a formal attack on Microsoft&#8217;s Bing Travel yesterday in an article that appeared on Wired, Kayak to Bing: Stop Copying Us!. Apparently, Kayak&#8217;s executives are more than a little steamed by the eerily similar look and feel, and even went so far as to put Microsoft on legal notice (presumably a [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.kayak.com">Kayak</a> travel search launched a formal attack on Microsoft&#8217;s <a href="http://www.bing.com/travel">Bing Travel </a>yesterday in an article that appeared on Wired, <a href="http://www.wired.com/epicenter/2009/06/kayak-bing/">Kayak to Bing: Stop Copying Us!</a>. Apparently, Kayak&#8217;s executives are more than a little steamed by the eerily similar look and feel, and even went so far as to put Microsoft on legal notice (presumably a cease &amp; desist) to knock it off.</p>
<p>Wired implies that Kayak is concerned about the similarities causing brand confusion with its users, and even noted an &#8220;uncomfortable&#8221; comparison in their own <a href="http://www.wired.com/epicenter/2009/05/microsofts-bing-hides-its-best-features/">review of Bing</a>, and points out that others have noticed as well. You can see further evidence of others comparing <a href="http://www.bing.com/search?q=bing+AND+kayak">Bing vs Kayak</a> from a variety of sources, including TechCrunch&#8217;s <a href="http://www.techcrunch.com/2009/06/04/bing-travel-arrives/">early review</a>, where commenters pointed out that Bing Travel gave lower airfares than Kayak, though they use the same airline data sources.</p>
<p>Microsoft spokesperson Whitney Burk denied the accusations , telling Wired:</p>
<blockquote>“We are discussing the matter with Kayak,” Burk said in emailed statement. “Bing Travel is based on independent development by Microsoft and Farecast.com, which Microsoft acquired in 2008. Any contrary allegations are without merit.”</blockquote>
<p>However, the Farecast technology which Microsoft acquired is not the issue here, since Kayak does not run a similar fare prediction service, they are not claiming an infringement on technology, but just the user interface, namely Bing&#8217;s use of &#8220;sliders&#8221;.</p>
<p>Sliders have become a popular tool on many &#8220;Web 2.0&#8243; based sites, and Farecast did initially use them in design.  Last year, another start-up in airfare comparison, <a href="http://www.insidetrip.com">InsideTrip.com</a>, debuted (May 2008) using sliders much like Kayak or Bing does. In Greg Sterling&#8217;s <a href="http://searchengineland.com/insidetrip-seeks-to-add-more-depth-and-dimension-to-travel-search-13493">write-up</a>, you can see the similarity:</p>
<p style="text-align: center;"><img class="aligncenter" title="InsideTrip First Look by Greg Sterling" src="http://farm3.static.flickr.com/2004/2307944482_e422da93bb.jpg" alt="" width="500" height="349" /></p>
<p>Note that InsideTrip was also founded by Dave Pelter, who held two year executive position at Farecast.com. Is it the fact that InsideTrip.com has not gained (or rather, sustained) enough traction for Kayak users to be as affected by the similar slider options?</p>
<p style="text-align: center;"><a href="http://siteanalytics.compete.com/insidetrip.com/?metric=uv"><img class="aligncenter" src="http://grapher.compete.com/insidetrip.com_uv_310.png" alt="" width="310" height="170" /></a></p>
<p><strong>Hakia holds similar views </strong></p>
<p>Kayak isn&#8217;t the only one crying foul, as <a href="http://www.hakia.com">hakia </a>also <a href="http://blog.hakia.com/?p=726">blogged when Bing launched</a>, hakia claimed that Bing&#8217;s categorized search is a similar version to its own hakia Galleries, using the example of a search for &#8220;Obama&#8221;:</p>
<blockquote>&#8220;<a href="http://www.hakia.com/search.aspx?q=obama&amp;source=tb&amp;ver=1.0">hakia Galleries</a> prove 17 aspects of this query. We save the user time by answering 17 Obama related questions in one search. Compare the hakia Obama gallery with the same <a href="http://www.bing.com/search?q=obama&amp;form=QBLH">search at Bing.com</a> (Bing provides only 7 aspects of this search query).&#8221;</blockquote>
<p>Hakia&#8217;s COO, Melek Pulatkonak also recently told <a href="http://www.investors.com/NewsAndAnalysis/Article.aspx?id=480529&amp;Ntt=hakia">Investor&#8217;s Business Daily</a> that they&#8217;d been in early partnership talks with Microsoft in July, and stated:</p>
<p>&#8220;We were approached by Microsoft to show them how the hakia galleries worked, and we did, and now they have a similar feature — we showed them how to do it,&#8221; she said. &#8220;We were surprised that it is a featured part of and the most differentiated part of Bing.&#8221;</p>
<p>Microsoft also denies that hakia had any part in the development of Bing&#8217;s categorized search, which Microsoft refers to as &#8220;faceted search&#8221;.</p>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/kayak-says-bing-isnt-playing-air-fair-hakia-sees-near-mirror-image-21542/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Hakia Debuts &#8220;Digital Newspaper&#8221; Start Page</title>
		<link>http://searchengineland.com/hakia-debuts-digital-newspaper-start-page-15941</link>
		<comments>http://searchengineland.com/hakia-debuts-digital-newspaper-start-page-15941#comments</comments>
		<pubDate>Tue, 23 Dec 2008 17:31:07 +0000</pubDate>
		<dc:creator>Greg Sterling</dc:creator>
				<category><![CDATA[Search Engines: Hakia]]></category>

		<guid isPermaLink="false">http://searchengineland.com/?p=15941</guid>
		<description><![CDATA[If you&#8217;re a general search engine that competes with Google (Yahoo &#38; Microsoft), as you know, it&#8217;s very hard to make headway in that game. But if you recharacterize or reposition yourself as other than a &#8220;search engine&#8221; you might gain some adoption. To some degree that&#8217;s what Kosmix did by building out multi-media rich [...]]]></description>
			<content:encoded><![CDATA[<p>If you&#8217;re a general search engine that competes with Google (Yahoo &amp; Microsoft), as you know, it&#8217;s very hard to make headway in that game. But if you recharacterize or reposition yourself as other than a &#8220;search engine&#8221; you might gain some adoption. To some degree that&#8217;s what <a href="http://www.kosmix.com/">Kosmix</a> did by building out multi-media rich Wikipedia-like &#8220;topic pages&#8221; (a better version of the US version of Yahoo Glue Pages). <a href="http://hakia.com/">Hakia</a> faces this same conundrum. The company has been trying for some time to tell people that its &#8220;semantic search&#8221; delivers better results than Google. However people largely haven&#8217;t been listening.</p>
<p>Now the company is introducing a customizable start page (much like iGoogle, Netvibes or MyYahoo) called <a href="http://my.hakia.com/default.aspx">MyHakia</a>. The company is calling it a &#8220;<a href="http://blog.hakia.com/?p=541">digital newspaper</a>,&#8221; which is a good marketing hook. <span id="more-15941"></span></p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2008/12/picture-41.png"><img class="alignnone size-full wp-image-15942" title="picture-41" src="http://searchengineland.com/figz/wp-content/seloads/2008/12/picture-41.png" alt="" width="500" height="467" /></a></p>
<p>The hope is that use of this page will expose people to Hakia&#8217;s search results.</p>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/hakia-debuts-digital-newspaper-start-page-15941/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hakia Relaunches Site With &#8220;Trusted Results&#8221;</title>
		<link>http://searchengineland.com/hakia-relaunches-site-with-trusted-results-14949</link>
		<comments>http://searchengineland.com/hakia-relaunches-site-with-trusted-results-14949#comments</comments>
		<pubDate>Mon, 06 Oct 2008 14:00:21 +0000</pubDate>
		<dc:creator>Vanessa Fox</dc:creator>
				<category><![CDATA[Search Engines: Hakia]]></category>

		<guid isPermaLink="false">http://searchengineland.com/?p=14949</guid>
		<description><![CDATA[Today at SMX East, natural language search engine Hakia has launched a new search experience that  enables searchers to view categorized results, as well as view &#8220;Trusted&#8221; Results&#8221; from &#8220;Credible Sites&#8221;. The Trusted Results program is an initiative Hakia  has developed with information professionals and librarians. As they note on their site: A popular Web [...]]]></description>
			<content:encoded><![CDATA[<p>Today at <a href="http://blog.hakia.com/?p=445">SMX East</a>, <a href="http://www.hakia.com">natural language search engine Hakia</a> has launched a <a href="http://blog.hakia.com/?p=453">new search experience</a> that  enables searchers to view categorized results, as well as view &#8220;Trusted&#8221; Results&#8221; from &#8220;Credible Sites&#8221;.</p>
<p>The <a href="http://club.hakia.com/lib/">Trusted Results program</a> is an initiative Hakia  has developed with information professionals and librarians. As they note on their site:</p>
<blockquote>A popular Web source may not always be credible, and a credible Web source may not always be popular. hakia is the first search engine to integrate librarians’ collective knowledge of credible Web sites into search results to guide searchers.</blockquote>
<p>So far, these results are available for health, medical, and environmental topics and they are looking to expand coverage. Below, more information about what&#8217;s new at Hakia, how to vette sites for their new program, and how the changes stack up compared to the rest of the search industry.</p>
<p><span id="more-14949"></span></p>
<p>Every search engine, whether it&#8217;s trying to become the default search experience or provide a niche vertical search, operates within the shadow of Google. Google helped usher in a searching culture, and not only are searchers more likely to&#8221;Google&#8221; something than they are to go elsewhere to search for something, but the Google experience is the search behavior they tend to expect.</p>
<p>So how can other search engines compete? Certainly distribution and awareness are driving factors in getting searchers to try something new, but user experience is key in getting them to stay.</p>
<p><strong>Building a better search engine</strong></p>
<p>The plan for building a better search engine often includes the following building blocks:</p>
<ul>
<li><strong>Comprehensiveness.</strong> <a href="http://searchengineland.com/cuil-launches-can-this-search-start-up-really-best-google-14459.php">Cuil</a> stressed this when they launched earlier this year, but they seemed to have lost footing with freshness, which is a key obstacle to overcome when managing so much information.</li>
<li><strong>Relevance. </strong> Google is widely credited with gaining popularity due to its breakthroughs in algorithmically measuring relevance across the web. <a href="http://searchengineland.com/what-is-google-pagerank-a-guide-for-searchers-webmasters-11068.php">PageRank</a> was a major factor, but understanding query intent and usefulness of sites based on that query are at least as important to relevance quality as isolated page popularity.</li>
<li><strong>User experience. </strong>Many search engines have experimented with an experience that&#8217;s differentiated from Google. While these interfaces may be a step forward for search, the fact that they&#8217;re different from Google (which is the behavior searchers expect) makes their adoption more difficult. Ask was praised for its innovative approach to UI with <a href="http://searchengineland.com/ask-relaunches-now-ask-3d-11379.php">Ask 3D</a>, but as of today, they&#8217;ve <a href="http://bits.blogs.nytimes.com/2008/10/06/askcom-revamps-search-engine/">replaced that interface</a> with what appears to be <a href="http://searchengineland.com/askcom-goes-back-to-1996-with-new-release-14951.php">more like their previous approach</a>.</li>
</ul>
<p><strong>Hakia&#8217;s approach to relevance</strong></p>
<p>Hakia&#8217;s initial launch tackled relevance by applying semantic technology. <a href="http://searchengineland.com/social-networking-through-search-hakia-helps-you-meet-others-12586.php">As I noted from my talk</a> with them last year:</p>
<blockquote>They point out that while the traditional search engines bring back good results most of the time, it’s impossible to know if pages that weren’t returned (because they have too few links to them, for instance) would have been more relevant for the query. By understanding the concepts on web pages rather than relying on things like external links and anchor text, they feel they can have a better sense of what page across the entire web is most useful to a searcher.</blockquote>
<p><strong>Human-powered search results</strong></p>
<p>With this latest launch, they&#8217;re expanding their focus on relevance by <a href="http://blog.hakia.com/?p=419">engaging with librarians</a> to manually compile <a href="http://club.hakia.com/lib/">lists of &#8220;trusted&#8221; sites</a> for particular categories. They say that &#8220;Google (and others like Google) don’t make the distinction of what is credible (or quality) what is not.&#8221; They note <a href="http://news.bbc.co.uk/1/hi/technology/7613201.stm">Sir Tim Berners-Lee&#8217;s recent comments</a> when launching the World Wide Web Foundation that new systems are needed that give trustworthiness labels to web sites that have been proven to be reliable sources.</p>
<p><a title="Hakia &quot;Credible Sites&quot; by Search Engine Land, on Flickr" href="http://www.flickr.com/photos/23148333@N06/2917259353/"><img src="http://farm4.static.flickr.com/3130/2917259353_6622e0aabd.jpg" alt="Hakia &quot;Credible Sites&quot;" width="500" height="368" /></a></p>
<p>Of course, Hakia isn&#8217;t the only search engine that&#8217;s partially human powered. <a href="http://searchengineland.com/mahalo-launches-with-human-crafted-search-results-11341.php">Mahalo</a> uses both paid and volunteer guides (although they aren&#8217;t necessary experts in the subject matter) and even Google itself recently entered the &#8220;credible sources&#8221; fray with <a href="http://searchengineland.com/googles-knol-launches-like-wikipedia-with-moderation-14434.php">Knol</a>.</p>
<p><strong>Differentiated user experience</strong></p>
<p>Hakia is also launching a new way of navigating search results. The <strong>All results </strong>tab displays web results, credible sites and news results. Searchers can also view only pages from credible sites, images, or news and have easiest access to <a href="http://blog.hakia.com/?p=10">galleries</a>.</p>
<p><a title="web by Search Engine Land, on Flickr" href="http://www.flickr.com/photos/23148333@N06/2918103424/"><img src="http://farm4.static.flickr.com/3191/2918103424_aef7a6f5c2.jpg" alt="web" width="500" height="334" /></a></p>
<p>Many search engines have been experimenting with results beyond the traditional &#8220;10 blue links&#8221;. However, as of yet, these new experiences haven&#8217;t seemed to have <a href="http://searchengineland.com/the-google-challengers-2008-edition-13049.php">pulled searchers away from Google</a>. Google itself, of course, continues to experiment with new experiences as well, including continued evolution of its<a href="http://searchengineland.com/google-universal-search-2008-edition-13256.php"> universal search results</a> as well as more radical changes such as those seen with <a href="http://www.searchmash.com/">SearchMash</a>.</p>
<p>It&#8217;s great to see search innovation continue and this is just another example of how some part of that innovation is finding the right balance between algorithmic and manual assessment.  I&#8217;ve joked lately in talks that searchers see Google as a &#8220;truth machine&#8221;, when in reality it&#8217;s just presenting what it finds with no commentary on its validity (with Knol being an exception). As we rely more on the web, we may have more need for the Berners-Lee vision of trust labels, and search engines like Hakia are providing a glimpse of what that might look like.</p>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/hakia-relaunches-site-with-trusted-results-14949/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cuil Launches &#8212; Can This Search Start-Up Really Best Google?</title>
		<link>http://searchengineland.com/cuil-launches-can-this-search-start-up-really-best-google-14459</link>
		<comments>http://searchengineland.com/cuil-launches-can-this-search-start-up-really-best-google-14459#comments</comments>
		<pubDate>Mon, 28 Jul 2008 04:01:00 +0000</pubDate>
		<dc:creator>Danny Sullivan</dc:creator>
				<category><![CDATA[Search Engines: Cuil]]></category>
		<category><![CDATA[Search Engines: Hakia]]></category>

		<guid isPermaLink="false">http://searchengineland.com/beta/cuil-launches-can-this-search-start-up-really-best-google-14459.php</guid>
		<description><![CDATA[Can any start-up search engine &#8220;be the next Google?&#8221; Many have wondered this, and today&#8217;s launch of Cuil (pronounced &#8220;cool&#8221;) may provide the best test case since Google itself overtook more established search engines. Cuil provides what appears to be a comprehensive index of the web, offers a unique display presentation, and emerges at a [...]]]></description>
			<content:encoded><![CDATA[<p><a title="Cuil Home Page by search-engine-land, on Flickr" href="http://www.flickr.com/photos/searchengineland/2708529277/">
<img src="http://farm4.static.flickr.com/3258/2708529277_4f804a9c21_o.jpg" border="0" alt="Cuil Home Page" width="389" height="164" /></a></p>
<p>Can any start-up search engine &#8220;be the next Google?&#8221; Many have wondered this, and today&#8217;s launch of <a href="http://www.cuil.com/">Cuil</a> (pronounced &#8220;cool&#8221;) may provide the best test case since Google itself overtook more established search engines. Cuil provides what appears to be a comprehensive index of the web, offers a unique display presentation, and emerges at a time when people might be ready to embrace a quality &#8220;underdog&#8221; service.</p>
<p>The big questions now are how does the relevancy hold up and can word-of-mouth really still build significant share? [<strong>Note</strong>: The Cuil site was supposed to be live for searches at of 9:01pm Pacific time on July 27, but so far I'm still seeing only a holding page. I'd expect this to change fairly soon].</p>
<p><span id="more-14459"></span></p>
<p><strong>Why Care About Cuil?</strong></p>
<p>There&#8217;s no end of companies that have been trying to take on Google as a search destination. Earlier this year, my<a href="http://searchengineland.com/080103-084033.php"> Google Challengers:2008 Edition</a> article covered some of these, like <a href="http://www.hakia.com/">Hakia</a>,<a href="http://searchengineland.com/070530-180000.php"> Mahalo</a>, and<a href="http://search.wikia.com/wiki/Search_Wikia"> Search Wikia</a>. You can add to that list other companies like <a href="http://gigablast.com/">Gigaweb</a> and<a href="http://www.exalead.com/"> Exalead</a>. None of them have made a dent
in Google&#8217;s share.</p>
<p>Indeed, the established players of Yahoo, Microsoft, and Ask.com &#8212; all of whom have established quality search products &#8212; haven&#8217;t dented Google either. So what makes Cuil worthy of special attention?</p>
<p>For one, Cuil has an impressive pedigree with its three founders: Tom Costello of IBM&#8217;s WebFountain project, plus Anna Patterson and Russell Power of Google&#8217;s TeraGoogle project, Google&#8217;s massive search index. Cuil also counts former AltaVista founder Louis Monier &#8212; who later went to eBay and then Google &#8212; as part of the team.</p>
<p>These people know search. In particular, they know on-the-firing line, heavy duty, industrial strength search. Not only that, they&#8217;re unleashing what appears to be a comprehensive service that anyone can use. Indeed, Google<a href="http://googleblog.blogspot.com/2008/07/we-knew-web-was-big.html"> already did</a> a blog post in reaction to Cuil and its size claims on Friday, before Cuil even launched or those claims became public. If Google&#8217;s paying that much attention, then anyone should.</p>
<p><strong>What Cuil Offers</strong></p>
<p>There are four major areas that Cuil is putting out to distinguish itself from other services. These are:</p>
<ul>
<li>Big web index</li>
<li>Unique relevance algorithm</li>
<li>Unique results display</li>
<li>Privacy</li>
</ul>
<p>I&#8217;m going to dive into each of these areas in depth, to examine the importance of them as well as dissect some of the misconceptions and PR spin that they also have.</p>
<p><strong>Size Wars Return?</strong></p>
<p>Cuil is claiming to have the largest index of the web: 120 billion pages indexed (with a total of 186 billion seen by its crawler; spam and duplicate content are among things excluded from what gets indexed). In talking with them, Cuil estimated they were three times the size of Google. Sounds pretty awesome, right?</p>
<p>Sigh. Yes, size matters. You want to have a comprehensive collection of documents from across the web. But having a lot of documents doesn&#8217;t mean you are most relevant. As<a href="http://searchenginewatch.com/showPage.html?page=3551586"> I wrote</a> back in September 2005, when Google famously dropped the number of documents it had indexed:</p>
<blockquote>Last century, in December 1995 to be exact, AltaVista burst upon the
search engine scene with what was at that time a giant index of 21 million
pages, well above rivals that were in the 1 million to 2 million range.
The web was growing fast, and the more pages you had, the greater the odds
you really were going to find that needle in a haystack. Bigger did to
some degree mean better.</p>
<p>That fact wasn&#8217;t wasted on the PR folks. Games to seem bigger began in
earnest. Lycos<a onclick="s_objectID=&quot;http://searchenginewatch.com/sereport/article.php/2162421_1&quot;;return this.s_oc?this.s_oc(e):true" href="http://searchenginewatch.com/sereport/article.php/2162421"> would talk</a> about the number of pages it &#8220;knew&#8221; about, even if these
weren&#8217;t actually indexed or in any way accessible to searchers through its
search engine. That irritated search engine Excite so much that it even
posted a page on how to count URLs, as you can see archived<a onclick="s_objectID=&quot;http://web.archive.org/web/19961121225924/http:/www.excite.com/ice/counting.html_1&quot;;return this.s_oc?this.s_oc(e):true" href="http://web.archive.org/web/19961121225924/http:/www.excite.com/ice/counting.html"> here</a>.</p>
<p>While size initially DID mean bigger was better, that soon disappeared
when the scale of indexes grew from counting millions of pages to tens of
millions. Bigger no longer meant better because for many queries, you
could get overwhelmed with matches.</p>
<p>I&#8217;ve<a onclick="s_objectID=&quot;http://searchenginewatch.com/searchday/article.php/3071371_1&quot;;return this.s_oc?this.s_oc(e):true" href="http://searchenginewatch.com/searchday/article.php/3071371"> long played</a> with the needle-in-the-haystack metaphor to explain this.
You want to find the needle? You need to have the whole haystack, size
proponents will say. But if I dump the entire haystack on your head, can
you find the needle then? Just being biggest isn&#8217;t good enough.</p>
<p>That&#8217;s why I and others have been saying don&#8217;t fixate on size for as
long as<a onclick="s_objectID=&quot;http://searchenginewatch.com/sereport/article.php/2165301_1&quot;;return this.s_oc?this.s_oc(e):true" href="http://searchenginewatch.com/sereport/article.php/2165301"> 1997</a> and<a onclick="s_objectID=&quot;http://searchenginewatch.com/sereport/article.php/2166151_1&quot;;return this.s_oc?this.s_oc(e):true" href="http://searchenginewatch.com/sereport/article.php/2166151"> 1998</a>. Bigger no longer meant better, regardless of the many size
wars that continued to erupt. Remember, Google &#8212; when it came to popular
attention in 1998 and 1999 &#8212; was one of the tiniest search engines at
around 20 to 85 million pages. Despite that supposed lack of
comprehensiveness, it grew and grew because of the quality of its results.</p>
<p>Why have the size wars persisted? Search engines have seen an index
size announcement as a quick, effective way to give the impression they
were more relevant. In lieu of a relevancy figure, size figures could be
trotted out and the search engine with the biggest bar on the chart wins!</blockquote>
<p>Given this history, seeing Cuil trot out size figures is incredibly disheartening and a step backwards, not forwards. Time better spent on other things (such as measuring the <strong>RELEVANCY</strong> of the results) will instead
get consumed by those trying to count pages. Without even running queries and trying to perform comparison counts, I already have issues with Cuil&#8217;s claims. For example:</p>
<ul>
<li>Cuil told us that Google was at 40 billion documents. According to?
According to what Cuil has heard that reporters have told them they hear
from Google. OK, I talk with both Google and reporters that cover them
regularly. I&#8217;ve never heard this figure put out there. Cuil later added
after the initial talk with them that comparison testing makes them
believe that Google hasn&#8217;t grown.</li>
<li>Yahoo was said to be at 20 billion. Cuil said this is based on where
Yahoo said it was back in 2005, with the assumption that if they&#8217;d gotten
bigger, they would have announced this. Bad assumption given that since
2005, the search size detente has kept both Google and Yahoo from talking
about size figures.</li>
<li>Microsoft was said to be at 12 billion. Actually, Microsoft
<a href="http://searchengineland.com/070927-000001.php">said</a> it was at
20 billion last September &#8212; but if that hard figure isn&#8217;t being used by
Cuil, then you start doubting the other ones they&#8217;ve put out. In a
follow-up, Cuil said they believe Microsoft has fallen back to a smaller
index of 12 billion, based on its testing.</li>
</ul>
<p>We can also start testing in short order, however. Just run a query, see what Google reports as a count for it, then run the same thing on Cuil. If Cuil regularly reports more, they win. Or not. This is what people especially started doing in droves during the last size battle between Google and Yahoo, and then issues about duplicate content and spam starting coming up.</p>
<p>Assuming you get beyond that, any advantage Cuil has on the size front right now will be short-lived, if they make size an issue. Google will simply crawl more documents and ensure that whatever Cuil is, Google will be +1.</p>
<p>We asked Cuil about this, why Google wouldn&#8217;t just match them. &#8220;If they wanted to triple size of their index, they&#8217;d have to triple the size of every server and cluster. It&#8217;s not easy or fast,&#8221; said Patterson.</p>
<p>In a follow-up, Cuil added that Google being as large as they estimated it to be now was largely down to Patterson&#8217;s work at Google, and since she&#8217;s no longer there, increasing the index size will be a &#8220;non-trivial&#8221; exercise.</p>
<p>Perhaps. And perhaps the infrastructure that Cuil has built does make it easier for them to more cheaply index documents from across the web than Google. But Google has plenty of money and engineering expertise of its own. It&#8217;s foolish to think they wouldn&#8217;t counter what might be perceived as a weakness. They responded to Yahoo in 2005; they&#8217;d do the same with Cuil. And for what? Even if Cuil is bigger than Google, it doesn&#8217;t mean Cuil is more relevant. Nor does it mean adding more documents in a &#8220;I&#8217;m bigger than
you&#8221; game would improve the state of search overall.</p>
<p>Unfortunately, Google started reacting to Cuil&#8217;s claims even before Cuil made them. <a href="http://searchengineland.com/080725-161058.php">In a post
on Friday</a>, Google just so happened to decide it was time to mention they &#8220;knew&#8221; of 1 trillion items on the web. That will confuse some people into thinking Google has indexed 1 trillion documents, even though they don&#8217;t say this. What Google did say clearly was:</p>
<blockquote>We don&#8217;t index every one of those trillion pages &#8212; many of them are
similar to each other, or represent auto-generated content similar to the
calendar example that isn&#8217;t very useful to searchers. But we&#8217;re proud to
have the most comprehensive index of any search engine, and our goal always
has been to index all the world&#8217;s data.</blockquote>
<p>My response to Google &#8212; and to Cuil &#8212; and to any search engine that tries to do the size battle is what
<a href="http://searchengineland.com/080725-161058.php">I said</a> on Friday:</p>
<blockquote>There&#8217;s no exact answer to what&#8217;s a useful page &#8212; and so in turn,
there&#8217;s no one exact answer to who has the &#8220;most&#8221; of them collected. Tell me
you have a good chunk of the web, and I&#8217;m fine. But when Google or any
search engine start making size claims, my hackles go way up. There are
better things to focus on.</blockquote>
<p>As a side note, one issue with any large index is keeping it fresh. Cuil says that they crawl 1 to 1.5 billion pages per day, which means it would take 3 months to refresh everything they&#8217;ve currently spidered. However,some important pages are crawled on a weekly basis, they said. That&#8217;s good &#8212; but Google has pages that can be added in near-real time thanks to its <a href="http://searchengineland.com/080130-103249.php">instant layer</a>.</p>
<p><strong>So Long, PageRank?</strong></p>
<p>Cuil is making a big push that it ranks pages by content, rather than popularity. The idea here is to poke at how Google is commonly viewed to just reward pages that have the most PageRank value.</p>
<p>The problem is that PageRank is just part of the way Google ranks pages. It looks at a variety of other factors, so that ranking is not just a popularity contest (see<a href="http://searchengineland.com/070426-011828.php"> What Is Google PageRank? A Guide For Searchers &amp; Webmasters</a> for more about this).</p>
<p>The other issue is that despite the PR pitch, Cuil is indeed using popularity to rank results, as far as I can tell.</p>
<p>For example, in a search for [harry potter], the Harry Potter &amp; The Order Of The Phoenix movie web site comes up first on Cuil. This is out of thousands of possible pages. How on earth can Cuil know just from the content on the page itself that the movie site should be in the top results, especially in a web environment where people can (and will) custom tailor content to mislead search algorithms?</p>
<p>The answer is link analysis &#8212; counting links and effectively seeing who is pointed at the most. The twist is that it is done by measuring the links from pages relevant to what someone search on.</p>
<p>Let&#8217;s go back to the [harry potter] search. When you do that at Cuil, it finds all the pages that it thinks are related to those two words. This means pages that use those words, as well as pages that have other words on them, such as &#8220;harry potter books&#8221; or &#8220;gryffindor.&#8221; It figures out these relationships by seeing what type of words commonly appear across the entire set of pages it finds. Since &#8220;gryffindor&#8221; appears often on pages that also say &#8220;harry potter,&#8221; it can tell these two words (well, three words&#8211; but two different query terms) are related.</p>
<p>Cuil then looks at the entire set to see which pages are linked to from them. Those with many or important links are likely to do better. Since the Harry Potter movie page has a lot of links pointing at it, it comes up
higher in the results. Cuil even has a name for this &#8212; IdeaRank.</p>
<p>If this sounds familiar to some people, that&#8217;s because this particular flavor of link analysis was popularized by Teoma, which was later acquired by Ask. When Teoma appeared, it tried to distinguish itself against Google by saying it analyzed only the &#8220;subject-specific&#8221; links to do ranking. This is still <a href="http://about.ask.com/en/docs/about/webmasters.shtml">played up</a> at Ask today:</p>
<blockquote>Our ExpertRank algorithm goes beyond mere link popularity (which ranks
pages based on the sheer volume of links pointing to a particular page) to
determine popularity among pages considered to be experts on the topic of
your search. This is known as subject-specific popularity. Identifying
topics (also known as &#8220;clusters&#8221;), the experts on those topics, and the
popularity of millions of pages amongst those experts &#8212; at the exact moment
your search query is conducted &#8212; requires many additional calculations that
other search engines do not perform. The result is world-class relevance
that often offers a unique editorial flavor compared to other search
engines.</blockquote>
<p>Fair to say, despite Ask&#8217;s supposed improved analysis, it never trounced Google. Moreover, there are plenty who assume &#8211;<a href="http://searchengineland.com/070716-000001.php">including myself</a>&#8211; that Google itself does subject-specific link analysis.</p>
<p>So the rank by content twist at Cuil? As I&#8217;ve said, more twist than substance. But the content analysis is used in other ways, as I&#8217;ll get into next.</p>
<p><strong>Three Column &#8220;Magazine&#8221; Display</strong></p>
<p>Probably the most dramatic difference between Cuil and Google is how Cuil runs search results across three columns, rather than all in a straight line:</p>
<p><a title="Cuil Search Results by search-engine-land, on Flickr" href="http://www.flickr.com/photos/searchengineland/2708531681/">
<img src="http://farm4.static.flickr.com/3130/2708531681_58389811e6.jpg" border="0" alt="Cuil Search Results" width="500" height="415" /></a></p>
<p>It&#8217;s appealing in one sense that you can see more results all that once. In fact, Cuil said in user testing, the display had an impact on which results was seen as &#8220;number one.&#8221; Some viewed the result in the top left
corner as most important &#8212; others go to the one in the top left. When all nine results can be seen on a large screen, some assume the one in&#8217; the middle is best.</p>
<p>In the top right corner, there&#8217;s a &#8220;Related Searches&#8221; box that allows you to refine your search and drill into specific topics:</p>
<p><a title="Cuil Refinement Option by search-engine-land, on Flickr" href="http://www.flickr.com/photos/searchengineland/2708532497/">
<img src="http://farm4.static.flickr.com/3034/2708532497_229eec5def_o.jpg" border="0" alt="Cuil Refinement Option" width="313" height="297" /></a></p>
<p><a title="Cuil Refinement Option &lt;/p&gt; &lt;p&gt;As covered earlier,&lt;br /&gt; Cuil is able to understand there are sets of pages related to certain terms.&lt;br /&gt; Clustering like this isn't new -- it's been in the search space for years,&lt;br /&gt; with &lt;a href=" href="http://www.flickr.com/photos/searchengineland/2708532497/">Vivisimo a long-time pioneer</a> and Yahoo recently tapping into this to some degreewith its <a href="http://searchengineland.com/070725-233903.php">Yahoo
Search Assist service</a>.</p>
<p>Somewhat related to this, &#8220;tabs&#8221; appear at the top of the search page listing related searches:</p>
<p><a title="Cuil Query Tabs by search-engine-land, on Flickr" href="http://www.flickr.com/photos/searchengineland/2709353008/">
<img src="http://farm4.static.flickr.com/3046/2709353008_1074b6dd69.jpg" border="0" alt="Cuil Query Tabs" width="500" height="74" /></a></p>
<p>Select one of these tabs, and you get back results on that particular topic. Tabs that appear reflect how popular those phrases are on the web.</p>
<p>Underneath the display, Cuil is also working to do what it views as a twist. It&#8217;s trying to diversify the results by topic, so that in the case of Harry Potter, for example, you might get pages about the book as well as the
movie and the author, rather than just results that are all about the book.</p>
<p>&#8220;We&#8217;re trying to choose pages that aren&#8217;t all on the same topic and show you the diversity of the web,&#8221; said Patterson.</p>
<p>Again, however, it&#8217;s not like a Google search lacks diversity. A search on Harry Potter there also brings back different types of results.</p>
<p>Actually, a far more dramatic example of results diversity is what someone like Hakia does. Unlike Cuil, which divides pages into topics based on word patterns, <a href="http://searchengineland.com/071031-200015.php">Hakia is doing real semantic analysis</a> &#8212; trying to understand what words actually mean and what pages are about. As a result, it groups results for [harry
potter] into various categories such as:</p>
<ul>
<li>The Books</li>
<li>Headlines</li>
<li>News &amp; Interviews</li>
<li>The Soundtrack</li>
<li>The Games</li>
<li>Photographs &amp; Pictures</li>
<li>Blogs &amp; Fan Sites</li>
<li>Myths &amp; Controversies</li>
<li>Merchandise</li>
</ul>
<p>Over at Mahalo, you get similar groupings &#8212; though these are done through human work rather than through concept analysis.</p>
<p>Where are the ads? At launch, there will be some public service ads at the bottom of the page, so that people are used to ads being in that spot. As for revenue generating ones, the company is considering whether it should build its own ad network or partner with someone else.</p>
<p>Related to display, Cuil automatically suggests search topics as you type into the box on its home page, queries that come from looking at the most popular related words from across the web. It will also suggest actual web sites to take you to, showing an icon next to their name:</p>
<p><a title="Cuil Suggested Queries by search-engine-land, on Flickr" href="http://www.flickr.com/photos/searchengineland/2708536777/">
<img src="http://farm4.static.flickr.com/3158/2708536777_b6b149b42b_o.jpg" border="0" alt="Cuil Suggested Queries" width="273" height="261" /></a></p>
<p><strong>The Privacy Card</strong></p>
<p>Cuil says that it&#8217;s not logging IP information or keeping any type of material that could be traced back to individual searchers. In contrast, all three major search engines do log IPs addresses plus cookie searches and, in the case of Google, even allow searchers to store search history over time.</p>
<p>That may be reassuring to some searchers, but to date, even scare stories about what Google could do (not that it does) hasn&#8217;t kept searchers away from it. <a href="http://searchengineland.com/080618-155829.php">Ask</a> and<a href="http://searchengineland.com/070723-084924.php"> Microsoft</a> have both tried to play the privacy card against Google and gained nothing for it. Small player Ixquick<a href="http://ixquick.com/eng/protect_privacy.html"> has been</a> playing the card even longer and has gotten no visible traction out of it.</p>
<p>Another issue is that by not allowing searchers to voluntarily allow for personalized results, Cuil might be missing out on an advancement in search where <a href="http://searchengineland.com/080528-131813.php">Google&#8217;s ahead of the pack</a>.</p>
<p><strong>Under The Hood</strong></p>
<p>Behind the scenes, Cuil talk about its infrastructure that&#8217;s designed to be faster and more efficient than those at Google or the competing search engines. Supposedly, that means Cuil can do things cheaper and better than the other players.</p>
<p>I&#8217;ll leave this to others who are tech heads to dissect more in the future. For my part, I&#8217;ll just say that I&#8217;ve heard this line many times over the years. AltaVista would say how it was better in architecture than Lycos.
Inktomi would talk then about how it was more distributed and cheaper than AltaVista. Then FAST would say it was even more distributed than Inktomi. And Google, you know, was so with it that you could break circuit boards and things just kept working.</p>
<p>I&#8217;d also get briefed on how super-duper the Google infrastructure was, for example, then a few years later, there would be a completely new one introduced that lets them do things not possible under the old one.</p>
<p>So call me jaded. Everyone&#8217;s always saying they&#8217;ve got the best, fastest and least expensive way of doing stuff. If they do, proving it has been difficult, nor has it seemed to make much different in market share.</p>
<p>I&#8217;ll leave with a few stats, however. Right now Cuil runs off of two data centers, using a combined 1,000 machines each running 8 CPUs. Another 280 machines split between data centers are used to serve results rather than index the web.</p>
<p><strong>The Name</strong></p>
<p>Does something seem missing with Cuil&#8217;s name? They lost the second L that was part of their original name, Cuill.</p>
<p>&#8220;We had a moment of silence for the departing L,&#8221; said Patterson, explaining that people found it easier to remember the name with one L rather than two.</p>
<p>The name mean &#8220;wisdom&#8221; in Gaelic, from the<a href="http://www.druidry.org/obod/trees/hazel.html"> legend of Finn MacCuil</a>. That brings in another Teoma connection. Teoma is apparently a Gaelic word for &#8220;cunning.&#8221;</p>
<p><strong>Will Cuil Succeed?</strong></p>
<p>Does Cuil think it can beat Google in the search space? They won&#8217;t come out and say that, but you can&#8217;t help but feel that&#8217;s the goal when talking to them. How about beating at least Microsoft or Yahoo?</p>
<p>&#8220;We&#8217;ll take that for a start. If we do that in the next year and a half, I&#8217;ll be an extremely happy person,&#8221; Patterson said.</p>
<p>But is even that realistic? Microsoft and Yahoo themselves both have mature search products. What&#8217;s especially important is that the big three also offer more than the web search that Cuil is providing at launch. News search, image search, video search, local search &#8212; these are just some of the verticals that Cuil lacks but which do get used by searchers. Not offering these makes Cuil feel too focused on what &#8220;old school&#8221; search used to be and missing out on the <a href="http://searchengineland.com/071127-091128.php">Search 3.0</a> vertical and blended search revolution that has been going on.</p>
<p>Clustering of pages; subject-specific link analysis &#8212; these are things others have tried and gained no market share with. Having a comprehensive index is great, but no one prior to launch has been able to play with the service and measure core relevancy. Cuil itself said it had no metrics to show it is more relevant than Google. So why would anyone think it could gain share?</p>
<p>I tend to think Cuil&#8217;s hitting the timing right to pick up a little share, maybe a point or two (which is huge compared to other start-ups). I don&#8217;t think word-of-mouth is dead, and I&#8217;m cautiously optimistic that Cuil will have good relevancy even without having tried it yet (if not, all bets are off). I think you&#8217;ve still got a core of early adopters and tech geeks out on the web who want the &#8220;next Google,&#8221; especially at a time with the
existing Google seems so big and threatening to others. A good underdog can fill a need, if it&#8217;s a quality underdog &#8212; and neither Yahoo or Microsoft have that underdog spirit.</p>
<p>It&#8217;s possible that Cuil could be a wild success that eclipses Yahoo and Microsoft and does threaten Google itself, of course. Anything&#8217;s possible. But I think that&#8217;s unlikely for reasons I<a href="http://searchengineland.com/080103-084033.php"> wrote before</a>:</p>
<blockquote>Google came along at a very special time, as I&#8217;ve long written. It had
better technology at a time when all the search engines had abandoned
improving search, since that was seen as a loss leader. The money was in
portal features.</p>
<p>Today, search is a multi-billion dollar industry. If someone with a
serious search threat comes along, you buy them (such as with YouTube), or
you start to develop your own rival if it seems a real threat. Google&#8217;s not
omnipotent &#8212; but you&#8217;ve already got a space where it&#8217;s Google, Yahoo,
Microsoft, and Ask all seriously fighting it out (and the latter three,
despite their funding and experience, still struggle against Google as being
synonymous as a trusted search brand for most users).</p>
<p>To date, Google is the real exception of &#8220;a better mousetrap wins.&#8221; It&#8217;s
far more likely the companies above, if they do gain traction, will end up
being purchased for a large amount by one of the existing &#8220;search utility
companies.&#8221;</blockquote>
<p>Is Cuil open to being purchased, as<a href="http://searchengineland.com/080701-144250.php"> what happened to search start-up Powerset</a> earlier this month? Powerset was seen by many as a Google-killer (though <a href="http://searchengineland.com/080512-000100.php">not by me</a> and several others). In the end, despite the hype that Powerset itself helped fuel (Cuil&#8217;s been careful to avoid this), it got gobbled up. What if Microsoft&#8217;s Steve Ballmer came knocking?</p>
<p>&#8220;I don&#8217;t know what he&#8217;d be knocking for, whether it be acquisition or partnership or whatever. We do intend on being polite. We believe in getting to know people and making friends because you never know
what deal may come down the line,&#8221; Patterson said.</p>
<p>For related discussion, <a href="http://www.techmeme.com/#a080728p3">see Techmeme</a>.</p>
<p><strong>Postscript:</strong> <a href="http://searchengineland.com/080728-024035.php">See Cuil Fast Test &#8211; Relevancy Isn&#8217;t A Google Killer</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/cuil-launches-can-this-search-start-up-really-best-google-14459/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Powerset Launches &#8220;Understanding Engine&#8221; For Wikipedia Content</title>
		<link>http://searchengineland.com/powerset-launches-understanding-engine-for-wikipedia-content-13970</link>
		<comments>http://searchengineland.com/powerset-launches-understanding-engine-for-wikipedia-content-13970#comments</comments>
		<pubDate>Mon, 12 May 2008 04:01:00 +0000</pubDate>
		<dc:creator>Danny Sullivan</dc:creator>
				<category><![CDATA[Search Engines: Hakia]]></category>
		<category><![CDATA[Search Engines: Powerset]]></category>
		<category><![CDATA[Search Engines: Wikipedia]]></category>
		<category><![CDATA[Search Features: Natural Language]]></category>
		<category><![CDATA[Search Features: Query Refinement]]></category>

		<guid isPermaLink="false">http://searchengineland.com/beta/powerset-launches-understanding-engine-for-wikipedia-content-13970.php</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p>After nearly two years in the making &#8212; and plenty of hype &#8211;
<a href="http://www.powerset.com/">Powerset</a> has
finally rolled out a &quot;natural language&quot; search engine. It&#8217;s not a Google killer.
It&#8217;s barely a business model right now. But at least it&#8217;s something the world
can finally play with, and under the hood, there&#8217;s lots of potential.</p>
<p>By the time you read this, the Powerset site should have changed into a tool
that allows you search
against material within Wikipedia. Why bother using Powerset rather than using Wikipedia&#8217;s own search tool or even Google
<a href="http://www.google.com/advanced_search?q=site:en.wikipedia.org">set to look only within Wikipedia
pages</a>? The Powerset pitch is that you&#8217;ll get better results because
Powerset&#8217;s technology has read
and understood what every word within Wikipedia actually means.</p>
<p><span id="more-13970"></span></p>
<p><b>An Understanding Engine, Not Natural Language Search</b></p>
<p>To understand that more, I beg that you forget you ever heard &quot;natural language&quot;
being associated with Powerset. That&#8217;s not really describing what they do in
comparison to regular search engines.</p>
<p>To explain, you have to understand that Google and the other major search
engines are largely stupid.
They don&#8217;t really understand the content on the pages that they &quot;read.&quot; If they see the word &quot;walk&quot; in a sentence, they don&#8217;t know if walk is
being used as a verb or a noun. In very general terms, they don&#8217;t even know that
words are words. Words are more or less patterns to them &#8212; collections of
letters &#8212; and when someone
searches, they try to find the pages that have those patterns in them or in
links to those pages.</p>
<p>That&#8217;s VERY simplified, OK? The major search engines DO have some smarts, some
ability to know that walk is related to walking or that walk and run might be
similar words. But this is largely done through statistical guessing, rather
than comprehending what the individual words actually mean, especially in terms
of their exact grammatical usage.</p>
<p>Powerset is different. It says that its technology reads and comprehends each
word on a page. It looks at each sentence. It understand the words in each
sentence and how they related to each other. It works out what that sentence
really means, all the facts that are being presented. This means it knows what
any page is really about.</p>
<p>In lieu of a better phrase, call it an &quot;understanding engine.&quot; Maybe that&#8217;s
not the right phrase, but natural language search isn&#8217;t it, either.
Understanding engines at least highlights the uniqueness of Powerset &#8212; that&#8217;s because
it actually
understands what pages are about &#8212; it can extract facts from those pages plus
comprehend how those facts, as well as those pages, relate to each other.</p>
<p><b>Wikipedia Discovery Tool</b></p>
<p>One of the chief uses for Powerset is employing it as a Wikipedia discovery
or query refinement tool. To use the Powerset example they gave me during a briefing last week, consider a
search for [henry viii]. What&#8217;s someone interested in in when they search on
that topic, given Henry did a lot of things during his reign?</p>
<p>Over at Google, we get query refinement suggestions at the bottom of the
page, like this:</p>
<p><a href="http://www.flickr.com/photos/searchengineland/2482533093/" title="Google Query Refinement by search-engine-land, on Flickr">
<img src="http://farm4.static.flickr.com/3140/2482533093_c3ff0f2415.jpg" width="500" height="87" alt="Google Query Refinement" border="0" /></a></p>
<p>At Yahoo</p>
<p><a href="http://www.flickr.com/photos/searchengineland/2483347718/" title="Yahoo Query Refinement by search-engine-land, on Flickr">
<img src="http://farm3.static.flickr.com/2223/2483347718_0a4874a907.jpg" width="500" height="136" alt="Yahoo Query Refinement" border="0" /></a></p>
<p>At Microsoft</p>
<p><a href="http://www.flickr.com/photos/searchengineland/2483347764/" title="Microsoft Query Refinement by search-engine-land, on Flickr">
<img src="http://farm3.static.flickr.com/2064/2483347764_2632c0c9d2_o.jpg" width="158" height="223" alt="Microsoft Query Refinement" border="0" /></a></p>
<p>Most of these are generated by looking at the relationships between those who have
searched for one topic and then may have gone off and done another search. Yahoo
has the most sophisticated of the pack (see
<a href="http://searchengineland.com/070725-233903.php">Search Suggestions On
Steroids: Yahoo Search Assist</a>), but it still hasn&#8217;t actually
&quot;read&quot; about Henry VIII and tried to group him into subtopics, in the way a human
might.</p>
<p>That&#8217;s what Powerset tries. Here&#8217;s what you get in a search for Henry VIII:</p>
<p><a href="http://www.flickr.com/photos/searchengineland/2483348160/" title="Powerset Query Refinement by search-engine-land, on Flickr">
<img src="http://farm3.static.flickr.com/2226/2483348160_7de4731125.jpg" width="500" height="369" alt="Powerset Query Refinement" border="0" /></a></p>
<p>Notice the tabs at the top, where it recognizes Henry VIII could refer to the
person, the opera, the play, or even a television drama. OK, so not too amazing
when you think about it. But look further to the &quot;Factz&quot; area. Here you can see
that Powerset, after reading through Wikipedia, has figured out that Henry VIII
&quot;dissolved&quot; things like monasteries or that he &quot;granted&quot; things like land. And
yes, he &quot;married&quot; a few people.</p>
<p>There&#8217;s even more facts that can be found like this:</p>
<p><a href="http://www.flickr.com/photos/searchengineland/2482533815/" title="Powerset Factz by search-engine-land, on Flickr">
<img src="http://farm3.static.flickr.com/2182/2482533815_e10b047083.jpg" width="431" height="500" alt="Powerset Factz" border="0" /></a></p>
<p>This is nice refinement. Running down the list, you can quickly scan the many
facts that define Henry&#8217;s life. And from the list, with a click, you can drill
in more about topics and jump right to particular pages within Wikipedia:</p>
<p><a href="http://www.flickr.com/photos/searchengineland/2482533897/" title="Powerset Factz by search-engine-land, on Flickr">
<img src="http://farm4.static.flickr.com/3204/2482533897_dc0af3e3dd_o.jpg" width="468" height="120" alt="Powerset Factz" border="0" /></a></p>
<p>See how there&#8217;s a link to the
<a href="http://en.wikipedia.org/wiki/Falmouth,_Cornwall">Falmouth, Cornwall</a>
page? Powerset has seen that there&#8217;s something Henry VIII built mentioned on
that page, Pendennis Castle. That&#8217;s not covered on the main
<a href="http://en.wikipedia.org/wiki/Henry_VIII_of_England">Henry VIII page</a>,
but because Powerset has read both pages and understands what they are about, it
can link the facts together.</p>
<p><b>Overkill For Now?</b></p>
<p>In short, the refinement is cool. What&#8217;s not to love about it? For one, it
might be overkill. During the demo, Powerset made a big deal on how Powerset
could build information from across various Wikipedia pages that isn&#8217;t written
on any single one of them. For example, a search for [hulk hogan]
brought this up:</p>
<p><a href="http://www.flickr.com/photos/searchengineland/2483348582/" title="Powerset Factz by search-engine-land, on Flickr">
<img src="http://farm3.static.flickr.com/2074/2483348582_a6e475ff9f.jpg" width="500" height="101" alt="Powerset Factz" border="0" /></a></p>
<p>See how those who Hulk Hogan has defeated are itemized? It&#8217;s nice &#8212; but do
you really trust that all the defeats have been captured? I wouldn&#8217;t. I&#8217;d
probably still go looking for an authoritative list that had been reviewed by a
human. Moreover, I can get lists
like that without great refinement. A search for
<a href="http://www.google.com/search?q=hulk hogan victories">hulk hogan
victories</a> on Google brings me to this
<a href="http://prowrestling.about.com/od/thewrestlers/p/hulkhogan.htm">nice
page</a> on About.com listing his world title victories.</p>
<p>In addition, while Powerset did a nice job of breaking down Henry VIII
according to Wikipedia, Wikipedia&#8217;s human editors do a pretty nice job right in
the opening paragraphs to the Henry VIII page:</p>
<blockquote>
<p><i><b>Henry VIII</b> (<a href="http://en.wikipedia.org/wiki/June_28" title="June 28">28
June</a> <a href="http://en.wikipedia.org/wiki/1491" title="1491">1491</a> –
<a href="http://en.wikipedia.org/wiki/January_28" title="January 28">28
January</a> <a href="http://en.wikipedia.org/wiki/1547" title="1547">1547</a>)
was
<a href="http://en.wikipedia.org/wiki/Kingdom_of_England" title="Kingdom of England">
King of England</a> and
<a href="http://en.wikipedia.org/wiki/Lordship_of_Ireland" title="Lordship of Ireland">
Lord of Ireland</a>, later
<a href="http://en.wikipedia.org/wiki/King_of_Ireland" class="mw-redirect" title="King of Ireland">
King of Ireland</a> and claimant to the
<a href="http://en.wikipedia.org/wiki/Kingdom_of_France" title="Kingdom of France">
Kingdom of France</a>, from
<a href="http://en.wikipedia.org/wiki/April_21" title="April 21">21 April</a>
<a href="http://en.wikipedia.org/wiki/1509" title="1509">1509</a> until his
death. Henry was the second monarch of the
<a href="http://en.wikipedia.org/wiki/House_of_Tudor" class="mw-redirect" title="House of Tudor">
House of Tudor</a>, succeeding his father,
<a href="http://en.wikipedia.org/wiki/Henry_VII_of_England" title="Henry VII of England">
Henry VII</a>.</i></p>
<p><i>Henry VIII was a significant figure in the history of the English monarchy.
Although in the first parts of his reign he energetically suppressed the
<a href="http://en.wikipedia.org/wiki/English_Reformation" title="English Reformation">
Reformation</a> of the
<a href="http://en.wikipedia.org/wiki/Church_of_England" title="Church of England">
Anglican Church</a>, which had been building steam since
<a href="http://en.wikipedia.org/wiki/John_Wycliffe" title="John Wycliffe">
John Wycliffe</a> of the fourteenth century, he is more often known for his
ecclesiastical struggles with Rome. These struggles ultimately led to him
separating the Anglican Church from Roman authority, the
<a href="http://en.wikipedia.org/wiki/Dissolution_of_the_Monasteries" title="Dissolution of the Monasteries">
Dissolution of the Monasteries</a>, and establishing the English monarch as
the
<a href="http://en.wikipedia.org/wiki/Supreme_Head_of_the_Church_of_England" class="mw-redirect" title="Supreme Head of the Church of England">
Supreme Head of the Church of England</a>. Although some claim he became a
Protestant on his death-bed, he advocated Catholic ceremony and doctrine
throughout his life; royal backing of the English Reformation was left to his
heirs,
<a href="http://en.wikipedia.org/wiki/Edward_VI" class="mw-redirect" title="Edward VI">
Edward VI</a> and
<a href="http://en.wikipedia.org/wiki/Elizabeth_I" class="mw-redirect" title="Elizabeth I">
Elizabeth I</a>. Henry also oversaw the legal union of
<a href="http://en.wikipedia.org/wiki/England" title="England">England</a> and
<a href="http://en.wikipedia.org/wiki/Wales" title="Wales">Wales</a> (see
<a href="http://en.wikipedia.org/wiki/Laws_in_Wales_Acts_1535â??1542" title="Laws in Wales Acts 1535–1542">
Laws in Wales Acts 1535–1542</a>). He is noted in popular culture for being
<a href="http://en.wikipedia.org/wiki/Wives_of_Henry_VIII" title="Wives of Henry VIII">
married six times</a>.</i></p>
</blockquote>
<p>I suspect most people hitting Wikipedia are already going to find an opening
paragraph like that, which does a
pretty good job guiding them in refining their topics about Henry VIII and pointing them to
facts.</p>
<p>That&#8217;s a problem for Powerset, which told me it hopes to&nbsp; attract lots of
those Wikipedia users to its own site, where they&#8217;ll be eventually shown ads
alongside the content (ads aren&#8217;t present at launch).</p>
<p>Powerset was at pains to explain how popular Wikipedia is and what a well
used resource it has become. Agreed &#8212; and plenty of those people wind up there
because they&#8217;ve done searches at Google. About 70 percent of Wikipedia users
come via search engines, according to Powerset itself. That&#8217;s a huge audience
that is NOT going to magically be routed to Powerset instead. Yes, some know to go directly
to Wikipedia. No doubt some of these users will hear of the new
Powerset tool and go there. However, it will be a
stunning achievement if these are more than a fraction of those who hit the main Wikipedia site.</p>
<p><b>Article Outlines</b></p>
<p>Powerset has another trick up its sleeve that might pull in the people. For
any page you visit, there&#8217;s an &quot;Article Outline&quot; box that appears within it,
like this:</p>
<p><a href="http://www.flickr.com/photos/searchengineland/2482534405/" title="Powerset Article Outlines by search-engine-land, on Flickr">
<img src="http://farm3.static.flickr.com/2227/2482534405_ec6970e13b.jpg" width="500" height="302" alt="Powerset Article Outlines" border="0" /></a></p>
<p>It&#8217;s very slick. Select an item, and you&#8217;re jumped to the spot within the
document related to it:</p>
<p><a href="http://www.flickr.com/photos/searchengineland/2483349268/" title="Powerset Article Outlines by search-engine-land, on Flickr">
<img src="http://farm3.static.flickr.com/2137/2483349268_b6ffd573fd.jpg" width="500" height="151" alt="Powerset Article Outlines" border="0" /></a></p>
<p>I think it&#8217;s self-evident that Powerset adds some nice value to Wikipedia.
Indeed, everyone would probably be smart to go to it directly rather than
Wikipedia itself. But as I&#8217;ve covered above, that&#8217;s not what I expect to happen.</p>
<p><b>Future In Site Search?</b></p>
<p>If Powerset fails to capture a wide audience, then what&#8217;s the way forward for
it? One area is to
provide better site-specific searching. Powerset&#8217;s technology can be applied to
any set of documents, to make it easier for people to find what they are looking
for within them. Site specific search allows those visiting a particular web
site to look just within that site. That market, along with enterprise search
(making intranets searchable) continues to grow. And the audience doing those
types of search are likely more inclined to seek out refinement options and
other exploratory tools than they are when performing general searches.</p>
<p>Powerset said this is a market they&#8217;re interested in, so perhaps we&#8217;ll see it
grow in that area. But for those expecting it to produce Google-wealth, keep in
mind that long-time and mature enterprise search player FAST
<a href="http://searchengineland.com/080108-080050.php">sold</a> for $1.2
billion earlier this year. Yes, that&#8217;s a huge amount of money, but it&#8217;s not
the multibillions Yahoo was going to go for, and it&#8217;s much less than what Google&#8217;s valued at.</p>
<p>Speaking of Yahoo, it used to be the leading candidate in the past of who might
acquire Powerset, especially given some close ties between the companies (Powerset
has a number of former Yahoos on staff). Given Yahoo&#8217;s current troubles and
unstable state, I wouldn&#8217;t expect much here.</p>
<p>Could a tie-up with a major player like Google or Microsoft happen? Sure.
Aside from site search, the technology that allows machines to automatically
comprehend what text documents are about ought to have other applications and be
worth something. What those are and how much it is worth isn&#8217;t clear. Powerset&#8217;s
been smart to <a href="http://searchengineland.com/070209-093707.php">snap up</a> many licenses and patents around the technology that
should make it attractive to a larger search player like Google or Microsoft to
acquire. Within one of these organizations, I suspect more innovative things
would come.</p>
<p>FYI, I wrote the above paragraph last Friday, before the rumors (see
<a href="http://www.news.com/8301-13953_3-9940887-80.html">here</a> on News.com
and <a href="http://www.techmeme.com/080510/p13#a080510p13">here</a> on
Techmeme) that Microsoft might want to buy it came out over the weekend.
Actually, I started writing this article several months ago and in that, was
covering how it might be an acquisition target. It&#8217;s a fairly obvious move to
expect any of the majors to take a look, and when I talked to Powerset several
months ago, I was given the impression that all the majors had taken a look.</p>
<p>Since then, of course, no one has acquired it &#8212; plus the company went
through a <a href="http://searchengineland.com/071102-133736.php">management
shake-up</a> last year. It was already under fire for not getting a product out
for so long. Add to these strikes as a potential Google killer the fact that it takes
Powerset about a month to comprehend Wikipedia&#8217;s 2.5 million topic pages. In
that time, many of those pages will have changed &#8212; thus needing to be reread
again. Powerset&#8217;s impressive, but with the web having in excess of 20 BILLION
constantly change pages, this is no overnight secret weapon that Microsoft might
buy and employ to take the search lead.</p>
<p>Indeed, what Powerset says it
has developed &#8212; along with patents locked up to protect it &#8212; is overkill for
what&#8217;s needed today. It will be more useful probably five years from now, in
ways we&#8217;re not even envisioning. For those players thinking long-term, which
include both Google and Microsoft, sure &#8212; it might well make sense to buy.</p>
<p>By the way, the Powerset launch will no doubt inspire interest in another
&quot;natural language&quot; search engine, Hakia. Someday I want to revisit Hakia and
explain more about why I also dislike the term &quot;natural language&quot; being applied
to it. In the meantime, you can read Vanessa Fox&#8217;s excellent article from last
October on the service, <a href="http://searchengineland.com/071031-200015.php">
Social Networking Through Search: Hakia Helps You Meet Others</a>. And if you
need a deflation of natural language hype, see
<a href="http://searchengineland.com/080103-084033.php">The Google Challengers:
2008 Edition</a>. In the section on Powerset, I summarize a long rant I did on
the history and hype of natural language search.</p>
<p>For related discussion, <a href="http://www.techmeme.com/080512/p1#a080512p1">see Techmeme</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/powerset-launches-understanding-engine-for-wikipedia-content-13970/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hakia Goes For &#8220;Quality&#8221; Over &#8220;Popularity&#8221;</title>
		<link>http://searchengineland.com/hakia-goes-for-quality-over-popularity-13775</link>
		<comments>http://searchengineland.com/hakia-goes-for-quality-over-popularity-13775#comments</comments>
		<pubDate>Tue, 15 Apr 2008 14:00:10 +0000</pubDate>
		<dc:creator>Greg Sterling</dc:creator>
				<category><![CDATA[Search Engines: Hakia]]></category>
		<category><![CDATA[Search Engines: Health & Medical Search Engines]]></category>
		<category><![CDATA[Search Engines: Other Search Engines]]></category>

		<guid isPermaLink="false">http://searchengineland.com/beta/hakia-goes-for-quality-over-popularity-13775.php</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p>If <a href="http://hakia.com/">Hakia </a>were using an automotive analogy, the site might be saying to Google, &#8220;We&#8217;re BMW, you&#8217;re Chevrolet.&#8221; The Hakia blog <a href="http://blog.hakia.com/?p=275">explains</a> how the engine is taking a &#8220;quality&#8221; approach, trying to assess the credibility of sites in ranking them, together with the help of professional librarians. Hakia specifically discusses this in the context of health-related search and contrasts its approach with that of &#8220;popularity,&#8221; a general reference to Google&#8217;s original PageRank algorithm.</p>
<p><span id="more-13775"></span>
The company says that it will roll out &#8220;Quality Search&#8221; in a range of verticals &#8212; &#8220;law, finance, science, and in many other content-rich verticals&#8221; &#8212; based upon expert sources and librarian-aided indexing.</p>
<p>Stepping back, the irony here is that it&#8217;s a bit of a return to the &#8220;directory&#8221; approach of old. At the highest level, humans originally compiled lists of websites (e.g., the original Yahoo directory). That was eventually replaced by machine algorithms when the internet got to be too large to categorize everything with an editorial staff. Enter Google.</p>
<p>But when the internet became so large that information overload became somewhat overwhelming and routine, the trend of human-powered search or &#8220;social search&#8221; emerged to rectify some of the seeming randomness and inefficiency of these giant indexes. Social search, generally speaking, thus sought to inject a community filter into search results (e.g., Eurekster).</p>
<p>Hakia&#8217;s blog post and approach suggests a return to the &#8220;top-down&#8221; editorial efforts of the earlier days, albeit with the knowledge base and capabilities of today&#8217;s internet. (This simplification probably doesn&#8217;t fully capture what they&#8217;re doing.)</p>
<p>One can also see the <a href="http://searchengineland.com/071127-091128.php">blended/universal search trend</a> as an effort to get to &#8220;answers&#8221; beyond the information overwhelm that intrudes into the prior &#8220;10 blue links&#8221; approach that defined general web search for years.</p>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/hakia-goes-for-quality-over-popularity-13775/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Google Challengers: 2008 Edition</title>
		<link>http://searchengineland.com/the-google-challengers-2008-edition-13049</link>
		<comments>http://searchengineland.com/the-google-challengers-2008-edition-13049#comments</comments>
		<pubDate>Thu, 03 Jan 2008 12:40:33 +0000</pubDate>
		<dc:creator>Danny Sullivan</dc:creator>
				<category><![CDATA[Blekko]]></category>
		<category><![CDATA[Google: Business Issues]]></category>
		<category><![CDATA[Search Engines: Hakia]]></category>
		<category><![CDATA[Search Engines: Mahalo]]></category>
		<category><![CDATA[Search Engines: Other Search Engines]]></category>
		<category><![CDATA[Search Engines: Search Wikia]]></category>

		<guid isPermaLink="false">http://searchengineland.com/beta/the-google-challengers-2008-edition-13049.php</guid>
		<description><![CDATA[Rich Skrenta &#8212; who, aside from creating the first computer virus, is more notable to search as a cofounder of the Open Directory Project and the Topix news search engine &#8212; has announced he&#8217;s founded a search start-up. A stealth one, as TechCrunch puts it. Don&#8217;t we already have several stealth search start-ups? Yep. Here&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>Rich Skrenta &#8212; who, aside from creating the first computer virus, is more
notable to search as a cofounder of the Open Directory Project and the Topix
news search engine &#8212; has announced he&#8217;s founded a search start-up. A stealth
one, as TechCrunch puts it. Don&#8217;t we already have several stealth search
start-ups? Yep. Here&#8217;s a guide to who&#8217;s who.</p>
<p><span id="more-13049"></span></p>
<p><strong>Blekko</strong></p>
<p>What we know so far about <a href="http://www.blekko.com/">Blekko</a> isn&#8217;t
much, and TechCrunch has the most details in its
<a title="Permanent Link to The Next Google Search Challenger: Blekko" rel="bookmark" href="http://www.techcrunch.com/2008/01/02/the-next-google-search-challenger-blekko/">
The Next Google Search Challenger: Blekko</a> post from yesterday. Apparently
Rich founded the company in September 2006, along with five other former Topix
employees, <a href="http://searchengineland.com/070627-084257.php">after he left
Topix in June</a>.</p>
<p>Rich told TechCrunch not to likely expect anything public until 2009. I agree
with Michael Arrington at TechCrunch that Rich has a track record that makes him
well worth watching. <a href="http://www.dmoz.org/">The Open Directory</a> was
an initial success, though the model didn&#8217;t scale well. Some of that was within
the founders&#8217; control but had
<a href="http://www.skrenta.com/2006/12/dmoz_had_9_lives_used_up_yet.html">more
to do</a> with AOL&#8217;s lack of backing. The company should be dragged into the
International Court Of Search Crimes and be forced to sell the ODP to someone
who will support it properly. <a href="http://www.topix.net/">Topix</a> has
built a reputation and is still standing and succeeding &#8212; though I&#8217;d say it
still has far to go to seriously threaten Google or Yahoo.</p>
<p>Rich adds a bit more in his
<a href="http://www.skrenta.com/2008/01/why_search.html">Why Search?</a> post
today:</p>
<blockquote>Having just spent 5 years in the media space, I&#8217;ve come away with the idea
that editorial differentiation is possible. But the editorial voice of a
search engine is in the index&#8230;so it has to be <em>algorithmic editorial
differentiation</em>.</blockquote>
<p>So far, it doesn&#8217;t sound like a social networking play like some of the others.
We&#8217;ll be watching, Rich. Also see discussion today
<a href="http://www.techmeme.com/080102/p114#a080102p114">on Techmeme</a>.</p>
<p><strong>Powerset</strong></p>
<p><a href="http://www.powerset.com/">Powerset</a> is now a classic example of
why you WANT to be a stealth start-up and say little. That&#8217;s because when you
get too much early press &#8212; in part through your own doing &#8212; then fail to
deliver anything, the hype can swing back at you hard.</p>
<p>The company came to light back in October 2006
<a href="http://venturebeat.com/2006/10/02/bold-start-up-powerset-about-to-raise-10m-to-take-on-google/">
via VentureBeat</a>, with the twist being that natural language search would be
the way forward. That caused me to write a
<a href="http://blog.searchenginewatch.com/blog/061005-095006">long rant</a>
about the hype of natural language search in reaction. From the top of that:</p>
<blockquote>This is a rant. It&#8217;s a rant from
over 10 years of watching people trot out natural language search as the
&#8220;killer&#8221; solution to the current state of search, something that&#8217;s happening
once again with
Powerset. That&#8217;s a search engine you can&#8217;t even use at the moment, but the
hype will no doubt continue. To counteract that, my thoughts on and some
history about natural language search.</p>
<div id="a026282more">Natural language search makes a compelling pitch for those who really
don&#8217;t know search or haven&#8217;t heard the natural language mantra before.
I&#8217;ve seen the pitch time and time again. You:</p>
<ul>
<li>Pick out an example that shows how &#8220;bad&#8221; search is on an existing
search engine</li>
<li>Demonstrate how natural language search would work better on your
service</li>
<li>Sit back and collect the press attention</li>
</ul>
</div>
</blockquote>
<p>I then went on to detail how natural language search had been hyped and tried
over the years. The short story is this: It doesn&#8217;t take much natural language
analysis to figure out what someone wants when they type in &#8220;britney spears
nude&#8221; or &#8220;hotmail.&#8221; In addition, by and large I don&#8217;t believe enough people will
change their basic search habits to enter long sentences when searching any time
soon.</p>
<p>Since that time, we&#8217;ve pretty much had nothing out of Powerset other than the
<a href="http://www.techmeme.com/070917/p117#a070917p117">launch</a> of Powerset
Labs in September 2007. That launch hasn&#8217;t produced any cool applications that
I&#8217;ve seen or heard about, nor much buzz. Instead, in November, we got a
<a href="http://searchengineland.com/071102-133736.php">management shake-up</a>.</p>
<p>For a more formal chronicle of the company&#8217;s developments, check out
<a href="http://venturebeat.com/index.php?tag=co:powerset">this area at
VentureBeat</a> and <a href="http://www.techcrunch.com/?s=powerset">these search
results at TechCrunch</a>.</p>
<p>Finally, while I&#8217;m harsh above on Powerset, I actually had a long visit with
the company in the middle of last year and was deeply impressed with the effort
going on there. I&#8217;m still working on a long write-up to explain what&#8217;s
happening. But in a nutshell, Powerset is trying to literally comprehend or
understand each page on the web.</p>
<p>Today&#8217;s search engines don&#8217;t know what a page is about by reading words.
They&#8217;re more or less doing pattern matching &#8212; finding pages that contain words
similar to what you search for (or pages relevant to those words based on
linkage). Powerset literally is trying to read and understand what a page is
about the way a human reads a page and knows it is on various subjects.</p>
<p>I don&#8217;t see that as making it a better search engine that Google. Instead, I
think it may eventually give it the ability to create a unique &#8220;auto-Wikipedia&#8221;
style site, assembling knowledge pages on any subject automatically. I also
think that there will eventually be some search benefit in comprehension of
pages, but exactly how that will play out I suspect is part of being with an
existing search engine and a more traditional model. With the array of patents
Powerset has lined up, I suspect it will eventually get acquired by Google,
Yahoo, or Microsoft rather than rollout its own product. But we&#8217;ll see.</p>
<p><strong>Hakia</strong></p>
<p>Like Powerset, <a href="http://hakia.com/">Hakia</a> has played the natural
language search game. Unlike Powerset, it has a product anyone can use &#8212; live
since at least the middle of 2006.</p>
<p>Again, I&#8217;ve been working on a long write-up on the inner workings of Hakia
and have yet to finish it. It&#8217;s complicated, and I mainly want to cover what I
find to be the real use of their technology &#8212; the ability to create custom
&#8220;gallery&#8221; pages and understand those are related to particular searches.</p>
<p>It&#8217;s easier to show you what&#8217;s impressive. Search for
<a href="http://hakia.com/search.aspx?q=hillary+clinton">hillary clinton</a>,
and you get a nice page showing news, her official site, biography pages, blogs
&amp; fan sites, news &amp; interviews, and more. It&#8217;s very Mahalo-like, except it
doesn&#8217;t require human editors like Mahalo and predates Mahalo by a year.</p>
<p>That categorization is something I know the major search engines could do, if
they wanted. So far, they don&#8217;t. And so far, despite Hakia talking about its
<a href="http://blog.hakia.com/?p=211">rising traffic</a>, it has yet to make a
serious mark. Moreover, in October, it made a serious shift to allow social
interaction with its results. That&#8217;s a sign that the original plan that &#8220;natural
language will win all&#8221; has failed to do so; therefore, another twist is needed.</p>
<p><a href="http://searchengineland.com/071031-200015.php">Social Networking
Through Search: Hakia Helps You Meet Others</a> from Vanessa Fox here at Search
Engine Land covers the change, plus it gets into the natural language indexing
stuff I mentioned earlier that makes Hakia unique, plus has examples of gallery
pages.</p>
<p><strong>Mahalo</strong></p>
<p>Credit to Jason Calacanis. He said he wanted to take on Google, then wasted no
time getting <a href="http://searchengineland.com/070530-180000.php">Mahalo</a>
rolled out. OK, he also says he&#8217;s not taking on Google &#8212; just focusing on the
top searches that he thinks would be better with human review. Sure, you aren&#8217;t
taking on Google, Jason.</p>
<p>To date, Jason reports that Mahalo&#8217;s traffic is growing and strong. But to
date, I&#8217;ve certainly see no webmasters taking about what a traffic driver Mahalo
is. It would be early to call it a raging success, but it&#8217;s a nice
alternative to have. Indeed, later this month I&#8217;ll finally finish my Search 4.0
piece that picks up from the conclusion of my
<a href="http://searchengineland.com/071127-091128.php">Search 3.0: The Blended
&amp; Vertical Search Revolution</a> article last November. I&#8217;ll show some examples
of how the human element at Mahalo can and has kicked some Google and
traditional search engine butt &#8212; though also how it isn&#8217;t the panacea some
expect.</p>
<p>Some of our
<a href="http://searchengineland.com/lands/search-engines-mahalo.php">past
coverage of Mahalo</a>:</p>
<ul>
<li><a href="http://searchengineland.com/070530-180000.php">Mahalo Launches
With Human-Crafted Search Results</a></li>
<li><a href="http://searchengineland.com/070613-084941.php">Mahalo Greenhouse:
Get Paid For Writing Search Results</a></li>
<li><a href="http://searchengineland.com/070711-101653.php">Search Spam Fight
- Mahalo: 1; Squidoo: 0</a></li>
<li><a href="http://searchengineland.com/070810-193355.php">Mahalo Follow:
Toolbar Gives You Human-Powered Alternatives To Searching, Surfing</a></li>
<li><a href="http://searchengineland.com/070827-121805.php">The Promise &amp;
Reality Of Mixing The Social Graph With Search Engines</a></li>
<li><a href="http://searchengineland.com/071212-060000.php">Mahalo Adds The
Social Graph To Search</a></li>
</ul>
<p><strong>Search Wikia / Wikia Search</strong></p>
<p>Wikipedia founder (as he prefers to be called; Wikipedia itself calls him
cofounder) Jimmy Wales made waves a year ago when he said he&#8217;d take on &#8220;closed&#8221; Google
with humans and a transparent search engine. Called
<a href="http://search.wikia.com/wiki/Search_Wikia">Search Wikia</a> (but, confusingly, it&#8217;s also called Wikia Search), Wales has grabbed attention from the press
over the past year. Slamming at Google as a
<a href="http://www.boingboing.net/2008/01/01/wikiinspired-transpa.html">scary
closed thing</a> gets you good mileage, especially when you helped establish
Wikipedia, a threat Google takes so seriously that it may launch its own
Wikipedia-style site, <a href="http://searchengineland.com/071213-213400.php">
Google Knol</a>.</p>
<p>Now Wikia Search is at hand. A private &#8220;pre-alpha&#8221; test
<a href="http://searchengineland.com/071224-084959.php">started</a> in late
December, an invite-only thing I still find odd for a service that&#8217;s supposedly
all about the &#8220;transparency.&#8221; But on Monday, the general public will finally get
a look at whatever Wales and his team have concocted. In the meantime, while
Wales still hasn&#8217;t posted any news since July 27 to the &#8220;news&#8221; section of Search
Wikia, press reports tell us so far:</p>
<ul>
<li>Only a tiny 50 to 100 million pages will be indexed at launch. The major
search engines today have tens of billions of pages indexed. (<a href="http://ap.google.com/article/ALeqM5iVpozoN4SEv7fIbj-dSXBPinksWAD8TTR2T00">AP</a>)</li>
<li>There will be a high degree of human editorial influence, though whether
that&#8217;s over the algorithm or the search results on a per-query basis remains
to be seen (<a href="http://www.crn.com/software/205207267">CMP</a>)</li>
<li>An early <a href="http://www.flickr.com/photos/shuttterview/2001866209/">
screenshot</a> suggested that Search Wikia might be evolving more into a
Facebook-style service, perhaps with some ways for users to share results (<a href="http://www.matthewbuckland.com/?p=359">Matthew
Buckland</a> &amp;
<a href="http://blog.wired.com/business/2007/11/rumor-wikipedia.html">Wired</a>)</li>
</ul>
<p>Some of our
<a href="http://searchengineland.com/lands/search-engines-search-wikia.php">past
coverage of Search Wikia</a>:</p>
<ul>
<li><a href="http://searchengineland.com/061229-193718.php">Q&amp;A With Jimmy
Wales On Search Wikia</a></li>
<li><a href="http://searchengineland.com/070727-123006.php">Search Wikia Takes
Steps To Crawl; Acquires Grub</a></li>
<li><a href="http://searchengineland.com/070803-131149.php">Search Wikia Gets
Open Source Categorization Software</a></li>
<li><a href="http://searchengineland.com/071224-084959.php">Search Wikia
Launches In 2007 With Private Beta</a></li>
</ul>
<p><strong>Cuill</strong></p>
<p>Arguably the stealthiest of the stealth start-ups,
<a href="http://cuill.com/">Cuill</a> (pronounced &#8220;cool&#8221;) has an impressive
pedigree with its three founders: Tom Costello of IBM&#8217;s WebFountain project and
Anna Patterson and Russell Power of Google&#8217;s TeraGoogle project, its massive
search index. And last year, former AltaVista founder Louis Monier &#8212; who later
went to eBay as its first eBay Fellow, then to Google &#8212; jumped ship from Google
to join Cuill.</p>
<p>I talked with Cuill earlier this year to understand a bit more about what
they are doing, but the details are still being held very closely. The main
difference between Cuill and everyone else I&#8217;ve named above is that Cuill is
founded by people who understand and have dealt with firsthand the challenge of
indexing billions of documents.</p>
<p>Cuill recently
<a href="http://www.techcrunch.com/2007/09/10/greylock-partners-invests-in-stealth-search-engine-cuill/">
took on more funding</a>. Louis is also going to be doing a
<a href="http://searchengineland.com/071217-053500.php">keynote</a> at our
<a href="http://searchmarketingexpo.com/west/">SMX West</a> search marketing
conference, held in Santa Clara, California from Feb. 26-28. I&#8217;m thrilled to be having
him since there are only a handful of people who have worked for the &#8220;old&#8221;
Google (AltaVista), the current Google (when he was at the Big G), and a
potential future Google (Cuill).</p>
<p><strong>And The Winner Is&#8230;</strong></p>
<p>If you think the future of search is on smart automation, Cuill&#8217;s definitely
one to watch, and perhaps Blekko as well. If you think it&#8217;s the growth of
humans, Mahalo and Search Wikia are your better candidates. The reality is that
success will likely be a blend of the two. For the human services, a real open
source index would be a big help &#8212; see
<a href="http://searchengineland.com/071106-102435.php">Google: As Open As It
Wants To Be (i.e., When It&#8217;s Convenient)</a> for more about this.</p>
<p>But the reality is that all of these services will have an incredibly tough
time to beat Google.</p>
<p>Google came along at a very special time, as I&#8217;ve long written. It had better
technology at a time when all the search engines had abandoned improving search,
since that was seen as a loss leader. The money was in portal features.</p>
<p>Today, search is a multi-billion dollar industry. If someone with a serious
search threat comes along, you buy them (such as with YouTube), or you start to
develop your own rival if it seems a real threat. Google&#8217;s not omnipotent &#8212; but
you&#8217;ve already got a space where it&#8217;s Google, Yahoo, Microsoft, and Ask all
seriously fighting it out (and the latter three, despite their funding and
experience, still struggle against Google as being synonymous as a trusted
search brand for most users).</p>
<p>To date, Google is the real exception of &#8220;a better mousetrap wins.&#8221; It&#8217;s far
more likely the companies above, if they do gain traction, will end up being
purchased for a large amount by one of the existing &#8220;search utility companies.&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/the-google-challengers-2008-edition-13049/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Changes At Natural Language Search Company Powerset</title>
		<link>http://searchengineland.com/changes-at-natural-language-search-company-powerset-12604</link>
		<comments>http://searchengineland.com/changes-at-natural-language-search-company-powerset-12604#comments</comments>
		<pubDate>Fri, 02 Nov 2007 17:37:36 +0000</pubDate>
		<dc:creator>Vanessa Fox</dc:creator>
				<category><![CDATA[Business Issues: Acquisitions & Investments]]></category>
		<category><![CDATA[Search Engines: Hakia]]></category>
		<category><![CDATA[Search Features: Natural Language]]></category>

		<guid isPermaLink="false">http://searchengineland.com/beta/changes-at-natural-language-search-company-powerset-12604.php</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p>Big management changes are going on at Powerset, which has <a href="http://searchengineland.com/070209-093707.php">received much attention</a> for its potential in using natural language processing for search. Barney Pell, who has been CEO at Powerset, <a href="http://www.barneypell.com/archives/2007/11/management_chan.html">posted today in his blog</a>  that he is transitioning to CTO, that Steve Newcomb, who had been COO, is leaving the company, and that Ron Kaplan, who had been CTO and Chief Science Officer, is now solely Chief Science Officer. The company is currently looking for a CEO.</p>
<p>Several companies have been taking the natural language angle in creating the next generation search engine, but turning the potential into production has proven tricky. Powerset had hoped to launch their search engine this year but now thinks it <a href="http://venturebeat.com/2007/11/02/powerset-the-hyped-search-engine-company-sees-shakeup/">could be as late as 2009</a> (although they are hoping for a mid-2008 launch).</p>
<p>Hakia, another player in the space, hopes to have their <a href="http://searchengineland.com/071031-200015.php">own brand of natural language processing</a> fully powering their search engine next year.</p>
<p>Pell is optimistic about Powerset&#8217;s future. In his blog post, he says he&#8217;d love to talk to potential CEOs and closes by saying:</p>
<blockquote>&#8220;The changes we are making now position us for a next phase that promises to be really exciting. We will bring our technology out in real products that users will enjoy and that will trigger changes across the entire ecosystem of search. I think the next year is going to be an amazing time for Powerset and I am as passionate as ever about Powerset, our technology, our team and our future.&#8221;</blockquote>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/changes-at-natural-language-search-company-powerset-12604/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Social Networking Through Search: Hakia Helps You Meet Others</title>
		<link>http://searchengineland.com/social-networking-through-search-hakia-helps-you-meet-others-12586</link>
		<comments>http://searchengineland.com/social-networking-through-search-hakia-helps-you-meet-others-12586#comments</comments>
		<pubDate>Thu, 01 Nov 2007 00:00:15 +0000</pubDate>
		<dc:creator>Vanessa Fox</dc:creator>
				<category><![CDATA[Search Engines: Hakia]]></category>
		<category><![CDATA[Search Engines: Other Search Engines]]></category>
		<category><![CDATA[Search Features: Natural Language]]></category>

		<guid isPermaLink="false">http://searchengineland.com/beta/social-networking-through-search-hakia-helps-you-meet-others-12586.php</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.hakia.com/">Hakia</a>, a natural language search engine, has just added a new spin to search: social networking. Their new Meet Others feature lets you connect with others who are searching for the same things you are.</p>
<p>Since Hakia processes queries differently than old school search engines such as Google, you aren&#8217;t just matched up with people who typed in the exact query you did &#8212; you&#8217;re matched with a larger set of searchers that Hakia thinks are looking for the same things you are based on natural language processing. For instance, if you&#8217;re searching for &#8220;what drug treats a headache,&#8221; Hakia processes the semantic relationships between words and may deduce that someone searching for &#8220;what medicine relieves migraines&#8221; is a match. And that type of processing is the crux of how Hakia wants to differentiate itself.</p>
<p>I recently sat down with President and COO Melek Pulatkonak and CEO Dr. Riza Berkan to talk about what they&#8217;re doing in the search space and where they see things heading. More on how they&#8217;re tackling natural language processing below. First, a run down of what was launched today.</p>
<p><span id="more-12586"></span>
In addition to working on a completely new way of indexing and ranking the web behind the scenes, Hakia is also working on providing a unique search experience.  In July, they launched the <a href="http://company.hakia.com/scoopbar/scoopbarinfo.html">Hakia ScoopBar</a>, which highlights the sections of the pages that are relevant for your query when you click through to them from the search results.</p>
<p>Today, with Meet Others, they hope to add a social networking component to search. The feature is entirely opt in. Once you do a search on Hakia, you&#8217;ll see a Meet Others icon above the search results. Click that to access a room designed for those doing similar searches. You can post a message and then provide details about how you want to be contacted (masked email or instant messaging via MSN or Skype). You can also contact others who have posted messages to the room. The freshest and most highly rated posts stay in the room longest. Older and less popular posts fall off as searchers make new posts.  Hakia says they monitor abuse and have safeguards in place for spam (for instance, your post is authenticated through email).</p>
<p>What about privacy? Since the feature is opt-in, no one will see what you are searching for unless you decide to click the Meet Others icon and post a message. Even then, no registration is involved so your post isn&#8217;t associated with a username. And you decide what contact information you want to make available to others. If you choose email, Hakia masks it so anyone contacting you doesn&#8217;t see the address. (However, your IM details, if you choose to post them, are public.) You can also remove your post at any time, which removes any contact information you&#8217;ve made available.</p>
<p>Hakia likens this system a bit to craigslist.org. They want to bring people together. They use the example of someone looking for concert tickets. Someone else may post a message about tickets available for sale. Hakia can bring the buyer and seller together.</p>
<p>Here&#8217;s social networking search in action. Doing a search for &#8220;Looking for collectible pokemon cards&#8221; brings up results that include a Meet Others icon to the right of the search button.</p>
<p><a href="http://www.flickr.com/photos/vanessafox/1810711788/" title="Photo Sharing"><img src="http://farm3.static.flickr.com/2009/1810711788_663eae730b_o.jpg" width="256" height="79" alt="Hakia Meet Others" /></a></p>
<p>Click that to see who has posted about that query.</p>
<p><a href="http://www.flickr.com/photos/vanessafox/1809852187/" title="Photo Sharing"><img src="http://farm3.static.flickr.com/2330/1809852187_fc78566b14.jpg" width="500" height="391" alt="Hakia Meet Others" /></a></p>
<p>Then post a message of your own or choose a contact option for someone who&#8217;s already posted.</p>
<p><a href="http://www.flickr.com/photos/vanessafox/1810696240/" title="Photo Sharing"><img src="http://farm3.static.flickr.com/2227/1810696240_26d037027f.jpg" width="500" height="409" alt="Hakia Meet Others : Chatting" /></a></p>
<p>As noted above, there&#8217;s no set time limit for how long a message remains in the room. It varies depending on how many messages get posted and how they get rated.</p>
<p>Is this an innovation in search or a recipe for disaster? How many people will find this valuable and how many will find it just plain creepy? In today&#8217;s climate of extreme social networking, Hakia just might be onto something, but the proof will be in the adoption.</p>
<p>Getting searchers comfortable with the notion of chatting with others about their searches isn&#8217;t Hakia&#8217;s only adoption obstacle. As I said in my <a href="http://www.vanessafoxnude.com/2007/08/07/for-all-your-song-related-zebra-and-chicken-needs-look-no-farther-than-hakia/">review of one of Hakia&#8217;s CDs</a> (they&#8217;re talented musicians in addition to computer scientists, who sing about zebras and finding your childhood on eBay for twenty five cents), regardless of how revolutionary their technology may be, the big challenge will be getting people to change the way they search because their technology isn&#8217;t at its best for the 2.8 word queries that Google has taught the world to type in. With those types of queries, Hakia performs just like any other search engine. Their differentiation comes with natural language processing, best used for longer queries that are typed more like the way people talk or write.</p>
<p>And what of this differentiation? Hakia provides interesting results now, but the jury&#8217;s still out on just how different and valuable what they are working on really is. They aren&#8217;t launching the search experience that&#8217;s powered by the core technology they are working on until sometime next year. They say what&#8217;s currently launched leverages the technologies they&#8217;re developing, so you can get a sense of what the final product will be like.</p>
<p>They say they are working on an entirely new infrastructure (different than what traditional search engines employ) called QDEXing (query detection and extraction). They search the web for concepts, rather than words, when satisfying a search. They point out that while the traditional search engines bring back good results most of the time, it&#8217;s impossible to know if pages that weren&#8217;t returned (because they have too few links to them, for instance) would have been more relevant for the query. By understanding the concepts on web pages rather than relying on things like external links and anchor text, they feel they can have a better sense of what page across the entire web is most useful to a searcher.</p>
<p>Traditional search engines use inverse indexing to catalog the words on a page. At the simplest level, when someone does a search, the engine looks through the index and finds the pages listed for those query words. Hakia, instead, uses QDEXing to determine what questions each page can answer. When someone does a search, Hakia finds that question and then returns the pages that answer it.</p>
<p>They use a number of scoring factors (such as linguistic and referential methods) to determine page quality. For instance, they say they can detect if a page was written by a person or was autogenerated. They only store pages that they feel meet their quality bar.</p>
<p>So, how will they get searchers to try it out? They may start with vertical databases that don&#8217;t do as well with traditional search technology. And they&#8217;ve cataloged the 700,000 most popular search results into galleries, which are algorithmically generated and have human review. Groups of hand-edited results for popular queries? Haven&#8217;t I heard this story before somewhere? Hakia says they&#8217;re different from sites like <a href="http://searchengineland.com/070530-180000.php">Mahalo</a> and Wikipedia in that the results are algorithmically generated, so there&#8217;s less of an editorial component. They provide a well-balanced representation of query results &#8212; not just the most popular. And most importantly, the content is search results, not reference material. Everything about what Hakia does is about improving the search. What does a gallery look like? Well, I randomly chose a query &#8212; &#8220;Buffy the Vampire Slayer&#8221;. <a href="http://www.vanessafoxnude.com/category/buffy/">No reason</a>. Hakia returns a gallery page with things like headlines, television profile, the channel it&#8217;s on, pictures, and fan sites. Hakia algorithmically generates the categories based on overall semantic processing of pages about the topic.</p>
<p><a href="http://www.flickr.com/photos/vanessafox/1810103687/" title="Photo Sharing"><img src="http://farm3.static.flickr.com/2175/1810103687_42f3f9ca80_o.gif" width="525" height="490" alt="Buffy's Gallery at Hakia" /></a></p>
<p>Of course, there are other players on the natural language processing bandwagon, with <a href="http://searchengineland.com/070209-093707.php">Powerset</a> one of the most hyped of the bunch. Will Hakia&#8217;s approach of providing a unique user experience set them apart? Well, the Buffy page <em>is</em> kind of cool. Whether or not searchers can get comfortable with a new search experience and different way of querying remains to be seen.</p>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/social-networking-through-search-hakia-helps-you-meet-others-12586/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Mahalo Launches With Human-Crafted Search Results</title>
		<link>http://searchengineland.com/mahalo-launches-with-human-crafted-search-results-11341</link>
		<comments>http://searchengineland.com/mahalo-launches-with-human-crafted-search-results-11341#comments</comments>
		<pubDate>Wed, 30 May 2007 22:00:00 +0000</pubDate>
		<dc:creator>Danny Sullivan</dc:creator>
				<category><![CDATA[Search Engines: Hakia]]></category>
		<category><![CDATA[Search Engines: Mahalo]]></category>

		<guid isPermaLink="false">http://searchengineland.com/beta/mahalo-launches-with-human-crafted-search-results-11341.php</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p>
<img src="http://searchengineland.com/mahalo-logo.gif" width="213" height="69" align="left" hspace="7" vspace="2" /></p>
<p><a href="http://mahalo.com/">Mahalo</a>, the expected people-powered search
engine backed by Jason Calacanis, has now gone live in an early &quot;Alpha&quot; test
release. In Mahalo, human editors have crafted the top search results for
popular queries.</p>
<p>For example, search for [paris hotels], and human editors at Mahalo have
assembled a page that lists actual hotels in Paris rather than hotel
aggregation/booking sites that
<a href="http://www.google.com/search?q=paris hotels">you see</a> at Google.</p>
<p>I ran a few queries when talking with Jason about the service last week. Often, the results
were impressive. In some other cases, the humans had gone into overkill, listing
so many related categories of information that I felt like I was using Yahoo
<a href="http://forums.searchenginewatch.com/showthread.php?t=597">back in</a>
1999.</p>
<p><span id="more-11341"></span></p>
<p><b>History: Humans &amp; Search</b></p>
<p>Human-crafted search results aren&#8217;t a new idea, of course. This is precisely
how Ask Jeeves used to work. In fact, it was this human element that was crucial
to the early success that Ask had back when it initially became popular as an
up-and-coming search engine in 1998, alongside Google. Ask&#8217;s editors would &#8212; as
Mahalo editors do now &#8212; look at the most popular searches and create search
results that editors hoped would be answer the information need.</p>
<p>Ask&#8217;s problem was scaling. Having so many editors cost money. In contrast,
Google&#8217;s link-based automated approach provided good relevancy for both popular
and unusual (or <a href="http://searchengineland.com/061221-085419.php">
long-tail</a>) queries.</p>
<p>Over time, the machine has reigned supreme when it comes to the
<a href="http://searchengineland.com/lands/search-engines.php">major search
engines</a>. Yahoo&#8217;s human-powered directory has been
<a href="http://blog.searchenginewatch.com/blog/050308-101342">buried</a> in
various ways over the years, while Microsoft once heavy-reliance on human
editing of top results was
<a href="http://searchengineland.com/070308-102703.php">long-abandoned</a> in
the technological chase after Google.</p>
<p><b>Scaling Humans</b></p>
<p>Cost-wise,  Calacanis is optimistic he has things covered.</p>
<p>&quot;We have 40 people working on it in Santa Monica and will have 100 by the end
of the year. We think we can control the costs,&quot; he said. &quot;We can actually make
enough money to keep this going.&quot;</p>
<p>In particular, Jason talked about his experience in keeping 300 bloggers
going though the <a href="http://www.weblogsinc.com/">Weblogs network</a> he
used to run as useful. </p>
<p>&quot;It&#8217;s not hard for me to keep them [so many editors]
focused on a goal.&quot; As for funding, if the Google AdSense units currently on the
site don&#8217;t cover costs, Calacanis says investors ranging from News Corp. to
AOL&#8217;s Ted Leonsis have given him enough money to run the company for at least
five years. </p>
<p>In terms of searches targeted, Jason said the focus remains firmly on the
most popular queries that are performed by many people, rather than trying to
have a human-crafted answer for everything.</p>
<p>&quot;The goal of the site is not to be a comprehensive search engine. We&#8217;re very
comprehensive for the most popular search terms. After that, we give the others
over to Google or Yahoo,&quot;  Calacanis said.</p>
<p><b>Alternative, But Not Replacement, To Google</b></p>
<p>Give over to Google or Yahoo? Two explanations here. First, Jason doesn&#8217;t
expect that Mahalo will be used instead of a major search engine like Google or
Yahoo. Instead, he hopes it will become a tool people selectively tap into when
they want answers for a common, popular subject. He expects those looking to do hard-core
research or a hunt after unusual information to turn to a
traditional search engine.</p>
<p>&quot;I don&#8217;t think it&#8217;s possible for people to abandon them,&quot; he said, giving the
example of someone looking for information about a specific cell phone battery
as being too specific for Mahalo to cover.</p>
<p>&quot;It&#8217;s not the goal to wipe out Google or be the
first choice. Google is the new ocean. You are much better to work with them. We
don&#8217;t see them as competitive. We&#8217;ve got 4,000 search terms, we&#8217;re not going to
replace them, we know that. But if you compare any of these terms, it&#8217;s better
to start with us,&quot; he said, his &quot;Google as ocean&quot; comment echoing Topix&#8217;s Rich
Skrenta&#8217;s &quot;Google is the environment&quot;
<a href="http://www.skrenta.com/2007/01/winnertakeall_google_and_the_t.html">
observation</a> from earlier this year.</p>
<p dir="ltr">Of course, some people will search for things at Mahalo without
knowing it has no answers of its own. In that case, Mahalo will provide Google-powered
results, so as not to disappoint. Google&#8217;s not a formal partner in doing this, by the way. Instead, Mahalo is simply using the
<a href="https://www.google.com/adsense/static/en_US/WsOverview.html">AdSense
For Search</a> service that any publisher can tap into. And no &#8212; Mahalo doesn&#8217;t
plan its own ad system, Calacanis said.</p>
<p><b>Gaining Acceptance</b></p>
<p dir="ltr">While the goal might not be to replace Google, Calacanis clearly
wants his service to get used. How does he plan to grow share? There&#8217;s no
spending millions on ads similar to what Microsoft and Ask.com
<a href="http://searchengineland.com/070515-084119.php">have done</a>. Instead,
he&#8217;s looking to follow the Google model, word-of-mouth.</p>
<p dir="ltr">Plenty of search start-ups have assumed word-of-mouth about their
hot new idea or twist on search would be enough to make their companies thrive
then been disappointed as they disappeared into obscurity. Why does Calacanis
think Mahalo will be different from these others?</p>
<p>&quot;I
don&#8217;t think they delivered a good product. Compare any of our [human-crafted] search results to
Google search results head-to-head. We will be five-to-ten times better,&quot; he
explained.</p>
<p>What if Maholo somehow beat all the odds and seriously threatened Google?
Doesn&#8217;t that potentially weaken Mahalo, which is depending on Google to do the
hard work of crawling the web and providing relevant results for all those tail
terms that Mahalo won&#8217;t target?</p>
<p>Calacanis sees this as unlikely &#8212; but that&#8217;s also where Search Wikia &#8211;
project backed by Wikipedia&#8217;s Jimmy Wales &#8212; might come in. Wales is focused
more on building an open-source crawling of the web that anyone could use (see <a href="http://searchengineland.com/061229-193718.php">Q&amp;A With Jimmy
Wales On Search Wikia</a> for more on this). For that reason, Calacanis doesn&#8217;t
necessarily see himself as &quot;beating&quot; Wales to the punch with a new human-powered
service and in fact sees the two projects as perhaps complimentary.</p>
<p>&quot;If his open source results are good and unique and better than Google&#8217;s,
we&#8217;ll use them. We&#8217;re defaulting to Google for long tail [query results] because they are the
best search out there. I hope that he comes up with something great, because if
it is open source, we&#8217;d have a great solution,&quot; Calacanis said.</p>
<p><b>Crafting Queries; Avoiding The Destination Trap</b></p>
<p>Mahalo isn&#8217;t just relying on human editors. There&#8217;s the ability for users of
the site to submit content that should be included on a page, if the editors
have overlooked it. But unlike human-powered Wikipedia, these suggestions don&#8217;t
go live automatically. An editor has to agree to a change and implement it.</p>
<p>&quot;We basically spend four  to eight hours on a search term, and that gets
us to what I&#8217;d consider 60 to 70 percent compete [information for that term].
We&#8217;ll rely on the audience to do the rest, though we won&#8217;t let them edit the
page,&quot; Calacanis said.</p>
<p>Pages will link not just to web sites but also to video content, news stories
(right now from Google News) and other information. It&#8217;s impressive, especially
some of the categorical groupings the humans do, though
<a href="http://hakia.com/">Hakia</a> does some pretty similar and impressive
work using technology rather than humans (try a search for
<a href="http://hakia.com/search.aspx?q=iphone">iphone</a>, for example &#8212; and I
hope to finally make time to do my planned write-up on Hakia later in June).</p>
<p>Some Mahalo pages, such as for iPhone, will have &quot;Fast Facts&quot; sections to
provide direct answers &#8212; though Calacanis said this type of information will
purposely be kept to a minimum, so as to avoid the problem that About.com fell
into.</p>
<p>In particular, About.com originally launched as the Mining Co. back in 1997,
billed as a human-powered alternative to finding stuff on the web &#8212; human
guides would &quot;mine&quot; the best information out there, hence the site&#8217;s original
name.</p>
<p>I never viewed the Mining Co. as a search engine but rather a destination
site. Calacanis had the same view, when I asked why he thought Mahalo would be a
successful search engine when the Mining Co. &#8212; About.com &#8212; failed to find
success in that particular role.</p>
<p>&quot;They strayed from being a guide to the web to being a
landing page,&quot;  Calacanis explained. &quot;We are not going to do that&#8230;.one of the rules we have is
&#8216;Don&#8217;t
compete with the destination&#8217;,&quot; he said. Internally, Mahalo heavily debated
having Fast Facts but decided including three or four would be helpful.</p>
<p>Overall, Mahalo aims to cover 25,000 top search terms. About 4,000 have been
created already, and the goal is to do 500 per week scaling up to 1,000 per
week. This includes revisiting and updating existing terms</p>
<p><b>Humans Versus Machines</b></p>
<p>That revisiting is important. I recently
<a href="http://dailysearchcast.com/070522-151946.html">talked</a> with Tim
Mayer, vice president of product management at Yahoo, about the idea of Yahoo
doing more hand-crafting of results. It seems like a no-brainer idea, but
Mayer reminded that what&#8217;s relevant for a query can often change over time.
Google&#8217;s Udi Manber, vice president of engineering, made similar remarks when I
spoke with him about human-crafted results when I was visiting at Google
yesterday.</p>
<p>One example he pointed out was how Google&#8217;s human quality reviewers &#8212; people
that Google pays to provide a human double-check on the quality of its results,
so they can then better tune the search algorithm &#8212; started to downgrade
results for [cars] when information about the movie Cars started turning up. The
algorithm had picked up that the movie was important to that term before some of
the human reviewers were aware of it.</p>
<p>Overall, the best solution probably isn&#8217;t all human or all machine but some
combination of the two. </p>
<p>&quot;Humans Are Better&quot; was literally the motto of the Open Directory Project
when it launched back in 1998, but that human model hasn&#8217;t scaled well. That,
along with other human-powered failures, are enough to make Mahalo seem
interesting but ultimately not likely to succeed.</p>
<p>Then again, Wikipedia stands so far at a shining example of how humans can
indeed come together and produce a quality resource. It&#8217;s not perfect,
obviously, but it has lots of great information. As I
<a href="http://searchengineland.com/061229-193718.php">said</a> about Wales&#8217;
project at the end of December, so I think about Mahalo &#8212; it&#8217;s good to see
humans getting more of a role in search, and perhaps both projects will
ultimately find a way to better blend the best of both words, human and machine.</p>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/mahalo-launches-with-human-crafted-search-results-11341/feed</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic page generated in 0.478 seconds. -->
<!-- Cached page generated by WP-Super-Cache on 2012-02-10 04:43:00 -->
<!-- Compression = gzip -->
