NetBase Debuts “Semantic Search Showcase” With HealthBase

Netbase is an enterprise-facing software and search company that appears to have one of the most advanced search platforms in the market. Earlier this week during a briefing Netbase marketing and product VP Jens Tellefsen asserted that no other search provider in the consumer or enterprise segment was as advanced — an audacious claim.

Tellefsen went to considerable lengths with me to back up the assertion, however. He said, “The closest thing we’ve seen is what Powerset was trying to do.” But he added that Powerset was essentially an elaborate proof of concept (subsequently acquired by Microsoft), while Netbase is a fully functioning search technology platform that is being used today by major publishers, enterprises and the US government.

To “come out” in a manner of speaking and demonstrate its capabilities to a broader public, Netbase has launched vertical search site HealthBase, a kind of “technology showcase” for the company’s “content intelligence” platform and semantic search capabilities. If HealthBase gets a positive response I was told perhaps the company will move into the consumer search business. But that’s not the main point of the site at the moment. Indeed there’s a very “enterprise-y” quality to the look and feel of HealthBase.

picture-31

According to the press release that came out this morning:

healthBase is the first example of Content Intelligence that is open and available to the public. The showcase uses Content Intelligence technology to automatically find treatments for any health condition or disease; pros and cons of any treatment, medication and food, and more. Like all NetBase-powered applications, healthBase enables users to get summarized answers and insights automatically from millions of online sources.

Each question takes seconds to answer and is equivalent to someone manually reading thousands of documents. As no manual work is required to build the semantic index, healthBase can search on and find answers to tens of thousands of health conditions, diseases, treatments, medications, supplements, foods and even plants.

Tellefsen said that while companies such as Healthline appear to offer “semantic search,” he argued that was the product of “months and months of human effort, tagging documents, and so on.” By contrast Tellefsen explained the HealthBase index and content compiled and created “in a couple of days” without any human intervention. He said this approach can be “replicated across domains,” meaning other verticals.

Netbase does its own crawl, which depending on the implementation can include the Internet and/or specific private databases. In the case of HeathBase the company has crawled a limited group of sites that include PubMed, WebMD, the Mayo Clinic, Healthline, Yahoo Health and a number of others.

In explaining the back end, Tellefsen said that Netbase “reads and understands” sentences and the causal connections and relationships between words in those sentences. This enables content and search results to be organized in ways that make them more intelligible and accessible. It also makes possible discovery of information that might otherwise be deeply buried within search results or documents within those results.

Here’s an example results page for “hypertension”:

picture-30

One might look at this page and say “that’s just clustering.” And other companies have made similar claims about parsing and “understanding” content. But validation seems to come from Netbase customers. The company’s platform and technology have been in the market for several years (in various forms since 2004) and are being used today by P&G, the US Army, Reed Elsevier and others. To  independently test Netbase’s claims you’d have to systematically do lots of searches across a number of top health sites and compare results. However I was impressed with the material I saw and demonstration that I received.

Here’s a video that offers a similar demo and discussion of HealthBase:
YouTube Preview Image

Related Topics: Channel: Consumer | Search Engines: Answer Search Engines | Search Engines: Health & Medical Search Engines

Sponsored


About The Author: is a Contributing Editor at Search Engine Land. He writes a personal blog Screenwerk, about SoLoMo issues and connecting the dots between online and offline. He also posts at Internet2Go, which is focused on the mobile Internet. Follow him @gsterling.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • gdprice

    Here are a few quick comments I posted on ResourceShelf today re: HealthBase.

    1) First, a bit of a stickler.
    Yes, it’s the first day for HealthBase (and things can change quickly) but it would be useful if the HealthBase would provide a complete list of the sources it’s crawling from. They have a small list on the first page of each section (and that’s a good start) but a complete list would be even more helpful to researchers. We do give kudos to HealthBase for providing a “source list” to show where the results come from. However, the kudos only go so far. Why? If you search for “causes of H1N1 (swine flu), clicking the source list takes you to the source but makes you rerun the entire search again. Not very helpful.

    2) One of the sources not listed on the first page but we did find in results from Wikipedia. We’ll keep the “is Wikipedia useful for health researchers” argument out of it for now. We did a search for “poor posture” in the “causes and conditions” tab. OK, no problem. We then selected “Joint.” The second result was from Wikipedia, dated April, 2009. One of Wikipedia’s strengths (and maybe a weakness in some cases) is it’s currency. It would be useful to let users know that this is (is it?) the most current version of the material available. When we found other Wikipedia material, they contained other dates. Here’s an example. We searched (using the “causes of conditions tab” for diabetes. Under the “infection” tab we found a Wikipedia result from May, 2009.

    3) Finally, when searching for journal material (like what you find in PubMed) it takes some clickling around to find the full citation to the abstract (if available) and bibliographic information. HealthBase could and should make this easier. In fact, they could work with database vendors and document delivery services to provide full text access to the article.

  • dmarsch

    On a limited set of queries, they do quite well. But if you move beyond their examples, it looks like things break down pretty quickly.

    For example, a commenter on TechCrunch pointed out that “Causes of AIDS” include:
    Strong Magnetic Field
    AUDIOVISUAL
    Akhmed Zakayev

    I am not a doctor, but I don’t think that’s factually correct.

    Some other fun ones… “Treatments for Autism” include “animal model” and “mouse model” and disturbingly, “vaccines” and “thimerosol” are listed as prime causes of autism, even though all science points to the contrary.

    Healthbase looks adept at associating terms, but actually claiming to have a semantic understanding of the terms’ meanings is a stretch. I’m not sure they should be bragging about how it only took them a few days to throw this together… it may have paid off to put some more effort into it before debuting it.

  • John Rehling

    As a founding engineer at the company (no longer there since 2007), I can verify that there is a semantic understanding of statements that the engine parses. However, by ranking inappropriately high results coming from a single misparsed sentence, they produced the great majority of the bad results cited in the post-launch buzz.

    The problems, and their fixes, here:

    http://nlpconfidential.blogspot.com/2009/09/medicine-for-healthbase.html

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide