• gdprice

    Here are a few quick comments I posted on ResourceShelf today re: HealthBase.

    1) First, a bit of a stickler.
    Yes, it’s the first day for HealthBase (and things can change quickly) but it would be useful if the HealthBase would provide a complete list of the sources it’s crawling from. They have a small list on the first page of each section (and that’s a good start) but a complete list would be even more helpful to researchers. We do give kudos to HealthBase for providing a “source list” to show where the results come from. However, the kudos only go so far. Why? If you search for “causes of H1N1 (swine flu), clicking the source list takes you to the source but makes you rerun the entire search again. Not very helpful.

    2) One of the sources not listed on the first page but we did find in results from Wikipedia. We’ll keep the “is Wikipedia useful for health researchers” argument out of it for now. We did a search for “poor posture” in the “causes and conditions” tab. OK, no problem. We then selected “Joint.” The second result was from Wikipedia, dated April, 2009. One of Wikipedia’s strengths (and maybe a weakness in some cases) is it’s currency. It would be useful to let users know that this is (is it?) the most current version of the material available. When we found other Wikipedia material, they contained other dates. Here’s an example. We searched (using the “causes of conditions tab” for diabetes. Under the “infection” tab we found a Wikipedia result from May, 2009.

    3) Finally, when searching for journal material (like what you find in PubMed) it takes some clickling around to find the full citation to the abstract (if available) and bibliographic information. HealthBase could and should make this easier. In fact, they could work with database vendors and document delivery services to provide full text access to the article.

  • dmarsch

    On a limited set of queries, they do quite well. But if you move beyond their examples, it looks like things break down pretty quickly.

    For example, a commenter on TechCrunch pointed out that “Causes of AIDS” include:
    Strong Magnetic Field
    AUDIOVISUAL
    Akhmed Zakayev

    I am not a doctor, but I don’t think that’s factually correct.

    Some other fun ones… “Treatments for Autism” include “animal model” and “mouse model” and disturbingly, “vaccines” and “thimerosol” are listed as prime causes of autism, even though all science points to the contrary.

    Healthbase looks adept at associating terms, but actually claiming to have a semantic understanding of the terms’ meanings is a stretch. I’m not sure they should be bragging about how it only took them a few days to throw this together… it may have paid off to put some more effort into it before debuting it.

  • John Rehling

    As a founding engineer at the company (no longer there since 2007), I can verify that there is a semantic understanding of statements that the engine parses. However, by ranking inappropriately high results coming from a single misparsed sentence, they produced the great majority of the bad results cited in the post-launch buzz.

    The problems, and their fixes, here:

    http://nlpconfidential.blogspot.com/2009/09/medicine-for-healthbase.html