• http://www.seo-theory.com/ Michael Martinez

    Bill, your overview leaves it unclear to me whether you understand what they mean by a “Web object”. At the very least, you are not providing a definition for a “Web object”.

    In the context of these papers, Web objects are concepts or topics and they are independent of Web pages. The proposed methodologies are looking at organizing data by topic.

    i.e., given a topic (a Web Object), the search engine needs to extract as much information as possible about that topic and organize it into a coherent presentation unit.

    This is very high-level stuff that essentially proposes aggregating all known sources of information about any given topic under a unified structure (an “object model”).

    You won’t be able to influence “rankings” through links. Attributions would be a better marker of value, but the model breaks down once you get outside the academic paper archive they tested this methodology against. Business, news, and hobbyist content is rarely organized for peer review (although, oddly enough, fan fiction sites do often incorporate peer review structures and methods into their presentations).

    I don’t see much applicability in this brand of information retrieval science to the World Wide Web. It’s very specialized but they may be able to build on the principles and develop methods for extracting peer review information from non-academic sources.

  • http://www.seo-theory.com/ Michael Martinez

    Wish we could edit our comments:

    “the model breaks down once you get outside the academic paper archive…” should have used the plural form “archives”, not “archive”.

  • http://www.seobythesea.com Bill Slawski

    Hi Michael,

    Sorry I didn’t include a definition. I thought it was obvious from the papers, and from the description of extracting information from pages and integrating it together as an object, as to what an object is. Here’s a definition from the first linked paper:

    We define the concept of Web Objects as the principle data units about which Web information is to be collected, indexed, and ranked. Web objects are usually recognizable concepts, such as authors, papers, conferences, or journals that have relevance to the application domain. A Web object is generally represented by a set of attributes { , ,…, } 1 2 m A = a a a . The attribute set for a specific object type is predefined based on the requirements in the domain.

    They’ve been testing this model on products in addition to the academic papers.

  • http://www.seo-theory.com/ Michael Martinez

    I think it’s an interesting paper, but I know many people look to you (not to put the pressure on you) to condense all the academic-speak down to something everyone else can shake and nod their heads over. That’s why I was being nit-picky.

  • http://www.seobythesea.com Bill Slawski

    Sounds fair, Michael. I did try to ease people into the concept by introducing the searches that it is being used in, and then by describing the process of finding information, extracting it, and integrating it together into a Web Object.

    You’re probably quite right that I should have rephrased what a Web Object is within a sentence or two, perhaps at the end of the post.