Three new search related patents were granted yesterday at the US Patent and Trademark Office to Yahoo, Google, and IBM.
The Yahoo Patent looks at how to measure the buzz around categories, topics, and terms from their use at the Yahoo Portal. Google’s filing focuses upon matching anchor text to find related documents in different languages. IBM’s document discusses an unstructured information management and a two level searching technique.
This patent had me wondering how Yahoo presently measures trends in topics searched for on their search engine and portal, selected in their directory, and from people’s usage of the many services they offer; and how the company might be analyzing and using that information.
Web site activity monitoring system with tracking by categories and terms Invented by Janet Yoo, Kian-Tat Lim, Stanley Ben Wong, and Elliott Yasnokvsky Assigned to Yahoo US Patent 7,146,416 Granted December 5, 2006 Filed September 1, 2000
A traffic monitor provides statistics of traffic using an activity input for receiving data related to activity on a server system. Events being monitored are binned by topic or term, where the terms are associated with categories. The categories can be a hierarchy of categories and subcategories, with terms being in one or more categories. The categorized events include page views and search requests and the results might be normalized over a field of events and a result output for outputting results of the normalizer as the statistical analyses of traffic.
Google’s Cross Language Information Retrieval
While this patent was originally filed back in 2001, it’s hard to tell if the process it describes is actually in use. If it were, I would expect a few more French language results on a query for “eiffel tower” (without the quotation marks) than I presently receive (no specific language preferences set in my Google preferences). But one search is a pretty small sample size.
Systems and methods for using anchor text as parallel corpora for cross-language information retrieval Invented by Luis Gravano and Monika H. Henzinger Assigned to Google US Patent 7,146,358 Granted December 5, 2006 Filed August 28, 2001
A system performs cross-language query translations. The system receives a search query that includes terms in a first language and determines possible translations of the terms of the search query into a second language.
The system also locates documents for use as parallel corpora to aid in the translation by:
(1) locating documents in the first language that contain references that match the terms of the search query and identify documents in the second language;
(2) locating documents in the first language that contain references that match the terms of the query and refer to other documents in the first language and identify documents in the second language that contain references to the other documents; or
(3) locating documents in the first language that match the terms of the query and identify documents in the second language that contain references to the documents in the first language.
The system may use the second language documents as parallel corpora to disambiguate among the possible translations of the terms of the search query and identify one of the possible translations as a likely translation of the search query into the second language.
IBM’s Unstructured Information Management
Describes an Unstructured Information Management (UIM) system that can take data from different sources, in different formats (structured information and unstructured information), and enable natural language search of that information. For more on IBM’s approach to a UIMA, an IBM Systems Journal issue a couple of years ago focused solely upon that topic – Unstructured Information Management. I wrote an entry about another recent UIMA patent at SEO by the Sea in November.
Inventors: Andrei Z. Broder, David Carmel, Michael Herscovici, Aya Soffer, Jason Zien Assigned to IBM US Patent 7,146,361 Granted December 5, 2006 Filed: May 30, 2003
Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. A search query includes a search operator containing of a plurality of search sub-expressions each having an associated weight value. The search engine returns a document or documents having a weight value sum that exceeds a threshold weight value sum. The search operator is implemented as a Boolean predicate that functions as a Weighted AND (WAND).
Disclaimer: Patents are filed to protect ideas and methods developed as part of the intellectual property of a company, and may be used to exclude others from using the same, or similar processes, but the granting of a patent or publication of a patent application doesn’t necessarily mean that the processes involved have been fully developed, or will be in the future. Yet, the documents can provide some insight into the ideas that an organization is working upon, and may act as a starting point for more research.
Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.