Google Customized Search Engines to Harness The Wisdom of Experts?

Back in October, 2006, Google announced on the Official Google Blog that they were enabling people to create their own custom search engines.

If you asked yourself why they were doing this, and how it might provide benefits to individual site owners, searchers as a whole, and Google itself, there are some answers that came out yesterday at the US Patent Office…

Google has published a series of five new patent applications on “programmable search engines,” with Ramanthan V. Guha listed as the inventor on the patents (his name was also on the announcement linked to above on the Google Blog). From reading through the patent filings, I’m thinking that it’s safe to assume that the “programmable search engines” described are Google’s custom search engines, though the applications may describe aspects that may differ somewhat or may not have been fully developed yet.

Ramanathan Guha is listed as the sole inventor on these documents, and he has an interesting history. He joined Google in May of 2005, and had been a principle scientist for Apple Computer and for Netscape, a Co-founder and the CTO of Epinions, one of the developers of the RDF Site Summary (RSS) 0.9 standards, and has a rich resume of other accomplishments.

These are the patent filings covering the programmable search engines published this week:

The easiest way to learn about the features of Google’s custom search engines is to create one or two, so I’m not going to go into depth describing what the patent filings say about those. The sections involving the background of the invention are pretty interesting, though. I’m going to summarize parts of those to see if they can provide us with some insight into why these were developed and offered by Google.

Search as an unchangeable black box

We’re told that work on information retrieval systems mainly is focused upon improving search result quality, and is typically is measured in terms of how precise those results are, and how many results are recalled. While there may be other quantifiable ways to measure performance, those are two of the main goals.

Techniques used by Web search engines involve designs which encompass basic indexing algorithms and representation of documents, query analysis and modification, relevance ranking and results presentation, and many other methods. However they function, the processes search engines use are controlled internally, and can’t be changed by outside entities.

In other words, search engines operate as black boxes, receiving and processing queries using complex and preprogrammed algorithms and models which rank relevance to provide and order search results. Even if parts of the process are known, the search engine will only operate according to those algorithms and models.

Difficulties with User Intent

The relevance of search results depend upon a user’s search intent: why they are searching and why do they need the information? Two different people using the same query may be looking for completely different answers.

Attempts to solve this problem are often based upon relatively weak indicators, such as static user preferences, or predefined ways of refining queries, often amounting to educated guesses of user interest based on the query terms. These approaches can fail because of the highly variable nature of intent and situational facts that query terms may not clearly indicate.

Context and Informational Needs

The patent filing presents an example of a search using the query “Canon Digital Rebel.”

Does a searcher looking for that term want to buy the camera, or do they own it and want technical support, are they comparing it with other cameras, or may they be interested in learning how to use it?

Those situational facts, and a searcher’s information need cannot be reliably determined by either analysis of query terms, or by looking at previously stored preference data about the user.

The Failure of Inferring Intent by Tracking

Intent might also be inferred by tracking and analyzing prior user queries so that a model of a user’s interests might be created. Search queries from individual users might be collected, so that interests may be determined based on a frequency of key words appearing in search queries, as well looking at which search results the user accesses. See, for instance, Retroactive Answering of Search Queries (pdf).

The assumption that queries can accurately reflect a user’s short term or long term interests may be a problem.

Another potential problem is the assumption that there may be a direct and identifiable relationship between a given information need, such as shopping for a digital camera, and the query terms being used to meet that need. We’re told that assumption is incorrect because the same query terms can be used by the same (or different users) with quite different information needs.

Turning to Specialized Web Sites

Because people can’t consistently rely on search engines to locate information to satisfy their informational needs, they often visit sites offering highly specialized information about particular topics, built by individuals, groups, or organizations with an expertise in those subjects.

These sites, vertical content sites, often include specifically created content providing in-depth information on a topic, as well as organized collections of links to related sources of information.

So, a site about digital cameras may include:

  • Product reviews,
  • Guidance on how to purchase a digital camera,
  • Links to camera manufacturer’s sites,
  • Price comparison engines,
  • Other sources of expert opinion, and;
  • Other helpful information.

People running these sites, subject domain experts, often have considerable knowledge about the value of other sites on the Web. Using their expertise, these content developers can also best structure their site’s content to address the variety of different information needs of users.

A Need to Share Search with Subject Matter Experts

Someone visits one of these vertical content sites, where they find a good amount of useful information related to their needs. They may then return to a general search engine to find more relevant information. But when they do, the expertise they found at the vertical content site is no longer available to them from the search engine.

It’s not unusual for vertical content sites to provide search fields letting people access a general search engine. But those just pass search queries back to the general search engine.

Can the expertise of the owner of the vertical content site become available to a search engine during a searcher’s query, to provide more meaningful search results? If the search engine was a custom one, with some aspects of it programmed by the vertical site owner, it might allow their expertise to be shared with the searcher, with other similar sites using custom search engines, and with the search engine.

Aggregated context information might also be collected from a number of these programmable search engines, and become available to searchers even when they are entering a search at the general search engine instead of at a vertical search site.

Other Aspects of Using Programmable Search Engines

In short, custom search engines at vertical sites allow people to search using content sources decided upon and possibly annotated by the site owners. Information collected from the source choices and the labeling and annotation of those sources, and from the use of those custom searches may help inform results at other custom search engines involving related searches, and in query suggestions offered by Google on search results pages from regular Web searches.

A couple of other important topics are each discussed in individual patent applications – advertising and spam or bias.

Of course, Google would want to show advertisements with search results. Can the context (or user intent) taken from such searchers be used to inform the content of advertisements shown to searchers, or associated with the content shown on one of these vertical search pages?

There is a potential that people will try to abuse a system like this. The patent application focusing primarily upon “spam related and biased content,” describes filtering processes that may be used to avoid abuse.

Conclusion

If you haven’t tried out Google’s custom search engines, they are very easy to set up, and to use. If you own a site that focuses upon a particular subject, and consider yourself an expert on that subject, your expertise in setting up a custom search engine may influence results on other custom search engines from Google, and in suggestions on Google’s results pages in response to certain queries.

The only issue that I have with these patent applications is that they appear to assume that people setting up custom search engines on specific topics are experts on those subjects. Yet, if you visit a site on a topic, and find value and expertise upon the site, you may find value and expertise in a custom search set up on that site, too.

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: Channel: Content | Google: Custom Search Engine | Google: Patents

Sponsored


About The Author: is the Director of Search Marketing for Go Fish Digital and the editor of SEO by the Sea. He has been doing SEO and web promotion since the mid-90s, and was a legal and technical administrator in the highest level trial court in Delaware.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://seo-theory.blogspot.com/ Michael Martinez

    Just one more thing to be spammed and manipulated, I suppose.

  • http://www.seobythesea.com Bill Slawski

    Maybe, Michael.

    But I think that people with expertise in a lot of areas could come up with some pretty useful custom search engines in areas that interest them, and and interest visitors to their sites.

  • http://www.topranksearch.com David

    Coop is way cool and the way to go – Thanks to Google for sharing the love once more

    David

  • http://ouseful.info Tony Hirst

    “Yet, if you visit a site on a topic, and find value and expertise upon the site, you may find value and expertise in a custom search set up on that site, too.”

    I’ve started exploring this with the idea of naturally occurring ‘search hubs’, which are in their most basic form naturally occurring link collections that provide a base set of domains (or pages) that can be searched over.

    Think: naturally powered Rollyo search rolls…

    For example, the links from a blog post on a topic provide a tiny custom search set on that topic. The links to a page or domain provide a wider set of links potentially related to that topic (though this is spammable, of course).

    Rather more useful are searches over the pages/domains collected by a user under a particular delicious tag, for example.

    I started playing with the idea of a search hub powered search engine intermediary service at searchfeedr (reviewed here), which will be powered by Yahoo Pipes in its next incarnation, I think (for example, here are a couple of searchfeedr inspired search Pipes.

    tony

  • http://www.seobythesea.com Bill Slawski

    Nice ideas, Tony.

    I’d love to have a custom search that provided results for Search Engine Land, and the pages pointed to as headlines here in our Daily Search Cap posts. It would be a great way of putting the editorial expertise here to another good use.

    Then again, I’d also like to see people able to create their own customized searches as onebox results for Google’s personalized search feature – with the ability to add either single pages or whole domains to their personal customized search directly from search results.

  • http://ouseful.info Tony Hirst

    “I’d love to have a custom search that provided results for Search Engine Land, and the pages pointed to as headlines here in our Daily Search Cap posts. It would be a great way of putting the editorial expertise here to another good use.”

    So why isn’t that just a site search/search limited to site:searchengineland.com?

  • http://www.seobythesea.com Bill Slawski

    So why isn’t that just a site search/search limited to site:searchengineland.com?

    Because there are a large number of “headlines only” links in the Daily Searchcap to sites that aren’t on Search Engine Land, but rather exist on other domains. These are individual pages that have been pointed to for the content contained on those pages.

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide