Improved Information Retrieval – Looking at Context with Susan Dumais


Desktop and file search can be very different than web search, and the user’s context plays an important role in what is valuable when creating a search algorithm. But understanding context may be helpful to web search, too.

Microsoft’s Susan Dumais has done an extensive amount of research on how users interact with search applications for the desktop and for Microsoft’s Vista. She recently visited Yahoo, as part of their Big Thinker Series. The presentation was at their Yahoo! Mission College location on Tuesday, December 12, 2006.(via Gary Price.) A Microsoft patent application from this morning expands upon the presentation.

A video of that presentation is now available through Yahoo Video, and discusses ideas about improved search based upon user context, and covers rich metadata, tagging, memory landmarks, refinding things, and keeping found things found.

As Susan Dumais notes in the presentation, information retrieval isn’t done for its own sake. It needs to be thought about in the the context of the individuals and groups that it is created for. While we can think about queries in terms of informational, transactional, and navigation uses, people also are researching, learning, and being entertained when searching.

A lot of the presentation focuses upon research that has been documented in a few papers, and articles:

We’re told that the research on refinding information has influenced the Microsoft Live interface, but that it also looks at different information silos, which require some different ways of thinking about search, such as the web, email, files, applications, photos, contacts, and calendaring.

The future of search is going to involve more than just the web. It will look at searching intranets, and a searcher’s own computer as well, and because it will involve a searcher’s own content, they believe that they can provide a richer user experience which includes things like end user tagging, while still providing a single unified point of access to finding information within the context of performing other tasks.

As part of the research that Microsoft did while looking at search in different contexts, they found some interesting information about desktop search:

  • Queries tend to be very short – shorter than on the web,
  • Query syntax allows for a more advance search interface
  • Three most popular advanced operators:
    • filtering
    • resorting
    • new query
  • People opened email often, in an enterprise environment
  • Different search characteristics were exhibited for home workers
  • About half the things opened were things that people received in the last month.
  • Different kinds of content had different halflives – websites – half of them were things looked at in the last couple of weeks.
  • Date is by far the most common sort order – time is really important in retrieving your own information.
  • Very few “best match” searches – people already know what they are looking for.
  • Metadata is very useful, but the quality is variable. Some applications enforce better metadata collection than others, such as email.
  • Useful data is dependent upon applications – for instance, in calendars, the most important date isn’t when you received a notification, but rather the date of the meeting.

There’s more in the presentation about personalized search, memory landmarks and timelines, and the benefits of user tagging. It also includes a very brief comparison of the different desktop search methods from Google, Yahoo, and Microsoft.

Coinicidentally, I noticed a new patent application published this morning from Microsoft, with Susan Dumais listed as one of the inventors, that covers a fair amount of the information discussed in the presentation.

Analysis of topic dynamics of web search Invented by Susan T. Dumais, Eric J. Horvitz, Xuehua Shen Assigned to Microsoft US Patent Application 20070005646 Published January 4, 2007 Filed: June 30, 2005

Here’s a snippet from the description of the document that starts to discuss some of what it includes:

[0001] The Web provides opportunities for gathering and analyzing large data sets that reflect users’ interactions with web-based services. Analysis and synthesis of the rich data provided by these logs promises to lead to insights about user goals, the development of techniques that provide higher-quality search results based on enhanced content selection and ranking algorithms, and new forms of search personalization. The ability to model and predict users search and browsing behaviors has been explored by developers in several areas. The analysis of URL access patterns has been used to improve Web cache performance and to guide pre-fetching. In general, models developed for caching and pre-fetching average over large numbers of users, and exploit the consistency in access patterns for individual URLs or sites, but do not consider topical consistency. Another line of investigation has explored the paths that users take in browsing and searching web sites. This includes clustering techniques to group users with similar access patterns, with the goal of identifying common user needs. This technology involves detailed analysis of individual web sites. There has been some recent work exploring how page importance computations can be specialized to different users and topics.

If you want to dive into the patent filing first, I’d recommend watching the presentation before you do. Instead of trying to understand what it was attempting to get at, I found myself anticipating things that might be included within it because I viewed the presentation before tackling it.

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.



Bill Slawski

See more articles by Bill Slawski >


Share, Bookmark & Discuss This Article
More:


Keep Updated: News Via Email | News Via RSS Feed | News Via Twitter


See more stories like this in the Members Library! Check out the Legal: Patents, Microsoft: Bing, Search Engines: Personalized Search Engines, Stats: Search Behavior sections of the Members Library where this story is filed. Members also get access to exclusive video content, a members-only weekly & monthly newsletter, plus more. Check out all the benefits!

Comments are closed.


RECENT COMMENTS

  • kloeprich said " The recent news confirms suspicions I’ve had that News Corp and MS were already in negotiations with"
  • Susannah said " I can't wait to try some of these tips this week. What a resource! It's like having a coffee with 21"
  • dian said " I haven't tried that yet but if it is the way Mazter is saying I think it won't going to do any good"

See All »


FREE DAILY SEARCH NEWS RECAP!

Stay on top of all the search news with our daily summary, the SearchCap newsletter. View a sample ›

STAY CURRENT THROUGHOUT THE DAY

RSS Feeds

The Search Engine Land feed keeps you informed as news happens. SEE ALL FEEDS »

Upcoming Search Engine Land Conferences

Advertise With Us »

Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.


SMX Web Site » | SMX Difference » | SMX News »


Join us at an upcoming SMX event:

Search Marketing Now Learn more about search marketing with our free online webcasts and webinars from our sister site, Search Marketing Now. Upcoming online events include:


See more webcast topics »

TRACK US SOCIALLY
Upcoming Search Engine Land Conferences

Get Your Search Engine Land
Premium Membership!

Become a premium member today and receive:

  • Express commenting privileges & photo.
  • Exclusive videos & newsletters.
  • Discounts to our SMX conferences.
  • Access to "How To" & Other Archives.

Learn More

Upcoming Search Engine Land Conferences
Add to GoogleAdd to My Yahoo!Add to BloglinesAdd to NetvibesAdd to Windows Live