What are people looking for when they type “Google” into Google? What do they want to see when they use “eBay” as a query? How does a Google or Yahoo learn from their log files, and other user information? What does it tell them about user intent?
In an interview posted earlier today with Luke Wroblewski, Yahoo’s Principal Designer for Social Media, we’re told that the amount of user data that Yahoo has to work with while designing may be almost overwhelming. A point we haven’t seen made much by the search engines, it’s the second time today I’ve heard it. The first was during a presentation from Google senior research scientist Dan Russell.
Last Tuesday, Keri Morgret went to the December BayCHI, where Dan Russell spoke about How People Use Search Engines. She reports on some of her thoughts as well as some interesting observations about eye-tracking studies at Google. In her post, she includes a link to a video of an earlier version of the speech, How do Google searchers behave? Improving search by divining intent. The questions I started this post with are ones he asks in his presentation. He’s one of the guys who tries to find answers to questions like that.
Dan Russell joined Google from the IBM Almaden Research Center, and also worked for Xerox PARC and Apple’s Advanced Technology Group, as well as teaching at both Stanford and Santa Clara Universities. His homepage includes three sets of slides used during different versions of this presentation. The video weighs in at almost an hour-and-a-half, so if you want the shortened version, or want to read along, you may want to download one of those big PDF files.
The presentation is long, but it provides some nice glimpses into how Google works. I tried to find out when Dan Russell started working at Google when he give the presentation at Stanford linked to above, but it wasn’t easy to locate a date. It wasn’t long enough for him to not discuss some of the insights he had when he first started looking at searcher intent while at Google. His thoughts:
1. Intuitions are terrible when trying to figure out what people are searching for.
2. In particular, your intuitions are terrible.
3. That’s why Google does studies.
4. Fallacy 1: “I do it this way” so others do, too.
5. Fallacy 2 “My Mom does it this way” so others do, too.
6. Deep truth: You are statistically insignificant.
7. Deeper truth: (As a computer scientist/student/audience member) you are a couple of sigma from the norm
8. So are your friends
Understanding User Behavior and Types of Queries
While Google focuses upon search as its core mission, we’re told that the effort is of little use if we “can’t understand what the question is.” What are users looking for? That’s true not only for organic search, but also for things like images, Google Earth, print, and video, which don’t have the benefit of pagerank and link structure to index. Most of these services are in beta, and they can get it wrong now. Applications can be changed on a very quick basis, or as Dan Russell calls it, rapid prototyping on a delivery model. Especially if they can look at the usage data of those services and somehow understand it.
An example is Google Video, which at first provided the most popular results, with snippets. But it wasn’t getting the clickthroughs that they expected. So they quickly changed it to a richer display and the clickthroughs increased dramatically.
Another example was Google Maps – A couple of months after starting, the links were on the right side. After looking at log files, they decided to move them to the left, use a larger font, and add a tab for more details. They found that user behavior is influenced by small changes – and saw significantly differences in clickthrough rates for things like minor size changes in fonts. We’re told that measuring millions of clickthroughs provides interesting results.
Another issue arises when making changes to something like Maps Local. Those changes need to be echoed in places like Maps Local for Mobile. Another challenge that they face is training a culture to use a new interface – smaller changes are easier to get people used to using them.
It’s not unusual to see queries broken down into three different types, as described by Andrei Broder in his paper, A taxonomy of web search. We get to see some percentages, and a greater breakdown of query types in this presentation describing what people are searching for:
Navigational – 15 %
Transactional – 22% - Obtain 8% - Interact 6% - Entertain 4% - Download 4%
Informational – 63% - List 3% - Locate 24% - Advice 2% - Undirected 31% - Directed 3%
What patterns might emerge when people search?
We’re told that it is usually a two step process:
1. Searchers find a good site, and; 2. Look for information there.
Another strategy is teleporting, or going directly to somewhere else.
The reasons for teleporting:
- Users don’t realize they can search directly for the information - Difficulties in formulating a query - The user trusts the source that they are going to
Presumably, that two step process can be a good strategy if you know something about the resource, use its search engine if it has one, and understand the structure of that site. A video of a user session is shown at this point, illustrating someone exploring a site search, snippets, and the possibility of refining their query back at the search engine.
Information about Query Sessions
Instead of just looking at individual searches, considering user sessions are an important part of the analysis of searches.
How often do people do query reformation during a user session, and what do they do when they reform those queries?
1. Spell correct helps lots of people, and shortens their sessions. 2. People often make a minor change to a word, or add a word, which may not provide the best results (people often get stuck in inefficient queries, and don’t change those much).
Some other things that they see:
1. The more words used in a query, the longer sessions tend to last. Their assumption here is that more sophisticated queries involve people spending more time searching.
2. People use longer sessions on weekends.
3. The longer the sessions, the more often they see multi-tasking (multiple search subjects in a session) and interruptions.
Advanced searchers make up an extremely tiny fraction of the folks who search. Some of the characteristics of an advanced searcher are that they:
Have lots of meta-knowledge about content and sites Take notes (on the machine, or paper, or bookmarks) Try alternative word sequences Use quotations correctly
While they may take advantage of these things, they don’t do a lot of it.
The Challenges of Analysis of User Data
I started this post mentioning the problem of having an almost overwhelming amount of user data. Dan Russell shifts gears at this point of the presentation, and starts talking about how to analyze and reduce that data to manageable levels, so that instead of relying upon intuition, they are making meaningful use of the information.
There are two parts to meeting that challenge. The first is building a scalable data analysis system, where a portion of the data can be looked at, and parallel systems can be used to analyze the rest of the records. That type of analysis is described in a Google paper – Interpreting the Data: Parallel Analysis with Sawzall.
The second part of the analysis involves usability. The importance of field studies and lab-based usability tests of prototypes is covered in detail.
For instance, a user would perform a realistic task with a prototype, while thinking aloud. Researchers would watch to see:
- Where they have problems.
- Fail to complete a task, or take too long.
- Make an important mistake and don’t realize it.
- Misunderstand an important part of the UI.
Efforts are made to avoid helping or influencing the user, and focusing upon their actions rather than their opinions. Eyetracking is often used in this type of testing.
Field Studies consisting of interviews at the places where people actually use their computers are conducted, as well as diary and ethnographic studies.
We’re told that a large percentage of people who use search engines have very different mental images of how search engines work than people who work on search engines. An analogy used – someone opens the hood of your car, and points out a part, and asks you what it does. How likely are you to know?
The presentation describes a number of the issues they see when conducting field studies, and how they try to act upon them. The bigger issue here is how to take these types of studies and perform them in a manner which might be statistically significant. How useful might it be to get a greater sense of demographics involving different user segments and different cultures?
Some Questions and Conclusions
Some good questions that aren’t necessarily answered from this presentation:
Is one click good? Better than two clicks?
Is no click better than one click – such as when the answer is provided in the snippet?
What’s the best way to help searchers avoid distractions?
Should some diversity be mixed into results for breadth?
Why are SERPs so boring? Or are they?
How did a standard evolve across the major search engines in how search results pages look?
We’re told that Google’s focus in understanding how well they are doing in meeting searchers’ intentions has transformed from a static IR-styled analysis of query results towards longer-run, session analysis of how users interact with the search engine. This approach involves incorporating data from many kinds of studies, and using many different approaches instead of looking at a single point of data.
They don’t want to make decisions without lots of testing, they don’t want to rely upon intuition, and upon a world view centered around Silicon Valley and Stanford.
Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.