Yahoo Paper: Finding The Local “Center” Of Search Queries
A new research paper from Yahoo and Cornell University — with search legend
Jon Kleinberg as one
of the coauthors — provides a fascinating look at how a search query such as
"red sox" or "hurricane deal" can be centered around a physical location —
including one that changes over time.
The paper — Spatial Variation in
Search Engine Queries — made use of Yahoo query logs to see if queries
could be tracked back to particular areas. Each person doing a query has an
internet IP address. Those IP addresses (with some filtering done to deal with
people using the same IPs) were mapped, so that each query was linked to a point
on Earth (or specifically to North America, the region covered in this study).
The image above shows an example of this. Queries for [red sox] happen across
the US but occur with the most frequency (shown in red) around Boston, home to
the Red Sox.
Similarly, other sports team queries center around the various cities that host
One of the most interesting parts of the paper was how the "center" of a query
can move. Consider this illustration of searches for [hurricane dean]:
The chart shows how the center of the queries moved almost in line with where
the actual storm headed. OK, so how can the center of these queries be in water?
Who’s searching in the middle of the ocean? My assumption (the paper isn’t
clear here) is that you have people along the various coasts that were searching
— and so the center of all these searches sometimes mapped to being between the
Another type of localized query that can be mapped are "distinctive queries"
that occur in high frequencies or fairly uniquely to certain areas. The map
below shows some of these, such as [gilroy dispatch] happening around the Gilroy
All this mapping of queries is fun and interesting, but can it improve
search? Usually, the challenge has been to know what web pages match a
particular area, not which queries.
The paper doesn’t provide any concrete suggestions in its conclusion. But
there are a number of ways I can see it being helpful. IP detection isn’t
perfect — but if you can tell that only certain queries tend to come from
certain areas, then that might help search engines better target local
information to someone with an IP address that can’t be depended upon for
Knowing the "centers" of queries might also help search engines better
understand what "centers" should be used when mapping results. A local query
using a city name often ranks results based on those closest to the geographic
center of a city. But if query mapping shows a different center, perhaps that
could be used.