Subscribe Via Web Feed Subscribe with Google Add to My Yahoo! Subscribe with Bloglines Add to netvibes Subscribe with Live.com

« More On AOL's Search Marketplace | Main | Seed Funder STN Labs Is Open For B'ness »

Apr. 9, 2007 at 5:33am Eastern by Greg Sterling

Local Data: Not Sexy Just Critical

Locals Only - A Column From Search Engine Land A few weeks ago Steven Aldrich, a VP of strategy for small business software (and now marketing) company Intuit, delivered a speech in which he presented an amazing statistic. According to Aldrich, roughly 6 million businesses are started annually in the U.S. but another approximately 5.6 million go under. Think about it.

From the point of view of these small businesses, there's a fundamental challenge to survive. From the point of view of search engines and directories trying to reflect and catalog their fleeting existence, there's another kind of challenge – a data challenge.

According to the U.S. Census Bureau there are 23,343,821 “firms” in this country. Of those, 5,697,759 firms have one or more employees. But almost 99% of U.S. businesses have fewer than 20 employees and most have fewer than four. These data are mirrored by similar statistics (where they exist) around the world.

Beyond this, the majority of small businesses conduct a majority of their buying and selling (B2B and B2C) within 50 miles (even 20 miles) of their physical location. What all this means is that most U.S. business is fundamentally local. And the essence of what we broadly call "local search" is about capturing data on where these businesses are and what they do.

This is not sexy. But it's the "bread and butter" of local.

No matter how fabulous the maps and the "Asynchronous JavaScript and XML" (Ajax) interfaces, if the basic local data aren't there or are flawed the application invariably disappoints. We've all had the experience of looking for a local business we know is there and not finding it; or, conversely, looking for local "cafes" and only finding Starbucks listings.

Starbucks is there because it's easier to get Starbucks' location information than it is to collect data on independent local coffee houses. Getting good and accurate "local local" data, partly because of Aldrich's equation above, is very hard.

Every top-tier U.S. local search competitor relies at its core one or more of the big commercial databases. There are basically a handful of major providers in the U.S.:

I recently tried to do an assessment of the cost and quality of first three and it's quite challenging. Within the industry there's a lot of criticism and, to borrow a sports phrase, "trash talk" about the relative quality and freshness of the data. And the databases tend to be very costly (although not across the board).

But if the databases are imperfect it's because collecting data on millions of businesses is extremely difficult. And most people – even some of those working in local search – often fail to appreciate the Herculean task of doing so.

Most of these commercial databases are built from telephone company records (or phone directories) and then supplemented in various ways. InfoUSA, for its part, has an Omaha, Nebraska call center where it does out-calling to verify the accuracy business listings information in its database. But even this can fail to correct all the potential errors.

These core databases form what might be called the foundational layer of local search. But they certainly don't complete the structure. The other layers include information gleaned from crawling the Internet and user-submitted content (from both businesses and consumers).

Crawling captures local information that can be missed in the telco databases. And some of that online local data is fresher and more accurate. But crawling can also yield inaccuracy (because it recapitulates mistakes published elsewhere). Thus getting the data directly from the source or the community is the ultimate prize.

Given that there are many local businesses (more than 50%) that still don't have websites – even though their data may in fact be somewhere online – you would have thought that the search engines and directories would have been very aggressive and accommodating in encouraging them to directly input their information. To its credit, Yahoo has for some time allowed business owners and more recently consumers to correct and update information. (See, for example, "edit this listing.") And Google not long ago enhanced its Local Business Center and greatly expanded the information that could be included.

Indeed, most search and directory sites now have places where local businesses can directly input information. (See Stacy Williams' five-part article on the subject.) But those screens are commonly buried and not easy to access.

The final and, in many ways, most promising layer of local data is from the community. Sites like Citysearch and Yelp, among others, are helping build out the local database with user-generated content that provides intangibles (opinions, recommendations) that are becoming an increasingly important part of the local search experience. The community, as suggested above, can also rectify inaccurate listings information.

In a bold experiment, a couple of years ago, UK-based entrepreneur Paul Youlten created Yellowikis, a global directory site to be populated entirely by the community, Wikipedia style. Yet this is a long-term project, especially when there are so many competing directory products in the market.

All these data sources are symbiotic rather than mutually exclusive. An empty container is unlikely to be filled entirely by a community (Yellowikis notwithstanding); there must be something there to react to and modify. Also, increasingly, a skeletal database of business listings and contact information is not going to satisfy users, who are typically also looking for recommendations and other tools to help them make buying decisions.

And there are other non-traditional data sources, such as Urban Mapping, that help complete the application.

These layers and the challenge of capturing and updating information illustrates one of the least visible (or most visible) but critically important aspects of local search: the data. It's messy, often ugly and typically hard to get. Yet, as I've argued, it's the heart and soul of local and one of the things that makes it a good deal more complicated than search in general.

Greg Sterling is the founding principal of Sterling Market Intelligence and publishes Screenwerk, a blog focusing on the relationship between the Internet and traditional media, with an emphasis on the local search marketplace. The Locals Only column appears on Mondays at Search Engine Land.

Like The Story? Vote For It On Yahoo Buzz!
Subscribe To Our Daily Search News Recap!
Your Email:
Send me the monthly search newsletter too! (Learn more about our newsletters and feeds)
Subscribe To Our Search Feed!
Subscribe Via Web FeedSubscribe with GoogleAdd to My Yahoo!Subscribe with BloglinesAdd to netvibes
Subscribe with Live.comSubscribe in NewsGator OnlineSubscribe in RojoAdd to My AOL
Share & Bookmark This Story!
By Greg Sterling Permalink Jump To Comments See Related Stories In: Locals Only



Reader Comments

This is great Greg.

I'm helping overcome the issue in Australia of placing business listings of online business directories by creating a definitive list with quick links to critical pages.

http://michaelvisser.com.au/tools/australian-online-business-directories/

We've run a national directory for a small country for many years and this factor is one of the major stumbling blocks preventing us from going international. On the local level it is easier to be accurate and inclusive. You can pretty much just walk all the streets and knock all the doors. Going "big" quickly knocks the bottom out of that particular barrel.

Sven N., Malta
http://planetsoftpages.com/

Greg -- That's a great article and it highlights a pretty big problem. Allmenus.com is another site that is tackling the local data issue -- we already have over 40,000 local restaurants online throughout the country, and we're adding more everyday. Soon, we'll be adding ratings and reviews, too. However, it's a huge problem for us to get all the local information that we need -- we end up employing an army of people to gather everything.

Search:

Search Marketing Expo

Save the date for:
SMX Madrid (in Spanish, May 20-21)
SMX Advanced - Seattle, WA (June 3-4) Register today! Early bird rate expires May 9!
SMX Local & Mobile - San Francisco, CA (July 24-25) (July 24-25) Pre-agenda rate expires May 2. Get the lowest rate by registering now.
SMX East - NYC - (Oct. 6-8)
SMX London - November 4 & 5, 2008

Search Marketing Now

Learn more about search marketing through free online webcasts and webinars from our sister site Search Marketing Now.

Upcoming Webcasts:

Most Recent News Posts

About Search Engine Land

Stay Updated!

Get Our Search Newsletters:
Email:
Daily Monthly

Get Our Search Feed:
Subscribe Via Web FeedSubscribe with Google
Add to My Yahoo!Subscribe with Bloglines
Add to netvibesSubscribe with Live.com
Subscribe in NewsGator OnlineSubscribe in Rojo
Add to My AOL
More About Our Feeds & Newsletters

Add to Technorati Favorites

Track Us Socially:
Facebook: Our Search News App
Facebook: Search Engine Land Page
Facebook: Search Engine Land Group
Flickr: Search Engine Land
LinkedIn: Search Engine Land Group
Twitter: Search Engine Land Feed

Bragroll