Apr 30, 2007 at 12:01am ET by Chris Sherman
Google has announced an initiative with state agencies in Arizona, California, Utah and Virginia to help expose government information to web search engines. Often, government information is stored in database systems that are difficult if not impossible for search engine crawlers to access and index.
Google is working with technologists from the state agencies to help surface this invisible or deep web content, using a simple yet elegant approach using the sitemaps protocol, thereby allowing Google or any other search engine to discover and index government information.
Search engine crawlers rely on links to find content on the web. Much of this content is static, stored as pages on web servers. By contrast, databases display content dynamically, responding to user queries and commands. Since crawlers can’t type, it’s difficult for search engines to access content in a database.
However, most web pages displayed by a database have a unique URL. If this URL is saved as a link, search engine crawlers can effectively follow the link and see the same content a human user would—and index the content of the page.
This is where sitemaps come into play. By using the sitemap protocol to simulate queries to a database, the search engine can get around the barriers normally posed by dynamic content.
The sitemaps themselves are not indexed, so these collections of URL strings will not surface in search results. Instead, searchers will see search results based on full-text indexing of database content.
The approach is elegant because “Google’s not doing anything other than our typical approach to crawling,” said J.L. Needham, manager of public sector content partnerships.
Needham said that the amount of government information accessible now is relatively limited, but that Google plans to continue working with government agencies, eventually surfacing millions of pages of previously hidden content. Among the types of information searchers can find currently are job postings provided by Utah’s Department of Workforce Services, colonial history resources provided by the Library of Virginia, info on education and health services in California, and profiles of real estate professionals from the Arizona Department of Real Estate’s database of licensed agents.
Needham emphasized that these efforts were focused strictly on publicly available information, not private or personal records maintained by state governments.
Google is also helping these state governments beef up their site search tools, using the free Google Custom Search service to create customized search tools for users.
Needham said Google welcomes the opportunity to work with other government agencies to make their information repositories more accessible. “We would be happy if government spent a bit more time on SEO—the focus is almost entirely on the web site and the search tool on the web site,” he said.
For information on how a government agency can make it easier to search for hard-to-find public information, visit http://www.google.com/publicsector.
Share, Bookmark & Discuss This Article
More:
Keep Updated: News Via Email | News Via RSS Feed | News Via Twitter
See more stories like this in the Members Library! Check out the Google: Partnerships, Google: Webmaster Central, SEO: Submitting & Sitemaps, Search Engines: Government Search Engines sections of the Members Library where this story is filed. Members also get access to exclusive video content, a members-only weekly & monthly newsletter, plus more. Check out all the benefits!
TOP STORIES
SEARCH NEWS BRIEFS
FEATURES & ANALYSIS
RECENT COMMNENTS
Stay on top of all the search news with our daily summary, the SearchCap newsletter. View a sample ›
Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.
SMX Web Site » | SMX Difference » | SMX News »
Join us at an upcoming SMX event:
Learn more about search marketing with our free online webcasts and webinars from our sister site, Search Marketing Now. Upcoming online events include:
Featured sites from our Blogroll
Become a premium member today and receive: