Jan 4, 2008 at 11:14am ET by Bill Slawski
Robots reading cereal boxes in the supermarket? Googlebot at the art museum? Street signs and building addresses snatched from Street View images for local search, image search, and product search?
Three new patent applications published at the U.S. Trademark and Patent Office this week explore the intricacies of reading text in images taken from Google’s Street View project and some interesting steps beyond those. I described a number of the implications behind the patent filings in an SEO by the Sea post from last night: Google on Reading Text in Images from Street Views, Store Shelves, and Museum Interiors.
Let’s take a slightly different look.
One of the most fun blog posts of last year was a spoof titled Google Interiors – the day my house became searchable. The satire seems to have come a little closer to reality, with the publication of these three patent filings.
The patent applications involved are:
The most sensational aspects of the documents come towards the end where we are told that robots might be used to take pictures of products on store shelves and in museums. A snippet from the filings:
In addition to street scenes, indexing can be applied to other image sets. In one implementation, a store (e.g., a grocery store or hardware store) is indexed. Images of items within the store are captured, for example, using a small motorized vehicle or robot. The aisles of the store are traversed and images of products are captured in a similar manner as discussed above. Additionally, as discussed above, location information is associated with each image. Text is extracted from the product images. In particular, extracted text can be filtered using a product name database in order to focus character recognition results on product names.
There’s a science fiction element to this world of robots running amuck in supermarkets, but there’s also a lot of science involved in the documents. The descriptions of how text might be taken from street view images describes a number of techniques that account for problems with images, such as those caused by low contrast from shadows and shading. The use of consecutive images from the Street View cameras can also enhance the reading of text that might be blurry or partially hidden from view in one or more shots.
Here’s a screenshot from the patent filings, which shows a number of places where text might be extracted from one image:
Some of the image techniques described in this document were first hinted at in the patent applications behind Google’s Book project, which I wrote about in the summer of 2006 in Patent applications provide window into Google Book Search and Gmail. Those documents discuss the use of optical character recognition to both read the text within books and to understand differences in the structural elements of that text, so that, for instance, chapter headings in books or article titles in magazines might be seen and indexed differently than body text from those documents.
These text recognition and extraction techniques will work with digital still images and with video images. A number of the techniques described work best with video, where there might be multiple images of a view from slightly different angles. If the Street View filming apparatus also included a laser distance measuring device, described in the patent filings, that may also help to eliminate false positives in recognizing text.
It’s been an old sawhorse for years that Google couldn’t recognize text that was displayed in images while indexing pages on the Web. These patent filings hint that Google may be able to do much more with images than we can imagine.
Some of the things that this technology could be used for:
It’s difficult to tell if and when we might see googlebot in the grocery stores, but we probably should start wondering how well Google might be able to handle text within images on the Web these days.
Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.
Share, Bookmark & Discuss This Article
More:
Keep Updated: News Via Email | News Via RSS Feed | News Via Twitter
See more stories like this in the Members Library! Check out the Google: Maps & Local sections of the Members Library where this story is filed. Members also get access to exclusive video content, a members-only weekly & monthly newsletter, plus more. Check out all the benefits!
TOP STORIES
SEARCH NEWS BRIEFS
FEATURES & ANALYSIS
RECENT COMMENTS
Stay on top of all the search news with our daily summary, the SearchCap newsletter. View a sample ›
Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.
SMX Web Site » | SMX Difference » | SMX News »
Join us at an upcoming SMX event:
Learn more about search marketing with our free online webcasts and webinars from our sister site, Search Marketing Now. Upcoming online events include:
Featured sites from our Blogroll
Become a premium member today and receive: