MIT’s Technology Review published an interview with Google Director of Research Peter Norvig that explores his (and presumably Google’s) thinking about problems in search and “next-generation” search functionality that Google is working on. There’s nothing strikingly new in the interview but it’s an interesting overview and window into some of the current projects.
Among them, Norvig emphasizes speech recognition and processing, both in mobile (i.e., Goog411) and hypothetically on the desktop. He also discusses getting users to provide more information (“natural language”) or interact more with search and help disambiguate queries to enable Google deliver more tailored results. Norvig also discusses trying to develop better understanding of the contents of documents (including video).
Here are some interesting excepts of Norvig’s responses:
Re projects with the most funding: The two biggest projects are machine translation and the speech project. Translation and speech went all the way from one or two people working on them to, now, live systems . . . We wanted speech technology that could serve as an interface for phones and also index audio text. After looking at the existing technology, we decided to build our own. We thought that, having the data and computational resources that we do, we could help advance the field. Currently, we are up to state-of-the-art with what we built on our own, and we have the computational infrastructure to improve further. As we get more data from more interaction with users and from uploaded videos, our systems will improve because the data trains the algorithms over time.
Re the problems in search: One is understanding users’ needs more. The other is understanding the contents of documents, whether they be Web pages or video.
Re more user input/interaction with search: One of the things we’re looking at is finding ways to get the user more involved, to have them tell us more of what they want. People type the query “map,” and then they get upset if it’s not the map they were thinking of. So, people may be willing to talk more than type. Or maybe they’re willing to take a suggestion if we offer something that they didn’t type a query for, but is related.
Re mobile search: [T]here are search interactions other than main Web search. When you’re on cell phones, you can only see one link at a time. It really changes the game. There’s much more impetus for us to be correct, so we’re thinking about that kind of interaction there, and how you could use audio to present information.
Re natural language search: I think there’s a whole range of what you can mean as natural-language search. The first part of that range, we’ve been doing for a while. For instance, we understand synonyms and that the two words in San Francisco should go together. But then there’s Las Vegas and Vegas, which mean the same thing, and New York and York don’t mean the same thing. Those are the kinds of things we figure out. Another component of natural-language search is to parse a longer query into components. And the farthest along is typing in a full sentence in English and getting a full sentence as an answer. That sort of thing we’re not doing yet. We are answering some kinds of questions. You can query “population of Japan,” and we’ll pull that out. But for the majority of questions, that’s not what people want. They don’t want the burden of having to express it as a full sentence.