Q&A With Google Personalization Gurus Sep Kamvar and Marissa Mayer
Last week I had the chance to talk to Marissa Mayer, Google VP, Search Products & User Experience, and Sep Kamvar, engineering lead for personalization at Google, about the inclusion of Web History into personalized search. On the face of it, this is another beta announcement from Google that will impact a relatively small number of users and will have no immediate implications for advertisers, but on a strategic level, this is a puzzle piece of substantial proportion that provides some real clues as how Google might start tying the parts of their expanding kingdom together. So I’m temporarily putting part two of the Shopping Engine Comparison on hold to give you the highlights of my chat with Marissa and Sep.
As far as the immediate impact, it is old news (at least a few days old) and Danny covered it pretty extensively in his post. What I’d like to do today is look at future implications, both for the user and the search marketer, and share some quotes and comments from both Sep and Marissa that provide clues about what Google’s plans around personalization might be.
Treading slowly but surely down the personalization path
First of all, this recent announcement is a carefully measured step forward into the realm of personalization, but it does follow pretty quickly on the heels of Google’s previous announcement regarding how the personalization suite became more of a default “on” experience for Google users. I think this indicates that we can expect a number of incremental announcements from Google over the next year, pushing personalization more and more aggressively towards their user base. The inclusion of Web History marks a particularly significant announcement however, as it represents a “missing piece” that dramatically alters the nature of personalization in a number of ways.
This announcement is intended to give Google a test bed to give Sep Kamvar and his team a clean data set to play in. While Google has always collected user click stream information (if PageRank was enabled), they didn’t have explicit consent from users to user it for anything, just a non-descript general statement in the EULA. This clears the road for them to play with the data with a clear conscience.
Marissa: We’ve seen search history and search clicks improve personalized search, so it makes sense to us that Web history should improve it as well. With that said, we don’t have a large body of data on which to experiment and achieve those increases, so we wanted the first few users to consent to it and allow us to build a small user base that we can do a lot of research on to try to understand how to use additional Web history data, beyond the search history data, to enhance the personalized search results. There’s no real reason for us to turn it on for several hundred million users if we don’t know exactly how to use it to give them the best possible gain. So we’re starting small with people who are consenting to opt-in to give Sep (Kamvar) that early set to experiment with
Cranking up disambiguation
So from the user’s perspective, what will the inclusion of Web History mean? Well not much right now, but in the long term it provides a much clearer data set from which to disambiguate intent. The problem with relying solely on search history is that it restricts personalization to this rather slim slice of online behavior. Personalization only works if you’ve conducted searches in a particular area previously. This currently limits the appearance of personalization to 2 results out of 10 in 1 in 5 searches, a threshold that Marissa admits was arbitrarily chosen and will likely change in the future. Search history gives a decidedly one dimensional view of a single user’s online activity. But when you combine that with web history, you get a much clearer picture. You can then see what type of activity a relatively ambiguous query (i.e. digital camera) led to. Was it transactional activity, shopping around on online stores, was it checking out reviews on dpreview.com, or was it looking for technical support information for one particular model? With this additional information, Google can significantly bump the relevancy on the next search you do for digital cameras.
May I suggest?
But further to that, it moves Google much closer to being a recommendation engine. You can now let personalization act as a guide in areas where you may not have searched before. Let’s assume you’re planning a trip to Italy, in the Cinque Terra region. You’ve visited a few sites and done some initial planning, checking flight costs and visiting the official tourism site looking for activities in the region. Let’s further say that you’re an avid hiker, and visit Trails.com regularly. By putting this together, Google can now lift results into a generic search for Cinque Terra tailored specifically to your interests. And the more you search, the more accurate the recommendations become.
Sep: The more we give users the opportunity to give us data, the more we can do with it and the greater our coverage will be. One thing that we noticed was that the longer people use personalized search, the more queries that it impacts and that’s for two reasons: The first is that the more we know about their search history, the better it becomes. The second is that people who are just starting to use search history have been used to unpersonalized search, so they tend to oversimplify their query. For example, they type in something like “Boston Public Library” when in their head they’re thinking “Public Library”. They translated because they think the search engine needs that extra step of specificity. When they use personalized search they realize that the search engine will give them the right result for “Boston Public Library”. They tend to make less specific queries over time, shorter queries and it’s quicker and easier for them. And this affects more and more of their searches. So two things happen. One is that personalized search becomes more effective with the more history we collect and the second is that user’s behavior changes, the queries become easier and more natural to them, rather than what they think the search engine needs to understand.
Show me the money!
By this point, every marketer in the audience has drops of saliva the size of large grapes hanging from the corner of their mouths. Obviously, this has huge potential upside for the marketing network on Google. But personalization is squarely focused on the organic results…for now.
Marissa: We’re interested first and foremost in enhancing the search results using this new data and I believe that’s the first emphasis at this point. We’re focusing there at this point. It’s possible in the future that we would personalized ads but I really feel the object of search results are where we should focus first before returning to the monetization upside of this.
Users, don’t get too comfortable, and advertisers, don’t get too disappointed. A follow up comment from Marissa made it clear that what’s good for organic is also good for advertising, and while organic is ahead of the curve in this particular case, advertising isn’t that far behind.
Marissa: One of the things that matters to me a lot is that search and the ads match. It’s actually really hard to keep them in sync. The search team comes up with a great synonym expansion and then we have the problem that our ads don’t search for those same synonyms. It goes on and on in this way. And then the ads become geographically focused and relevant. Every now and then you’ll see ads targeted to your location and shouldn’t the search results be as well. The idea is that both the ads and the results have to have access to the same input and the same information entity, operating under the same premise – are we returning back results that are local to the user, or are we returning results that are have the same expansion techniques applied. I think it would feel wrong to the user if the ads understood something about them that the search results didn’t and vice versa. We’re trying to keep the two in step, but with that said, our current agenda is to let the search results get a little ahead of the curve there and actually figure out how the search results will become further personalized before we put too much effort into the ad personalization. Obviously given that advertising is the business that it and given that a lot of the market research that has been done, there are already clear and easy ways to do personalization of advertising. Search personalization is far less explored.
Other potential personalization scenarios
How else will personalization impact the search results page? There were a couple of additional things that I thought about right away. One area where Google has consistently maintained a lead is in maintaining top of page relevancy. They only show sponsored results in the Golden Triangle when they’re pretty comfortable those results represent the best match to the user’s intent. With personalization providing significantly more data to disambiguate intent, will Google use this to further refine if top sponsored ads appear or not? Marissa indicated it’s still very early in the game and they haven’t explored the possible applications of personalization that much yet. I have to guess that Nick Fox and his ad quality team will definitely be looking at Sep’s progress on disambiguation with more than a little interest though.
Personalization should also allow Google to become more confident in presenting vertical results. It seems to me that, overall, the presentation of OneBox results has been dialed down recently. Again, these vertical results, because they appear in the Golden Triangle, are subject to the same strict relevancy standards as the top sponsored ads. If personalization helps Google get inside the head of the user and determine their intent, would this mean that they could more confidently put them into a vertical search experience, whether it was local, news, product or even further defined verticals, such as travel or finance? Right now, the vertical market still represents a hand off point from Google to a number of specialized players, and it would certainly be to Google’s benefit to eliminate the hand off and keep users engaged with a Google vertical property, especially if that could be accomplished seamlessly through personalization. While the possibilities are certainly intriguing, according to Marissa it’s just one of many things they haven’t given a lot of thought to yet.
Marissa: There are a lot of different things that we could do with this data. I’ll be totally honest. Verticals isn’t something that has been first and foremost in our minds so I don’t really think there’s a strong vertical angle here at the moment.
But is it win-win?
Ultimately, it comes down to this fact: are users willing to make the trade off between privacy and increased functionality? And there’s where the logic of Google’s approach starts to show itself. To be honest, I was a little bit apprehensive when I opted in and explored the web history tab. That was a lot of personal data that I was making Google privy too. And it was historical data from before my opt-in, so obviously Google has always been tracking me. My opting in just allowed me to see the data they’ve always been collecting. For me, there better be a significant user experience benefit to make it a fair trade. Google is counting on the “out of the box” usefulness of web history to entice even a small number of users to opt in.
Marissa: We’re very excited about this because, one, we think that it offers our users transparency into the data that we have so if you’re a user that has the PageRank feature turned on and you have search history turned on you can see this all-in-one account. But also, and more importantly, it offers up a huge opportunity for further personalization. We think that we have seen big gains in our personalized search algorithm and its relevance based on search history and search collects and while that’s certainly an important part of it, having access to someone’s full Web history will allow us to do a better job personalizing their search results to them. Also, it’s just a useful feature in and of itself. To be able to look up, easily, any site that you’ve seen. Yes, there is a history feature in the browser but our UI offers you more context, offers you time and dates of visit. After using it myself for the past two weeks, it does make it easier for you to look up, “Hey, what was that page I saw the other day?” So we’re really excited about it. That said, we do understand with this feature, as with all personalization features, very large privacy trade-off is being made and so users do need to explicitly consent to this feature.
But there’s also the longer term benefit of seeing your search results improve. And that’s going to be a much more incremental win for the user. As that gains traction, expect Google to get more aggressive with pushing Web History out to users, much the same that happened with Search History:
Marissa: It’s the same curve. in the beginning,g search history was very much an opt in feature because we weren’t sure the type of relevancy gains we’d get from it, so we had to start with opt in. Sep and his team went to work and they made great use of that data and ultimately we saw large relevancy gains, so we said, look, for the average user they should have search history being collected for them because we can ultimately make their results that much better. So slowly, over time, as we got that initial set of data we learned what we could do with it and we got much more comfortable and confident that this was a good trade-off for users to make. There was a small trade-off on privacy but they’re going to get dramatically better search results. That was something that made sense to us over time.
I suspect that the same curve will follow here. So today, and I believe you and I agree on this, probably Web history will make search better but we really don’t know that, at least not on a broad scale. So we start slow with an opt-in and make sure that the users are happy and knowledgeable and they’re making this trade-off in the knowing way. We’ll use that data to help understand what kind of gains we can get from it. As we get comfortable and confident that the relevancy gains that we’re see are commensurate with the types of privacy trade-offs we’re presenting users with, we’ll ultimately likely become more aggressive. That said, we’re never going to get to the state where we’re returning something without the user being aware of it and we’ll always make a strong attempt to inform.
The move last week by Google was a relatively minor ripple today for the vast majority of Google users, bit it marks another significant development in what I believe is the biggest news in search since the mid 90’s. One day, we’ll look back at last Thursday’s announcement as an important milestone in online history.
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.