Search Engine Land » SEO » Content » Researcher Jim Jansen On The “Sex” Of Search Queries & Personalization

Researcher Jim Jansen On The “Sex” Of Search Queries & Personalization

In this column, I’ll follow up on my conversation with Dr. Jim Jansen from Penn State and his recent investigation into behavior patterns that lie within a large data set of visitor and search advertising campaign data from a high traffic ecommerce site. In part one, Jim and I explored whether a search funnel actually […]

Gord Hotchkiss on October 30, 2009 at 7:00 am | Reading time: 9 minutes

Chat with SearchBot

In this column, I’ll follow up on my conversation with Dr. Jim Jansen from Penn State and his recent investigation into behavior patterns that lie within a large data set of visitor and search advertising campaign data from a high traffic ecommerce site. In part one, Jim and I explored whether a search funnel actually exists. Surprisingly, Jim found that more generic queries, considered by marketers to be “top of funnel” queries, may be the only search activity required. He found these terms tended to generate equivalent or higher ROI than longer, more transactional queries.

Today, I’d like to cover a couple of additional topics that came up in our conversation: personalization in terms of the “maleness” or “femaleness” of the query used, and how personalization may play out on both the desktop and on mobile devices.

Let’s start with the “sex” of queries. Jansen did an interesting segmentation of the queries in the dataset, using Microsoft’s demographic tool:

Jansen: We took queries from this particular search engine marketing campaign and classified them based on gender probability using Microsoft’s demographic tool, which will classify a query by it’s probability of being male or female. We looked at it this way: not whether the searcher was male or female but did the particular query fit a gender stereotype—did it have a kind of a male, for example, feel to it or stereotype implications?

Having done previous work with personalization, and gender specificity does fall into a broad category of personalization, Jansen had his own hunches about what he found. As it turned out, his hunches were wrong:

Jansen: The results to me were counterintuitive from what I expected. Usually, the idea of personalization is that the more personalized you get, the higher the payoff, the efficiency and effectiveness is. [But when we looked at the data] in terms of sales, far and away the most profitable were the set of queries that were totally gender-neutral. We took the queries and divided them into seven categories: “very strongly male,” “generally male,” “slightly male,” “gender neutral,” “slightly female,” “strongly female,” “very female.” By two orders of magnitude, the most profitable were the ones that were totally gender-neutral.

Jansen offered examples of “gender neutral” terms:

Jansen: We defined gender-neutral to be were queries that the Microsoft tool classified up to like 59% either side. So we had a fairly big spread here. Here are some examples of queries based off the Microsoft tool: “electronic chess.” The Microsoft tool classified that 100% male. For a gender-neutral query—”atomic desk clock” and “water purifier.”

At this point, the mystery of why “gender neutral” performed at at a significantly higher level remains to be solved, but Jansen has some thoughts:

Jansen: One thing that is coming out in the personalization research is that at a certain level, we have totally unique differences. You can personalize to a general category and to a certain level, but beyond that, it’s either not doing much good or may actually get in the way. And that may be something that is happening here—that these particular, very targeted gender keyword phrases are just not attracting the audience that the more gender-neutral queries and keywords are.

Again, it’s a “why” thing. We spend a lot of time in web search trying to personalize to the individual level and really haven’t got very far. But now people are trying to do things like personalize to the task rather than the individual person, and there’s some interesting things happening there. Spell checks and query reformulations and things like that are very task-oriented rather than individual searcher oriented.

Dr. Jansen’s point about how personalization might be better aimed at the tasks we’re engaged in rather than the people we are led to further speculation about where personalization might take us in the future.

Jansen: [Personalization] is just so hard to do. You know, Gord is different than Jim, and Gord today is different than Gord was five years ago. Personalizing at the individual level is just very difficult and may not even be a fruitful area to pursue.

We’re nonlinear creatures, we’re changing all the time. I can’t even keep up with all my changes and I can’t imagine some technology trying to do it. It just seems an unbelievably challenging, hard task to do.

I brought up the point that even we don’t know why we do the things we do, because so much of our decision making is driven by unconscious factors. It’s a thought that’s crossed Jansen’s mind as well:

Jansen: I’ve commented on that before in terms of recommending a movie or book to me. I don’t even know what books and movies I like until I see them. Sometimes I pick up a book and say, “Oh, I’m going to really love this,” only to get a chapter into it and realize “Okay, this is horrible.” And I think you see that in the NetFlix challenge— So many organizations have labored for a decade now, and finally it looks like perhaps this year someone may win by combing 30 different approaches simultaneously to the very simple problem of “Recommend a movie. It’s just amazing the computational variations that are going on.

From personalization, our conversation then veered to mobile (not such a long detour, really). To me, the intersection of search and functionality has the most potential on our mobile devices. But the need to “get it right” is substantially higher, given the inherent challenges of handheld devices: limited screen real estate and input challenges.

Jansen: Everybody is saying (again), “This is the year mobile searching’s going take off.” It’s been going on for four or five years now, and really, at least here in the US, it hasn’t really happened yet. But what I think is going to make it hit the mainstream is this combination of localized search. When you have a mobile device, the technology has so much more information about you: it’s got your location to within a couple feet, the context that you’re in can really start entering the picture and information gets pushed to you—I’m thinking tagged buildings and restaurants and cultural events and on and on. And so with my mobile device, where I can talk into it, I don’t even have to type anything. I want “what’s going on in the area?” and it automatically knows my location and the time and perhaps something about me and the things that I’ve searched on before. “Oh, you like coffee shops where there’s some music playing. Guess what? Boom. There’s five right near, in your area that have live entertainment right then.” So I think in that respect it’ll be a little more narrowed search, but the technology will have so much more information about us that in a way it makes the job easier. The problem’s going to be the interface and the presentation of the results.

Imagine being able to walk through a town… I live in Charlottesville, Virginia. Tons of history here from 400 years ago when Europeans first settled here, Thomas Jefferson, James Madison, etc., etc. Being able just to walk down Main Street and have tagged buildings interface with my mobile device… I’m a big history buff and so getting that particular information, one, pushed to me or at least available to push when I ask for it is a wonderful, wonderful area of personalization. This idea of localized search and mobile devices and mobile search may be the thing that brings it all together and makes mobile search happen.

Given the direction of conversation I had to ask Jim about the privacy implications of all this functionality. Let’s assume that Google is the likely candidate that assemble this search “utopia.” What price might we have to pay to enable Google’s effectiveness as our own personal digital concierge, or, more sinisterly, our “Big Brother?”

Jansen: You know, the “Big Brother” idea label has certain negative connotations, so I don’t want to say that Google is Big Brother-ish in that regard. But certainly I think with their movement into free voice and free directory assistance, they will soon have a voice data archive that will allow them to do some amazing things with voice search, which would be an awesome feature for mobile devices. Being able to talk into a mobile device, have it recognize you nearly 100% of the time and execute the search.

My final question for Jim was how much of a priority should Google make innovation in the mobile search space:

Jansen: Google of course is the one that knows what they’re doing, but certainly I think it would be naive not to be exploring that particular area. And I think the contrast from what you said about Microsoft and the desktop, the desktop is just so busy. You’re getting so many different signals in terms of business, personal things, my kids use my computer sometimes. And so the context is so large on the desktop, but the mobile device, it’s narrower. You know, you have some telephone calls, you can do some GPS things, so the context is narrower but very, very rich in that very narrow domain. I think it’s a really hot area of search.

The entire interview transcript has been posted to my blog. As always, a conversation with Jim Jansen never fails to be interesting.

Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.

Add Search Engine Land to your Google News feed.