Both VentureBeat and the New York Times are out today with an update on Powerset, which has won the rights to natural language search technology from Xerox PARC that it hopes will make it into the new Google, when Powerset finally launches. VentureBeat also gives us news of at least one person from Yahoo now working there – Tim Converse — who left Yahoo in December. I know others from Yahoo are also at Powerset, as well.
Last time we had news about Powerset, I did a big long rant – Hello Natural Language Search, My Old Over-Hyped Search Friend — about the hype and history often associated with natural language search. VentureBeat mentions that rant, as does TechCrunch today in its post, Powerset Hype To Boiling Point. So I felt I should give an update.
Unlike Matt Marshall over at VentureBeat, I’ve yet to see Powerset (right now, you can only see it within the Powerset offices, and I’m in England, not Silicon Valley). I have been in touch with CEO Barney Pell, and I’m looking forward to using a working demo in the future.
I also have to stress that not having seen the tool firsthand, it could be that I’ll be blown away and as impressed as some others are. Then again, right now I still feel that it’s more hype than actual Google threat.
Technically, I’m "off" from writing on Fridays. So for now, I’ll just share what I sent over to both VentureBeat and Barney last night:
Frankly, I can use Hakia right now and have been very impressed with much of what they deliver, though not all of that is due to the natural language stuff. I have an article I’m working up on it to talk about how they are better for some queries, in some respects.
I knew about some of the Yahoo people going over, including Tim. Interesting, certainly gives more to the Powerset story to consider, but I’m still firmly in the prove it category by actually rolling it out.
who acquired ibm
[a search cited as Powerset doing better than Google in Matt's story]
OK, I see Surfaid, Lenovo listed, so that’s two "IBM" things acquired. I’m not sure who searches for that [particular phrase] to find out who acquired particular IBM units. That type of query to me feels like something you were suggested to try, and if so, then usually it’s suggested because it’s known that query will produce something good. Go search for travel, cars and casino and compare those to Google and see if you’re still impressed. Those are real searches.
Also, what’s the size of the index Powerset is searching against? Barney?
Google has in excess of 10 billion documents from across the entire web, and that’s a low estimate. If Powerset is hitting fewer documents — and especially a subset of say business documents — just that filtering along might give you better results.
So later, when you ask Powerset: "What do liberal democrats say about healthcare policy?" And that sounds really compelling. But guess what? On many pages that say liberal democrats, you’ll also find Hillary Clinton also mentioned.
- What do liberal democrats say about healthcare policy: 1.4 million matches
- What do liberal democrats say about healthcare policy hillary clinton: 900,000 matches
Maybe that special document I needed was in the 500 million that Google’s not picking up, because it’s not making the association. Chances are, it’s not.
"Powerset showed it can answer more complex questions, such as “Who did IBM acquire in 1996?�? Here, Google completely breaks down
[quote from the story]
Breaks down because what did Powerset show? I mean, a screenshot? A list of what was there?
FYI, if I do these:
Both bring me to Wikipedia first [via Google] which answers the question in short order and is anything other than a failure.
Now ask yourself. If you’re looking for an IBM acquisition in 1996, don’t you think you’re likely to wonder about a particular one? Or if you’re after things they’ve acquired, are you that likely to put in a particular year?
Sure, some refinement suggestions would help. But you’re picking out one query and saying Google breaks down when that might not even be a query someone would do.
Clearly, Powerset faces challenges. Even if its technology does prove to be useful, it isn’t clear how long it will keep its lead in the face of an onslaught from Google. Another challenge is changing peoples’ search behavior, which is used to keyword searches.
[quote from the story]
What lead? With respect, it doesn’t have any lead. It has some experts, a product pitch and a product a few can play with and only internally. When it launches, then it has a serious challenge to take on even Ask, much less Google. I’m not saying it can’t be done — but wow, you’ve already got it winning while it hasn’t even entered the race. Not only does it have to attract traffic, but it needs to prove it can scale to handle that traffic and the processing. Will it stand up to doing all this natural language processing (which often will be unnecessary) in the face of a lot of traffic.
NOTE: Matt also replied to me:
By lead, I actually meant lead in natural language search. Clearly they don’t have a lead in search. (though, even in natural language, I suppose you could say that is speculation, given that [Google] tells me they’re got people working on this).