In two weeks, we’ve had two "open" initiatives from Google: OpenSocial, to free social networking data from behind the Facebook walled garden and the Open Handset Alliance, to free cell phones from a myriad of complicated mobile OS platforms and carriers who want to restrict features. I’ve seen some people writing about open as the new black, with Google showing its fashion sense by dressing in the latest color. But lest anyone think that Google’s wardrobe is being replaced with an all-open line-up, it’s worth remembering that recently, open mainly fits Google when it’s behind competitively in a space. Let’s consider the places where staying closed is what suits Google best.
Everyone’s all excited about the "social graph" these days, that terrible term that simply means social network data or social linkage. If you know how people are connected, there are all types of interesting things you can do, from applications that let people compare movie tastes to targeting ads.
Facebook is seen as having the best social data, which is why Google wants to tap into it through OpenSocial (see here and here for background). But what about the "web graph," that similarly terrible term that simply means how pages are linked together? Years ago, Google tapped into the web graph to improve its relevancy and now remains the top search engine in the world.
While the major search engines thankfully no longer play the page count game, I’ve got no doubts about Google’s claim to be much larger than its rivals in general. It has been scaling up its index — the collection of documents it gathers from the web — consistently since the company started.
That large index gives Google a huge advantage over rivals. It knows more about what’s on the web than anyone else. So why not share? Why not start an Open Index Alliance where there’s a coordinated effort to crawl and index all the documents in the world, allowing anyone to tap into the raw data?
An absurd idea? Why? Just having the same collection of documents doesn’t mean a competitor would be as good as Google. It will still come down to the search algorithm, that system that sifts through all the data and decides which documents are best. Google’s index isn’t the secret sauce — it’s the Google algorithm that’s important.
Still, the index is important competitively. Deny that to competitors and they have to go through the significant time and expense of building their own. That takes away from time that can go into improving the more important search algorithms. In addition, if competitors can’t build as large an index, they’ve got less data to work with and potentially are further behind Google.
Still, why should Google be open to helping others by opening up web search? Two reasons. First, it’s consistent. If Google’s going to push for those with existing advantages to open up through efforts like OpenSocial and the Open Handset Alliance, an Open Index Alliance just seems like fair play.
Second, it’s not just competitors. There are researchers who would like to tap into a huge collection of web documents. To my knowledge, this isn’t something that currently happens. Some researchers might be able to tap into a small portion, but certainly not everything Google has on tap.
I was able to attend Foo Camp this year, and a session of creating an open index was one of the highlights for me. Doug Cutting talked about his work on things like Nutch and Lucene. At the table listening with great interest were both Jason Calacanis of Mahalo and Jimmy Wales of Search Wikia. Both want a massive index they can tap into. Wales, in particular, gets much press pushing on Google for not being open with its index. Others at the session raised the issue of researchers wanting to use the data. And midway through, Google’s Larry Page arrived and joined in.
Maybe some of that conversation will flow back to Google and an open index might happen. Personally, I’m not counting on it. I still feel Google sees its web index as a crown jewel to be hoarded and protected, not fully shared — and if I’m correct, that’s deeply ironic given the "open" push in other areas.
There’s probably no deeper example of Google being closed than when it comes to book search. Google’s efforts to scan books are well known at this point. But Google keeps coming under fire for agreements said to restrict those scans for being used by its competitors. For background, see:
- Battle For Books: Evil Google Versus The Altruistic Open Content Alliance
- The Politics of Book Search: Some Research Libraries Decline to Offer Books to Microsoft, Google
To be fair, Microsoft has also added similar restrictions. But if Google’s on an "open" kick, why not join the Open Content Alliance?
The OCA was started in 2005 as a rival to Google’s book scanning efforts, with Yahoo and Microsoft as major backers but the Internet Archive also participating and leading as a neutral party. Of course, Internet Archive founder Brewster Kahle has had plenty of non-neutral things to say about Google’s book scanning efforts, concerned that Google is gobbling too much up for its own.
Well, what better way to counter the "closed" PR than to join the OCA? And more importantly, it would be better for everyone if scanning the world’s books were done in a coordinated manner, rather than the probable duplication of efforts that is going on right now.
Firefox Default Search
I was bemused to see Jeremy Shoemaker come under fire for questioning why Google is the default in many versions of Firefox, since he was right in his assessment. A Firefox developer responded that Google’s the default because users want it and that Yahoo couldn’t buy the default slot. As a Firefox user, I must have missed the vote that was held. But sarcasm aside, Google DID buy the spot — except for some Asian versions of Firefox (Yahoo bought those).
Firefox is important because when Internet Explorer 7 came out, Google was very much of the opinion that users need to get more "choice" about the search provider selected — i.e., the search defaults should be more open.
Google & Dell’s Revenue-Generating URL Error Pages Drawing Fire gives more background on this, plus it highlights that the idea of open choice is fine when Google’s at a disadvantage (in IE7) but restricting choice is fine when Google benefits (with Firefox or Dell’s branded search).
Perhaps it’s time for an Open Search Default Alliance, where the OSDA can figure out how users can best control their search settings rather than this being left to business interests. That might prevent people like Verizon from deciding recently that they’d like to monetize error traffic by seizing control of searches. Since Verizon partners with Yahoo, the OSDA could actually help Google. Of course, it might hurt it in other cases. But it’s all about being open, right?
AdWords & AdSense
Talking about openness when it comes to AdWords and AdSense in the context of cooperative activities is a stretch, I admit. But if you’re going to talk about Google being open in general, you have to address the inherent closed nature of these programs.
How much will an ad cost you through AdWords? Depends on the mysteries of quality score, which was recently declared to us at Search Engine Land that buying an ad for our own name would cost at minimum $5 per click, presumably because we aren’t relevant enough. Right. Because as you know, most people on Google searching for us by name probably don’t want to reach our site. Heh. But competitors buying our name, they are apparently more relevant (or pay more), since they outrank us.
Hey, I understand the concept of an account history, and how over time, things should get better for us. But it can be maddening to people, and the closed nature of how AdWords operates — that black box that just got poked at again, this time by Robert X. Cringely, doesn’t fit the "open" trend some think Google is following.
As for AdSense, how much does Google keep back from publishers? If you’re big, like Ask.com, you’ll know the slice you’re getting. But most people are going to get whatever Google decides to give. AdSense isn’t an "open" marketplace where publishers set prices and see how much advertisers are willing to pay, with Google taking a known and set percentage. Google will take whatever it wants, and publishers are left guessing. So much for open.
If I come off as harsh, well, I also do a lot of defending of Google as well, as I just did yesterday about Cringley’s post. My goal in this, as with many posts, is to provide some balance. And those thinking Google has swallowed the open Kool-Aid need to think again.
Google does do plenty of things I find encouraging on the open front. They hooked up with the Open Invention Network a few months back. Google was the driving force to get search engines united around the Sitemaps standard. There are no doubt many other examples of where Google is involved with collective, open-source style projects.
But these things are far from an institutional mandate, from what I’ve seen so far — and the latest, most prominent efforts come because being open makes good business sense for Google, not because it makes good sense in general.