Google “Three Times Larger” Than Nearest Rival & More Q&A With Google’s Marissa Mayer


As part of today’s Google Press Day 2007, Marissa Mayer, Google vice president of search products and user experience, covered the "Past, Present & Future Of Search." Much of this is known to Search Engine Land readers and also already covered by Google during its Searchology day last month. But, there was a size tidbit on Google estimating itself three times larger than its nearest rival and some other things out of Q&A I thought worth live blogging. More below.

Google has four key components of search:

  1. Comprehensiveness
  2. Relevance
  3. Speed
  4. User Experience

Comprehensiveness

Marissa is celebrating her 8th anniversary this week, and the index was 30 million pages when she joined. Today, it’s "tens of billions" of pages indexed.

On June 26, 2000 — "Giga Google" was a company milestone, a billion documents indexed.

Relevance

Had such a good experience that Google quickly spread via word of mouth. Marissa presses on, despite the power going out, taking down her slides and microphone.

Speed

Snappy under-a-second responsiveness helped Google grow. While physics may limit ultimately how fast Google can be, they’d like responses to be nearly the speed of light.

Q&A

With the power not returning, Marissa went right to Q&A. Below quotes are near to what she said, though they might not be exact in all cases. Snarky comments are all my own.

BBC: How many pages indexed today and how compare?

Today we release our page count in orders of magnitude, and "we believe we are three times larger than our next nearest rival." (Yahoo would be the nearest rival. It also doesn’t quote numbers, and why both of them don’t is covered in my 2005 article about Google dropping its home page size count. Let’s hope they don’t go back to it. Oh, and Yahoo will likely dispute they are three times behind. Expect a "we’re on par with our nearest rival" comment from Yahoo in the future).

Guardian: Jason Calacanis made good points the internet is polluted. What do you think about Mahalo as his solution.

To date we rely largely on automation….I actually think the right answer is a blend of both, to get the incredible scale that automation and operate on and have the human intelligence, particularly in Asia (this response about Asia largely because Google hasn’t done well in the face of human-powered Naver in South Korea and Yahoo Answers growth there, so now everyone thinks human-powered must be the Asian or South Korean solution).

What about the semantic web (heh, what about it?).

Sheer scale of the data, "we actually able to find interesting patterns in the data." (IE, we don’t need no stinkin’ human tagging when we can extract information from the rich textual documents themselves).

Yandex does live journal blogging. How about Google?

Yes, and Google does blog search. Plus, they do universal search to blend these results with other results as relevant, and blog search may come to that in the future.

How many data centers?

They don’t release figures. Having own fiber optic network helps them improve user experience.

Do you think people could opt in to hold their data longer than 18 months to have a hyper experience?

Marissa kind of stumbled on the answer, saying it’s something they’ll look at. She stumbled because dude, they already keep data longer than that through opt-in via web history and other stuff. Read these for more on that:

* Google Responds To EU: Cutting Raw Log Retention Time; Reconsidering Cookie Expiration

* Google Bad On Privacy? Maybe It’s Privacy International’s Report That Sucks

* Google Search History Expands, Becomes Web History

* Google Anonymizing Search Records To Protect Privacy

With the power staying out, lunch was declared, with Q&A and presentations to return after folks are fed. So, perhaps more later.

With the power staying out, lunch was declared, with Q&A and presentations to return after folks are fed. So, perhaps more later.

Back to the formal presentation, where things left off with speed. Comments from me are in [brackets]….

Home page is clean, briefly revisits story of dumb luck that Sergey didn’t know much HTML, so the home page was kept simple. Shows examples of home page over the years.

Search results page has stayed clean over time. Covers spelling correction feature and related query suggestions [though less offered on this front generally than say by Ask.com and its Ask 3D interface].

Covers different results for different locations [a pet hate of mine, in as much as sometimes people in one country WANT to see exactly what others are seeing, though in many cases, automatically making results country specific is helpful]. Cote d’Or shown as an example of how in France, you tend to have the pages dominated by the region rather than the chocolate (as is the lead result in Google Belgium), as that’s what most people in France want.

Covers how Google Universal Search is designed to help people get the right vertical search results even if they don’t know all of the ones that Google offers.

Covers iGoogle personalized home page and taking in personalized information to customize search results.

Question time again….

Is opt-in program allowed to keep people’s data for so long, say five or six years (versus say what might be legally allowed).

Personalized search is opt-in. You decided if you want the results opt-in. That info is anonymized after 18 months [actually, log data is anonymized, but the Google personalized search results for each users is not anonymized, to my knowledge].

How important is geographic info for video?

Tagging is very important for video. Geo tagging of images and video can become increasingly important for making country specific results.

Should authoritative sites take preference over say blog entries or Wikipedia [IE, these aren't trustworthy]

With Wikipedia, seeing the linkage helps Google know that particular entries might be deemed trustworthy, "it’s happening because people like the content and are linking to it." [Plus, by law, Google is required to list Wikipedia first in many countries. Heh -- joke, joke, folks].

Question on Googlebombs and images.

Image search, you can reduce to resolution sizes. Answer to second question. As for Googlebombs, explains how unusual words be susceptible to trickery, "the few Google bombs that have happened have become quite famous." [For more, see Google Kills Bush's Miserable Failure Search & Other Google Bombs and Google Says Stephen Colbert Is No Longer The Greatest Living American].

 How far along to search video by looking at actual images.

Preliminary efforts, very much at the research phase, they’re not really ready for prime consumption. Personally, she thinks it’s much more likely they would develop a good voice-to-text mechanism [actually, Google should have this already -- they used this for the initial launch of Google Video]. Says they are further along with the voice recognition part.



Danny Sullivan is editor-in-chief of Search Engine Land. He’s a widely cited authority on search engines and search marketing issues who has covered the space since 1996. Danny also oversees Search Engine Land’s SMX: Search Marketing Expo conference series. He maintains a personal blog called Daggle, can be found on Facebook, Google Buzz and microblogs on Twitter as @dannysullivan.

See more articles by Danny Sullivan >


Share, Bookmark & Discuss This Article
More:


Keep Updated: News Via Email | News Via RSS Feed | News Via Twitter


See more stories like this in the Members Library! Check out the Google: Web Search, Stats: Size sections of the Members Library where this story is filed. Members also get access to exclusive video content, a members-only weekly & monthly newsletter, plus more. Check out all the benefits!

Comments are closed.


RECENT COMMENTS

  • Shari Thurow said " Hi all- Information architects, at least the most knowledgeable ones, understand the main finding be"
  • nuttakorn said " I think 2010 is about Real-time and personalize search algorithms that Google will pay attention to "
  • nuttakorn said " I just heard many sources that Google.cn will shut down tomorrow. You can see this source of news fr"

See All »


FREE DAILY SEARCH NEWS RECAP!

SearchCap is a once-per-day newsletter update:

STAY CURRENT THROUGHOUT THE DAY

Our feed & social options update you as news happens.


Advertise With Us »

Search Marketing Expo

Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.


SMX Web Site » | SMX Difference » | SMX News »


Join us at an upcoming SMX event:

Search Marketing Now Learn more about search marketing with our free online webcasts and webinars from our sister site, Search Marketing Now. Upcoming online events include:


See more webcast topics »

FOLLOW US SOCIALLY
Upcoming Search Engine Land Conferences

Get Your Search Engine Land
Premium Membership!

Become a premium member today and receive:

  • Express commenting privileges & photo.
  • Exclusive videos & newsletters.
  • Discounts to our SMX conferences.
  • Access to "How To" & Other Archives.

Learn More

Upcoming Search Engine Land Conferences
Add to GoogleAdd to My Yahoo!Add to BloglinesAdd to NetvibesAdd to Windows Live