Search Engine Land » SEO » Content » Google “Three Times Larger” Than Nearest Rival & More Q&A With Google’s Marissa Mayer

Google “Three Times Larger” Than Nearest Rival & More Q&A With Google’s Marissa Mayer

As part of today’s Google Press Day 2007, Marissa Mayer, Google vice president of search products and user experience, covered the "Past, Present & Future Of Search." Much of this is known to Search Engine Land readers and also already covered by Google during its Searchology day last month. But, there was a size tidbit […]

Danny Sullivan on June 19, 2007 at 6:43 am | Reading time: 6 minutes

Chat with SearchBot

As part of today’s Google Press Day 2007, Marissa Mayer, Google vice president of search products and user experience,
covered the "Past, Present & Future Of Search." Much of this is known to Search
Engine Land readers and also already covered by Google during its
Searchology day last
month. But, there was a size tidbit on Google estimating itself three times
larger than its nearest rival and some other things out of Q&A I thought worth
live blogging. More below.

Google has four key components of search:

Comprehensiveness
Relevance
Speed
User Experience

Comprehensiveness

Marissa is celebrating her 8th anniversary this week, and the index was 30
million pages when she joined. Today, it’s "tens of billions" of pages indexed.

On June 26, 2000 — "Giga Google" was a company milestone, a billion
documents indexed.

Relevance

Had such a good experience that Google quickly spread via word of mouth.
Marissa presses on, despite the power going out, taking down her slides and
microphone.

Speed

Snappy under-a-second responsiveness helped Google grow. While physics may
limit ultimately how fast Google can be, they’d like responses to be nearly the
speed of light.

Q&A

With the power not returning, Marissa went right to Q&A. Below quotes are
near to what she said, though they might not be exact in all cases. Snarky
comments are all my own.

BBC: How many pages indexed today and how compare?

Today we release our page count in orders of magnitude, and "we believe we
are three times larger than our next nearest rival." (Yahoo would be the
nearest rival. It also doesn’t quote numbers, and why both of them don’t is
covered in my 2005
article
about Google dropping its home page size count. Let’s hope they don’t go back
to it. Oh, and Yahoo will likely dispute they are three times behind. Expect a
"we’re on par with our nearest rival" comment from Yahoo in the future).

Guardian: Jason Calacanis made good points the internet is polluted. What do
you think about Mahalo
as his solution.

To date we rely largely on automation….I actually think the right answer
is a blend of both, to get the incredible scale that automation and operate on
and have the human intelligence, particularly in Asia (this response about
Asia largely because Google hasn’t done well in the face of human-powered
Naver in South Korea and Yahoo Answers growth there, so now everyone thinks
human-powered must be the Asian or South Korean solution).

What about the semantic web (heh, what about it?).

Sheer scale of the data, "we actually able to find interesting patterns in
the data." (IE, we don’t need no stinkin’ human tagging when we can extract
information from the rich textual documents themselves).

Yandex does live journal blogging. How about Google?

Yes, and Google does blog search. Plus, they do universal search to blend
these results with other results as relevant, and blog search may come to that
in the future.

How many data centers?

They don’t release figures. Having own fiber optic network helps them
improve user experience.

Do you think people could opt in to hold their data longer than 18 months to
have a hyper experience?

Marissa kind of stumbled on the answer, saying it’s something they’ll look
at. She stumbled because dude, they already keep data longer than that through
opt-in via web history and other stuff. Read these for more on that:

* Google Responds To
EU: Cutting Raw Log Retention Time; Reconsidering Cookie Expiration

* Google Bad On
Privacy? Maybe It’s Privacy International’s Report That Sucks

* Google Search
History Expands, Becomes Web History

* Google
Anonymizing Search Records To Protect Privacy

With the power staying out, lunch was declared, with Q&A and presentations to
return after folks are fed. So, perhaps more later.

Back to the formal presentation, where things left off with speed. Comments
from me are in [brackets]….

Home page is clean, briefly revisits story of dumb luck that Sergey didn’t
know much HTML, so the home page was kept simple. Shows examples of home page
over the years.

Search results page has stayed clean over time. Covers spelling correction
feature and related query suggestions [though less offered on this front
generally than say by
Ask.com and its Ask 3D interface].

Covers different results for different locations [a pet hate of mine, in as
much as sometimes people in one country WANT to see exactly what others are
seeing, though in many cases, automatically making results country specific is
helpful]. Cote d’Or shown as an example of how
in France, you tend to
have the pages dominated by the region rather than the chocolate (as
is the lead result in Google Belgium), as that’s what most people in France
want.

Covers how Google
Universal Search is designed to help people get the right vertical search
results even if they don’t know all of the ones that Google offers.

Covers iGoogle personalized home page and taking in personalized information
to customize search results.

Question time again….

Is opt-in program allowed to keep people’s data for so long, say five or six
years (versus say what might be legally allowed).

Personalized search is opt-in. You decided if you want the results opt-in.
That info is anonymized after 18 months [actually, log data is anonymized, but
the Google personalized search results for each users is not anonymized, to my
knowledge].

How important is geographic info for video?

Tagging is very important for video. Geo tagging of images and video can
become increasingly important for making country specific results.

Should authoritative sites take preference over say blog entries or Wikipedia
[IE, these aren’t trustworthy]

With Wikipedia, seeing the linkage helps Google know that particular entries
might be deemed trustworthy, "it’s happening because people like the content and
are linking to it." [Plus, by law, Google is required to list Wikipedia first in
many countries. Heh — joke, joke, folks].

Question on Googlebombs and images.

Image search, you can reduce to resolution sizes. Answer to second question.
As for Googlebombs, explains how unusual words be susceptible to trickery, "the
few Google bombs that have happened have become quite famous." [For more, see
Google Kills Bush’s
Miserable Failure Search & Other Google Bombs and
Google Says Stephen
Colbert Is No Longer The Greatest Living American].

How far along to search video by looking at actual images.

Preliminary efforts, very much at the research phase, they’re not really
ready for prime consumption. Personally, she thinks it’s much more likely they
would develop a good voice-to-text mechanism [actually, Google should have this
already — they used this for the initial launch of Google Video]. Says they are
further along with the voice recognition part.

Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.

Add Search Engine Land to your Google News feed.