How The “Focus On First” Helps Hide Google’s Relevancy Problems
In January, I was invited to speak to Google’s search quality team about issues I had with Google’s search results. My topic? For queries where I know a subject really well, I often found Google provided some pretty poor results in the top listings along with the good ones. I wanted more perfection!
With the focus this week on Google Instant Search providing fast answers, I figured it was worth revisiting the quality of results. This post is largely built off the things I presented back in January, issues that are still continuing with Google. It explains how a “focus on first” can help cover the fact that other things listed in Google’s top results aren’t always that great.
Not Ego Search, Expert Search
My choice of queries has a strong degree of self-interest. My examples are searches where I’d expect to see Search Engine Land rank well for. Clearly, anyone listed above us is S.P.A.M. — as the old joke goes — Someone Positioned Above Me!
Certainly, one of the most common ways people will measure a search engine is through an “ego search.” You search for yourself, and if you find your own content — a blog, a LinkedIn profile or whatever — you decide if the search engine is relevant or not. It’s a well known behavior, even if it’s sometimes a terrible way to measure relevancy.
I’m not talking about ego searching, however. I’m talking about a subject expert reviewing a set of results to determine if what shows is up to snuff. It’s something that Google’s army of quality raters cannot do. These are people Google hires to review results, to decide what seems good, bad and just outright spam. That aggregate data is used to help Google improve its automated ranking algorithm. But if someone’s not a subject expert, they’re largely making their best guess — and it could be a guess that’s way off.
Before I dive in, let me stress that Google does an excellent job in general on search. It usually does help me, and millions of others, find what they are looking for. And part of that quality comes from an honest effort to improve and take criticism toward that goal, such as asking me to effectively let them have it during my talk in January.
“Search Engines” — More Than Poking Fun
Let’s start with a classic, a search for search engines. I’ve poked at Google about this query for almost a year now, including at the Google Instant press conference yesterday. Google — the most popular search engine in the world — doesn’t rank itself in the first page of results. In fact, as I look while writing this, it’s not even in the top 100 results.
That’s absurd. Yes, it plays well as a counter to the crowd that wants to argue that Google favors its own its own services. Look, we don’t even list ourselves! But it’s a flaw, an error, a bad set of results for Google not to be there.
What do we get for the first page?
- Wikipedia page
- Search Engine Guide page
- Search Engine Watch page
- Search Engine Colossus page
That’s 10 sites, 6 of them actual search engines and 4 of them about search engines.
AltaVista: 2nd Best Search Engine In The World, Says Google
Listing AltaVista makes no sense, from a relevancy standpoint. It WAS one of the earliest search engines, launched before Google. It had its own search technology. It later gave up the search space, got purchased by Yahoo, drew its results from Yahoo which last month itself gave up its own search technology to carry listings from Bing.
AltaVista is Bing, twice removed, twice watered down. Yahoo — which says it’s still a player in search — has expended no effort to somehow maintain AltaVista as an appealing search engine. It’s a has-been, but one that Google considers the second best recommendation out of over 100 million pages that it considered ranking for a search on “search engine.”
That’s dumb. It’s also dumb to list both Dogpile and WebCrawler, meta search engines owned by the same company that do effectively the same thing. One is sufficient. Get some variety in there.
Old Man Links Rule
What’s happening is that Google rewards longevity. AltaVista was around ages ago, gained a lot of links over time, and in particular links from other aged sites. Google relies heavily on links to determine rankings, and links from old sites to old sites can trump anything. It’s like a link gerontocracy.
You can see this in action with three of the sites listed that are about search engines, rather than actually being search engines. All of them are older than Search Engine Land. All of them have gained links over time that help them do well for this query. One of them I know extremely well — Search Engine Watch.
For those not familiar, I created Search Engine Watch back in 1996, sold it in 1997 and but was hired to keep running it as editor. I did that through December 2006, when I left after failing to agree on a contract after it was sold again to its current owners.
The entire editorial staff of Search Engine Watch left with me that month, moving over here to Search Engine Land. All the editorial authority that Search Engine Watch had been built on was gone, moved to a new domain. But Google was unable to understand that.
(Note that this isn’t a reflection on the current staff of Search Engine Watch or any suggestion they shouldn’t be listed now. There’s a great group of talented people who now oversee the site there. I’m talking about what happened back when the old editorial staff left).
You Can’t Take Your Google Reputation With You
Google, which does things to fight spam and improve relevancy such as discrediting links in some cases when a domain was sold, was unable to transfer any reputation that I or my staff had built in covering the search engine industry to where we continued to cover that industry.
I think it’s fair to say that Search Engine Land is one of the top sites about search engines. But if you search on that topic, the site remains buried somewhere in the 40s on Google, behind even more old search engines and outdated articles. It ought to rank higher. Someone who knows a lot about search agrees with that, Sergey Brin, cofounder of Google, when I talked with him about relevancy issues last year.
I’m acutely aware that there’s a lot of self-interest in writing about this. But I’m not alone in this type of situation. Consider Andy Beal, who once ran a site called Search Engine Lowdown and then had to abandon it, to start over at Marketing Pilgrim. None of his reputation transferred to that new domain. Google’s link-based system can’t handle that.
This will grow as a problem for others. You have any number of sites that switch hands, have personnel changes or which change entirely in focus, yet Google doesn’t register a change, a transfer of authority. Over time, the new sites should grow. Make no mistake, Search Engine Land gets plenty of traffic from search engines. But still, after three years, it still hasn’t cracked the top ten for “search engines?”
Hey, maybe I need to take more of my own advice and get out there and do some link building! Interestingly, I never had to do much of that at Search Engine Watch. It had a good reputation and drew links naturally. That’s largely the same with Search Engine Land. We’re just a much newer site, and Google loves the old.
Actually, there’s an exception to that. Google also loves the fresh. It’s an odd spectrum. It rewards old links heavily, but then it also will give a boost to recently published content, for a short period of time. What it needs to do is focus on a better balance for that big gap in between.
Get Me Some Links!
Back to building links, if I really wanted to do it right, clearly I should start an SEO company and get all my clients to link to me, perhaps not even realizing it. After all, that’s what Google rewards for one of the most competitive searches out there, a search on SEO.
Here’s the screenshot from my presentation for results on SEO from back in January, which is virtually the same as you’ll see for the same search today:
I went through each result during my presentation to further explain my comments.
Wikipedia is required by law to have a top listing on Google, as I’ve often joked. But do we really need it twice? Especially when the second page is simply a listing of various things that SEO might refer to (and this still happens today on Google).
Google lists its own long-standing page about SEO, which I think is a very good choice. Other good choices to me were SEO Book and SEOmoz. I know both sites well. Both are good resources. I was less familiar with SEO Chat, hence my “suppose” about it being included.
SEO-USA.org seemed to make sense, if Google was going for diversity in results. Not everything called SEO refers to search engine optimization.
Down at the bottom, my questions about this set of results really kicked in. A tool I’d never heard of was ranking at number 10. Number 10 out of over 100 million possible matches. And at number nine, an SEO firm I’d never heard of. But they must be really good, for Google to rank them number 10 above other things, such as maybe the WebmasterWorld Forums (which, as I noted, didn’t make the list).
These Are High Quality Links?
Hmm. How did that company end up there? I brought up my next slide:
Turns out, this company had links from the oddest of places. Some seemed to be clients. I think one was a comment link from The News Tribune. As for the one from Brooke Skye Lesbians, that appeared to be part of the default links in the WordPress theme this site was using.
When I look at something like that, it’s hard to believe the “good links count” line that Google puts out. By the way, this particular site no longer ranks. Maybe my talk brought greater attention. Maybe something else happened. I noticed in writing this that some of those links are now gone. Either the company did a clean-up, or whatever tactics it used to gain links were relatively short term.
Looking At Links Yourself
But no matter. There were plenty of other firms to take its place, using similar tactics. Test it yourself. If you see an SEO firm listed, simply take the URL of the firm and search for it with the command “link:” in front of it, like this:
Especially be sure to do this with some of the companies on the second and third page of results. Some at Google will tell that what’s listed on the second or third page of results aren’t that important, since few people go past the first page of results. I totally agree with that behavior. And yet, what Google considers to be the 11th or 15th or 25th most important page on a subject out of millions of choices still ought to be damn good. And often, they’re not. Often, they’re garbage.
Do some of those link searches. You don’t need to be an old fart SEO to understand when you’re seeing really odd backlink profiles, and to understand that getting links from anywhere still seems to work, and for a topic that Google ought to have under extremely tight scrutiny: SEO.
When poor links like this can produce top rankings, it gets harder and harder to convince people that they should focus on content, that content wins out. I still believe that, by the way. If you’re in it for the long-term, a focus on content is right, in my opinion. But you can see why so many get tempted by the short term gains. They can work.
Broken Things Even Rank
Here’s another search to try: pagerank. Run through some of the PageRank tools that are listed in the first few pages of results. Many of them simply don’t work. But they’re old, been out there, and Google still rewards them. Don’t know much about PageRank? Read my article: What Is Google PageRank? A Guide For Searchers & Webmasters. It’s pretty comprehensive and still fairly fresh. Pity it’s so hard to find on Google.
I could do this all day. In a few minutes, give me a query, and I can usually find at least one result that doesn’t match the quality you’d expect to be in the first page of results on Google. If it’s an area I’m an expert it, I can do it even faster — and find more outliers. And if you go to the second page of results, it can sometimes be laughable.
Saved By The Focus On First
Google survives this because for the most part, a few good answers are good enough. As Google research director Peter Norvig said recently in a Slate interview:
If I do a search of the New York Times, I want nytimes.com to be the top result. But what should the 10th result be? There is no right answer to that. If a hardware error means we dropped one result and somebody had a different result at No. 10, there’s no way of saying that’s right or wrong.
Actually, there can be results at number 10 that plenty of people would agree are wrong for various reasons, and some even higher than that, when talking about non-navigational queries. But I agree with Norvig in general. Sometimes having enough good things is good enough, given what a searcher perceives.
Beyond perception, Google has another way to isolate itself even further from the bad picks that slip through: Google Personalized Search.
Personalized search starts to elevate the sites you like. If you ego search, it’s fantastic — your sites can do really well. You’re probably not aware that your results are personalized. You’re probably not aware that others are seeing results you might think aren’t so great. So you’re happy.
Now add Google Instant into the mix. At its press conference, Google emphasized how people would move their eyes from what they entered into the search box to the first result that was listed, using that first result in a way to effectively judge if all the results they might get matched their query. Google’s really just got to make that first result hum, for most people, most of the time. If results 2-10 are so-so, it’s not a mission critical matter.
It shouldn’t be that way, however. We ought to get 10 solid results on the first page. That’s what I expect from Google. But maybe I expect too much. Maybe good is good enough, especially given how people search.
My Wish List
Still, I hope. I’d like all 10 results for a query to be absolutely great. I’d also like them to be from 10 different sites. It’s long overdue to end the “indented” listings where two pages from a particular site might show, one indented under the other, with Wikipedia in my SEO example above.
In addition, if you actually do click to go to the second page of results, the quality of those listings still ought to be pretty high. They shouldn’t be laughable. Otherwise, don’t give me a second page of results.
Further, if you are going to “page” your way through results, Google needs to stop serving up pages from the same web sites that you’ve already rejected. For example, Google might show me a Wikipedia page (or two ) on its first page of results. If I go to the second page of results, Wikipedia might show up again. And again and again, as I drill into the results. If I bypassed it the first time, I don’t want it continually shoved in my face. Give me something different!
Finally, for years Google has deliberately degraded the backlink data it shares. You can’t look up a site on Google and see all the people linking to it. It has done this because, we’ve been told, potentially people might abuse this to game Google’s algorithm.
News flash. They abuse you already. All suppressing that data does is make it impossible for some of the most knowledgeable people outside of Google to really understand exactly how Google might be gamed in violation of its guidelines. If Google wants people to report spam, then give them the proper tools. Full link data now. The time is overdue.