Of Climategate, Googlegate & When Stories Get Too Long

Daily Telegraph writer James Delingpole got worked up yesterday because his colleague Christopher Booker’s story on the “Climategate” scandal mysteriously disappeared from Google. Skullduggery, he pondered? Nothing so dramatic, says Google. The article simply grew too big in length to stay in Google News.

Let’s do the breakdown. Booker’s story of November 28 covered the controversy over how academics at the University Of East Anglia were apparently trying to suppress anti-global warming views from other academics from getting widespread attention.

Booker’s story, Climate change: this is the worst scientific scandal of our generation, even had a Google connection right from the start, leading off:

A week after my colleague James Delingpole , on his Telegraph blog, coined the term “Climategate” to describe the scandal revealed by the leaked emails from the University of East Anglia’s Climatic Research Unit, Google was showing that the word now appears across the internet more than nine million times.

Or maybe 1 million times, if you search using +climategate to eliminate possible synonyms and alternative spellings that may or may not be related. Then again, over at Bing, it’s 54 million matches, dropping to 1 million also if you do +climategate. Search engine number counts are slippery little devils, and I don’t recommend citing them as proof of anything.

But still, 1 million, 10 million, let’s not quibble. This was a big story, whether or not you give it the “Climategate” name that Delingpole coined. So big that when it Booker’s article disappeared from Google News, he suspected the worse, writing:

What is going on at Google? I only ask because last night when I typed “Global Warming” into Google News the top item was Christopher Booker’s superb analysis of the Climategate scandal.

It’s still the most-read article of the Telegraph’s entire online operation – 430 comments and counting – yet mysteriously when you try the same search now it doesn’t even feature. Instead, the top-featured item is a blogger pushing Al Gore’s AGW agenda.

Perhaps there’s nothing sinister in this. Perhaps some Google-savvy reader can enlighten me…..

UPDATE: Richard North has some interesting thoughts on this. He too suspects some sort of skullduggery.

I’m quoting his entire piece, because I’m going to dissect it bit by bit for how absurd it is, before I even get to the official Google explanation. And hat tip to David Dalka for alerting me to this story, by the way.

Wow, the story is no longer the top item? Well, stories at the Daily Telegraph’s site itself change throughout the day. Heck, the Daily Telegraph’s print edition changes each day. And so, too, does Google News change. At least hourly, in fact. Here are two articles we’ve published this past month that explain more about this:

Shouldn’t Delingpole have known this before ringing the alarm bells? After all, he appears familiar with Google News, in that he understood when Booker’s article appeared there. Surely he’s seen other articles move on and off? And working for the Daily Telegraph, surely he could actually ask someone on staff who’s familiar with its presence in Google for some further background (I’m virtually certain they have someone like this).

But no. Instead, it has to be painted as a plot. Look, Booker’s article is gone and in its place, conveniently, a pro-Gore blogger pushing that global warming is real.

Let’s get this straight. There’s no lack of eyes on Google News. There’s also no lack of story selection it could edit, if it wanted to. Perhaps some stories that are critical of Google, maybe? Maybe some stories to push some particular Google-liberal-whatever agenda in its home country of the United States, maybe?

No, instead what Google does is decide to wipe out one particular article on the global warming issue. That’s where it’s going to shoot its credibility wad. Oh, and make sure to do it to the Daily Telegraph, which has attacked Google for showing its stories. That’ll introduce some nice irony. When the Telegraph complains about the missing story, Google can just say “Oh, thought you didn’t want us destroying your business model.”

Yeah, that’s the ticket. Yeah, kill that one story, and Google will somehow manage to keep the truth from getting out there. Because it’s not like there aren’t those other 1 million to 54 million pages that mention “Climategate” on the web, depending on which count you want to believe.

Oh, but Delingpole says maybe it’s nothing. But then again, he notes someone else is exploring the issue and “too suspects some sort of skullduggery.” IE — there Delingpole gives the impression he believes there IS skullduggery, and the implication is that Google’s to blame.

That other person — Richard North — actually doesn’t blame Google over the mystery but instead wonders if someone has hacked the Daily Telegraph site and manage to get this one page blocked. He writes:

This cannot be accidental – there is a quite deliberate attempt to prevent this piece being listed. Repeating the exercise on Bing.com and Yahoo.co.uk news pages gets similar nil results. Yet other headlines from comment pieces from The Sunday Telegraph show up immediately.

James Dellingpole has picked up the problem (great minds) but my guess is that this isn’t a Google issue. The problem probably lies closer to home – there looks to be an enemy in the camp, who has probably been using this, or something like it.

The same piece not found in three different search engines? Yes, that’s odd. It’s very much a classic sign of an indexing problem. But if you had access to block one story, you’d probably try to block many of them. Plus, let’s be reasonable, blocking Booker’s story wouldn’t keep his particular views from getting out.

That’s particularly the case in that when looked today. I couldn’t find the story at the Daily Telegraph, but I did find copies of the identical story syndicated on other sites. I also found right at the top of regular Google in a search for it by the headline. So it wasn’t blocked from Google. It just wasn’t showing in Google News.

That’s odd, so I checked with Google. Remember Delingpole talking about all those comments the story got? That’s the culprit, Google says. There were so many that the story ballooned over the 1MB size, causing it to be dropped from Google News as too large (too large in file size, not too big of a story topic!). From the statement I was sent:

The article attracted so many comments that it exceeded a threshold for the page being too large (it’s more than 1.3 MB of HTML at this point). We’re looking at whether it makes sense to allow larger pages in the future. As with Google Search, our goal for Google News is to give users the most relevant, objective results, which is why we generate them automatically and without human intervention.

Google web search used to be this way many years ago. It used to index the first 101K of an article and ignore the rest. Of course, pages that were too big still got listed, unlike what happens with Google News, apparently.

Now if you want some mysteries, here are two. First, the article is flagged by the Daily Telegraph with a meta robots tag that tells Google not to show a cached version. And yet, I can see a cached copy. What’s up with that?

Also, a known issue with Google News is that once it visits a news story, it doesn’t come back for updates. So how did it realize the story got too big for inclusion? (Postscript: Brent Payne, who oversees SEO for the Tribune papers, tweets that this is no longer an issue).

I’ve got follow up questions out to Google on both of these issues. Perhaps they’ll have a convenient explanation to cover up any lingering doubts to turn this into a conspiracy.

Seriously, there probably are good explanations. And if I sound harsh on Delingpole, it’s because I get stories like this all the type where ordinary common sense should eliminate the conspiracy theories. Someone’s little known page on a little cared about topic goes missing, and it turns into a “Google’s out to get me” situation. As if Google even knew who they were.

In this case, it’s a well-known story on a politically-charged topic. But it’s not the first well-known story on a politically-charged topic where Google might have felt temptation to assert editorial control. So why would it start here? And why with just one story? And for all the brains at Google, fully aware of how news flows on the web, they’d be that stupid to think no one would notice.

No, these things don’t add up. But having to debunk what the Daily Telegraph could have investigated itself, rather than just blogged and alleged, leaves me kind of grumpy. Delingpole doesn’t mention in his piece about trying to contact Google in any way — there’s no “waiting to hear back from Google” or anything like that.

Maybe a Matt cartoon on the entire Googlegate affair would cheer me up.

Postscript: A comment below as well as this email I received raises an issue with the related searches (called Google Suggest) that appear when you start to type in the search box:

On 11/25/09, I could type “cli” into Google and “climategate” would pop up as the top suggestion, with around 3 million results. The next day, I could type as much as “climategat” and no suggestions whatsoever. Someone at Google had deleted it as a search suggestion. Then Sunday afternoon, it was back again as a suggestion with around 13.3 million results. Today, it has disappeared again as a search suggestion and only 11.4 million results.

I checked with Google and got back this statement:

Google has not ever removed the query [climategate] or variations of the query from Google Suggest.

Google Suggest uses a variety of algorithms in order to come up with relevant suggestions while the user is typing. We do remove certain clearly pornographic or hateful or malicious slur terms from Suggest.

My assumption is that on one day, if a lot of people were searching for climategate, then that might appear. Then if queries dropped off, the suggestion might go away. Then return again if more started searching again. I’m checking to see if I can get more clarification.

Related Topics: Channel: Content | Features: Analysis | Google: News | Legal: Censorship | Top News

Sponsored


About The Author: is a Founding Editor of Search Engine Land. He’s a widely cited authority on search engines and search marketing issues who has covered the space since 1996. Danny also serves as Chief Content Officer for Third Door Media, which publishes Search Engine Land and produces the SMX: Search Marketing Expo conference series. He has a personal blog called Daggle (and keeps his disclosures page there). He can be found on Facebook, Google + and microblogs on Twitter as @dannysullivan.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • sierra

    While the nefarious interpretation is clearly unwarranted, thresholding pages by size is stupid. If a page has over 1,000 comments, it indicates some level of popularity, engagement, and relevance. Distinguishing the core content from the comments is trivial. Why doesn’t Google perform this simple analysis?

  • todderchek

    Thanks, Danny. I don’t believe there is necessarily a conspiracy, but there are two other components to the story for which I haven’t heard a reasonable explanation. Until Thursday, Google search engine autosuggested “climategate” when the first few letters were typed, which disappeared around friday. It is now back on, though only after typing “climateg”, whereas before “clim” was enough to provoke the suggestion. In a similar vein, YouTube (owned by Google) autosuggested “hide the decline”, a popular video mocking climate scientists involved in the scandal until about the same time. It was also removed from the most viewed picks, even though at the time it had over 200,000 viewss in two days and the views beat the socks off of other videos listed.

  • tzwjwc

    Nice explanation, but let’s be reasonable. The ‘Climax Blues Band’ is autosuggested and Climategate is not. Cynic or not, that doesn’t pass the common sense test. Something is up.

  • http://painlord2k painlord2k

    I agree with the article, Google in probably innocent.
    You could find “climategate” auto-suggested in Google, with different languages set.
    But not England.
    As we don’t know the algorithms used to compute the auto-suggests, the ranking and so, it is entirely probable that “climategate” was dropped from the autosuggest in UK and lesser than this in US because it went from no exist to very popular in few hours/days. This, probably, raised a red flag somewhere, that dropped the word from a few places. The effect appear to be more marked in UK, where they use “climategate” more than in US where they gave it other names also. In Japan it was suggested, like in Italy.
    If they wanted to censure “climategate” they would have censured it everywhere, not only in UK and partially in US.
    Anyway, “never blame evil when incompetence can be a reason”.

  • kukahat

    comedy time of the day:
    maybe it was Hitler behind global warming? http://www.youtube.com/watch?v=jGdbHW9Nlds

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide