When Google News Fails, Here’s How To Fix It

In today’s world of instant gratification, with Twitter often “scooping” traditional news sources, we still turn to professional journalists for accurate, timely news, and confirmation of the events that transpired. While “breaking” news offers instant awareness, we still want to read news accounts reported by trained pros, who have dug deeply for facts and have […]

Chat with SearchBot

google-news-logo-square In today’s world of instant gratification, with Twitter often “scooping” traditional news sources, we still turn to professional journalists for accurate, timely news, and confirmation of the events that transpired. While “breaking” news offers instant awareness, we still want to read news accounts reported by trained pros, who have dug deeply for facts and have published stories that have been vetted by qualified editors.

Nonetheless, we want “fresh” news, and increasingly we want to sample viewpoints from a diverse number of sources. We definitely don’t want yesterday’s “fish wrappers” as dated print newspapers were once called. So we turn to online news aggregators, and one of the most popular, with more than a billion users per week, according to Google, is Google News. With good reason: Google News offers 72 editions in 30 languages, drawing content from more than 50,000 sources.

Only one problem: In recent months, many of the “fresh” news stories featured on the Google news homepage are days, or even months out of date.

For example, here were the two “top” stories in the Technology category on December 2:

Gnews Itunes2dec
While the Microsoft story was relatively fresh, the Apple iTunes story was four days old.

Similarly, on November 24, the two top stories for the Science category:

Google News11 24

The space station story was two days old; the climate change story had been published five days previously. “Current” news? Yes. “Fresh” news? Not.

So what’s the story?

All The News That’s Fit To Automate

Google finds and indexes news using the same crawler it uses for other web content (it previously used a specialized crawler called Googlebot-news). At the bottom of the Google News homepage, you see this message: “The selection and placement of stories on this page were determined automatically by a computer program. The time or date displayed (including in the Timeline of Articles feature) reflects when an article was added to or updated in Google News.”

But crawling and indexing news is just part of the story. Many other factors determine what you see on the Google news homepage apart from the date and time a news story was “added or updated in Google News.”

For one thing, thanks to personalization and your web history, most “editions” of Google news are somewhat tailored and unique to each individual user. This means that if you’ve consistently read stories about climate change, for example, Google’s personalization algorithms may opt to select sources you’ve read or points of view you’ve previously favored rather than trying to present you with the freshest article in its news index.

Other factors play a role in what’s selected for your Google News homepage, including:

  • Local news, determined largely by your computer IP address, and the proximity of local news sources near you.
  • Editors’ Picks, a list of the top five stories for a particular publication that are self-selected and submitted to Google via a custom RSS feed.
  • Authorship, an article associated with a Google profile of a journalist.
  • Social discussions in Google+ (but not from other social sites, such as Twitter or Facebook).

All of these signals can come into play in determining the stories you see on your version of the Google News homepage. And, since news is constantly changing and Google is also constantly updating its index, your “front page” is anything but static.

At Google, Getting A Good Date Can Be A Problem

One of the most difficult challenges Google faces is in determining the “true date” of a web page or news story. While there are structured metadata standards (such as Schema.org types) that allow publishers to associate a specific date and time with a web page or news story, very few organizations are using them. So Google must rely on other clues, such as the date when a webpage file was saved or changed on a web server, or on-page dates inserted by either a writer or a perhaps less-than-reliable bit of code.

With web pages, Google generally wants to display the freshest version of a page. Once a news story has been published, however, it seldom changes (apart from corrections or minor updates). Instead, as events unfold, newer stories are written about the news event – and it’s these fresh stories that readers want, rather than earlier versions.

Problem: Some content management systems “update” news stories with newer “related” headlines. Google’s crawler detects these minor changes, and then must make a “judgement call” about whether the change warrants making the story “fresh” again. In most cases, minor changes are disregarded, but things can get really weird thanks to Google’s Editors’ picks feature. These are headlines from select publications that are self-selected by the publication and submitted to Google via a custom RSS feed. Here’s what the editors’ picks from my local paper, the Denver Post, looked like on the Google News homepage on November 29:

Editors Picks

See that top story on the Aurora shooting? It was published more than four months ago, and was a Denver Post editors’ pick in July (again, note that this was a story specifically flagged by the Denver Post, not something that Google’s crawler identified as an important story). The story itself hasn’t changed since then; however the Denver Post content management system “updated” the page with a relatively recent related headline. Google’s news crawler detected this minor change, and thus the story became a top headline again, even though months out of date. Most of the other “top stories” were also days out of date.

This shouldn’t have happened – Google requires editors’ pick feeds to be updated every 48 hours. In the case of the Post, editors’ pick stories are flagged manually but then added automatically to the RSS feed submitted to Google. According to the Post, their content management system was recently updated and had a few glitches with feed management that they’ve subsequently ironed out.

So yes, from time to time you’ll see “stale” news appearing on the Google News homepage. As with anything automated, it’s virtually impossible to anticipate glitches. The good news is that Google has provided many tools to help you influence what you see and take a certain amount of control over what’s largely an automated process that goes on behind the scenes in Google News.

Customizing Google News

Google already is doing a lot behind the scenes to tailor your news experience. Foremost is personalization, but you can go beyond what Google automatically detects and specify manual preferences for things like location, news sources and other factors that are important to you. Check out Google News personalization basics for details on what (and how) you can change things.

What can you do to make personalization even more personal?

For starters, you can customze the sections you see, adding topics that interest you – news about search marketing, sports teams, local politics – pretty much anything you can search for. You can also add ready-made topics from the Google News Custom sections directory, saving you the time and effort of fine-tuning your preferences.

Want the freshest, most in-depth coverage Google News has to offer on a particular news story? Dig deeper by clicking on its expandable story box. To do this, mouse over the right side of a headline to display the “Click to see related articles” button:

Gnews Expand Story

Click on this, and you’ll see more headlines, images, related links, and yet another link, “see realtime coverage”:

Gnews Expanded Headline

This is a hidden gem – click on “see realtime coverage” and you’ll get streaming headlines as Google crawls and adds them to its index. But that’s not all – you’ll also see a timeline of the number of sources covering the story over the past month, links to “In Depth” coverage (similar to the “Spotlight” stories on the main news homepage menu), displaying stories that may not be super-fresh, but that offer deep or longer-lasting insight), and in some cases, related links (for example, in the story about North Korea, links to stories from South Korean news sources).

Prefer your news on the run? Like most sites, the mobile version of Google News operates somewhat differently than the desktop version, and they’re all subtly different depending on your device. Check out browsing and searching mobile news, a guide to personalization for your device, and this list of supported devices for tips on how to best access news on the go.

Be sure to also check out advanced Google News search. Although it’s not easy to find, Google does offer advanced news search that’s pretty similar to other advanced search options, including the ability to limit your search by quasi-Boolean operators, location of words in articles, source, and date added to the index. Regular Google News search only allows you to search articles from the past 30 days; advanced news search retrieves content from the defunct Google News archives, with articles dating back in some cases to the 1800s.

Finally, if you consider it “news” you can also access Google+ posts in Google News. You’ll need an account and must be signed in, but once you’ve set things up you’ll see “relevant” posts on the topics you’re following in Google News.

Want to learn more about Google News and how it works? Check out Google’s News Help, or Search Engine Land’s extensive Google News archive of coverage.


Contributing authors are invited to create content for Search Engine Land and are chosen for their expertise and contribution to the search community. Our contributors work under the oversight of the editorial staff and contributions are checked for quality and relevance to our readers. The opinions they express are their own.


About the author

Chris Sherman
Contributor
Chris Sherman (@CJSherman) is a Founding editor of Search Engine Land and is now retired.

Get the newsletter search marketers rely on.