Information gain: Here’s what this new SEO ‘buzzword’ really means

Here's what information gain means for machine learning, Google patents and information foraging theory – and how to leverage it.

Chat with SearchBot

The new buzzword in SEO is information gain. And like all new buzzwords, SEOs are throwing it around like we’ve just discovered fire.

But there’s a massive problem.

Information gain means different things to different people.

In this article, you’ll learn about information gain and how to use it to your advantage.

The 3 schools of information gain: Humans, machines and search engines

Information gain can be used in three topics:

  • Machine learning.
  • Google Patent.
  • Information foraging theory.

Information gain is used to train decision trees in machine learning. And unless you are a computer programmer, we can largely leave that can of worms unopened (for now).

When SEOs talk about information gain, they mainly refer to the Google patent.

Google was granted a patent in 2022 regarding an information gain score that applied to documents. 

This patent showed that Google had developed a way to measure the “sameness” of content and either promote or demote it accordingly.

This is a great way for Google to deal with content that is essentially unoriginal or simply copied from another source and reworded. 

But what about information gain in relation to the information foraging theory?

Information foraging theory was documented in the book of the same name, written by Peter Pirolli.

It applies the models of how animals search for food (optimal foraging theory) to how humans search for information (which we’ll talk about later).

As you can see, we have three different meanings for the same term. 

With regards to SEO, the Google patent is mainly easy to understand – just make your content unique.

However, information foraging is more complex, so we need to examine it more thoroughly.

Why information foraging matters for SEO

Recently, Google started discussing information foraging theory in their decoding decisions report (the messy middle).

Indeed, information foraging theory seems to be the direction Google is heading, and to quote their report directly:

“An explosion in product choice and information has made it harder to feel confident about making the right decision.”

Or, to put it another way, there’s just too much information out there.

If we have too much information, the time to make a purchase decision is increased, and this isn’t good for anyone.

You can see why Google SGE might help things if you think about this.

By providing a generative AI response to a search query, a search user immediately grasps the subject without needing to click a website.

This initial information should help a user to make their next search decision.

Take this result from a search in Perplexity.

Perplexity - Best gym shoes for bad knees

Within seconds, my knowledge of the best gym shoes for bad knees has increased, and there are many links and suggestions.

My next click will be to look at the suggested shoes, not to read another five blog articles.

If SGE works similarly, you can see how it will radically change commerce.

We’re no longer optimizing for Google. We’re optimizing for AI.

Dig deeper: LLM optimization: Can you influence generative AI outputs?

Get the newsletter search marketers rely on.


From SEO to information gain optimization 

Google has been involved in AI for a long time, and AI is part of many of its systems.

They used BERT to improve their understanding of language, and I’m sure many more systems are in use.

The point is that Google is trying to understand content to serve search engine users better. Therefore, Google itself is reading your content.

Sure, not like a human does, but they are reading it. 

So, it makes sense to apply a similar approach to increase Google’s information gain from content, just like humans.

In essence, we become information optimizers. 

Our job as SEOs is to continually increase the rate of information gain.

The rate of gain, explained

Information gain rate, when it comes to information foraging theory, is described as:

  • Rate of gain = Information value / Cost associated with obtaining that information

You see, while search engines carry a cost for indexing and retrieving documents, so do humans.

When we use our brains, we consume calories, and the body is highly efficient at not wasting them.

We use heuristics (mental shortcuts) to filter the world and make decisions.

Information foraging theory suggests that we seek to do the same. We attempt to gain as much information as possible from a source in as little time as possible.

To do this, we go through a five-stage process.

Goal

  • What information do we need?

Patch

  • We decide on what source of information would best deliver our goal. This could mean that we go to Tripadvisor, TikTok, YouTube or any website/ search engine that comes to mind.

Forage

  • Here, we search for the information we need on the platform of choice. For this example, we’ll stick with Google. You type into the search engine keywords to try and find the information you need.

Scent

  • When we head to search engines, we’re looking for the scent of good information sources; signals such as reviews, higher rankings and page titles that encourage clicks.
  • We click on sites, scan information and decide whether to invest time reading the resource.

Diet

  • We consume information from multiple sources before making decisions. This is what Google refers to as the messy middle of search. 
  • For brands/ sites, being part of your consumers’ information diet increases the propensity that they will come to depend on you for information and trust you.

As we know, that trust leads to purchases or increased clicks (which can lead to advertising/ affiliate revenue). So this means that SEO should include optimizing around information scent.

But if you’ve read the above, you can see that Google search works similarly, just a machine version.

Information optimization: The new science

If we’re going to optimize around information gain, we need to understand that it requires a greater understanding of two factors:

  • Machine learning. 
  • Human learning.

We already know that Google wants original, experience-based information from the best sources.

They also want to reduce the cost of extracting that information.

Yes, Google wants an easy life. So, how do we do this on a practical level?

Simply put, we make extracting information easier for both machines and humans (at the same time), and here’s how.

The optimal website maximizes the value gained per interaction 

Contrary to what many think, fast websites might matter, but if the information gain rate from a website is low or has a high perceived cost, then the person will leave.

Here’s an example.

I’ve asked ChatGPT for some information about a hotel in Paris. It gives me the information the best way it can.

ChatGPT output - Paris hotel

It gives a lot of information I can easily extract at a low cognitive cost. But how should a website deal with this?

Tripadvisor has a whole page dedicated to the hotel. Look at how they’ve optimized one section for information gain rate.

Tripadvisor - Paris hotel page

The content – which uses symbols, scorecards and lists room types – is designed for humans (and machines) to gain the most information in the least time/cost.

And it’s this that we have to get our heads around to help search users.

But we need to destroy some myths around content.

Good content is context-based

I read a lot of good content, but most of it’s in my inbox in the form of blogs people have written that are not designed to gain traction from search.

Good content for SEO is wildly different.

When we search online, we have an emotional need state that requires solving.

Kantar and Google did some research a while ago.

Google & Kantar research on search intent

In this study, the above need states were used by searchers, who came to search engines looking for them to be resolved.

Some words that stand out across from each need state are:

  • Quick.
  • Laser focused.
  • Specific phrases.
  • To the point.
  • Simplicity.
  • Uncomplicated.
  • Trust.
  • Ratings.
  • Reviews.
  • Competence.
  • Location.

It’s these attributes in information that search users look for in content online.

Strikingly, we can see how Tripadvisor’s content displays these attributes, and we can also see how applying them to content would increase the information gain rate for humans and machines.

But how can we start to take the approach of information optimization to content?

Well, here’s a four-part process to get you started. 

Part 1: Content structure

Look at how your page should be structured for search to increase the information gain rate.

A good example is the Tui website:

Tui website - faceted search buttons

They’ve used faceted search “buttons” to help users find what they are looking for.

Consider how best to design your page for humans and search engines to increase the information gain.

UX matters, as does the information on the page.

Part 2: Information architecture 

Consider how you want your information to be structured for maximum information gain.

You might consider giving information early and quickly, for example:

“When is the best time to travel to Jamaica?” 

“March is the best time to travel to Jamaica.”

Look at your content and aim to add some, if not all, of the following attributes.

  • Quick adventure.
  • Laser focused.
  • Specific phrases.
  • To the point.
  • Simplicity.
  • Uncomplicated.
  • Trust.
  • Ratings.
  • Reviews.
  • Competence.

Part 3: Content design

The last impact is the design of the content.

Consider how best to add value, such as using unique images to your posts to help explain information or data.

Backlinko graphics

Backlinko uses images like the above to convey data in an interesting format.

This leads us to the final part.

Part 4: Content difference

If you do all of the above, you should have content that is very different from what already exists.

But if you don’t, ensure that you do.

There are 1,000 different ways to say the same thing, but it requires creativity and consideration about how best to display your unique angles and viewpoints around this.

But here’s a little challenge.

Head to a site like Backlinko or HubSpot and look at their content.

Find an article and apply the above four-part system, and think about how you would improve it based on your unique views or experience.

This could serve as a suitable workshop for agencies and in-house staff to consider the information gain and how best to apply it.

Because in the era of generative content, information gain is king.


Contributing authors are invited to create content for Search Engine Land and are chosen for their expertise and contribution to the search community. Our contributors work under the oversight of the editorial staff and contributions are checked for quality and relevance to our readers. The opinions they express are their own.


About the author

Andrew Holland
Contributor
Andrew Holland is the Director of SEO at JBH, an award-winning Digital PR and SEO agency. His SEO background stems from a 17-year career in the police, where a number of years were spent within intelligence, utilizing SEO within internal systems to catch criminals. After developing chronic asthma, Andrew left the police and launched a freelance SEO and digital marketing career, and has worked with a wide range of clients, from SMEs to 8-figure businesses. Today, he directs the SEO delivery for one of the UK's fastest-growing Digital PR and SEO agencies. Building organic visibility, trust and fame for consumer brands.

Get the newsletter search marketers rely on.