How Google indexes passages of a page and what it means for SEO

Google will begin passage based indexing later this year for English languages. It is a ranking change, not an indexing change.

Chat with SearchBot

Among the slew of changes to search Google announced Thursday, we wanted to delve deeper into the passage-based indexing announcement.

Passage-based indexing updates. “Very specific searches can be the hardest to get right,” said Google, “since sometimes the single sentence that answers your question might be buried deep in a web page. We’ve recently made a breakthrough in ranking and are now able to not just index web pages, but individual passages from the pages. By better understanding the relevancy of specific passages, not just the overall page, we can find that needle-in-a-haystack information you’re looking for.”

Google said passage-based indexing will affect 7% of search queries across all languages when fully rolled out globally

What it looks like in search. Google provided these visuals to demonstrate the change:

UnderstandingPassages.max 1000x1000 1
With new passage understanding capabilities, Google can understand that the specific passage (R) is a lot more relevant to a specific query than a broader page on that topic (L).
Passage Index Zoom Google

In the video, Google said this at the 18:05 mark. “We’ve recently made another breakthrough, and are now able to not just index webpages, but individual passages from those pages. This helps us find that needle in a haystack because now the whole of that one passage is relevant. So, for example, let’s say you search for something pretty niche like ‘how can I determine if my house windows are UV glass.’ This is a pretty tricky query, and we get lots of webpages that talk about UV glass and how you need a special film, but none of this really helps the layperson take action. Our new algorithm can zoom right into this one passage on a DIY forum that answers the question. Apparently, you can use the reflection of a flame to tell and ignores the rest of the posts on the page that aren’t quite as helpful. Now, you’re not gonna do this query necessarily, but we all look for very specific things sometimes. And starting next month, this technology will improve 7% of search queries across all languages, and that’s just the beginning.”

Is Google indexing sections or parts of pages?

We asked Google if Google is now indexing passage or sections of the page. Google is not. Google is still indexing full pages but Google’s systems will consider the content and meaning of passages when determining what is most relevant versus previously we were largely looking at the page overall, a Google spokesperson told us.

It is more of a ranking change versus an indexing change

So indexing really has not changed here. It is more of a ranking change, how Google ranks content, based on what it finds on your web page. Google is not, I repeat, not, indexing individual passages on the page. It is however better at zoning into what is on the page and surfacing those passages better for ranking purposes.

What signals does Google look at here?

So previously, Google’s systems would look at some of the “stronger signals about a page– for example, page titles or headings– to understand what results were most relevant to a query. While those are still important factors, this new system is helpful for identifying pages that have one individual section that matches particularly well to your query, even if the rest of the page is about a slightly different or overall less relevant topic,” Google told us.

Will header tags be more important?

Does this mean header tags or the equivalent are more important now? Google didn’t have the answer for me on this. But I suspect while title tags are pretty important signals, headers in this case might be more important when this rolls out. Again, Google generally does not talk about specific ranking signals and Google did not comment on headers as a ranking signal.

Google told us they have “always had an understanding of keywords and phrases in documents, but often things like page title were very strong signals that helped us provide the best overall pages.” Now Google can find that “needle in a haystack” and surface the most relevant result based on information within passages. Again, which specific signals are important here, is hard to say.

Isn’t this like Featured snippets?

How does this differ from features snippets, where Google shows a passage of your content as an answer at the top of the Google Search Results. Google said its “systems determine the relevance of any web document via understanding of passages. Featured snippets, on the other hand, identifies the most relevant passage in a document we’ve overall determined to be relevant to the query.”

Where is this passages algorithm most useful?

Google said “this is helpful for queries where the specific bit of information the person is looking for is hidden in a single passage on a page that is not necessarily the main topic of that page.”

Let’s say someone searches [how does BERT work in google search], previously Google might have returned a bunch of results that seem to be relevant overall. Maybe Google would have returned a news story around BERT coming to Google Search. This news story might not actually directly answer the question.

Now if you have a really broad page that is about, let’s say how Google Search works, and in that broad page, there is one BERT passage that actually explains how BERT works. Even though the rest of the page isn’t super relevant, and those other BERT and Google Search pages might seem more relevant, Google’s new systems can zoom in on that one bit, and rank that page higher.

Goes live later this year

Google said this will start rolling out later this year and will start in English languages in the U.S. with more languages/locations to follow. Once this is rolled out globally, this will impact about 7% of queries on Google Search.


About the author

Barry Schwartz
Staff
Barry Schwartz is a Contributing Editor to Search Engine Land and a member of the programming team for SMX events. He owns RustyBrick, a NY based web consulting firm. He also runs Search Engine Roundtable, a popular search blog on very advanced SEM topics. Barry can be followed on Twitter here.

Get the must-read newsletter for search marketers.