Google changes Core Web Vitals metrics; How to use lab and field data for optimization

Field data, considered towards rankings, will vary depending on real-world user device power, screen size, and network connectivity. Lab data has default values for these and, (except in the case of Page Speed Insights), can be calibrated by developers to simulate all manner of conditions.

Chat with SearchBot

Google’s mobile page experience ranking factor, originally thought to be part of May’s Core Ranking Update, will be rolling out in a matter of days. Are you ready? If not take heart.

There are several weeks reserved for observation and adjustment: “Page experience won’t play its full role as part of those systems until the end of August,” said Google. After that, the desktop page experience ranking factor will be next to roll out and will be fully launched before the end of the year.

About the Core Web Vitals metrics

Core Web Vitals are grand performance metrics related to speed that factor in towards achieving a stable, viewable, and usable experience given a device viewport and including offscreen content up to 9000 vertical pixels. Faster is better, which typically means lower metrics evaluations are better.

Field data, considered towards rankings, will vary depending on real-world user device power, screen size, and network connectivity. Lab data has default values for these and, (except in the case of Page Speed Insights), can be calibrated by developers to simulate all manner of conditions.

Lab data is not considered for rankings.

Core Web Vitals performance metrics are complex and imperfect, and fixing page experience snags can be perplexing. Even now, Google made last-minute changes for upgrading all its tools to include sharpened formulas in response to cases brought by developers in the field.

You can generally look forward to improved scores if you have been affected by the metrics which have undergone some reengineering. Particularly helpful are modifications to the way Largest Contentful Paint (LCP) and Cumulative Layout Shift (CLS) are measured.

Changes to First Contentful Paint

The threshold for achieving “good” scores for First Contentful Paint (FCP), components of which contribute to Core Web Vitals without FCP actually being one, increased from 1.0 to 1.8 seconds. FCP accounts for Time to First Byte, more a reflection of your server response time than anything you manipulate directly with code, plus the time it takes to process render blocking resources such as CSS, which you can.

Changes to Largest Contentful Paint

LCP, a significant milestone in the lifecycle of a page, originally didn’t include some offscreen elements. Now LCP pinpoints the largest element even if after it is later removed from the page DOM once discovered, or when several images of the same size all qualify.

Such situations occur when carousels load and cache content for offscreen slides. Another helpful modification is background images get ignored by LCP, as well.

Changes to Cumulative Layout Shift

To prevent situations like extremely long browsing sessions undermining CLS scores, smaller “window” sessions are capped at 5-seconds, marked as ended by a 1-second gap as a boundary to find the page’s worst 5 seconds of layout shifting.

That’s a much better representation of shifting than tallying wholly uncapped sessions which can last 20-minutes or more for scores that are blown way out of proportion.

What’s old is new

Google won’t be using the old calculation as part of the page experience ranking factor. For outlier use cases, however, whole session scores can still be useful. The way these data are retrievable by API means the old score calculation can live a second life for those who want it. You’ll be able to retrieve it independently or by accessing Google’s open repository via CrUX Report (SQL): uncapped_cumulative_layout_shift.

SELECT
  uncapped_cls
FROM
  `chrome-ux-report.all.202105`,
  UNNEST(
    experimental.uncapped_cumulative_layout_shift.histogram.bin
  )
AS uncapped_cls
WHERE
  origin = 'https://searchengineland.com'

When Page Speed Insights (PSI) data is useful

The trick is learning about more than one official way to retrieve your scores, which gets further complicated by how to think about the data you’re viewing. Page Speed Insights (PSI), often highlighted by SEO practitioners, doesn’t provide enough information on its own to tell the whole story.

PSI is designed to give a comprehensive snapshot to developers for troubleshooting performance problems. When available from CrUX, field data aggregated over a previous 4-week period is useful for making comparisons. An appearance of both lab and field data will undoubtedly display a difference between the two.

Variance is a natural occurrence between testing sessions and when comparing tests from different devices and or networks. Field data therefore varies as wildly as a given website’s audience. PSI field data, therefore, represents a range of data, aggregated over the previous 28-days, up to the most recently completed full-day’s worth of data.

The CrUX of it

Google’s page experience ranking factor conceivably could rely on the same previous 28-day aggregated scores. It’s unlikely to, however, as it would be far more performant if Google’s page experience ranking factor relied instead on the preceding month’s aggregated 28-day prepared BigQuery dataset. In that case, we can expect any ranking changes to take effect on the second Tuesday of each new month.

That is, BigQuery data for CrUX reports undergo a performance optimization process which prepares the preceding month’s data for public consumption. Such indexing, and possibly caching certain query responses, allow CrUX users to query historically all the way back to late 2017, when data was first being collected. Announcements arrive the second Tuesday of each new month for when the previous month is ready for queries.

Lab data provides richer feedback

Lighthouse lab scores in PSI are “calibrated to be representative of your upper percentiles” for worst case scenarios, like that of underpowered browsers on sluggish networks. Google purposefully calibrates it so developers have richer feedback to more easily troubleshoot problem areas which can occur but are less common in the real world.

If lab scores were indicative of more average conditions, it wouldn’t reveal the performance bottlenecks developers need to see for making changes to improve page experience under stress conditions. Lighthouse outside of PSI, in Dev Tools or packaged by NPM as an open Node project, can be calibrated to simulate various scenarios.

Field data provides real-world usage examples

It’s incredibly important to read scores understanding what data gathering method was used for powering the scores you have at hand. In the case of PSI you may see both lab and field data and they shouldn’t be confused as the same thing. Field data is collected for CrUX reports and you can collect it on your own, as well. Field data is indicative of the audience as recorded by browsers with real-world usage of your website.

Field data is important because it is what Google uses for its page experience ranking factor. Field data scores will almost invariably be better than lab data scores for the same page. Field data stable through time once prepared for long-term storage performance, (which you can refine by several criteria), with recent data newly prepared on a monthly basis, while lab data can differ with each new test.

When browsers have the requisite permissions to transmit scores, field data gets sent and collected for use by PSI, any Core Web Vitals tool that uses the CrUX open API, or for those who write Core Web Vitals JavaScript into web pages. The only way to examine field data in real-time is to opt for writing the JavaScript and collecting it for your own use in a browser console or repository, or having it sent to Google Analytics.

Use lab data for optimization

The open-source Lighthouse project is what powers lab data and it has been implemented in Dev Tools and can also be installed in a package which comes with it’s own Command Line Interpreter (CLI). Lighthouse in Dev Tools can be configured to match decreased or boosted power and speed from the “upper percentile” default.

You might want to simulate varying power and speed, for example, if you have the wherewithal to deliver more elaborate experiences at certain simulated thresholds implementing a progressive enhancement strategy.

CLI usage

Lab data as available working from a local machine perhaps integrated into a workflow as part of pre-production testing and a continuous integration process. This type of setup requires you to have the Chrome web browser and Node installed.

The CLI will spawn a Chrome browser process for access to the rendering engine and Lighthouse library. It will tabulate data returned by, and then afterward kill, that Chrome process. Options for opening the resulting report and calibrating to device and network settings are available as part of the command options

$ lighthouse <url> [OPTIONS]

The most common options will be --view, which automatically opens the report in your system default web browser, --throttle directives for simulating different device environments, and --only-categories for limiting tests to just those which affect performance and have subsequent determinant factors for Core Web Vitals. Although running SEO tests is useful, too many SEO improvements don’t move the needle for Core Web Vitals.

Lighthouse Cli 1
Example Lighthouse CLI command

Why we care

We really only care about these specific concerns if we’re developers learning to troubleshoot and improve performance factors that affect page experience. Even when we’re unconcerned about SEO, these factors are incredibly important for how our pages and applications which render pages, including native applications with WebViews, are experienced by actual users in the field.

All too often, an application that gets heavy usage results in countless hours of frustration and can negatively impact the bottom line. The NYTimes native apps on mobile phone-sized screens, for example, loads content and script slowly under even the best networks conditions. It results in delays scrolling and clicking even after a top story is rendered in the viewport. It can further be truly awful to navigate constant layout shifting when adverts load at inconveniently delayed times.

Google has plenty of case studies which show the positive effect on revenue when after implementing performance fixes bring more positive experiences. If not for the value of NYTimes content, their app might be further derided and suffer even far less usage than it currently enjoys. The app should definitely be improved which would be a welcomed change for so many readers.

If you are inclined to want to dive deep on Core Web Vitals, then be sure to attend our upcoming SMX Advanced conference sessions next week. For those of you who are developers, the SEO for Developers Workshops on the 17th and 18th explore these topics with greater depth and detail.


Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.


About the author

Detlef Johnson
Contributor
Detlef Johnson is the SEO for Developers Expert for Search Engine Land and SMX. He is also a member of the programming team for SMX events and writes the SEO for Developers series on Search Engine Land. Detlef is one of the original group of pioneering webmasters who established the professional SEO field more than 25 years ago. Since then he has worked for major search engine technology providers such as PositionTech, managed programming and marketing teams for Chicago Tribune, and advised numerous entities including several Fortune companies. Detlef lends a strong understanding of Technical SEO and a passion for Web development to company reports and special freelance services.

Get the must-read newsletter for search marketers.