5 ways to use the Wayback Machine for SEO

Use the tool to recover lost information and glean insights into the direction of competitor strategies.

Chat with SearchBot

Sometimes a simple tool can give you incredibly powerful insights. 

The Wayback Machine is one such tool.

The Wayback Machine takes historical screenshots of web pages and stores them in its public database. Anyone can use the Wayback Machine to view previous versions of pages or entire sites. 

Here are five smart ways you can use the Wayback Machine for SEO.

Get the daily newsletter search marketers rely on.


1. Find legacy URLs from old versions of the site

One of the most useful ways to use the Wayback Machine is to find historical URLs that have never been redirected.

The Wayback Machine collects information about your site throughout time. So it might have access to URL data from 10+ years ago. 

This is especially important for sites that have been around for a long time. It’s likely that the stakeholder who managed the site years ago has changed the company or left roles and may not have used SEO best practices during site migrations.

The Wayback Machine can be a lifesaver here. You can quickly find old URLs that were never redirected to live versions. 

For example, the “Headphones” page from Bose (https://www.bose.com/products/headphones/) from 2003 was never redirected:

AekkDR9ilqcHS5f1 JxwHnKrFCQk5rteHY REj8tdi2avU0oZ9TX7Y9YmbNfokMgLtuAVkx07PrVqChdN0eijtuvf8RuKCDv5BFwSwbbVdxf7s8X V Ir4hESnKjV4d7IgGZUU7a

Using the Wayback Machine, it’s easy to discover legacy versions of key content from previous site versions. You can then find URLs to redirect that you would have likely never discovered otherwise. 

Want to take this to the next level? Read Patrick Stox’s article on using the Wayback Machine API to find historical redirects. By querying the API, you can bulk export legacy URLs. This can be much more efficient for larger sites. 

2. Find previous page content

Website content changes over time. This happens for a variety of reasons (e.g., SEO, CRO, site migration or highlighting different aspects of a product). There is always an inherent risk of any content changes, especially if they’re significant. 

This is where the Wayback Machine comes into play.

If you’re seeing large losses in rankings after changing content, you can check the Wayback Machine to surface previous versions of old pages. Restoring the content to its original version could help your content regain lost visibility. 

For example, NYMag’s “Best Pillows For Neck Pain” article has lost organic visibility since mid-2020 for terms such as “neck support pillow.” This has resulted in organic traffic loss over time. 

39 OKCbGRBXwsVYlo AF75dh2fCE ZGGurIq9W0FyfTscQKl1XFCl CaivMHC Jp0jt0aR0ZPRJMcptKHhqQ99p6RXO2D9DxUibwiuuysLfL3CVKMFkuCVesnrEGYUAUsqlJwP3j

Comparing the page to early 2020, we see that they have changed the content since then. The 2020 version included a quote from a chiropractor from the American Chiropractic Association in the introduction and kept the products above the fold. 

D7arNF73jNChvHcSdgF 6IIreHYEkvdTEiAH38ryS0B1Ww2EotQJ5Y3aVKhUf8WtT8eVdPHmd3i9KrdZPCAMm5JxH3RTqLyrsY75a0lZG5hSNR5s9h1lmLju4xHBF81D7iSK ZhT

However, in the current version, they have added more content to the introduction, pushed the products below the fold and moved the American Chiropractic Association quote further down the page. 

7t1Ib8X1Xf1LVtxXwRs7QE9vhNnYFRvyYI2TYNB4vblR2WTZVhc1uuY0RnYmSLYtwM9aYVGVTrbKkzJX0sVmd DQsGtyeuPoK2FfZMJtsp4Jnu9TvAuTWiFeuIUAeY0BnIcRXBlD

While this might not be the sole cause of the ranking drops, looking back at the previous content during peak rankings could help them test restoring some of the content to older versions to see if this helps improve visibility.  

3. Finding old robots.txt file

Another great use of the Wayback Machine is checking how your robots.txt has changed from previous versions. This can be particularly helpful during a site migration if your robots.txt file has changed and you don’t have a version of the original file. 

Fortunately, the Wayback Machine crawls robots.txt files a lot. Just look at how many times IBM’s robots.txt file was crawled in 2012: 

LVj9OyoYRduQ4sDdagIM5GAhZWqxqUHRMTMf40w M8NcAA1boOQh1QRtkspHD49CyzghgdjwzR 4gc2WVFA X7klzNXqJoYKXFSCwA1QMxoFxMDuGDL7wpj9NZTjwfop9DQPWdrS

Using this, you can analyze how the robots.txt has changed over time. For instance, IBM’s robots.txt file looks quite different from what it used to. Here is the file back in 2012: 

M01 YtMMvUgIQ5ww3wJb 3tEJq2xPs Rjkj0WdNqMajF0IK1IJawhvLsgqsTNUVx8Y48ZC5opFsmv3NT KloM7N50dHbThYC6pAGEW57If1UZAQSLQ0J0daM 41GHsUsKSLXel7r

Looking at the site’s current robots.txt file, you can see that the commands have changed: 

S7O1mdl451WmjFLqO10JeAZE0secuyC2CppSxYd79gVD4tqMzy8aL57leXD8HX9ywIM962AaXtyLdiXkgZ26GfXYW5TxYkRMcvfACMgVhkfWY1Nr0 LzcDm6vZjf8hSWfx84RLWF

Using The Wayback Machine can be an extremely effective way to lookup old versions of your robots.txt file. This is especially useful if the information has been lost during a site migration. 

4. What sections competitors are adding to their pages

Sites in competitive spaces routinely add or update content. For your highest priority keywords, your competitors are likely making frequent updates to their pages to try to improve their visibility. It can be difficult to track these changes. 

Fortunately, the Wayback Machine allows you to understand what types of updates competitors are making to their content.

For example, we can use The Wayback Machine to look at the best cast iron skillet page from Serious Eats on June 27, 2021:

Zx0cwnIUkHdYyxyKpO POK9DxrqOIEAxQxY9b UhDRCT0Gf EgOCV9liAVfrpPrGUg25FqeYrQz QoyU6AtNn93oZN7 ZUgwfk30Wr7pXE2vwT6c G7sPyQbOYWE5QzuKg41ERLP

Looking at the page today, we can immediately see that they have made some dramatic changes to the page: 

Imtpm2zxu5ve8eDwN2wlaVZseCZAKIPLoSS2Z0eXqe3dsBHcrSJHwVy0R82hbG61LBVRwQn2vpd7PeYDEHQ5UuZW9ERxci1WeMZoOAd9M19Sz1Ztgr1thzBDRP0js SYjUgTyl7qBy reviewing the existing page, we can see that they have:

  • Added an “Editor’s Note” to the top of the article
  • Moved the “The Winners” section to the top of the page
  • Implemented internal links at the top of the first paragraph
  • Made “The Winners” section more visual
  • Added an FAQs section

This is extremely valuable information to have when performing a competitive analysis. These changes can now inform the editorial strategy we apply to our own page. 

Determining the content differences can be difficult. It requires a manual review. However, you can use tools like Diffchecker to easily spot the content changes.  

L4O572GtTvor1G4 RX Y94wQREeBn85HVw0C UO384l90j5PIb0iqp2meC77pxh AsDOAeZ 4boJoV HJpxMDaz2YGfrkcSBazoTuZSHgImLYajtFQ36GVz5ILPhtFhORel7hNkk

5. How frequently competitors are updating content

Use the Wayback Machine to determine how frequently competitors update content.

This is especially useful if you’re in a SERP landscape where content freshness matters for visibility. 

For example, CNET’s high-ranking page for The Best Android Phone Of 2022. At the top of the article, you can see the timestamp for when the article was last updated: 

RfmKajlmH00y7xcq6CChcuWfiQATPGy PCITjvzecA8ixd9xpTr8alesMwYxRpFyel9clV0DtyqMske NxrxGQgTDcL6Q35tdBIRHRbKc3Aez3KR2P5vT  Y Woj0Ph06ISTblcE

Because technology is extremely fast-moving, it’s likely that freshness matters for terms such as “best android phones” since the products are frequently changing. Therefore, we might want to research how often we need to update our own content to stay competitive. 

Using the Wayback Machine, we can construct a timeline of how frequently CNET updates these articles. By looking at the previous timestamp on the page, we can look up the most recent historical version of the Wayback Machine that precedes that date. For example, to find the update that happened before March 5, 2022, we can look up what the version on March 2, 2022, looked like. 

By repeating this process, we can develop a timeline for how frequently CNET updates this page: 

Based on the data, it’s safe to say that CNET updates this article on a monthly cadence. We might want to apply the same update frequency to our content to stay competitive with CNET. 

Back to the Wayback

In a world where the web is ever-changing, the Wayback Machine is invaluable.

You can use this tool in multiple ways to recover lost information and glean insights into the direction of competitor strategies.

Make sure the Wayback Machine is in your SEO toolkit.


Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.


About the author

Chris Long
Contributor
Chris Long is the VP of marketing at Go Fish Digital. Chris works with unique problems and advanced search situations to help his clients improve organic traffic through a deep understanding of Google’s algorithm and Web technology. Chris is a contributor for Moz, Search Engine Land, and The Next Web. He is also a speaker at industry conferences such as SMX East and the State Of Search. You can connect with him on Twitter and LinkedIn.

Get the must-read newsletter for search marketers.