Finding Search Engine Freshness & Crawl Dates
A reader emailed me today noticing that Google was showing a date next to his
listing, which made me think this was a good time to revisit how, when
and where search engines show crawl dates for pages. These dates are a useful
way for site owners to understand how often they are being revisited or for
anyone to "squeeze the loaf" of a search engine to see how fresh it is. Here’s a
search engine-by-search engine rundown on date display. I’ll also cover how
we’ve sadly lost crawl dates being embedded next to listings, over the years.
But that’s not all! Read now and you’ll even get a free at-a-glance table
explaining how dates are displayed. Read now — web server operators are
When you do a search, some pages may show a date below the description of a
listing, as illustrated below:
I thought Google had long done this for certain pages that it revisits on a
super-frequent basis. And when I did a search for
cars today, I saw a date
like this coming up for the cars.com listing as shown above. An hour later, the date was
gone. I then tried that search again using a particular Google data center,
rather than whatever data center was assigned to my browser randomly. Doing the
search at that data center gave me dates again.
I’m checking with Google on how long dates have been showing and why they may
come and go as I saw today. I’ll postscript what I’m told at the end of this
The example above shows that only some pages have dates. In contrast, the
Google Cache can
give you dates for nearly any web page.
The Google Cache allows you to view a copy of a page that is stored on
Google’s servers, rather from the website directly. (Don’t like Google caching this
for your site? Learn how to prevent it
here. Don’t see a cached link option? Then the site owner is blocking
Going back to our search for
cars and the screenshot
above, you’ll see that the disney.go.com listing doesn’t have a date next to it.
To find the date the page was visited, you have to click on the link that says "Cached" under the description of that
listing. That makes the cached page load like
this. At the top of that page, you’ll see this:
See the date and time, which I’ve put in bold? That’s when the page was last visited by Google.
September 2006, that date reflected the last time Google found the page to have
changed, not when it was last visited. In other words, if Google visited the
page in January 2005, then revisited it throughout the year but the page never
changed, the cached date would keep saying January 2005.
Since September 2006,
that’s been different. The date was altered to reflect the last time Google
visited the page — a good change to make. Google explains more about this on
the Google Webmaster Central blog
here, and Google’s Matt Cutts also did a video about it
The options above allow anyone to see the freshness of any pages within
Google, one page at a time (as long as they are cached). What if you want to get industrial strength
and view the freshness of all your own pages at once?
Unfortunately, the Google Webmaster
Central tools don’t let you see the last time all your pages were spidered.
But that’s something they’re considering for the future. The tools will,
however, show you any problems Google had in reaching any of your pages and the
last time a crawl error happened for those pages. Using the "Crawl rate" option
found under the Diagnostics tab, you can also see a general graph of crawling
activity to your site.
There is one other type of date that you might see associated with
listings that has nothing to do when the page was crawled. Look here:
See the "3 visits – Feb 14" part? That’s coming from
Personalized Search and shows that I’ve clicked on that listing 3 times,
with the last visit being on Feb. 14. My
Google Ramps Up
Personalized Search article from earlier this month explains more about how
Google Personalized Search works and can be disabled, if you don’t like it on,
as now happens much more often.
Microsoft Live Search
Microsoft Live Search operates like
Google. Some pages show dates next to them, as I’ve highlighted below:
As with Google, this seems to happen with pages that are being spidered
frequently, but I’ll check on this. Does a page lack a date? Then click on the
"cached page" link. When the cached page loads, you’ll see something like this
at the top of it:
This is a version of
http://www.pixar.com/theater/trailers/cars/index.html as it looked when
our crawler examined the site on 2/16/2007. The page you see below is the
version in our index that was used to rank this page in the results to your
The date (which I’ve but in bold above) tells you when the page was last spidered.
Don’t see a cached page
option? The site owner is probably blocking caching. Are you a site owner that wants to
block caching? Visit the
help area at
search for "cache" to find more info. I’d point you to the right place, but it
remains impossible to link to particular pages in Microsoft’s absurd help
[Postscript: Microsoft sent this information: "We only show the last-crawl date when it is within a few days. This is a decision to draw attention to the freshest content without highlighting older content. Crawl dates for other documents can be found by looking at the cached page."]
At Ask.com, you can only get dates by looking at the cached pages,
similar to how that works at Google and Microsoft. Click on the "Cached" link that you’ll see
next to the URL of a listing, as highlighted below:
At the top of the page, you’ll see something like this with the date and time
(shown in bold below) that the page was last visited:
Below is a cache or saved snapshot of
http://www.cars.com/ as we found it on February 19, 2007 1:24:56 AM.
At Yahoo, you can only get dates one way, through using
Yahoo Site Explorer. You’ll
have to create an account for your web site, then authenticate your account,
then you’ll be shown last crawl dates as I’ve highlighted in the first listing
More than any other search engine, Yahoo makes it easy for a site owner to
see the freshness of many pages all at once. However, the huge disadvantage from a
searcher perspective is that you can’t spot check the freshness of any page you
The Date & Freshness Table
I love nothing more than doing tables, so let’s put everything above into a
|For Errors & Home
Ideally, I’d like to see that top row — "Dates Next To Listings?" — be
completely "Yes." Some site owners block caching, which makes it hard to measure
freshness. Putting the dates right next to the listings makes it easy for anyone
who cares to see at a glance if a search engine is stale or fresh.
In fact, I have to laugh. I’ve been asking for this for years. On the old
features chart I used to maintain about dates, I
wrote in 2001:
Along with the page description, some search engines show the date when a web
page was created or modified. As noted above, these dates may not always be
reliable. However, they do provide a useful clue as to how fresh or stale a
search engine’s listings are. Thus, search engines that show a date deserve
praise for doing so.
That was from 2001! Nearly six years later, it’s still the case that dates
aren’t being shown. In fact, it’s a reversal. Back in 2001, the major search
engines of AltaVista, HotBot (Inktomi) and Northern Light all showed dates for
all listings right within search results. Fast forward to today, and none of the
major search engines do.
The reason is simple enough. Over time, the search engines either couldn’t
maintain freshness or didn’t want to show they were sometimes stale. So dates
either went away or never got added. C’mon gang — time to bring them back right
into the search results. If they aren’t there by default, make it an option
people can enable.
In the meantime, there’s a favorite tactic for those search watchers who want
to track freshness. Google’s Matt Cutts
about this back in 2005, describing exactly a technique I and others have long
used. You simply find a page that you know carries a date that’s constantly
updated. Look at the cached page and see what the time and date says on it.
But Yahoo doesn’t show a date on cached pages! No, it doesn’t, but you’re not
looking for the date that the search engine inserts. You want the date on the
page itself. For example,
here’s the cached page over at Yahoo for CNN:
See the part I highlighted in red, that says:
UPDATED: 3:53 a.m. EST, February 26, 2007
That’s the date that CNN had on its own page when the Yahoo spider last
visited. When I looked, the date and time was 3:10 pm EST on February 27 — so
the page is only 12 hours old. Not bad in this case, but I wouldn’t expect a
major news site to be much out of date.
Return Of The Freshness Guarantee?
Finally, I’ll leave you with this trip down memory lane. Back in June 1999,
AltaVista once offered a freshness guarantee that was quickly broken. As I
"AltaVista search is able to
make its Freshness Guarantee: no search site will have fresher results than
AltaVista unveiled its first
"Freshness Guarantee" back when it relaunched in June, promising that its
entire index would be refreshed at least once per month. That guarantee was
almost immediately broken, as even AltaVista President Rod Schrock admitted
when we talked recently. "We turned our attention to this new system," Schrock
OK, fair enough — they wanted
to build something even better. But this new guarantee has already been
broken, as described above. If claims like these are going to be made, then
they should actually be met. And not to meet them in the midst of a huge media
blitz is an incredible blunder.
Freshness is one important component to what makes a good search engine. It’s
not the only thing. Having fresh results means nothing if the results aren’t
relevant. And some pages don’t need to be spidered that often. But putting dates
next to listings is an easy form of search "food" labeling that can give
reassurance about a major search engines. Surely it’s time for dates to make a
(Some images used under license from Shutterstock.com.)
Everything you need to know about SEO, delivered every Thursday.