Finding Search Engine Freshness & Crawl Dates

A reader emailed me today noticing that Google was showing a date next to his
listing, which made me think this was a good time to revisit how, when
and where search engines show crawl dates for pages. These dates are a useful
way for site owners to understand how often they are being revisited or for
anyone to "squeeze the loaf" of a search engine to see how fresh it is. Here’s a
search engine-by-search engine rundown on date display. I’ll also cover how
we’ve sadly lost crawl dates being embedded next to listings, over the years.
But that’s not all! Read now and you’ll even get a free at-a-glance table
explaining how dates are displayed. Read now — web server operators are
standing by!

Google

When you do a search, some pages may show a date below the description of a
listing, as illustrated below:


Crawl Dates At Google

I thought Google had long done this for certain pages that it revisits on a
super-frequent basis. And when I did a search for
cars today, I saw a date
like this coming up for the cars.com listing as shown above. An hour later, the date was
gone. I then tried that search again using a particular Google data center,
rather than whatever data center was assigned to my browser randomly. Doing the
same
search at that data center gave me dates again.

I’m checking with Google on how long dates have been showing and why they may
come and go as I saw today. I’ll postscript what I’m told at the end of this
story.

The example above shows that only some pages have dates. In contrast, the
Google Cache can
give you dates for nearly any web page.

The Google Cache allows you to view a copy of a page that is stored on
Google’s servers, rather from the website directly. (Don’t like Google caching this
for your site? Learn how to prevent it

here
and

here
. Don’t see a cached link option? Then the site owner is blocking
caching).

Going back to our search for
cars and the screenshot
above, you’ll see that the disney.go.com listing doesn’t have a date next to it.
To find the date the page was visited, you have to click on the link that says "Cached" under the description of that
listing. That makes the cached page load like

this
. At the top of that page, you’ll see this:

This is Google’s

cache
of
http://disney.go.com/disneypictures/cars/ as retrieved on
22 Feb
2007 14:34:08 GMT
.

See the date and time, which I’ve put in bold? That’s when the page was last visited by Google.

FYI, before
September 2006, that date reflected the last time Google found the page to have
changed, not when it was last visited. In other words, if Google visited the
page in January 2005, then revisited it throughout the year but the page never
changed, the cached date would keep saying January 2005.

Since September 2006,
that’s been different. The date was altered to reflect the last time Google
visited the page — a good change to make. Google explains more about this on
the Google Webmaster Central blog

here
, and Google’s Matt Cutts also did a video about it

here
.

The options above allow anyone to see the freshness of any pages within
Google, one page at a time (as long as they are cached). What if you want to get industrial strength
and view the freshness of all your own pages at once?
Unfortunately, the Google Webmaster
Central
tools don’t let you see the last time all your pages were spidered.
But that’s something they’re considering for the future. The tools will,
however, show you any problems Google had in reaching any of your pages and the
last time a crawl error happened for those pages. Using the "Crawl rate" option
found under the Diagnostics tab, you can also see a general graph of crawling
activity to your site.

There is one other type of date that you might see associated with
listings that has nothing to do when the page was crawled. Look here:


Google Personalized Search Last Visit Date

See the "3 visits – Feb 14" part? That’s coming from
Google
Personalized Search
and shows that I’ve clicked on that listing 3 times,
with the last visit being on Feb. 14. My
Google Ramps Up
Personalized Search
article from earlier this month explains more about how
Google Personalized Search works and can be disabled, if you don’t like it on,
as now happens much more often.

Microsoft Live Search

Microsoft Live Search operates like
Google. Some pages show dates next to them, as I’ve highlighted below:


Crawl Date At Microsoft Windows Live

As with Google, this seems to happen with pages that are being spidered
frequently, but I’ll check on this. Does a page lack a date? Then click on the
"cached page" link. When the cached page loads, you’ll see something like this
at the top of it:

This is a version of

http://www.pixar.com/theater/trailers/cars/index.html
as it looked when
our crawler examined the site on 2/16/2007. The page you see below is the
version in our index that was used to rank this page in the results to your
recent query.

The date (which I’ve but in bold above) tells you when the page was last spidered.

Don’t see a cached page
option? The site owner is probably blocking caching. Are you a site owner that wants to
block caching? Visit the
help area at
Live and
search for "cache" to find more info. I’d point you to the right place, but it
remains impossible to link to particular pages in Microsoft’s absurd help
system.

[Postscript: Microsoft sent this information: "We only show the last-crawl date when it is within a few days. This is a decision to draw attention to the freshest content without highlighting older content. Crawl dates for other documents can be found by looking at the cached page."]

Ask.com

At Ask.com, you can only get dates by looking at the cached pages,
similar to how that works at Google and Microsoft. Click on the "Cached" link that you’ll see
next to the URL of a listing, as highlighted below:


Crawl Date At Ask.com

At the top of the page, you’ll see something like this with the date and time
(shown in bold below) that the page was last visited:

Below is a cache or saved snapshot of 

http://www.cars.com/
  as we found it on February 19, 2007 1:24:56 AM.

Yahoo

At Yahoo, you can only get dates one way, through using
Yahoo Site Explorer. You’ll
have to create an account for your web site, then authenticate your account,
then you’ll be shown last crawl dates as I’ve highlighted in the first listing
below:


Crawl Date At Yahoo Site Explorer

More than any other search engine, Yahoo makes it easy for a site owner to
see the freshness of many pages all at once. However, the huge disadvantage from a
searcher perspective is that you can’t spot check the freshness of any page you
randomly select.

The Date & Freshness Table

I love nothing more than doing tables, so let’s put everything above into a
nice one:

Feature Ask Google Microsoft Yahoo
Dates Next
To Listings?
No Some Some No
Dates On
Cached Pages?
Yes Yes Yes No
Dates In
Webmaster Tools?
No
Tools
For Errors & Home
Page
No
Tools
Yes

Ideally, I’d like to see that top row — "Dates Next To Listings?" — be
completely "Yes." Some site owners block caching, which makes it hard to measure
freshness. Putting the dates right next to the listings makes it easy for anyone
who cares to see at a glance if a search engine is stale or fresh.

In fact, I have to laugh. I’ve been asking for this for years. On the old
features chart I used to maintain about dates, I

wrote
in 2001:

Along with the page description, some search engines show the date when a web
page was created or modified. As noted above, these dates may not always be
reliable. However, they do provide a useful clue as to how fresh or stale a
search engine’s listings are. Thus, search engines that show a date deserve
praise for doing so.

That was from 2001! Nearly six years later, it’s still the case that dates
aren’t being shown. In fact, it’s a reversal. Back in 2001, the major search
engines of AltaVista, HotBot (Inktomi) and Northern Light all showed dates for
all listings right within search results. Fast forward to today, and none of the
major search engines do.

The reason is simple enough. Over time, the search engines either couldn’t
maintain freshness or didn’t want to show they were sometimes stale. So dates
either went away or never got added. C’mon gang — time to bring them back right
into the search results. If they aren’t there by default, make it an option
people can enable.

Verifying Freshness

In the meantime, there’s a favorite tactic for those search watchers who want
to track freshness. Google’s Matt Cutts
once wrote
about this back in 2005, describing exactly a technique I and others have long
used. You simply find a page that you know carries a date that’s constantly
updated. Look at the cached page and see what the time and date says on it.

But Yahoo doesn’t show a date on cached pages! No, it doesn’t, but you’re not
looking for the date that the search engine inserts. You want the date on the
page itself. For example,

here’s
the cached page over at Yahoo for CNN:


Finding Dates On Cached Pages

See the part I highlighted in red, that says:

UPDATED: 3:53 a.m. EST, February 26, 2007

That’s the date that CNN had on its own page when the Yahoo spider last
visited. When I looked, the date and time was 3:10 pm EST on February 27 — so
the page is only 12 hours old. Not bad in this case, but I wouldn’t expect a
major news site to be much out of date.

Return Of The Freshness Guarantee?

Finally, I’ll leave you with this trip down memory lane. Back in June 1999,
AltaVista once offered a freshness guarantee that was quickly broken. As I
wrote at
the time:

"AltaVista search is able to
make its Freshness Guarantee: no search site will have fresher results than
AltaVista."

AltaVista unveiled its first
"Freshness Guarantee" back when it relaunched in June, promising that its
entire index would be refreshed at least once per month. That guarantee was
almost immediately broken, as even AltaVista President Rod Schrock admitted
when we talked recently. "We turned our attention to this new system," Schrock
said.

OK, fair enough — they wanted
to build something even better. But this new guarantee has already been
broken, as described above. If claims like these are going to be made, then
they should actually be met. And not to meet them in the midst of a huge media
blitz is an incredible blunder.

Freshness is one important component to what makes a good search engine. It’s
not the only thing. Having fresh results means nothing if the results aren’t
relevant. And some pages don’t need to be spidered that often. But putting dates
next to listings is an easy form of search "food" labeling that can give
reassurance about a major search engines. Surely it’s time for dates to make a
comeback.

Related Topics: Ask: SEO | Ask: Web Search | Channel: SEO | Google: SEO | Microsoft: Bing | Microsoft: Bing SEO | Search Features: Dates | SEO: Blocking Spiders | SEO: General | SEO: Titles & Descriptions | Stats: Freshness | Yahoo: SEO | Yahoo: Site Explorer

Sponsored


About The Author: is a Founding Editor of Search Engine Land. He’s a widely cited authority on search engines and search marketing issues who has covered the space since 1996. Danny also serves as Chief Content Officer for Third Door Media, which publishes Search Engine Land and produces the SMX: Search Marketing Expo conference series. He has a personal blog called Daggle (and keeps his disclosures page there). He can be found on Facebook, Google + and microblogs on Twitter as @dannysullivan.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://www.mattcutts.com/blog/ Matt Cutts

    I’m less concerned about showing the crawl dates on the search results. I would love if Yahoo would start showing the crawl date on their cached pages, because that would allow apples-to-apples comparisons.

  • http://www.followthedrinkinggourd.org Joel Bresler

    Hi, thanks for the interesting article.

    I recently launched a a web site on the American folk song Follow the Drinking Gourd and have been tracking home page and site-wide caching information ever since. Bottom line: Since March 1st, Google and Yahoo have lagged on average about half a week behind on the home page, while MSN is running a week or so behind.

    The really interesting differences in caching show up in how frequently the search engines are refreshing the other pages in this site. Aside from the home page, the average followthedrinkinggourd.org page had a whopping cache lag of 39 days in Google, vs. 13 days in Yahoo and just 10 in MSN.

    Follow the Drinking Gourd: Site Caching statistics

    I hope the information proves useful.

    All the best,

    Joel B

  • http://www.fitness-superstore.co.uk Fitness superstore

    I always factor in how often a website is updated, for me this far outweighs pagerank

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide