Content Curation: Pruning Old “Fresh” Content To Reveal The Evergreen

Most sites deliver regularly-produced content as part of their architecture. At its simplest, it might just be a blog section through which company news, updates and outreach are posted. At its best, however, it’s a carefully pruned source of evergreen SEO traffic and a lean component of your site indexing strategy.

The General SEO Theory

SEO-girl

Most SEOs that have been in the industry for a while would largely agree that overall domain SEO performance can be generalised to be a balance between the total backlink profile of the site (its “Authority”) and the total number of pages indexed on the domain (the total “Domain Sprawl”).

Obviously, this is a massive generalisation. But, in my experience, it is a fundamental axiom for SEO that’s stood the test of time and is a component of SEO that still moves the needle when SEO strategies built around it are executed.

With that in mind, it’s apparent that any content which is indexed (and therefore contributing to domain sprawl) has to pull its own weight or it will drag down the overall performance of the site.

This requirement is the basis of site architecture optimisation but is often ignored in the day-to-day management of a site’s SEO campaign. Why? Freshly generated content will always be a crucial part of a good domain architecture, but without a regular review of effectiveness, it can simply drift into long-term domain sprawl with little traffic return to show for it.

There are a couple of neat, quick search tricks we can use to find older content that’s failing to deliver return for its indexed SEO value. Let’s walk through them.

Finding The Old “Fresh”

Given that we’re in the business of reducing our indexed pages, the first tool you should turn to is Google. Using a combination of site index operators and indexed date range tools provided by Google, we can get a list of indexed URLs within any range of dates we’d like.

Let’s take a look at the BBC to see how this might throw up some opportunities to prune.

Chaining our operators and setting an earliest first index date of two years ago, we can dig out some cruft pretty easily:

BBC Indexing Issue

I think we can agree that content like this and this are not necessarily of the highest value in the index (though they do now have extra backlink value, of course!). Though not necessarily “out of date” — their contained data is dynamically updated by the BBC — their indexing makes them landing pages without the surrounding BBC website. This makes for a very poor searcher experience; thus, this content is best cleaned away from the index.

On a side note, we can then quite easily find the BBC’s xml sitemap listing these pages. This strategy isn’t ideal, as it would override any canonicals and is just promoting the indexing of effectively frame pages.

We can take a more extreme view of the indexing date range to dig out some better examples of old content that could well be retired to benefit the domain as a whole. Tweaking the search to set the latest first index date to 2001 hits a goldmine.

By further refining our operators, we can dig out all examples of particular types of content that would do well to be removed from Google’s index. In this example, chaining an intitle operator of “VOTE2001″ with the site operator and setting a comprehensive date range will allow you to hoover up all example URLs that should be canonicalised into more useful content in the same section.

Old BBC Indexed Site & Copy

Additionally, if your underlying CMS system is flexible enough, you may be able to use the URL/Content patterns identified using this process to export and redirect all relevant content instead. Just beware that if you use a hard 301 redirect, and you leave link references to the content elsewhere on the site that are likely to be visited by real people, then you would be creating a poor user experience by forcing a redirect on them.

This is why I favour a canonical solution, as only the search engines are “redirected” — this means that for the edge cases where the old content is still relevant to users arriving from elsewhere on the site (like your own internal search function), the content is still accessible. Crucially, your old content is no longer pulling SEO value away from the rest of the domain.

With a site as large, and a history as long, as the BBC, this technique can be dramatically influential in improving overall rankings across the board for more relevant terms with tomorrow’s fresh content.

(Stock image via Shutterstock.com. Used under license.)

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: All Things SEO Column | Channel: SEO

Sponsored


About The Author: has over twelve years web development experience & is the founder of QueryClick Search Marketing, a UK agency specialising in SEO, PPC and Conversion Rate Optimisation strategies that deliver industry-leading ROI.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://www.seo-theory.com/ Michael Martinez

    “Most SEOs that have been in the industry for a while would largely agree that overall domain SEO performance can be generalised to be a balance between the total backlink profile of the site (its ‘Authority’) and the total number of pages indexed on the domain (the total ‘Domain Sprawl’).”

    I stopped reading at this ridiculous statement.

  • http://www.highrankings.com/ Jill Whalen

    Correct, Michael. Most SEOs would NOT agree to that statement.

  • Henley Wing

    Hey Jill, what do you find wrong with that statement?

  • Daniel Freedman

    With reference to the statement in red on the screen shot of the BBC site, is the author actually suggesting that content that is unlikely to be adding any traffic is “poor value?”

    If so, the author fundamentally misunderstands the BBC’s mission and values and sees the world with SEO blinders. A page that attracts even a handful of views long after it was published can be immensely valuable to its users.

    Pursued to its logical extreme, this mistaken line of reasoning would have the BBC purge dated or unpopular content.

    But, of course, this is the LAST thing the BBC should do.

    The BBC is not selling widgets,

  • http://uk.queryclick.com/ Chris Liversidge

    It is a ‘massive generalisation’ Michael, as I stated in the next line. I would stand by it though – I’d be interested to know what you think is incorrect about it. If you were to try to generalise SEO in a single sentence from your experience, how would you go about it?

  • http://www.seo-theory.com/ Michael Martinez

    There is really nothing in the sentence that is supportable. How I would describe SEO isn’t relevant to the fact that you wrote a simply incredible statement of fact that I find to be completely unbelievable.

  • http://uk.queryclick.com/ Chris Liversidge

    Hi Daniel, I agree the BBC isn’t selling widgets, but I think they would be better served migrating their *ranking* ability for terms related to ‘Labour budget’ or ‘Budget cuts’ to a page that still directly relates to those terms but is not a) out of date (content refers to 2001), and b) off-brand.

    Even for the BBC branding and in particular usability and accessibility are important.

    There are also legal questions to consider, such as the fact these pages do not trigger the cookie usage alert that’s now obligatory for UK sites.
    By only following the canonical redirections, this content can then still be surfaced using internal search tools or navigation, preserving it for use by those genuinely researching 2001 election policy reporting.

  • Daniel Freedman

    Hi, Chris. Thanks for the reply. I understand what you’re saying. But we’ll have to agree to disagree. I can assure you most BBC journalists, viewers and listener would either cringe at your suggestions — or think that following them would be “off brand.”

  • http://uk.queryclick.com/ Chris Liversidge

    Well the only impact to those people would be any general initial Google search landing them on a more recently updated BBC page (and more frequently on the BBC in general for general search terms) – they’d still be able to access the content via the BBC site itself.

    I don’t think that’s a bad searcher experience of damaging to the BBC’s brand.

 

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide