Google’s Additional Discovery Method: RSS and Atom Feeds

For years, Google’s discovery of web pages was solely based on links. If a page had no links to it, Googlebot had no way of knowing about it and therefore, would never index it. Along the way, Google provided an option for submitting individual pages, but that wasn’t really a viable option for site owners with large sites. In 2005, Google launched XML Sitemaps, which was a much more scalable way for site owners to let Google know about pages of their site that Googlebot may not otherwise discover through links. Today, a Google Webmaster Central blog post discusses another way Googlebot may discover pages: feeds. They say that using RSS and Atom feeds to discover pages helps them learn about new content quickly.

New content is key for Google since freshness is a vital component of relevance for some queries. Convention wisdom is that it’s not all that useful to ensure Google knows about pages of your site if they don’t have links to them, because without links, Google won’t see them as valuable. But current ranking is much more complicated than the original PageRank formula describes. And new content with no links may very well trump content with an abundance of links if it makes sense for the query.

Of course, site owners have always been able to to submit RSS and Atom feeds as Sitemaps, but this post describes using these feeds even if the site owner hasn’t submitted them via the Sitemap system. Instead, they are scanning other feed submission systems, such as Google Reader and ping services for the feeds.

It’s unclear from the post if the feeds are being used solely for discovery or if the content from the feeds are being used in place of crawling as well. The title of the post references “discovery” but the post itself notes that they are able to “get these new pages into our index more quickly than traditional crawling methods” and to directly crawl feeds. If Google is using the feeds in place of crawling, this would be another argument in favor of full rather than partial feeds — you’d get more of a page’s content indexed more quickly. Google Blogsearch initially crawled feed content rather than the actual pages, which led to partial indexing in Blogsearch, but this changed late last year.

The post notes that in order for Google to use a feed as a discovery method, the feed must not be blocked by robots.txt.

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: Channel: SEO | Google: SEO | Top News

Sponsored


About The Author: is a Contributing Editor at Search Engine Land. She built Google Webmaster Central and went on to found software and consulting company Nine By Blue and create Blueprint Search Analytics< which she later sold. Her book, Marketing in the Age of Google, (updated edition, May 2012) provides a foundation for incorporating search strategy into organizations of all levels. Follow her on Twitter at @vanessafox.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://www.tag44.com tag44

    Nice post, very well information shared here, thanks for the post and for sharing the resourceful information here.

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide