Google’s Additional Discovery Method: RSS and Atom Feeds

For years, Google’s discovery of web pages was solely based on links. If a page had no links to it, Googlebot had no way of knowing about it and therefore, would never index it. Along the way, Google provided an option for submitting individual pages, but that wasn’t really a viable option for site owners with large sites. In 2005, Google launched XML Sitemaps, which was a much more scalable way for site owners to let Google know about pages of their site that Googlebot may not otherwise discover through links. Today, a Google Webmaster Central blog post discusses another way Googlebot may discover pages: feeds. They say that using RSS and Atom feeds to discover pages helps them learn about new content quickly.

New content is key for Google since freshness is a vital component of relevance for some queries. Convention wisdom is that it’s not all that useful to ensure Google knows about pages of your site if they don’t have links to them, because without links, Google won’t see them as valuable. But current ranking is much more complicated than the original PageRank formula describes. And new content with no links may very well trump content with an abundance of links if it makes sense for the query.

Of course, site owners have always been able to to submit RSS and Atom feeds as Sitemaps, but this post describes using these feeds even if the site owner hasn’t submitted them via the Sitemap system. Instead, they are scanning other feed submission systems, such as Google Reader and ping services for the feeds.

It’s unclear from the post if the feeds are being used solely for discovery or if the content from the feeds are being used in place of crawling as well. The title of the post references “discovery” but the post itself notes that they are able to “get these new pages into our index more quickly than traditional crawling methods” and to directly crawl feeds. If Google is using the feeds in place of crawling, this would be another argument in favor of full rather than partial feeds — you’d get more of a page’s content indexed more quickly. Google Blogsearch initially crawled feed content rather than the actual pages, which led to partial indexing in Blogsearch, but this changed late last year.

The post notes that in order for Google to use a feed as a discovery method, the feed must not be blocked by robots.txt.

Related Topics: Channel: SEO | Google: SEO | Top News


About The Author: is a Contributing Editor at Search Engine Land. She founded ninebyblue.com and Blueprint Search Analytics. Her book, Marketing in the Age of Google, (updated edition, May 2012) provides a blueprint for incorporating search strategy into organizations of all levels. Follow her on Twitter at @vanessafox.

Connect with the author via: Email | Google+


SMX - Search Marketing Expo

SearchCap:

Get all the top search stories emailed daily!  

Like This Story? Please Share!

Other ways to share:

Like Our Site? Follow Us!

Subscribe to Our Feed! Join our LinkedIn Group Check out our Tumblr! See us on Pinterest Get Search Engine Land on your mobile device!
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://www.tag44.com tag44

    Nice post, very well information shared here, thanks for the post and for sharing the resourceful information here.

Get Our News, Everywhere!

 
  • Advertise With Us
 

Click to watch SMX conference video

Join us at an upcoming SMX event:

North America

EMEA

APAC

Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.

SMX Site » | SMX Difference » | SMX News »




 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide