Google’s Additional Discovery Method: RSS and Atom Feeds


For years, Google’s discovery of web pages was solely based on links. If a page had no links to it, Googlebot had no way of knowing about it and therefore, would never index it. Along the way, Google provided an option for submitting individual pages, but that wasn’t really a viable option for site owners with large sites. In 2005, Google launched XML Sitemaps, which was a much more scalable way for site owners to let Google know about pages of their site that Googlebot may not otherwise discover through links. Today, a Google Webmaster Central blog post discusses another way Googlebot may discover pages: feeds. They say that using RSS and Atom feeds to discover pages helps them learn about new content quickly.

New content is key for Google since freshness is a vital component of relevance for some queries. Convention wisdom is that it’s not all that useful to ensure Google knows about pages of your site if they don’t have links to them, because without links, Google won’t see them as valuable. But current ranking is much more complicated than the original PageRank formula describes. And new content with no links may very well trump content with an abundance of links if it makes sense for the query.

Of course, site owners have always been able to to submit RSS and Atom feeds as Sitemaps, but this post describes using these feeds even if the site owner hasn’t submitted them via the Sitemap system. Instead, they are scanning other feed submission systems, such as Google Reader and ping services for the feeds.

It’s unclear from the post if the feeds are being used solely for discovery or if the content from the feeds are being used in place of crawling as well. The title of the post references “discovery” but the post itself notes that they are able to “get these new pages into our index more quickly than traditional crawling methods” and to directly crawl feeds. If Google is using the feeds in place of crawling, this would be another argument in favor of full rather than partial feeds — you’d get more of a page’s content indexed more quickly. Google Blogsearch initially crawled feed content rather than the actual pages, which led to partial indexing in Blogsearch, but this changed late last year.

The post notes that in order for Google to use a feed as a discovery method, the feed must not be blocked by robots.txt.



Vanessa Fox is a Contributing Editor at Search Engine Land. Called a “cyberspace visionary” by Seattle Business Monthly, she is an expert in understanding customer acquisition from organic search. She shares her perspective on how this impacts marketing and user experience at ninebyblue.com and provides authoritative search-friendly design patterns for developers at janeandrobot.com.

See more articles by Vanessa Fox >


Share, Bookmark & Discuss This Article
More:


Keep Updated: News Via Email | News Via RSS Feed | News Via Twitter


See more stories like this in the Members Library! Check out the Google: SEO, Top News sections of the Members Library where this story is filed. Members also get access to exclusive video content, a members-only weekly & monthly newsletter, plus more. Check out all the benefits!

ONE COMMENT ON Google’s Additional Discovery Method: RSS and Atom Feeds

tag44,

Nice post, very well information shared here, thanks for the post and for sharing the resourceful information here.



POST A COMMENT

Got a comment? Log in, register to comment or become a premium member to comment without CAPTCHA hassles, to have your own custom picture/avatar appear, plus many other benefits.


RECENT COMMENTS

  • kloeprich said " The recent news confirms suspicions I’ve had that News Corp and MS were already in negotiations with"
  • Susannah said " I can't wait to try some of these tips this week. What a resource! It's like having a coffee with 21"
  • dian said " I haven't tried that yet but if it is the way Mazter is saying I think it won't going to do any good"

See All »


FREE DAILY SEARCH NEWS RECAP!

Stay on top of all the search news with our daily summary, the SearchCap newsletter. View a sample ›

STAY CURRENT THROUGHOUT THE DAY

RSS Feeds

The Search Engine Land feed keeps you informed as news happens. SEE ALL FEEDS »

Upcoming Search Engine Land Conferences

Advertise With Us »

Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.


SMX Web Site » | SMX Difference » | SMX News »


Join us at an upcoming SMX event:

Search Marketing Now Learn more about search marketing with our free online webcasts and webinars from our sister site, Search Marketing Now. Upcoming online events include:


See more webcast topics »

TRACK US SOCIALLY
Upcoming Search Engine Land Conferences

Get Your Search Engine Land
Premium Membership!

Become a premium member today and receive:

  • Express commenting privileges & photo.
  • Exclusive videos & newsletters.
  • Discounts to our SMX conferences.
  • Access to "How To" & Other Archives.

Learn More

Upcoming Search Engine Land Conferences
Add to GoogleAdd to My Yahoo!Add to BloglinesAdd to NetvibesAdd to Windows Live