Snowden Petition Blocked From Google? Like All Petitions, It Won’t Be When It Gets Enough Signatures

US-WhiteHouse-Logo.jpgSearch for “edward snowden petition” on Google to find the petition filed through the White House petitions site, and you’ll see something odd. The petition has no description, because the White House won’t let Google crawl the page. But it’s not a move against Snowden, as some might think. It’s part of how the petitions site has worked with search engines for some time.

Here’s how the listing looks:

petition-pardon-edward-snowden

Notice the description: “A description for this result is not available because of the site’s robot.txt — learn more.”

iAcquire noted the oddity this week, that the page is listed, but with this odd description. The description is explaining that the page has been blocked from Google and other search engines such as Bing from indexing it.

How can a page that’s blocked still be listed? This is what’s known as a “link only” listing, where Google can guess at what the page is about from other pages linking to it to form a title. But, it can’t generate a description nor gather any information from the page itself, because it’s blocked, and Google cannot access the content on the page to show a description of the page.

In fact, all new petitions on the White House site are blocked like this, and have been since 2011, as shown by this copy of the robots.txt file via the Way Back Machine.

Why would this happen? The White House is blocking petitions that are below a certain threshold. Page that gain enough signatures get an official response, and that also means they get a new page in an area of the site (the responses area) that isn’t blocked.

The White House has a page explaining the threshold needed, though it doesn’t explain the search engine blocking. However, our understanding is that this is how things work — pages below a threshold of signatures don’t get indexed, mainly to help prevent people who might try to use the White House site to generate spam.

Get enough signatures, and you’re guaranteed a response — and also deemed Google-worthy. Snowden’s petition actually has over the required amount, so it should get an official response in the near future, and one that will be fully indexed by Google.

Related Topics: Channel: SEO | Features: Analysis | Google: SEO | Google: Web Search | SEO: Blocking Spiders | Top News

Sponsored


About The Author: is Search Engine Land's News Editor and owns RustyBrick, a NY based web consulting firm. He also runs Search Engine Roundtable, a popular search blog on very advanced SEM topics. Barry's personal blog is named Cartoon Barry and he can be followed on Twitter here. For more background information on Barry, see his full bio over here.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://www.acsius.com/ Arun Singh

    I am wondering why bothering blocking the summary of a page we can all read…

  • http://www.archiewatt.com/ Archie Watt

    Don’t know if I’m missing something, but it looks to me like all petitions are blocked like that regardless of how many signatures it has – look at https://petitions.whitehouse.gov/petitions/popular/0/2/0 – it’s the same with all of those, and several are well over the signature threshold. I can’t see a single petition that *isn’t* blocked by robots.txt.

  • Graham Ginsberg

    If the page or pages are blocking Google, a good move indeed, what gives Google the right to index the page at all and even show it in search results?

    Seems like Google is breaking in and entering websites without permission. Isn’t that a crime in America?

  • http://www.ninebyblue.com Vanessa Fox

    The petitions are blocked. Once a petition reaches a certain number of signatures, a response page for that petition is created that lives under the /responses folder (that contains info on the petition and the response and links back to the original petition). That folder is not blocked.

  • http://www.harisbacic.com/ Haris

    Umm no.

    1. Using robots.txt file does not legally prohibit search engine bots from crawling your websites. It is up to each search engine bot to respect robots.txt rules, but they are absolutely not legally required to do so.

    2. Did you read this part: “Google can guess at what the page is about from other pages linking to it to form a title”? Google did not index the page by visiting it. They did so by creating a guess of what the page is about from other websites linking to it.

  • StevenTMahlberg

    Maybe it’s just because Google is a tool of the Government and they don’t want to deal with stupid petitions.

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide