• TmWe

    Google has said it may disregard the noindex command if other signals indicate strongly enough that the page should be indexed ?? You sure about that ?

  • http://www.archology.com/ Jenny Halasz

    Yes, unfortunately. I have seen it in action. It’s similar to Google’s policy of breaking canonicals if they don’t makes sense. In an extreme example, let’s say you had a link at the footer of every page to your service areas, and then your service areas page had a noindex tag on it, or a canonical tag to another page of the site. If Google thinks you made a mistake, they may choose to disregard your command. Maile Ohye from Google has confirmed that this is a possibility at several of the SMX shows where she’s spoken.

  • http://www.reginaldchan.net/ Reginald Chan Xin Yon

    Hi Jenny,

    Well written. First of all, I love both the quotes you written. Okay that’s huge :) Took me 3 reads to really get the meaning. Haha!

    I didn’t know that Google won’t crawl every indexed page and vice versa. So, that’s new to me.

    Thanks for that!

    Reginald

  • TmWe

    Not buying it sorry, Lots of people link sitewide to pages that they dont want indexed and sure there are no current live examples of Google ignoring that. If you have seen a page indexed with a noindex tag then more than likely Google hasnt recrawled the page to see the noindex tag since the addition.

  • tedives

    Great explanations Jenny.

    My thinking is, Google disregards nofollow and flows PageRank anyway through those links. Nofollow mostly exists (IMO) to create a FUD factor to encourage people not to spam Wikipedia, blogs, and so on – and for people to mark paid links and avoid penalties from them, the only other real use of it by Google.

    I guess I am a nofollow heretic in the way you appear to be a 302 one (that was a new one for me I had not heard ;-) Google made conflicting statements on this 3-4 years ago going back and forth on whether they “respect” nofollow, and in 2011 SEOMoz’s correlation study showed no difference between followed and nofollowed links from a ranking correlation perspective. It looks like they didn’t check this in the 2013 one unfortunately.

    Also, regarding crawling – check this Google paper out “Sitemaps: Above and Beyond the Crawl of Duty” – was not widely covered at the time but is a very interesting read:
    http://www2009.org/proceedings/pdf/p991.pdf

  • http://www.archology.com/ Jenny Halasz

    So what we might be disagreeing about here is an issue of nomenclature. I’m using “indexed” in the article to indicate that they’ve assigned some of the words in the document to that page in the database. I think you’re using it to mean that the page will show in the SERPs.

    According to Google’s John Mueller, they still include the page in their system somewhere with the noindex command attributed to it. They also may “index” the page slightly if there are other signals (such as inbound links) pointing to it. They will not show it in the SERPs as long as it has a noindex tag on it.

    You can test it yourself if you have a page that has other signals (such as external links pointing to it) that has a unique snippet of text on it. Do a site: search for the domain with the unique snippet. You will most likely see the page come up.

    Again, note that this is different than the regular SERPs. This page isn’t going to rank for a regular old keyword search. But it’s important to mention because if you have someone savvy enough to search with the site: command, they can find this content. Usually it’s a non-issue since the noindexed content is often publicly available. But if you’re using “noindex” to protect something behind a login barrier and not using SSL too, your information can be found.

    Just an important distinction that could be an article all its own – which is why I linked to the other article I wrote on the topic.

    Hope that helps clear up any confusion.

  • http://www.archology.com/ Jenny Halasz

    I just realized you are absolutely right. I don’t know if I made the typo or if it got changed during the editing process, but the line above should read: “However, I have seen cases where Google has included a noindexed page in their publicly available records” I am asking the team at SEL to fix this.

  • lowlevel

    Hi Jenny, thanks for the citation. :-)

    About the willingness to ignore the commands, I remember that Maile Ohye was talking exclusively about the rel=canonical directive, not about the noindex directive.

    The only situation when a noindex directive can be ignored is when the page with the noindex cointains also a +1 button.

    In all the other cases, the noindex is considered a very strict directive and I don’t remember any googler stating that it is treated like a suggestion.

  • http://www.archology.com/ Jenny Halasz

    Thanks for the kind words! This was a complicated but fun article to write. I have heard from many others that they think Google disregards nofollow, and it’s an interesting theory. Since that particular command is almost exclusive to Google (although Bing may use it too – they don’t put emphasis on it), I generally figure it doesn’t really matter how they use it, just that at some point, they use it as a measurement of something. Whether they use it exactly as intended or not would be an interesting future post – that I’m not really qualified to write!

    The 302 thing was new for me too – I really thought that was almost equivalent to 404 in terms of link value – but I was schooled on Google’s official response to that one too during the writing of this article. I’ve fallen on my sword several times in the last week… I figure it’s good for me. ;-)

    I am looking forward to reading that paper. Have you ever seen the paper on canonicals that Maile wrote? It’s also fascinating: http://tools.ietf.org/html/draft-ohye-canonical-link-relation-04

  • tedives

    An RFC thing? Had not seen it. Usually those are painful to read but that one is actually interesting thanks. I like conference papers and things like that RFC thing because Bill Slawski has pretty much got a monopoly on knowledge from patents, any patent stuff I run across I invariably find out he’s already written on it. Once in awhile you can come across something he missed because it was published in an obscure paper instead. Thanks!

  • http://www.seoagencysydney.com.au/ virginia

    Love the pic that shows google don’t index my page A, but it goes to page B. guess that makes sense.

  • TmWe

    “The only situation when a noindex directive can be ignored is when the page with the noindex contains also a +1 button”
    Are you saying that a page with a +1 button will be indexed regardless of noindex directives ?

    I am aware that Google can crawl a page regardless of robots.txt if a +1 button is activated, though robots.txt is not noindex.

  • http://www.archology.com/ Jenny Halasz

    Unless you have “nofollow” on page A as well. :)

  • http://www.archology.com/ Jenny Halasz

    Just to clarify… Google can still “index” the page (whatever they deem that to be) but they won’t include it in their search results, right? Because I have seen instances where a specific string in the site: command will return a result, but it doesn’t show up in the general SERPs. What I was trying to explain to people is that noindex is not an absolute – a savvy searcher could still find your page if they knew what to look for.

    And apologies about the mis-attribution to Maile. I had written down from my notes from SMX East 2012 that she had said they may collect information from a page that is noindexed if there are enough signals pointing them to that page… again, just that they would not show it in General SERPs.

  • Hammad

    are you sure that 302 transfers PR? Have you done some experiment to verify it?

  • Hammad

    are you sure that 302 transfers PR? Have you done some experiment to verify it?

  • David Viniker

    Positioning of a webpage on Google results pages for a keyword depends on Relevance and Reputation (Authority/Popularity).

    “PageRank, despite what many may say, is a measure of the quantity
    and quality of links. It has no connection to the words on a page.” The fact that PageRank is Google’s non-keyword specific indication of webpage reputation has been confirmed by Google’s head of anti-spam unit, Matt Cutts – http://www.mattcutts.com/blog/seo-for-bloggers/

    “Many SEOs believe that there are two elements of PageRank: a
    domain-level and a page-level PageRank…. While I
    believe that Google likely uses some element of domain authority, this
    has never been confirmed by Google.” Surely the fact that pages on high domain reputation websites such as Wikipedia or CNN (both have HomePage PageRanks of 9) – invariably rise to the top of Google SERPs for targeted keywords is proof enough. HomePage PageRank is the only indication available of Website or Domain Reputation as reported by Google.

    Google does not reveal all the backlinks to websites in its index. This means that any metric of Domain Authority that is not PageRank based cannot accurately reflect Google’s assessment of the importance of a website. As Google performs 90% of searches globally, it is the search engine that matters.

  • http://www.archology.com/ Jenny Halasz

    I always thought it didn’t, but John Mueller from Google said they will pass PR from 302 redirects as long as the destination page is available (i.e. not 404). Note, I’m only reporting what Google said, I’m not generally an algoholic.

  • http://www.archology.com/ Jenny Halasz

    I always thought it didn’t, but John Mueller from Google said they will pass PR from 302 redirects as long as the destination page is available (i.e. not 404). Note, I’m only reporting what Google said, I’m not generally an algoholic.

  • http://www.archology.com/ Jenny Halasz

    Toolbar PageRank is for “entertainment purposes only” and has not been updated in over 6 months: http://www.seroundtable.com/google-pagerank-no-update-17276.html. For readers interested in market share data (Google is well below 90% globally – depending on who you ask): http://searchengineland.com/google-worlds-most-popular-search-engine-148089.

  • http://www.archology.com/ Jenny Halasz

    Toolbar PageRank is for “entertainment purposes only” and has not been updated in over 6 months: http://www.seroundtable.com/google-pagerank-no-update-17276.html. For readers interested in market share data (Google is well below 90% globally – depending on who you ask): http://searchengineland.com/google-worlds-most-popular-search-engine-148089.

  • David Viniker

    Hi Jenny, and thank you.

    I believe that anyone who has watched the GoogleWebmasterHelp
    video on Advertorials by Matt Cutts, published May 29th 2013, would appreciate the continuing importance of PageRank in the Google algorithm.

    I respectfully cannot agree that the Google Toolbar is for “entertainment purposes only”. The toolbar is usually updated at three month intervals but we are now past seven months: There was a nine month gap in 2010. Why this current delay? There have major changes to Panda and the arrival of Penguin 2.0 in the
    interim: It has been suggested that the delay is to prevent analysts and optimizers from gleaning insight on how these have affected their websites’ and webpages’ PageRanks.

    In a comment to the Post “Is PageRank Finally Dead? It Seems To Be, At Least In The Google Toolbar”, the hypothesis was put forward that the Total Reputation of a webpage is the sum of
    the PageRank of the Page + the PageRank of the HomePage of the website + a boost if the competing webpage is the HomePage. Averaging the Total Page Reputations of the webpages on the top Google results page for a keyword provides a good indication of keyword difficulty.

    Has the reliability of the above as a keyword difficulty technique
    been affected by the delay in PageRank update? An on-going study of 5,000 keywords commencing in 2010 and first reported in EzineArticles suggests that, despite the delay, the indicator has been remarkably stable. http://www.pagerank-explained.com/2nd-August-2013-5K-Keyword-Study-TPR-6-analyses-2010-2013.pdf

    With regard to the 90% global share of the market, my
    source is http://karmasnack.com/about/search-engine-market-share/ and was measured from reports from a number of sources including Nielsen-Net, Alexa and seoMoz.

  • Brandon Zienowicz

    Jenny, do you have the source of the John Mueller quote to share? I think the interesting thing to think about is just how much PR is passed. Is it equivalent to a 301 or not. Here is an interesting article (http://bit.ly/1a1Wzg0) by Geoff Kenyon from Distilled that covers an experiment he conducted to test this. His results show that less link equity is passed through a 302 than a 301.

  • http://www.archology.com/ Jenny Halasz
  • Hammad

    Can you send me the reference? This is something new for me and i talked to someone from MOZ too and their believe is that 302 does not pass the page rank. During a recent audit to one of the site i found that PR of the site was lost [i had the historic readings]. The only thing i could found was wrong use of redirects and mostly 302s.
    will appreciate your help on it.

  • Brandon Zienowicz

    Thanks Jenny. For anyone interested in hearing it for yourself, it occurs at the 46:22 mark, here: http://youtu.be/6yGyKG85_e0?t=46m22s

  • http://remkovanderzwaag.nl/ Remko van der Zwaag

    This 302-thing is interesting but also confusing. I think it makes no sense (from a search engine’s point of view) to pass all pagerank to a page which is meant to be *temporary* by definition.

    Maybe John meant that a 302 passes *some* pagerank?

    But let’s say a 302 does pass all pagerank: I still think it’s important to use 301 redirects when it’s appropriate, since 302 redirects can leave a nasty trace in the SERPs where titles/meta descriptions get mixed up.

    Also, from a developers point of view, It’s simply not correct to use 302 redirects for permanent changes; I’m sure you’ll get in trouble sooner or later.

  • http://www.archology.com/ Jenny Halasz

    Thanks for the sources, I will be sure to check them out. I think everything you’re saying has a lot of validity, but I’m only reporting what Google has actually said. While there’s no doubt that PR is still a major part of the algorithm, the toolbar PR has been panned by Google as a data source. And my statement “for entertainment purposes only” comes directly from Google: http://searchenginewatch.com/article/2063353/Google-Toolbar-PageRank-Display-Just-For-Entertainment. shortly after this article was written, they updated the language on the toolbar “about” page as well – although I don’t know if it’s still there – I haven’t used it in years.

  • lowlevel

    Hi TmWe. Yes, a page with a +1 button could be indexed regardless of a noindex directive.

    Specifically, they say:

    “+1 is a public action, so don’t add the button to any page you don’t want visible on the web. When you add the +1 button to a page, Google assumes that you want that page to be publicly available and visible in Google Search results. As a result, we may fetch and show that page even if it is disallowed in robots.txt or includes a meta noindextag.”

    Source:
    https://support.google.com/webmasters/answer/1634172?hl=en

  • lowlevel

    Hi Jenny,

    > Because I have seen instances where a specific string in the site: command will return a result, but it doesn’t show up in the general SERPs.

    If you see a resource using the “site:” operator, then it means that the webmaster has not (correctly) sent to Googlebot a noindex directive. If Googlebot perceives a noindex directive, then that resource will not appear in any kind of SERP. including the a “site:” SERP.

    Other search engines behave in the same way, a noindex directive is quite a strict one. The only exception is the one cited in my previous comment: Google can ignore it if the page has a +1 button.

    > What I was trying to explain to people is that noindex is not an absolute – a savvy searcher could still find your page if they knew what to look for.

    I’ve seen several webmasters pointing out that there were noindex pages in some SERP, but every time I’ve analyzed the scenario and it turned out that the noindex directive wasn’t perceivable by the spider (commonly, the webmaster asks the spider not to request the resource, preventing the spider to see the noindex directive written inside).

  • http://www.archology.com/ Jenny Halasz

    Thank you for the explanation. I could swear I’ve seen it when implemented properly, but what you’ve said makes a lot more sense.

  • TmWe

    Hi lowlevel, thanks for the link, just read that. Now, either that page is incorrect or Google’s “Controlling Crawling and Indexing*” documentation is incorrect. (* current usage of the robots.txt web-crawler control directives as well as indexing directives as they are used at Google. )

    https://developers.google.com/webmasters/control-crawl-index/

    It would seem strange that Google would publish their ‘standards’ for robots.txt and noindex but hide caveats on unrelated pages. Ever seen any examples of sites indexed with a noindex tag due to a +plus one button ?

  • lowlevel

    > It would seem strange that Google would publish their ‘standards’ for robots.txt and noindex but hide caveats on unrelated pages.

    I wouldn’t consider it extremely strange, because in the past I’ve seen discrepancies between related FAQ pages. My guess is that in the section “Controlling Crawling and Indexing” they just forgot to add the exception related to the +1 button.

    > Ever seen any examples of sites indexed with a noindex tag due to a +plus one button ?

    I’ve never checked when or how frequently a noindex/disallow directive is ignored as a consequence of a +1 button. Google says “we may”, so it seems that sometimes a directive could be ignored and sometimes it will not be ignored.

  • lowlevel

    If in the future you stumble upon a properly implemented noindex resource that can be visualized in a SERP, please let me know! It would be an interesting case and I would love to study it. :-)

  • http://www.cygnet-infotech.com/ Boni Satani

    Awesome, thanks for the insights

  • http://www.archology.com/ Jenny Halasz

    definitely!

  • TmWe

    Strange that it isnt mentioned here either :

    https://support.google.com/webmasters/answer/1140194

    “Does +1 affect how Google crawls my site?

    When you add the +1 button to a page, Google assumes that you want that page to be publicly available and visible in Google Search results. As a result, we may fetch and show that page even if it is disallowed in robots.txt.”

    And yes, I too have seen discrepancies in Google’s documentation and so I find it is always best to seek corroborating evidence in times of doubt. You know – seeing is believing :) If you do happen across any supporting examples, I would be most interested.

  • mattcoffy

    Great work, Jenny! Allow me to share this to my circle. Your insights are fantastic and everything’s still well-written despite the complexity of the subject. Thanks!

  • digitalencore

    i have some confusion in one point. If i do 302 from old domain to new on, will the penalty also passed to it? I think so.. Let me know your thoughts.

    skype id:encore.digital