The Open Directory’s Home Page Goes Missing In Google
The Open Directory’s home page appears to have gone missing from Google’s search results. For example, a search on dmoz (the Open Directory’s nickname) does not return the home page in the search results. Here is a screen capture:
Similarly, searches for open directory or open directory project also don’t list the site at the usual dmoz.org address. Yes, the screenshot shows a page at search.dmoz.org — but normally, the home page would be listed at www.dmoz.org or just dmoz.org (as you can see at Yahoo, Microsoft and Ask, for example).
Google still does have pages from dmoz.org in the index. A search for site:www.dmoz.org clearly returns results but does not appear to return the Open Directory’s home page in those results.
It is weird that internal pages such as this one come up for normal Google searches but the home page is nowhere to be found. This is even stranger in light of The Great Google Directory Ban Of Sept. 2007. Is the Open Directory somehow being included in an algorithm change that hit smaller directories?
Postscript: Matt Cutts of Google replied to the Sphinn thread explaining that http://www.dmoz.org/ was 301 redirecting (a permanent redirect) back to http://www.dmoz.org/. So there was this loop saying my home page has permanently been changes to my home page. That obviously confused GoogleBot, so after a few days of trying to find the new URL and only being given the old URL, GoogleBot gave up.
Hey all, I dug into this a little bit with the help of a couple crawl folks. It looks like when Googlebot tried to fetch http://www.dmoz.org/, we got a 301 redirect back to http://www.dmoz.org/ . It looks like that self-loop has been going on for several days. We were last able to fetch the root page successfully on Sept. 10th, but from that point on DMOZ was returning these 301-to-itself pages, and after a few days Googlebot gave up on trying to fetch the url. It looks like the rest of the site is fine, so I suspect that if DMOZ gets 301/redirects for their root page sorted out on their webserver, we’ll recrawl and index the page pretty quickly.
So in short, it is an easy fix for the Open Directory Project, but we learned something new. Never 301 redirect a URL to the same URL.