SEOmoz discovered that ending your URLs with a .0 will prevent your pages from being found in the Google index. That means we need to add the .0 to the list of file extensions you should avoid using.
Stephen Spencer’s News.com article said, “According to Matt, the file extension in your URL won’t affect your rankings.” However, “the one extension you should avoid for your Web documents? .exe.” So now we have two known file extensions you should not end with, the .0 and .EXE.
I wonder if there is a document somewhere of file extensions or URL parameters that one should seriously avoid for search engines. If not, it would make for a great chart.
Postscript: Rand Fishkin of SEOmoz emailed me to tell me to add .tgz as another extension Google won’t include in the index.
Postscript #2: Matt Cutts of Google addressed this issue. Let me quote his takeaways:
- Why Google doesn’t crawl some filetype extensions (when we’ve seen good evidence that the extensions are mostly binary or otherwise not-very-indexable files).
- An easy was to use the filetype: operator, so that you can decide whether to avoid a particular filename extension yourself.
- Google is willing to revisit old decisions and test them again, which is what we’re doing with the “.0″ filetype extension.