Today, Google’s Webmaster Tools have been updated with new diagnostic features that alert site owners to problems Google may have extracting content from pages. In addition, they’ve added Video Sitemaps. They’ve also added Hungarian and Czech to the 20 languages already supported. Below more about these features and how best to use them.
Identifying Content Extraction Issues
A new section called Content Analysis lists issues that Google has had extracting content from the pages of the site, with links to the specific pages. In the help file, they say that the goal is to help identify ways to improve the site for visitors, but in many cases, these issues cause problems in the search result display as well. Fix these things, and your page might be a more compelling result for searchers to click on. The help stresses that pages that have these issues won’t experience ranking problems and that fixing them won’t improve ranking (although that’s not entirely true, as pages that Google can’t extract content from are unlikely to rank well).
Title Tag Issues Each page on the site should have a unique title tag that includes one or two main keywords for that page and succinctly describes what the page is about. You don’t want to target every page at the same query, since Google will return at maximum two pages from your site in the first page of results. Better is to have a particular focus for each page so that your site can be represented in the largest set of queries possible.
- Duplicate Title Tags: Pages with identical title tags are not only at risk of being filtered by Google for being duplicate, but they also are missing an opportunity to be optimized for distinct keywords. Click this error to see each title that’s replicated and a list of URLs that use that title tag.
- Missing Title Tags: As mentioned above, each page should have a unique title. A missing title tag may cause Google to show the URL or anchor text to the site for the title in the search result, which may not create a compelling display for the searcher.
- Long/Short Title Tags: If Google detects that the title is too short or long to be useful, it’ll flag the page here. Google doesn’t specify the number of characters that makes a title too short or too long (although they do use an example of a two-character, one word title as being too short), but I imagine webmasters will experiment with titles to see the lengths that cause these flags to be triggered (which webmasters can already do with how titles display in the search results). A title that’s too short is strictly a usability issue. Visitors coming to the site and bookmarking the page and searchers viewing the page in the results may not have enough information about what the site’s about. However, a title that’s too long may get truncated in the search result, which may create a title that doesn’t contain the most important information.
- Non-informative Title Tags: Google flags titles that are potentially placeholders. My guess is this would be things like “Untitled” and “Title.”
Meta Description Issues Google often (although not always) uses the text from the meta description tag as the snippet shown in the search result. See The Anatomy of a Google Search Result for more information about how Google determines the text to show in a snippet and how to improve what’s shown for your pages.
- Short/Long Meta Descriptions: When Google uses the meta description text for the snippet and that text is too long, Google cuts it off, which may cause the snippet to end abruptly or may keep important information from being displayed. To be on the safe side, you should also put the most relevant and valuable information at the beginning of the text to increase the odds it will be shown. Google doesn’t provide guidance about a meta description that’s too short, but the text should be long and descriptive enough to give the searcher a good idea of what the page is about.
- Duplicate Meta Descriptions: Just as each page should have a unique title, each one should have a unique meta description as well. Danny previously reported that searchengineland.com was filtered as duplicate when the meta description on each page of the site was the same.
Non-Indexable Content Issues My assumption is that this error is shown when Google can’t extract any text from the page. For instance, a page done entirely in Flash would have not text for Google to process. Any pages without indexable text will have a tough time getting ranked. First, look to see if it’s a page you want indexed. I would guess that URLs to images may show up here (such as http://www.example.com/image.jpg), and you won’t be able to do anything about those types of pages. But if you find that the page is something you’d like to rank, figure out why Google can’t extract any content from the page. For instance, if the page is in Flash, you could move some of the text into HTML and use Flash to augment that text.
Sitemap Details A little over a year ago, Google Webmaster Tools added a count of how many URLs were listed in a submitted Sitemap. Today, they’re adding the count of how many of those URLs are indexed. If your Sitemap is a fairly accurate representation of the URLs on your site, this new stat should help you easily know what kind of coverage you have in the index.
Video Sitemaps The Sitemaps protocol has been extended to support video. You can now include URLs to videos, as well as meta data about those URLs. Google uses Video Sitemaps for their Google Video index, although with universal search, those videos may appear in web search results as well.
Two New Languages Google has added Hungarian and Czech to the 20 languages already supported. These languages now have discussion forums as well, although no word on whether Googlers are available to monitor them. A few weeks ago, a Google Webmaster Central blog post talked about the 12 language-specific forums that Googlers (primarily out of the Dublin office) now monitor.