Live Search Webmaster Center Gains Crawl Error And Backlinks Reports
Today, the Microsoft Live Search Webmaster Center is launching its first major update since its initial debut last fall. The Live Search Webmaster Center is Microsoft’s hub for their communication with site owners, and the tools portion of that center now has new crawl issues reports, backlinks data, and download functionality. The Webmaster Center was […]
Today, the Microsoft Live Search Webmaster Center is launching its first major update since its initial debut last fall. The Live Search Webmaster Center is Microsoft’s hub for their communication with site owners, and the tools portion of that center now has new crawl issues reports, backlinks data, and download functionality. The Webmaster Center was first announced at Searchification in September and launched in November 2007 with a blog, discussion forum, and set of tools that give site owners insight into Microsoft’s crawl and indexing of their sites. Below, more details about what’s launched today and how site owners can use the new data.
The evolving relationship between search engines and site owners
Until a few years ago, the relationship between search engines and site owners was fairly limited. Site owners could block pages with robots.txt and otherwise hoped for the best. In 2005 Google and Yahoo! launched separate feed submission programs, as well as tools that gave webmasters a glimpse at some of what the search engines knew about their sites. In late 2006, Google, Yahoo!, and Microsoft joined together for their support of sitemaps.org, a standard XML Sitemaps protocol that enables site owners to submit a list of all of their site’s pages to the search engines.
At the time of that announcement, Yahoo!’s Site Explorer product supported Sitemaps submission and provided comprehensive backlinks data and Google’s tools had evolved into Webmaster Central, but Microsoft had no corresponding product. In August of 2007 Microsoft announced that they had formed a small team “to build the next-generation set of tools, content and resources for SEO professionals and webmasters (and get ‘link:’ back in your hands).” In November 2007, Microsoft launched the Webmaster Center, with a blog, discussion forum, help center, and set of tools (and Sitemaps support). They said they knew they had “a lot of work still to do” and this launch was just the first step.
Today, they’ve made good on that promise to get “link:” back in the hands of webmasters by adding a feature that provides filterable and downloadable backlinks data. They’ve also added crawl issues reports and download functionality.
These features join the existing set which include:
- Indexing details (is the site indexed? what are the top pages?)
- Penalty information (has the site been flagged as spam?)
- robots.txt validator (is the site’s robots.txt file configured properly?)
- Outbound linking data
- Sitemap submission
Crawl issues reports
The crawl issue reports are similar to the ones offered by Google’s Webmaster Central and provide details on URLs that MSNbot tried to crawl but wasn’t able to due to one of four issues. Since a page can’t be indexed if a search engine bot has trouble crawling it, then these reports are one of the first places a webmaster should check when diagnosing indexing issues.
Four reports are available:
- File not found – lists all pages that MSNbot tried to crawl and received an HTTP response code of 404. Generally, URLs listed here are from typos in links from other sites. You often can’t fix the link, but you can 301 redirect the typo to the correct page (for both a better user experience and reclaimed backlinks).
- Blocked by REP – lists all pages that MSNbot tried to crawl but didn’t because they were blocked by the site’s robots.txt file or robots meta tag. You should review this list and make sure you aren’t accidentally blocking access to pages you want indexed.
- Long dynamic URLs – lists all pages that have been flagged as having “exceptionally long query strings.” Microsoft says these URLs could lead MSNbot into an infinite loop as it tries to crawl all variations of potential parameter combinations and recommends webmasters find ways to shorten these dynamic URLs.
- Unsupported content types – lists all pages that are classified with content types that Live Search doesn’t index.
Will you find these reports useful if you already use Google’s reports? Additional data points are always helpful, so you should at least use them to get additional information. Also, in some cases, these reports provide different information than Google’s, so they can help you get a more comprehensive view of your site. For instance, you may find from the blocked by REP report that you’ve inadvertently blocked MSNbot. In this instance, Google’s reports would report no issues. Google doesn’t list URLs that may have lengthy parameters that are difficult to crawl. This may be because Googlebot has less trouble with them, but even Googlebot would likely have trouble with an “infinite loop” so these URLs are worth digging into deeper for the sake of all search engines.
The new backlinks feature shows the total count of backlinks to a site. You can view a list of the top URLs in the tool or can download up to 1,000.
The “page score” column will probably catch some notice — this is how important Microsoft views particular pages. Said Nathan Buggia, Lead Program Manager for the Webmaster Center:
Page Score is a loose measure of how authoritative Live Search determines each URL or domain to be. A high score (more green boxes) indicates a page to be more authoritative. There are many factors that determine the order in which URLs are listed in the search results, among those are page score, freshness and several other factors.
Both Google and Yahoo! provide backlinks information, so how is Microsoft’s data useful in addition to what the other engines provide? While all the engines provide download functionality, which enables you to filter and sort in Excel, Microsoft provides some filtering ability within the tool, which makes it easy to do quick checks. For instance, by filtering, I can see that nytimes.com links to searchengineland.com 849 times.
This filtering ability enables you to download far greater than the 1,000 limit. Buggia told me, “While the download limits you to 1,000 results, our advanced filtering functionality gives webmasters the ability to ‘zoom in’ to the specific section of their website, and download the specific 1,000 results they are looking for. We are continuing to explore additional ways to provide more data to webmasters and make that data as actionable as possible.”
You can now download all data available within the webmaster center as a CSV file. You can open a CSV file in Excel and sort and filter the data in a number of ways.
The future of search engine/webmaster relationships
When the sitemaps.org alliance was first announced, not everyone was sure that Microsoft was truly on board to build a relationship between their search engine and site owners. However, since then, they’ve begun processing Sitemaps, launched a webmaster portal, and now added several substantial features. In contrast, Yahoo!’s interest in this relationship seems to have waned as they’ve turned their attention to (arguably equally important) support for developers with SearchMonkey and BOSS. (They haven’t launched new material features for Site Explorer in a year). Google still leads the way, with a fairly robust toolset and regular blogging, responsiveness in their discussion forums, and conference appearances, but Microsoft’s latest launch shows that they’re interested in strengthening their relationship with site owners as well.
Relationships between search engines and site owners will continue to grow in importance, particularly as content ownership and privacy issues evolve. All three major search engines recently joined forces again to show consolidated support for the Robots Exclusion Protocol, possibly in part as a preemptive measure in response to concerns of publishers involved with ACAP.
If search engines have solid support mechanisms in place for site owners, those owners may feel more comfortable with how search engines use their content and may be less likely to make moves to limit access to it. This is important in particular for Microsoft with their declining market share. If site owners feel Microsoft is being intrusive and unresponsive to their needs, they could block MSNbot. This would be easier for them to do with MSNbot than with Googlebot, since it would have far more limited impact on their overall search traffic . By showing they are committed to a relationship with webmasters, Microsoft can work to prevent those types of actions. They are showing this commitment not only with new features, but by going out of beta. Nathan Buggia told me:
After 9 months in beta, we are taking the Webmaster Tools to version 1.0. This release marks a significant step forward in terms of the amount of information we’re providing webmasters, and the ability to make that information actionable. Web publishers are a vital component of our search ecosystem. You will continue to see us release additional functionality in the coming months, providing additional transparency to publishers about how live search indexes their content, helping them improve their results in Live Search.
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.
New on Search Engine Land