How Microsoft Removes “Junk” From Bing Search Results

Dr. Richard Qian from Bing’s core search team wrote a blog post on the Bing Search blog named Bing Search Quality Insights: Reducing Junk. This is part of Bing’s ongoing effort to provide search quality insights on how Bing works.

Bing here explains how they handle removing bad links from the Bing search results, and also have they handle junky or empty snippets.

Junk links include:

  • Dead Links
  • Soft 404
  • Parked Domains

Junk or Empty Snippets include:

  • Junky Snippets
  • Empty Snippets

Dead link examples are pages that return a 4xx or 5xx error code is returned from an HTTP request for a page. There are times where there is a dead link in Bing and Bing isn’t aware of it because they have not crawled the web page since it returned a proper result. But Bing’s crawler does crawl often and is able to detect dead links fairly quickly. When Bing does detect a dead link, depending on their algorithms they may “boost its re-crawl priority and frequency” to see if the dead link was a temporary error and should return to the search results or not.

A soft 404 is like a hard 404 but without returning a 404 header status. Bing said they use their “high precision classifiers in this area use page content such as key phrases in the page’s title, body and URL to determine if the page is a soft 404 and whether to remove it from the search results.”

Bing doesn’t want parked domains to show up in the search results so they use signatures to identify parked domains and remove them.

Bing also uses various techniques to improve their encoding classifier, document convertor, garbage detector, and HTML parser have reduced the occurrence of junky snippets.

For snippets that are empty, Bing uses dynamic crawlers and document processors, plus a number of classifiers to determine the appropriate snippet for the search result.

For more details, see the Bing blog.

Related Articles:

Related Topics: Channel: SEO | Microsoft: Bing | Microsoft: Bing SEO | Top News


About The Author: is Search Engine Land's News Editor and owns RustyBrick, a NY based web consulting firm. He also runs Search Engine Roundtable, a popular search blog on very advanced SEM topics. Barry's personal blog is named Cartoon Barry and he can be followed on Twitter here. For more background information on Barry, see his full bio over here.

Connect with the author via: Email | Twitter | Google+ | LinkedIn


SMX - Search Marketing Expo

SearchCap:

Get all the top search stories emailed daily!  

Like This Story? Please Share!

Other ways to share:

Like Our Site? Follow Us!

Subscribe to Our Feed! Join our LinkedIn Group Check out our Tumblr! See us on Pinterest Get Search Engine Land on your mobile device!
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://profile.yahoo.com/WIM5IUAZYN6KJY646Q6QVUUIQ4 M

    If
    then flag.Junk
    else bing.Fail

  • http://www.cloudkeyseo.com/ Eric

    We are an SEO company in Orange County and one of our biggest challenges with new clients is making sure we overcome dead link issues when optimizing websites. Good article.

  • Tatyana Serbit

    When Wikipedia and others were on SOPA protest, Google recommended them to use 503 (Service unavailable) header status to not being re-crawled with new “content” and lose positions.
    What whould Bing do?
    >Dead link examples are pages that return a 4xx or 5xx error code is returned from an HTTP request for a page.
    >When Bing does detect a dead link, depending on their algorithms they may “boost its re-crawl priority and frequency”

    The key point is that they “may”. Also they may not. Depending on their algorithms. They may just kill the page from SERP and index. Also they may not. ))

Get Our News, Everywhere!

 
  • Advertise With Us
 

Click to watch SMX conference video

Join us at an upcoming SMX event:

North America

EMEA

APAC

Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.

SMX Site » | SMX Difference » | SMX News »




 

Search Engine Land Periodic Table of SEO Ranking Factors

Get Your Copy
Read The Full SEO Guide