How Microsoft Removes “Junk” From Bing Search Results
Dr. Richard Qian from Bing’s core search team wrote a blog post on the Bing Search blog named Bing Search Quality Insights: Reducing Junk. This is part of Bing’s ongoing effort to provide search quality insights on how Bing works. Bing here explains how they handle removing bad links from the Bing search results, and […]
Dr. Richard Qian from Bing’s core search team wrote a blog post on the Bing Search blog named Bing Search Quality Insights: Reducing Junk. This is part of Bing’s ongoing effort to provide search quality insights on how Bing works.
Bing here explains how they handle removing bad links from the Bing search results, and also have they handle junky or empty snippets.
Junk links include:
- Dead Links
- Soft 404
- Parked Domains
Junk or Empty Snippets include:
- Junky Snippets
- Empty Snippets
Dead link examples are pages that return a 4xx or 5xx error code is returned from an HTTP request for a page. There are times where there is a dead link in Bing and Bing isn’t aware of it because they have not crawled the web page since it returned a proper result. But Bing’s crawler does crawl often and is able to detect dead links fairly quickly. When Bing does detect a dead link, depending on their algorithms they may “boost its re-crawl priority and frequency” to see if the dead link was a temporary error and should return to the search results or not.
A soft 404 is like a hard 404 but without returning a 404 header status. Bing said they use their “high precision classifiers in this area use page content such as key phrases in the page’s title, body and URL to determine if the page is a soft 404 and whether to remove it from the search results.”
Bing doesn’t want parked domains to show up in the search results so they use signatures to identify parked domains and remove them.
Bing also uses various techniques to improve their encoding classifier, document convertor, garbage detector, and HTML parser have reduced the occurrence of junky snippets.
For snippets that are empty, Bing uses dynamic crawlers and document processors, plus a number of classifiers to determine the appropriate snippet for the search result.
For more details, see the Bing blog.
- Keeping Up With Google: Bing Launches New “Search Quality Insights” Series
- Bing Webmaster Tools Adds Markup Validator
- Google & Bing Have “Won A Major Victory” Over Content Farms, Study Says
- Banned Holiday Deal Sites Return To Bing
- Bing Bans Holiday Deals Sites, Including One By Group That Created Cyber Monday
- Bing On Mobile Search & SEO
- The Meta Keywords Tag Lives At Bing & Why Only Spammers Should Use It