30% Of Results For Some Competitive Searches Found To Be Spam

Researchers Track Down a Plague of Fake Web Pages from the NY Times reports on a recent paper from Microsoft Research released named “Spam Double-Funnel: Connecting Web Spammers with Advertisers” [PDF download] (also note that Gary linked to it Friday). The report categorizes search results spam by industry category, showing that some search categories have […]

Chat with SearchBot

Researchers Track Down a Plague of Fake Web Pages from the NY Times reports on a recent paper from Microsoft Research released named “Spam Double-Funnel: Connecting Web Spammers with Advertisers” [PDF download] (also note that Gary linked to it Friday).

The report categorizes search results spam by industry category, showing that some search categories have a 30% or more rate of spam. Here is a chart covering various :

Microsoft Spam  - Spammer Targeted Categories


Read this from section 4.0:

In late September 2006, we submitted the 1,000 keywords to the Search Ranger system, which retrieved the top-50 results from all three major search engines. In total, we collected 101,585 unique URLs from 1,000x50x3=150,000 search results. With a set of approximately 500 known-spammer redirection domains and AdSense IDs at that time, the system identified 12,635 unique spam URLs, which accounted for 11.6% of all the top-50 appearances. (The actual redirection-spam density should be higher because some of the doorway pages had been deactivated, which were no longer causing URL redirections when we scanned which were no longer causing URL redirections when we scanned them.)

The NY Times summarizes the paper saying they “discovered that the average spam density — a measure of the percentage of Web pages that contain only advertisements — was 11 percent for 1,000 keywords they used in their research.”

Here are some other references for you:

Strider URL Tracer with Typo-Patrol from Microsoft Research
Strider Typo-Patrol from Microsoft Research
Typo Domain Spotting Tool & Domain Registration Stats from SEW Blog
Google AdSense For Domains Program Overdue For Reform — And Yahoo & Microsoft Should Also Take Note from SEW Blog
MS Research: Typo-Squatters Are Gaming Google from eWeek


About the author

Barry Schwartz
Staff
Barry Schwartz is a technologist and a Contributing Editor to Search Engine Land and a member of the programming team for SMX events. He owns RustyBrick, a NY based web consulting firm. He also runs Search Engine Roundtable, a popular search blog on very advanced SEM topics.

In 2019, Barry was awarded the Outstanding Community Services Award from Search Engine Land, in 2018 he was awarded the US Search Awards the "US Search Personality Of The Year," you can learn more over here and in 2023 he was listed as a top 50 most influential PPCer by Marketing O'Clock.

Barry can be followed on X here and you can learn more about Barry Schwartz over here or on his personal site.

Get the newsletter search marketers rely on.