• http://www.bluesapphirecreations.com ankurchaudhary

    Great article David! A question though – what if Google also identifies the top offenders (for the want of a better word) for certain search categories and reduce the domain or authority impact for pages coming from them? I am sure the algorithmic changes somehow try to take care of this but wouldn’t having a small (thence manageable) index of sites like eHow and Mahalo help the search engine provide better results. That said, the question of web neutrality is again daunting.

    On second thoughts, isn’t almost every objective parameter grounded in some heuristic subjectivity?

  • dstiehr

    Getting rid of the “top 20 spam sites” doesn’t fix the problems by a long shot. Search “dishwasher repair” on Blekko, and the very first result that comes up is http://www.dishasher-repair.org – a link farm!

    Try it for just “dishawasher” and you get http://www.dishwasherpete.org – a parked domain! Other results include a Wikipedia listing (in case I’m confused on what a dishwasher is, I guess), a howstuffworks article (shouldn’t this be lumped in with eHow?), food recipes, a recall article, something from a video game site, and, well I’m going to stop typing because you can run the query yourself.

    I took the dishwasher theme from the famous Paul Bedrosky article about how poor Google’s results were when he needed a new dishwasher, but I’d imagine this could be replicated across many different search themes. What is manually-curated search doing to help someone on these two very common queries?

    How are those results any better than eHow and the other Spam sites that have been removed? And if Blekko wants to say these results are a work in progress, then they should stop the PR attacks until they have their own house in order.

  • http://www.pbm.com/~lindahl/ Greg Lindahl

    @dstiehr Thanks for pointing out those spam sites. Did you try the suggested slashtag /diy? What do you think of those results?

  • http://seotrainingdojo.com David Harry

    @ankurchaudhary – I think of how one decides who ‘top offenders’ are. In the web spam world these are generally the more blantant offenders; cloaking, dodgy link building, automation etc. When it comes to content, however weak, it’s more about search quality. There are lots of sites out there with weak content, but we’ve never called that spam. Just a bad result. To weaken (legitimate?) en entire domains authority because of a subjective judgement that ALL their content is weak, seems troublesome. And yes, lol, as Rich pointed out, there is going some be some subjectivity regardless.

    @dstiehr – once more, this is what I feel is the never ending problem. Instead of dealing with relevancy isses, we keep seeing band-aid solutions to the problem. Furthering this problem is the difficulty in getting users to interact or give explicit signals in general.

    I spent some time looking around some of these ‘content farms’ and not all the information is useless. It’s jutst apparent that the collective domain authority is increasing scoring for pages that don’t likely warrant the ranking from a relevance perspective.

    @Greg – nice of U to drop in. I’d love to talk sometime about implicit/explicit signals. I love what you guys are trying to do, but it seems a shift in searching behaviour might be a problem (large scale). Hopefully everyone (search engines) can work towards better relevance.

  • http://www.bluesapphirecreations.com ankurchaudhary

    @David: I completely agree with your assertion but wouldn’t the threat that their impact/authority in the search results might be reduced if they have weak content quality coerce websites like eHow and Mahalo to work harder on quality management :-)

    May be just a bluff by Google would clean the web in unprecedented ways ;-)

  • http://www.pbm.com/~lindahl/ Greg Lindahl

    @David: Sure, I’d love to talk about implicit/explicit signals – a very important topic in search these days!