Lessons Learned at SMX West: Google’s Panda Update, White Hat Cloaking & Link Building
Google’s Own Words About the Farmer/Panda Update
Google’s Matt Cutts said that while the change isn’t 100% perfect, searcher feedback has been overwhelmingly positive. He noted that the change is completely algorithmic with no manual exceptions.
Blocking “Low Quality” Content
Matt reiterated that enough low quality content on a site could reduce rankings for that site as a whole. Improving the quality of the pages or removing the pages altogether are typically good ways to fix that problem, but a few scenarios need a different solution.
For instance, a business review site might want to include a listing for each business so that visitors can leave reviews, but those pages typically have only business description information that’s duplicated across the web until visitors have reviewed it. A question/answer site will have questions without answers… until visitors answer them.
In cases like this, Google’s Maile Ohye recommended using a <meta name=robots content=noindex> on the pages until they have unique and high-quality content on them. She recommends this over blocking via robots.txt so that search engines can know the pages exist and start building history for them so that once the pages are no longer blocked, they can more quickly be ranked appropriately. I noted in the panel where we discussed this that an exception might be for a very large site, robots.txt would ensure that the search engine bots were spending the crawl time available on the pages with high-quality content.
Matt said that having advertising on your site does not inherently reduce its quality. However, it is possible to overdo it. I had noted in my earlier articles about this change that in particular, no content and only ads above the fold, as well as pages that have so many ads, it’s difficult to find the non-advertising content often provide a poor user experience.
Matt also noted that if Google determines a site isn’t as useful to users, they may not crawl it as frequently. My suggestion based on this is to take a look at your server logs to determine what pages Googlebot is crawling and often those pages are crawled. This can give you a sense of how long it might take before quality changes to your site take affect. If Google only crawls a page every 30 days, then you can’t expect quality improvements to change your rankings in 10 days, for instance.
International Roll Out
Matt confirmed that the algorithm change is still U S. only at this point, but is being tested internationally and would be rolling out to additional countries “within weeks”. He said that the type of low quality content targeted by the changes are more prevalent in the United States than in other countries, so the impact won’t be as strong outside the U.S.
Continued Algorithm Changes
Matt said that many more changes are queued up for the year. He said the focus this year is on making low quality content and content farms less visible in search results as well as helping original creators of content be more visible. Google is working to help original content rank better and may, for instance, experiment with swapping the position of the original source and the syndicated source when the syndicated version would ordinarily rank highest based on value signals to the page. And they are continuing to work on identifying scraped content.
Does this mean that SEO will have to continue to change? Not if your philosophy is to build value for users rather than build to search engine signals. As Matt said, “What I said five years ago is still true: don’t chase algorithm, try to make sites users love.”
Gil Reich noted my frustration at the way some people seemed to interpret this advice.
“The Ask the SEOs sessions used to be battles between Black and White. Now, with the same participants, it’s between “focus on users” and “focus on creating the footprint that sites that focus on users have.” It seemed that Vanessa found this debate far more frustrating than when her fellow panelists were simply black hats.”
It’s true. For instance, at one point, Bruce Clay said that they’d done analysis of the sites that dropped and those that didn’t and they found that those that dropped tended to have almost an identical word count across all articles. So, he said it was important to mix up article length. I told the audience that I hoped that what they got out of the session was not that they had to go back to their sites and make sure all the lengths were different but that they actually made sure each page of content was useful.
You know the best way to ensure your site has a “footprint that sites that focus on users have”? Focus on users!
“White Hat” Cloaking?
We hear a lot about “white hat”cloaking. This tends to mean anytime a site is configured to specifically show search engines different content than visitors for non-spam reasons. For instance, a search engine might see HTML content while visitors see Flash. The site might not serve ads to search engines or present only the canonical versions of URLs.
Some SEOs say that Google condones this type of cloaking. However, Matt was definitive that this is not the case. He said:
“White hat cloaking is a contradiction in terms at Google. We’ve never had to make an exception for “white hat” cloaking. If someone tells you that — that’s dangerous.”
Matt said anytime a site includes code that special cases for Googlebot by user agent or IP address, Google considers that cloaking and may take action against the site.
First-Click Free Program and Geotargeting
Danny Sullivan asserted that the “first-click free” program is white hat cloaking, but Matt disagreed, saying that the site was showing exactly the same thing to Googlebot and visitors coming to the site from a Google search. Matt also noted that showing content based on geolocation is not cloaking as it’s not special casing for the Googlebot IPs (but rather by the geographic location).
Cloaking For Search-Friendly URLs
Note that several Search Engine Land articles (including this and this) assert that cloaking to show search-engine friendly URLs is OK with the search engines, but Google in particular has been definitive that this implementation is not something they advocate.
The basis for some thinking this scenario is OK by Google seems to be a statement from a Google representative at SES Chicago in 2005. But back in 2008, Matt Cutts clarified to me that “cloaking by rewriting URLs based on user-agent is not okay, at least for Google.” (My understanding is that was the first speaking engagement for the Google representative and he wasn’t fully up on the intricacies of cloaking.)
In any case, there are now lots of workarounds for URL issues. If developers are unable to fix URLs directly or implement redirects, they can simply use the rel=canonical link attribute or the Google Webmaster Tools parameter handling feature (Bing Webmaster Tools has a similar feature).
Hiding Text for Accessibility
What about using a -999px CSS text indent for image replacement? Maile had previously done a post on her personal blog noting that hiding text in this way can be seen as violating the Google webmaster guidelines even if it’s done for accessibility, not spammy reasons. Generally, you can use a different implementation (such as ALT attributes or CSS sprites). On stage at SMX, Maile also recommended using font-face. This can be tricky to implement, but at least for the font files themselves, you can use Google Web Fonts rather then building them yourself.
Matt seconded this in a later session: “hidden text? Not a good idea. I think Maile covered that in an earlier session.”
Showing Different Content to New Visitors Vs. Returning Visitors
Someone asked about showing different content to new vs. returning users. Both Matt and Duane Forrester of Bing commented that it was best to be careful with this type of technique. Generally, this type of scenario is implemented via a cookie. Both new visitors and Google won’t have a cookie for the site, while a returning visitor will have one. Matt noted that if you treat Googlebot the same as a new user, this generally is fine.
Links continue to be important, but how can sites acquire them using search engine-approved methods? Matt said to ask yourself “how can I make a good site that people will love?” Make and impression and build a great site with a loyal audience (not necessarily through search) that brings brand value and links will come.
Creating Value Beyond Content
Someone asked how to get links to internal pages of an ecommerce site. Product pages just aren’t that interesting and don’t have a ton of editorial content. Matt recommended looking at different ways of slicing the data and becoming an authority in the space.
How Valuable is Article Marketing?
Not very. Both Duane and Matt said that articles syndicated hundreds of times across the web just don’t provide valuable links and in any case, they aren’t editorially given. Duane made things simple: “don’t do the article marketing stuff.”
He suggested contacting an authority site in your space to see if they would publish a guest article that you write particularly for them. If the authority site finds your content valuable enough to publish, that’s a completely different situation from article hubs that allow anyone to publish anything.
What About Links in Press Releases?
Someone noted that while paid links violated the search engine guidelines, you can pay a press release service to distribute your release to places such as Google News, so don’t those links count? Matt clarified that the links in the press releases themselves don’t count for PageRank value, but if a journalist reads the release and then writes about the site, any links in that news article will then count.
Are Retweets More Valuable Than Links?
Someone asked about the recent SEOmoz post that concluded that retweets alone could boost rankings. Matt said he had asked Amit Singhal, who heads Google’s core ranking team, if this was possible. He said that Amit confirmed links in tweets is not currently part of Google’s rankings so the conclusions drawn by the post were not correct. Rather, other indirect factors were likely at play, such as some who saw the tweet later linked to it. (Purely speculating on my part, those tweets could have been embedded in other sites that in turn were seen as links.)
Matt mentioned that signals such as retweets might help in real-time search results and then talked about a recent change that causes searchers to see pages that have been tweeted.
Some mistakenly took this to mean that the Google algorithm would give a rankings boost to pages that have been tweeted vs. those that haven’t, but Matt was talking about the change a few weeks ago that personalizes search results based on a searcher’s social network connections. As Matt McGee explained in his Search Engine Land article about it:
In some cases, Google will simply be annotating results with a social search indicator, says Google’s Mike Cassidy, Product Management Director for Search. Google’s traditional ranking algorithms will determine where a listing should appear, but the listing may be enhanced to reflect any social element to it.
In other cases, the social search element will change a page’s ranking — making it appear higher than “normal.” This, I should add, is a personalized feature based on an individual’s relationships. The ranking impact will be different based on how strong your connections are, and different people will see different results.
Can Competitors Buy Links To Your Site and Hurt Your Site’s Rankings?
This is an age old question, but as several high profile sites have had rankings demotions lately for external link profiles that violate the search engine guidelines, it was top of some people’s minds. Matt reiterated that competitors generally can’t do anything to hurt another site. The algorithms are built to simply not value links that violate search engine guidelines. Demotions generally only occur when a larger pattern of violations is found.
What About Exact Match Domains and Incoming Anchor Text?
Several people have commented on spammy sites with exact math domains and lots of spammy incoming links with exact match anchor text ranking quite well in Google and Bing. Matt said they are looking into this.
How Google Handles Spam and Reconsideration Requests
Matt reiterated some basics about how this works:
- Google’s approach to spam: Engineers write algorithms to address spam issues at scale. In parallel, manual teams are both proactive and reactive in looking for spam and both removing it from the index and providing it to the engineering team to help them modify their algorithms to not only find that specific spam instance, but all similar instances across the web.
- How Google handles spam reports: Spam reports have four times the weight of other spam found in terms of manual action as it’s clearly spam that a searcher has seen in results.
- Notifying site owners of guidelines violations: Google is looking at providing additional transparency around guidelines violations found on sites. (They already provide some details in the Google Webmaster Tools message center.)
- How Google handles reconsideration requests: Only reconsideration requests from sites that have a manual penalty are routed to Googlers for evaluation. Generally, the reconsideration process takes less than a week. Algorithmic penalties can’t be manually removed. Rather, as Google recrawls the pages the algorithm adjusts the rankings accordingly.
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.