During the last 12 months (November 2011 to November 2012), SEO professionals have witnessed stunning changes from Google that impact SEO.
Reputation & Trust
Two words I frequently saw or heard this year were reputation and trust. I am writing about the credibility of a website’s content, design, and external links, not reputation management as an SEO service.
Since day one, Google and its fellow search engines promoted consistent quality while criticizing trickeries. What changed is Google can police bad behavior more effectively and broadly than ever before. Google got teeth.
Google Gets Vocal
In the past, Google shied away from notifying domains about their black hat webspam via Webmaster Tools. This changed in April when the search engine expanded the types of messages and warnings it sends.
Google introduced Penguin on April 24th. Penguin penalizes websites that exhibit signs of artificial external links. This is what Penguin looks like in analytics.
When it comes to recovery, Google seems adamant that websites must make a thorough effort to remove all artificial and low quality links no matter how old. The search engine did provide a link disavow tool last month, but it uses submissions as strong suggestions, not a hard and fast off-switch.
Even with the disavow tool, Google makes no rush to restore a site’s good status. Google waits until it recrawls and reindexes the URLs you disavow before it takes action. It can be weeks or months between spider visits to deep or low quality pages. Finally, I have seen no change in Google’s statement that some domains are beyond rehabilitation.
It is worth noting here, Google ignores links it cannot trust. It seems websites can have plenty of unreliable links before they pass some statistical threshold and Penguin takes hold. Penguin does not replace manual reviews, either. Google may still take manual action against a site because of untrustworthy links, even if Penguin has engaged.
Google loves Panda, the algorithm that penalizes websites for too much low-quality content. Since November 18, 2011, Google updated Panda 13 times. Panda acts like a ratio-based punishment. The sites I have seen recover remove inferior content and replace it with well-written, useful pieces. They also combine or better differentiate duplicate or near-duplicate content. A good example is the company that puts up a separate page for each office location, the text on all pages being identical except for the city, state, and address.
It often feels like Google spends all its time looking for low quality, so it was nice to see changes to identify and reward high quality in the search engine’s June-July weather update.
Back in April, Matt Cutts announced this news:
In the next few days, we’re launching an important algorithm change targeted at webspam. The change will decrease rankings for sites that we believe are violating Google’s existing quality guidelines.
Forecasted to affect 3.1% of queries, Matt was vague about how the algorithm works. One example showed blatant keyword stuffing. The second example showed links in spun content. Because Matt wrote:
Sites affected by this change might not be easily recognizable as spamming without deep analysis or expertise, but the common thread is that these sites are doing much more than white hat SEO; we believe they are engaging in webspam tactics to manipulate search engine rankings.
I suspect the update includes some form of language analysis.
Last March, Matt Cutts announced an upcoming over-optimization update.
We are trying to make GoogleBot smarter, make our relevance better, and we are also looking for those who abuse it, like too many keywords on a page, or exchange way too many links or go well beyond what you normally expect.
What is in the over-optimization penalty? We do not know, but the SEO community has ideas. When he made the announcement, Matt mentioned too many keywords on a page, something he has described before.
This month, Matt spoke about site-wide backlinks and compared how Google counts these to how Google counts keywords. I suspect site-wide links are part of the over-optimization algorithm.
In the image below, I illustrate a generic example. One instance is good, two is better, three or four is great, and then each additional mention becomes less and less important until you over-optimize. At some point, your optimization becomes suspect.
To be clear, I picked the Golden-Ratio and number of instances arbitrarily in order to convey this concept. The real Google formulas for stuff like keyword frequency and repetitive links are unknown. The number of instances will vary too. The point is, do not try to outguess Google. Be natural.
What I find especially interesting is that Google created a safety net for some things. If you have legitimate site-wide links, like a blog roll or links to subsidiary companies, Google will not penalize your domain. Also, and I am reading deeply between the lines on this one, it sounds like some things may not trigger an algorithmic penalty, but could be disastrous during a manual review.
Exact Match Domains
In September, Google announced they would crack down on low quality exact match domains. While this is not related to Panda and Penguin, it does target exact match domains that rank well because of their TLD and not their content or external links.
Too Many Ads Above The Fold
Sites that have too many static advertisements above the fold and force readers to scroll down the page to see content risk incurring a penalty. This does not affect too many sites, Google says less than 1%; so, it clearly targets outliers.
Infographic & Guest Blogging Links
We do not know of any actual update, but in July, Matt Cutts warned that infographic links are getting abused and may become a target of the webspam team in the same way widgets were discounted and penalized.
In October, Matt Cutts offered a similar warning to blogs and guest bloggers. White hat guest blogging can be a terrific win-win, but shady guest blogging may have consequences.
I will finish my examples of reputation and trust with the Pirate Update. It is a penalty against domains that receive too many DMCA “takedown” requests. This one appears to be a straight-forward tie-in between Google’s webspam algorithms and its DMCA request database. There are some important exceptions, so check out the link.
Google on Caffeine
It has been a couple years since the Caffeine infrastructure rolled out. Last year, we got a deep-roasted taste of Caffeine, thanks to Panda. This year, the Penguin, site-wide links, and ads above the fold algorithms appear to take advantage of Caffeine, too, probably in collaboration with increased crawling, data storage, and processing capacity.
A year ago, Google launched its freshness update, affecting 35% of search results. This much improved ‘query deserves freshness’ algorithm identifies recent or recurring events, hot topics, and queries for which the best information changes frequently.
Indexing iFrame Content
Michael Martinez designed a test demonstrating how a link on the iFramed page passed a unique anchor expression to another page. This did not work on Bing, and I am definitely not endorsing iFrames. iFrames were once widely used to hide content from search engines so it is a worthwhile demonstration of Google’s expanding powers.
Automatic URL Canonicalization
Maile Ohye spoke about this at SMX Advanced, and it caught my ear then. However, I did not think much about this until I saw Matt Cutts’s latest video. Duplicate content and canonicalization has always been a cornerstone of SEO.
Now Google says they will detect and group duplicate content URLs then combine their authority. While it is still important to do URL canonicalization via rel= tags or in Webmaster Tools, dynamic de-duping and combining authority from multiple pages is a noteworthy innovation.
Parked Domains and Scraper Sites
Last December, Google added a parked domain classifier to keep parked sites out of the results. They also improved the ability to detect duplicate content and show the originating document in search results. Removing parked domains may not seem like a big tech leap; it does demonstrate Google’s capacity growth. It’s the same with scraper sites. All that data has to get stored and cross-referenced.
Domain Diversity In Results
In September, Google released an update to increase the number of domains that appear in search results. As Danny wrote, “ Google’s search results can sometimes be dominated by pages that all come from the same domain.” This update is supposed to help alleviate this. That adds another layer of processing to the rankings selections.
Tags & SERP Real Estate
Before I finish, I want to discuss two more things: tags and real estate. Both continue to evolve and both are increasingly controversial.
Google is pushing tags, and you need to keep up on them. Tags come in two broad varieties, machine-readable mark-up and HTML elements or attributes. You must absolutely understand how to use HTML tags like rel=canonical and rel=author. Learn how to use these and incorporate them. Push your CMS developers until they support them.
Whether or not you use machine-readable markup like Schema.org is another matter. Yes, they make it easier for search engines to discover, classify, and display information.
They also make it possible for search engines to pull information from your website and display it in the search results, possibly costing you visitors and traffic. Whether or not this really does divert traffic is being hotly debated. What is important is to become informed and make the right decision for your business.
SERP Real Estate
In the last year, Google made more changes to how they display search engine results than ever before. The latest examples are killing the left sidebar and eliminating non-paid product search. More queries trigger local search results, a benefit to some businesses, while a detriment to others. Some queries only deliver seven organic results, no longer ten. Increased site links. More above the fold advertisements. Who knows what is to come?
I think Google wants to simplify the search results for the average user while maximizing its income opportunities. The typical user is not going to miss the power tools on the left side or look for the new dropdown menus. Google wants to personalize the results, too, with more emphasis on local and friends.
Search for coffee. In my own Google results, I see seven local results, three with personalized recommendations. Below that are three more personalized results in the organic listings.
This is where Google SEO is headed, personalized social and local results. More on that in my column next month.
Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.