<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Search Engine Land &#187; Vanessa Fox</title>
	<atom:link href="http://searchengineland.com/author/vanessa-fox/feed" rel="self" type="application/rss+xml" />
	<link>http://searchengineland.com</link>
	<description>Search Engine Land: News On Search Engines, Search Engine Optimization (SEO) &#38; Search Engine Marketing (SEM)</description>
	<lastBuildDate>Fri, 25 May 2012 16:18:25 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
		<item>
		<title>Peeking Into the World Of Google&#8217;s Algorithm Changes With Google Search Quality Head Amit Singhal</title>
		<link>http://searchengineland.com/peeking-into-the-world-of-googles-algorithm-changes-with-google-search-quality-head-amit-singhal-121528</link>
		<comments>http://searchengineland.com/peeking-into-the-world-of-googles-algorithm-changes-with-google-search-quality-head-amit-singhal-121528#comments</comments>
		<pubDate>Thu, 17 May 2012 10:57:50 +0000</pubDate>
		<dc:creator>Vanessa Fox</dc:creator>
				<category><![CDATA[Featured]]></category>
		<category><![CDATA[Features: Analysis]]></category>
		<category><![CDATA[Google: Algorithm Updates]]></category>
		<category><![CDATA[Google: Search Plus Your World]]></category>
		<category><![CDATA[Google: Web Search]]></category>
		<category><![CDATA[Top News]]></category>

		<guid isPermaLink="false">http://searchengineland.com/?p=121528</guid>
		<description><![CDATA[Earlier this week, Google Fellow Amit Singhal gave the opening keynote at SMX London. Although Matt Cutts has always been the public face of all parts of Google&#8217;s unpaid search, his realm is primarily web spam. Singhal has been speaking publicly more often (notably when Panda launched) and oversees search quality. Or, as he described [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/01/bio_singhal_full.jpg"><img class="alignright size-medium wp-image-109532" title="Amit Singhal" src="http://searchengineland.com/figz/wp-content/seloads/2012/01/bio_singhal_full-300x199.jpg" alt="Amit Singhal" width="300" height="199" /></a>Earlier this week, Google Fellow <a href="http://searchengineland.com/interview-with-amit-singhal-google-fellow-121342">Amit Singhal gave the opening keynote at SMX London</a>. Although Matt Cutts has always been the public face of all parts of Google&#8217;s unpaid search, his realm is primarily web spam. Singhal has been <a href="http://searchengineland.com/interesting-quotes-from-googles-search-lead-amit-singhal-110721">speaking publicly more often</a> (notably <a href="http://searchengineland.com/google-speaks-more-about-the-farmer-update-aka-panda-update-66801">when Panda launched</a>) and oversees search quality. Or, as he described in his talk, when he came to Google in 2000, he took a look at Sergey Brin&#8217;s code and entirely rewrote Google&#8217;s ranking algorithms.</p>
<p>Near the end of the talk, someone asked if how much money Google will make is factored into decisions about changes to Google&#8217;s (unpaid search) algorithms. Singhal was adamant: &#8220;no revenue measurement is included in our evaluation of a rankings change.&#8221; Listening to him explain how excites he gets about search improvements and how changes are evaluated, you realize there&#8217;s no spin here. He&#8217;s absolutely telling the truth. And he would know. Chris Sherman asked if anyone at Google really understands how the whole thing works and he replied that while no one knows how <em>everything</em> works (all of unpaid search, AdWords, Android, etc.), he has a pretty good idea of how all of unpaid search works. Not many can make that claim.</p>
<p>Core to Singhal&#8217;s talk was a focus on what Google <em>does</em> look at when improving unpaid search algorithms. The key is always relevance.</p>
<p>Singhal talked about the evolution of Google&#8217;s unpaid search algorithms. In 2003, they worked on stemming and synonyms. This meant that those searching for [watch buffy the vampire slayer] [watching buffy the vampire slayer] and [view buffy the vampire slayer] would likely all see the same results. In 2007, came universal search, which was a big step forward in understanding searcher intent. (Searchers typing in [i have a dream] not only are looking for Martin Luther King Jr.&#8217;s speech,  but would like to see a video of it.)</p>
<h2>Understanding Intent</h2>
<p>Ten years ago, search results were keyword-based, but Google is now moving towards understanding the intent behind the words. Singhal talked about Google&#8217;s acquisition of the company FreeBase, which has done substantial work on understanding phrases as entities rather than strings. &#8220;Mount Everest&#8221; isn&#8217;t just two words, it&#8217;s also a mountain, with a height, in a location, and so on. (Shortly after the talk, <a href="http://searchengineland.com/google-launches-knowledge-graph-121585">Google launched their Knowledge Graph</a>, which is the next step in this understanding.) Combine intent with speech recognition and mobile devices and you almost end up with what Singhal first glimpsed years ago on Star Trek. We do indeed, live in the future (almost).</p>
<h2>Personalization</h2>
<p>In 2012 took a big step (whether or not that step was forward is up for debate) towards greater personalization with <a href="http://searchengineland.com/googles-results-get-more-personal-with-search-plus-your-world-107285">Search Plus Your World</a>, which began incorporating Google+ into search results for those logged in. Singhal explained that Google+ integration was not the point, it was just a proof of concept. The point was a foundation for a wider world of (more secure) searching over everything: both what&#8217;s public in the world and what&#8217;s private to each searcher. Perhaps one day Google will in fact be able to find your car keys.</p>
<p>Singhal said that searcher click behavior shows that searchers are happy with this integration. But he acknowledged there&#8217;s work to be done. When asked when it would launch in Europe, he said that based on feedback, it&#8217;s undergoing improvements first.</p>
<h2>Relevance and Data: How Changes Are Evaluated</h2>
<p>Search Plus Your World is built and evaluated the way all ranking algorithm changes are: build, evaluate, launch, learn, improve, repeat. Relevance is key to every measurement. Singhal stepped through the process:</p>
<ol>
<li>An engineer at Google has an idea of a signal (one of over 200) that might be introduced or tweaked to improve overall relevance.</li>
<li>That algorithm change is run on a test set of data and if all looks good, human raters look at before and after results for a wide set of queries (a kind of manual A/B test). The human raters don&#8217;t know which is the before and which is the after. The raters report what percentage of queries got better (more relevant) and what percentage got worse (less relevant).</li>
<li>This process gets looped several times as the algorithm is tweaked to better serve results for the queries in the &#8220;worse&#8221; set.</li>
<li>Once the overall manual ratings show that the algorithm tweak makes results better overall, it&#8217;s all tested again. This time, a data center (one of many that contains Google&#8217;s index and serves results to searchers) is loaded with the new algorithm and a very small slice of searchers (typically 1%) see the modified result set. Are those searchers happier than the ones seeing the version of results without the tweak? Singhal says they compare where searchers click. Clicks on higher ranked pages mean results at the top are likely more relevant, and searchers are happier. (He didn&#8217;t say so, but they may look at other data, such as click and back behavior.)</li>
<li>An independent analyst compiles the results and provides a statistical analysis, which is presented at a <a href="http://searchengineland.com/an-unprecedented-video-glimpse-into-how-google-crafts-its-search-results-114682">search quality meeting</a>, where engineers look at the data and debate the change. If they decide this tweak improves the quality of search results overall (and is good for the web and doesn&#8217;t overly tax internal systems), the change goes out.</li>
</ol>
<p>This process is happening all of the time with lots of different proposed tweaks and tests. 525 algorithm changes were launched in 2011. That may seem like a lot, but earlier this year Singhal noted that many <a href="http://www.thisislondon.co.uk/news/techandgadgets/the-human-search-engine-7315344.html">more changes were tested</a>.</p>
<blockquote>&#8220;Concurrently we have approximately 100 ideas floating around that people are testing &#8211; we test thousands in a year. Last year we ran around 20,000 experiments. Clearly they don&#8217;t all make it out there but we run the process very scientifically.&#8221;</blockquote>
<p>Aggregated data from millions of searchers typing millions of queries provides clear patterns. Singhal said that not only do those who get better results more quickly click higher in the search results, but they also search more. (We&#8217;ve heard this before from Google. Marissa Mayer, for instance has noted that a <a href="http://glinden.blogspot.fr/2006/11/marissa-mayer-at-web-20.html">half a second delay in rendering search results resulted in 20% fewer searches</a>).</p>
<p>Singhal noted that the kind of personalization platform envisioned with Search Plus Your World is harder to test. Human evaluation looks at relevance, but personal relevance is unique for each searcher. All Google really has to go on is click behavior. <a href="http://searchengineland.com/two-weeks-in-google-search-plus-your-world-109527">Singhal talked with Danny Sullivan</a> about this dilemma a few weeks after Search Plus Your World launched:</p>
<blockquote>&#8220;Every time a real user is getting those results, they really are delighted. Given how personal this product is, you can only judge it based on personal experiences or by aggregate numbers you can observe through click-through.&#8221;</blockquote>
<p>All of this gets complicated by varied screen size. The user interface becomes more important as increased use of mobile devices and tablets shrink screen real estate.</p>
<p>If these changes are all about increased relevance, why is only Google+ represented in Search Plus Your World? Why not Facebook and Twitter? Singhal explained that most personally useful Facebook data is locked behind a login, and Twitter produces content at a rate that is too massive for Google to crawl quickly and comprehensively. Or, they could, but it would probably take down the Twitter servers. Twitter has also had some <a href="http://searchengineland.com/how-twitters-technical-infrastructure-issues-are-impacting-google-search-results-86229">technical issues that have made crawling difficult</a>, although are being fixed.</p>
<h2>What About Panda and Penguin?</h2>
<p>Singhal said that Google&#8217;s algorithms aren&#8217;t perfect (hence the 20,000 experiments a year). He looks at bad queries every day (and encouraged the audience to let him know about them! So, add them to comments to this post and we&#8217;ll forward them along). But when asked specifically about <a href="http://searchengineland.com/library/google/google-panda-update/panda-update-news">Panda</a> and <a href="http://searchengineland.com/google-launches-update-targeting-webspam-in-search-results-119295">Penguin</a>, two of the latest high profile algorithm changes, he said that data has shown they significantly improved the number of high quality sites being returned in results. They are not only refining what signals they use in ranking, but are improving how they gather and tune the signals themselves (so signal quality is higher). They are  constantly looking for aberrations in signals.</p>
<p>At the end of the day, he said, site owners need to take a hard look at what value their sites are providing. What is the additional value the visitor gets from that site beyond just a skeleton answer? Ultimately, it&#8217;s those sites that provide that something extra that Google wants to showcase on the first page of search results.</p>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/peeking-into-the-world-of-googles-algorithm-changes-with-google-search-quality-head-amit-singhal-121528/feed</wfw:commentRss>
		<slash:comments>21</slash:comments>
		</item>
		<item>
		<title>Google Webmaster Tools Expands Query Data to 90 Days</title>
		<link>http://searchengineland.com/google-webmaster-tools-expands-query-data-to-90-days-119602</link>
		<comments>http://searchengineland.com/google-webmaster-tools-expands-query-data-to-90-days-119602#comments</comments>
		<pubDate>Thu, 26 Apr 2012 21:19:00 +0000</pubDate>
		<dc:creator>Vanessa Fox</dc:creator>
				<category><![CDATA[Features: Analysis]]></category>
		<category><![CDATA[Google: Webmaster Central]]></category>
		<category><![CDATA[Top News]]></category>

		<guid isPermaLink="false">http://searchengineland.com/?p=119602</guid>
		<description><![CDATA[Today, Google has expanded the historical search query data to 90 days. The number of queries reported has increased as well: the report will now list the top 2,000 for each day of the selected date range (vs. the previous top 1,000). This is great news, as this is data not available anywhere else and [...]]]></description>
			<content:encoded><![CDATA[<p>Today, Google has <a href="http://googlewebmastercentral.blogspot.com/2012/04/even-more-top-search-queries-data.html">expanded</a> the historical search query data to 90 days. The number of queries reported has increased as well: the report will now list the top 2,000 for each day of the selected date range (vs. the previous top 1,000). This is great news, as this is data not available anywhere else and when looking at trends, the more information, the better. Google has made a few other minor adjustments to this data recently. So if you use Google webmaster tools query data, see below for all the details of how these reports work.</p>
<h2>What&#8217;s In the Top Search Queries Report</h2>
<p>First, a refresher on what this data is all about. The top search queries report (available in Google webmaster tools for sites you&#8217;ve verified ownership of by selecting <strong>Your site on the web &gt; Search queries</strong>) lists the top queries that brought traffic to your site from Google organic search (from all countries and properties).</p>
<h3>Summary Data</h3>
<p>For the selected data range, the report shows the total number of queries that brought traffic to the site, the total impressions and clicks the site received, and the number of impressions and clicks for the reported top queries.</p>
<h3>Query-Specific Data</h3>
<p>For each query, the report notes:</p>
<ul>
<li><strong>Number of impressions</strong> - how many searchers saw the site in search results for that query</li>
<li><strong>Number of clicks</strong> - how many searchers clicked on the search result for that query</li>
<li><strong>Click through rate</strong> - The percentage of the time searchers who saw the site in search results for that query clicked on it</li>
<li><strong>Average position</strong> - the average position the highest ranked URL for that site appeared in search results for the query across all searchers</li>
<li>In addition, you can find out the change for each of these data points from the previous period. However, the change percentages aren&#8217;t available for time periods longer than 30 days. The change details used to be visible by default, but they&#8217;re now off by default. You&#8217;ll need to click the <strong>With Change</strong> button to see them in the report (although they&#8217;ll be included automatically with the CSV download). If you have the change percentage displayed, you&#8217;ll need to turn that off in order to expand the date range beyond 30 days.</li>
</ul>
<p>You can click into any query to get more specific data, including the pages that ranked for the query, and the impressions, clicks, and click-through rate at each position the site ranked.</p>
<h3>Country and Property-Specific Data</h3>
<p>Use the filters to drill into what queries brought traffic from Google properties (web, video, images, mobile web, and smartphones) and from specific countries.</p>
<h2>How the Numbers Are Aggregated</h2>
<p>As I <a href="http://searchengineland.com/google-webmaster-tools-adds-useful-download-options-108684">explained in a previous post</a>, the numbers can be tricky and it&#8217;s important to understand what data you&#8217;re really looking at. These reports now list the top 2,000 queries that brought traffic to the site for the selected time period. That means that if a query wasn&#8217;t one of the top 2,000 for any days in the selected range, data won&#8217;t be reported for it. In the example below, the time period is 30 days, but only 6 of those days have data reported for the query (as illustrated by the dots in the graph).</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/04/queries.png"><img class="alignnone size-large wp-image-119616" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Google webmaster tools top queries graph" src="http://searchengineland.com/figz/wp-content/seloads/2012/04/queries-600x127.png" alt="Google webmaster tools top queries graph" width="600" height="127" /></a></p>
<h2>Generating Data</h2>
<p>When generating or downloading the data, keep in mind the following:</p>
<ul>
<li>The default ending data in the user interface display is today&#8217;s date, but reporting is typically 2-3 days behind so check the last date reported in the graph by hovering over the last dot. The default starting date of the range is 30 days before the end date. And because this end date is generally three days ahead of what&#8217;s actually reported, the actual date range generally shown by default is 27 days. Make sure you adjust the dates before analyzing data or comparing it to other time periods.</li>
<li>Because the default shown date range is 27 days, the download available from the Python script is also 27 days.</li>
</ul>
<div>It&#8217;s great that Google is offering more data (both number of queries and length of time). Just be sure that as you use this data, you understand exactly what you&#8217;re looking at and comparing (both in terms of data range and date reported per day).</div>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/google-webmaster-tools-expands-query-data-to-90-days-119602/feed</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>No, Bing Doesn&#8217;t Support Pagination Attributes to Consolidate Pages In A Series</title>
		<link>http://searchengineland.com/no-bing-doesnt-support-pagination-attributes-to-consolidate-pages-in-a-series-118694</link>
		<comments>http://searchengineland.com/no-bing-doesnt-support-pagination-attributes-to-consolidate-pages-in-a-series-118694#comments</comments>
		<pubDate>Thu, 19 Apr 2012 18:47:44 +0000</pubDate>
		<dc:creator>Vanessa Fox</dc:creator>
				<category><![CDATA[Microsoft: Bing SEO]]></category>
		<category><![CDATA[SEO: Duplicate Content]]></category>
		<category><![CDATA[Top News]]></category>

		<guid isPermaLink="false">http://searchengineland.com/?p=118694</guid>
		<description><![CDATA[Last week, the Bing Webmaster blog published a post about how Bing handles rel=&#8221;next&#8221; and rel=&#8221;prev&#8221; attributes. On the surface, it seemed as though Bing was announcing that it now supported these tags in the same way Google does. Last September, Google announced support of the rel=&#8221;prev&#8221; and rel=&#8221;next&#8221; attributes to designate paginated content, which enables [...]]]></description>
			<content:encoded><![CDATA[<p><img class="aligncenter size-full wp-image-93066" title="bing-search-featured" src="http://searchengineland.com/figz/wp-content/seloads/2011/09/bing-search-featured.jpg" alt="" width="570" height="270" /></p>
<p>Last week, the Bing Webmaster blog published a post about how <a href="http://www.bing.com/community/site_blogs/b/webmaster/archive/2012/04/13/implementing-markup-for-paginated-and-sequenced-content.aspx">Bing handles rel=&#8221;next&#8221; and rel=&#8221;prev&#8221; attributes</a>. On the surface, it seemed as though Bing was announcing that it now supported these tags in the same way Google does. Last September, <a href="http://searchengineland.com/google-provides-new-options-for-paginated-content-92906">Google announced support of the rel=&#8221;prev&#8221; and rel=&#8221;next&#8221; attributes</a> to designate paginated content, which enables site owners to cluster multiple pages of content into single entities so that indexing and other values can be consolidated.</p>
<p>However, the conclusion of the post included the following, which didn&#8217;t align with Google&#8217;s treatment of these tags:</p>
<blockquote>&#8220;Implementing these rel=&#8221;next&#8221; and rel=&#8221;prev&#8221; link elements doesn&#8217;t trigger a new visual treatment for your pages on our search result pages. It does, however, allow us to more comprehensively understand and index your content&#8221;.</blockquote>
<p>I talked to Bing&#8217;s Duane Forrester to clarify how Bing treats these tags. He confirmed that Bing does <em>not</em> use these tags to consolidate pages of a series into a single entity. Rather, Bing may use this markup in two ways:to aid discovery of pages and to enhance the search results display (possibly with links to the next and previous page, for instance). Although it wasn&#8217;t clear if either was happening yet. He said:</p>
<blockquote>&#8220;We may use our newly gained knowledge on your site&#8217;s structure to provide easy access to other sections of the paginated or sequenced content from our results pages in the future. In addition, webmasters implementing these elements may benefit from more comprehensive indexing over time as we apply our newly gained knowledge of a site’s structure to our indexing heuristics.&#8221;</blockquote>
<p>So what&#8217;s a site owner with paginated content to do? The best bet is to <a href="http://searchengineland.com/implementing-pagination-attributes-correctly-for-google-114970">implement the tags for Google&#8217;s benefit</a> and as a bonus, get potential additional crawling or display improvements from Bing. If pagination of your content is a significant problem in Bing (which may be the case particularly for some large e-commerce sites), consider blocking excessive pages with a Bing-specific robots.txt directive or noindex meta tag.</p>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/no-bing-doesnt-support-pagination-attributes-to-consolidate-pages-in-a-series-118694/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Google Webmaster Tools Crawl Errors: How To Get Detailed Data From the API</title>
		<link>http://searchengineland.com/google-webmaster-tools-crawl-errors-how-to-get-detailed-data-from-the-api-115153</link>
		<comments>http://searchengineland.com/google-webmaster-tools-crawl-errors-how-to-get-detailed-data-from-the-api-115153#comments</comments>
		<pubDate>Sat, 17 Mar 2012 18:38:49 +0000</pubDate>
		<dc:creator>Vanessa Fox</dc:creator>
				<category><![CDATA[Features: Analysis]]></category>
		<category><![CDATA[Google: SEO]]></category>
		<category><![CDATA[Google: Webmaster Central]]></category>

		<guid isPermaLink="false">http://searchengineland.com/?p=115153</guid>
		<description><![CDATA[Earlier this week, I wrote about my disappointment that granular data (the number of URLs reported, the specifics of the errors&#8230;) was removed from Google webmaster tools. However, as I&#8217;ve been talking with Google, I&#8217;ve discovered that much of this detail is still available via the GData API. That this detail was available through the [...]]]></description>
			<content:encoded><![CDATA[<p>Earlier this week, I wrote about my disappointment that granular data (the number of URLs reported, the specifics of the errors&#8230;) was <a href="http://searchengineland.com/google-webmaster-tools-revamps-crawl-errors-but-is-it-for-the-better-114892">removed from Google webmaster tools</a>. However, as I&#8217;ve been talking with Google, I&#8217;ve discovered that much of this detail is still available via the <a href="http://code.google.com/apis/webmastertools/docs/2.0/developers_guide.html">GData API</a>. That this detail was available through the API wasn&#8217;t at all obvious to me from reading their <a href="http://googlewebmastercentral.blogspot.com/2012/03/crawl-errors-next-generation.html">blog post about the changes</a>. The post included the following:</p>
<blockquote>&#8220;For those who worry that 1000 error details plus a total aggregate count will not be enough, we’re considering adding programmatic access (an API) to allow you to download every last error you have, so please give us feedback if you need more.&#8221;</blockquote>
<p>And led me to believe that the current API would only provide access to the same data available from the downloads from the UI. But in any case, up to 100,000 URLs for each error and the details of most of what has gone missing is in fact available <a href="http://code.google.com/apis/webmastertools/">through the API now</a>, so rejoice!</p>
<p>The data is a little tricky to get to and the specifics of what&#8217;s available varies based on how you retrieve it.  Two different types of files are available that provide detail about crawl errors:</p>
<ul>
<li>A download of eight CSV files, one of which is a list of all crawl errors</li>
<li>A crawl errors feed, which enables you to programatically fetch 25 errors at a time</li>
</ul>
<p>(Thanks to <a href="https://twitter.com/#!/RyanJones/status/180005866550996992">Ryan Jones</a> and <a href="http://hackingsearch.com/">Ryan Smith</a> for help in tracking these details down.)</p>
<p>What this means is that different slices of data are available in four ways:</p>
<ul>
<li>User interface display</li>
<li>User interface-based CSV download</li>
<li>API-based download</li>
<li>API-based feed</li>
</ul>
<div>What you&#8217;re able to see about each error is different based on how you access it.</div>
<h2>CSV Download</h2>
<p>Eight CSV files are available through the API (you can download them all for a single site or for all sites in your account at once as well as just a specific CSV and a specific date range), but this support is not built into most of the available client libraries. You&#8217;ll need to build it in yourself or use the <a href="http://code.google.com/p/php-webmaster-tools-downloads/wiki/Running#Introduction">PHP client library</a> (which seems to be the only one that has support built in). The CSV files are:</p>
<ul>
<li>Top Pages</li>
<li>Top Queries</li>
<li>Crawl Errors</li>
<li>Content Errors</li>
<li>Content Keywords</li>
<li>Internal Links</li>
<li>External Links</li>
<li>Social Activity</li>
</ul>
<p>For the topic at hand, let&#8217;s dive into the crawl errors CSV. It contains the following data:</p>
<ul>
<li>Up to 100,000 URLs for each type of error (rather than the 1,000 maximum available through the download link in the UI)</li>
<li>The full list of URLs blocked by robots.txt (which is no longer available at all in the UI)</li>
<li>Specifics of &#8220;not followed&#8221; errors (the UI reports only the status code returned by the URL, while the CSV includes what the actual problem was, such as &#8220;too many redirects&#8221;)</li>
<li>Specifics site-wide server errors (the UI no longer lists the specific URLs that returned the error or the specific error)</li>
<li>Specifics about &#8220;soft 404s&#8221; (the UI doesn&#8217;t include the detail of the type of soft 404)</li>
</ul>
<p>This file does not include details on crawl error sources (but that is available through the crawl errors feed, described below).</p>
<h2>Crawl Errors Feed</h2>
<p>It appears that the <a href="http://code.google.com/apis/webmastertools/docs/2.0/reference.html#Feeds_Crawl">crawl errors feed</a> request code is built into the <a href="http://code.google.com/p/gdata-java-client/">Java</a> and <a href="http://code.google.com/p/gdata-objectivec-client/">Objective C</a> client libraries, but you&#8217;ll have to write your own code to request this if you&#8217;re using a different library. You can fetch 25 errors at a time and programmatically loop through them all. The information returned is in the following format:</p>
<pre>&lt;atom:entry&gt;
  &lt;atom:id&gt;id&lt;/atom:id&gt;
  &lt;wt:crawl-type&gt;web-crawl&lt;/wt:crawl-type&gt;
  &lt;wt:issue-type&gt;http-error&lt;/wt:issue-type&gt;
  &lt;wt:url&gt;http://example.com/dir/&lt;/wt:url&gt;
  &lt;wt:detail&gt;4xx Error&lt;/wt:detail&gt;
  &lt;wt:linked-from&gt;http://example.com&lt;/wt:linked-from&gt;
  &lt;wt:date-detected&gt;2008-11-17T01:06:10.000
  &lt;/wt:date-detected&gt;
&lt;/atom:entry&gt;</pre>
<h2></h2>
<h2>How the Data Differs</h2>
<p>Let&#8217;s take a look at some real examples at how what you get in this file differs from the UI and UI-based download.</p>
<h3>Number of Errors Shown</h3>
<p>The UI shows the total count over time, but only lists up to 1,000 URLs for each error. The API-based CSV contains up to 100,000 URLs for each type of error. However, it gives you only the current snapshot of errors and doesn&#8217;t provide a total count of each error (if more than 100,000 exist for any given one). This is the same information you can get from the API-based crawl errors feed. The UI-based CSV also only shows you the current snapshot of errors and lists only up to 1,000 URLs (the same as the UI).</p>
<p>For searchengineland.com&#8217;s &#8220;not found&#8221; errors:</p>
<ul>
<li>The UI shows that Google encountered 4,981 URLs (down from around 8,000 in December) and displays 1,000 of them, along with the corresponding response code (and the list of sources via a popup):
<a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwtapi1.png"><img class="alignnone size-large wp-image-115575" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Google webmaster tools crawl errors" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwtapi1-600x332.png" alt="Google webmaster tools crawl errors" width="600" height="332" />
</a></li>
<li>The UI-based CSV lists 1,238 of them, along with the corresponding response code:
<a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwtapi2.png"><img class="alignnone size-large wp-image-115576" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Crawl Errors Download" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwtapi2-600x52.png" alt="Crawl Errors Download" width="600" height="52" /></a></li>
<li>The API-based CSV lists 2,867 of them, along with the corresponding response code and the number of incoming links (but not the sources):
<a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwtapi3.png"><img class="alignnone size-large wp-image-115577" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Crawl Errors API CSV" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwtapi3-600x66.png" alt="Crawl Errors API CSV" width="600" height="66" /></a></li>
<li>The API-based feed presumably would provide all 4,981 URLs (in 25 URL increments), the response code, and the list of sources.</li>
</ul>
<p>I checked a site that the UI indicated had more than 100,000 of a particular error (257,065) and found that the corresponding API-based CSV file listed 100,005 of those. The UI-based CSV listed 1,999.</p>
<h3>Error Sources</h3>
<p>The UI shows error sources, but only at the individual URL level. Click a URL, then click the Linked From tab to see the list. You can&#8217;t download this list from the UI in aggregate or for an individual URL.</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwtapi4.png"><img class="alignnone size-large wp-image-115588" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Crawl Error Sources" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwtapi4-600x403.png" alt="Crawl Error Sources" width="600" height="403" /></a></p>
<p>The API-based CSV indicates the number of sources linking to each URL (which can be useful as you can tackle issues with URLs that have a lot of links first), but doesn&#8217;t list what those sources are.</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwtapi5.png"><img class="alignnone size-large wp-image-115595" title="Crawl Error Sources Count" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwtapi5-600x12.png" alt="Crawl Error Sources Count" width="600" height="12" /></a></p>
<p>The API-based feed provides details on the source for each URL. Below is an example from searchengineland.com:</p>
<pre>&lt;atom:entry&gt;
&lt;atom:id&gt;https://www.google.com/webmasters/tools/feeds/</pre>
<pre>http%3A%2F%2Fsearchengineland.com%2F/crawlissues/27&lt;/atom:id&gt;
&lt;atom:updated&gt;2012-03-19T17:18:18.907Z&lt;/atom:updated&gt;
&lt;atom:category scheme='http://schemas.google.com/g/2005#kind'
 term='http://schemas.google.com/webmasters/tools/
2007#crawl_issue_entry'/&gt;
&lt;atom:title type='text'&gt;Crawl Issue&lt;/atom:title&gt;
&lt;atom:link rel='self' type='application/atom+xml'
href='https://www.google.com/webmasters/tools/feeds/
http%3A%2F%2Fsearchengineland.com%2F/crawlissues/27'/&gt;
&lt;wt:crawl-type xmlns:wt='http://schemas.google.com/
webmasters/tools/2007'&gt;web-crawl&lt;/wt:crawl-type&gt;
&lt;wt:issue-type xmlns:wt='http://schemas.google.com/webmasters/
tools/2007'&gt;not-found&lt;/wt:issue-type&gt;
&lt;wt:url xmlns:wt='http://schemas.google.com/webmasters/
tools/2007'&gt;http://searchengineland.com/10-optimization-
secrets-to-drive-more-mobile-traffic-from-facebook-114316/
www.linkedin.com/in/brianklais&lt;/wt:url&gt;
&lt;wt:date-detected xmlns:wt='http://schemas.google.com/
webmasters/tools/2007'&gt;2012-03-17T05:58:35.000&lt;/wt:date-detected&gt;
&lt;wt:detail xmlns:wt='http://schemas.google.com/webmasters/tools/2007'&gt;
404 (Not found)&lt;/wt:detail&gt;
<strong>&lt;wt:linked-from xmlns:wt= 'http://schemas.google.com/webmasters/ tools/2007'&gt;htp://searchengineland.com/10-optimization-secrets-to- drive-more-mobile- traffic-from-facebook-114316/comment-page-1 &lt;/wt:linked-from&gt;</strong>
&lt;/atom:entry&gt;</pre>
<h3>Date of Latest Crawl</h3>
<p>As far as I can tell, this detail is available only in the UI (once you click on a URL):</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/firstcrawled.png"><img class="alignnone size-large wp-image-115728" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="First Crawled Date" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/firstcrawled-600x228.png" alt="First Crawled Date" width="600" height="228" /></a></p>
<h3>In Sitemaps</h3>
<p>This level of detail also seems to be only available in the UI (once you click on an individual URL and then the In Sitemaps tab). This information is also available in aggregate from the Sitemaps section of the UI (but not as a download &#8212; either from the UI or from the API).</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/sitemap-errors.png"><img class="alignnone size-large wp-image-115729" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Sitemap URL Errors" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/sitemap-errors-600x134.png" alt="Sitemap URL Errors" width="600" height="134" /></a></p>
<h3>Not Followed Errors</h3>
<p>The UI shows the URL&#8217;s response code (such as 301), which as I noted in my earlier article, is somewhat misleading and not that useful for investigating the error. The report can be misread to mean that 301 response codes <em>are</em> errors. What this report actually provides is a list of URLs that returned either a 301 or 302 response code that Googlebot couldn&#8217;t follow due to a problem with the redirect.</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwtapi6.png"><img class="alignnone size-large wp-image-115599" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Google webmaster tools not followed errors" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwtapi6-600x140.png" alt="Google webmaster tools not followed errors" width="600" height="140" /></a></p>
<p>The UI-based CSV provides similar data (although not the additional details available in the UI when you click the URL).</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwtapi7.png"><img class="alignnone size-large wp-image-115602" title="Not Followed Download" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwtapi7-600x102.png" alt="Not Followed Download" width="600" height="102" /></a></p>
<p>The API-based CSV lists &#8220;redirect error&#8221; for these specific URLs.</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwtapi8.png"><img class="alignnone size-large wp-image-115612" title="Crawl Errors" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwtapi8-600x102.png" alt="Crawl Errors" width="600" height="102" /></a></p>
<p>That particular error isn&#8217;t any more helpful than the response code, but in some cases, this file lists the most specific issue (this detail is one of the pieces of data that used to display in the UI). <a href="http://support.google.com/webmasters/bin/answer.py?hl=en&amp;answer=35156">Possible values include</a>:</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/redirects.png"><img class="alignnone size-full wp-image-115725" title="Redirects" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/redirects.png" alt="Redirects" width="399" height="118" /></a></p>
<p>For instance, here&#8217;s a (slightly obscured) example of a URL with a &#8220;redirect URL too long&#8221; error:</p>
<p>http://www.example.org/A%20Category%20Page%0A/Www.Sample.Net</p>
<p>%0A/Topic%20Topic1%0A/A%20Category%20Page%20Topic%20Topic1
%0A/Topic%20Topic3%0A/Excercise%0A/Smoothies%0A/Topic2ix%0A/
Topic%20Trainer%0A/Topic3%20Programs%20%0A/Topic%20Assessments
%20%0A/Individual%20Or%20Group%20Topic3%20Topic1%20%0A/Topic4
%20Guidance%20%0A/Category2%20Loss%20Supervision%20%0A/Category2
%20Loss%0A/Support%20And%20Motivation%20%0A/Topic9%20Specific
%20Category4%20%0A/Cardiovascular%20Category4%20%0A/Topic9%20
Specific%20Category4%20%20%20%0A/Home%20Gym%20Selection%0A/
Home%20Gym%20Use%0A/Topic12%0A/Topic10%20Topic15%20Topic12%0A/
Workout%20Routines%0A/Category2%20Topic1%0A/Topic3%0A/Topic9%0A/
Topic11%20Topic12%0A/Lean%0A/Lean%20Category3%0A/Nutrition%0A/
Gift%20Topic13s%0A/Topic3%20And%20Wellness%0A/Category5%0A/Topic3
%20Programs%20For%20All%20Ages%20%0A/Topic%20Assessments%20%0A/
Individual%20Or%20Group%20Topic1%20%0A/Topic4%20Guidance%20%0A/
Category2%20Loss%20Supervision%20%0A/Support%20And%20Motivation
%20%20%0A/Cardio%20%0A/Topic9%20Specific%20Category4%0A/Category2
%20Topic1%0A/Category2%20Topic14%20Routines%0A/Premier%20Topic%20
Trainer%0A/Category2%20Loss%20Help%0A/Safe%20Category2%20Loss%0A/
Wellness%0A/Nutrition%0A/Category3%20Topic3%0A/Exercise%20Health%0A/
Category2%20Loss%20Product%0A/Category6ing%0A/Topic3%20Equipment%0A/
Category2%20Loss%20Supplement%0A/Loose%20Category2%0A/Health%0A/
Healthy%20Category6%0A/Fat%20Loss%0A/Category6%20Plan%0A/Online%20
Topic3%20Program%0A/Topic3%20Center%0A/Gym%20Exercise%0A/Fast%20
Category2%20Loss%0A/Lose%20Category2%20Fast%0A/6%20Topic15%20
Topic12%0A/Home%20Gym%20Equipment%0A/Gym%0A/Topic3%20Course%0A/
Topic3%20Exercise%0A/Gym%20Equipment%0A/Topic3%20Topic1%20Program%0A/
Gym%0A/Exercise%20Program%0A/Category2%20Loss%20Category6%0A/Topic%20
Topic3%0A/Exercise%0A/Sport%20And%20Topic3%0A/Topic3%20Class%0A/Topic
%20Topic3%20Trainer%0A/Topic10%20Topic15%20Exercise%0A/Health%20And%20
Topic3%0A/Topic3%20Australia%0A/Category2%20Loss%20Program%0A/Topic3%20
Trainer%0A/Topic%20Topic1%20Topic16%0A/Topic3%0A/Kick%20Topic17%0A/
6%20Topic15%0A/Category2%20Topic1%0A/Topic%20Trainers%0A/Category2%20
Loss%0A/Topic%20Trainer%0A/Home%20Gym%0A/Topic%20Topic1%0A/Category6
%0A/Quick%20Category2%20Loss%20%0A/Category2%20Loss%20Plan%0A/
Category7%20Category2%20Loss%0A/Easy%20Category2%20Loss%0A/Category2
%20Loss%20Tip%0A/Healthy%20Category2%20Loss%0A/Rapid%20Category2%20
Loss%0A/Topic8%20135%0A/Category2%201</p>
<p>An &#8220;invalid&#8221; redirect brought me to this page (which was a 302 to a 404):</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/invalid-redirect.png"><img class="alignnone size-full wp-image-115724" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Invalid Redirect" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/invalid-redirect.png" alt="Invalid Redirect" width="582" height="429" /></a></p>
<p>Empty redirects are those with no location information.</p>
<p>The API-based feed also provides these specifics as part of the feed.</p>
<h3>Soft 404s</h3>
<p>Gone missing from both the UI and UI-based CSV are the specifics of <a href="http://support.google.com/webmasters/bin/answer.py?hl=en&amp;answer=181708&amp;topic=1724951">soft 404s</a>.</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwt9.png"><img class="alignnone size-large wp-image-115622" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Soft 404s" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwt9-600x380.png" alt="Soft 404s" width="600" height="380" /></a></p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwt10.png"><img class="alignnone size-large wp-image-115623" title="Soft 404s" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwt10-600x219.png" alt="Soft 404s" width="600" height="219" /></a></p>
<p>The API-based CSV still lists these details (as does the crawl errors feed), such as:</p>
<ul>
<li>404-like content (the page returns a 200 response code, but seems to contain contain from an error page)</li>
<li>Redirect to an error page</li>
</ul>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwt11.png"><img class="alignnone size-full wp-image-115625" title="Types of Soft 404s" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwt11.png" alt="Types of Soft 404s" width="394" height="237" /></a></p>
<p>&nbsp;</p>
<h3><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/redirecterrorpage.png"><img class="alignnone size-full wp-image-115707" title="Redirect to Error Page" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/redirecterrorpage.png" alt="Redirect to Error Page" width="388" height="146" /></a></h3>
<p>&nbsp;</p>
<h3>Site-Wide (Server) Errors</h3>
<p>With the UI overhaul, Google shows the number of site-based (vs. URL based) errors encountered over time, but not the specific URLs that triggered the errors, and has simplified the error messaging into three types: DNS, server connectivity, and robots.txt fetch. There is no corresponding download. (Google says when they encounter these types of errors, it typically means they&#8217;ll receive them for any URL on the site, since the problem is at a lower level.)</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwt12.png"><img class="alignnone size-large wp-image-115634" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="DNS Error" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/gwt12-600x99.png" alt="DNS Error" width="600" height="99" /></a></p>
<p>There&#8217;s some interesting data to be gleaned here that&#8217;s not apparent. For instance, if you hover over the dots, you see the total number of Googlebot fetch requests per day. (Unfortunately, you can only see this graph if you Google encounters at least one site error during the reported time period and there&#8217;s no way to see this number other than hovering.)  For one site I looked at, the number of URLs crawled averaged around 250,000 per day. Then one day, 2% of requests returned a DNS error (2,866 of 152,528 requests). The following day, Googlebot made only 128 requests (all crawled successfully) and only 71 the day after that. This doesn&#8217;t match up exactly what the crawl stats report shows:</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/dns3.png"><img class="alignnone size-large wp-image-115687" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Crawl Stats" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/dns3-600x152.png" alt="Crawl Stats" width="600" height="152" /></a></p>
<p>Another site went from an average of 100,000 a day to 362 a day after 1% of requests returned a DNS error.</p>
<p>With the API-based CSV file and API-based crawl errors feed, you get:</p>
<ul>
<li>The URL that triggered the error</li>
<li>The specifics of the error</li>
</ul>
<h4>DNS Error</h4>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/dns-sample1.png"><img class="alignnone size-full wp-image-115720" title="Google DNS Errors" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/dns-sample1.png" alt="Google DNS Errors" width="397" height="43" /></a></p>
<h4>Server Connectivity</h4>
<h3><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/serverconnectivity.png"><img class="alignnone size-full wp-image-115721" title="Server Connectivity" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/serverconnectivity.png" alt="Server Connectivity" width="395" height="83" /></a></h3>
<h4>Robots.txt Fetch</h4>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/robotsunreachable1.png"><img class="alignnone size-full wp-image-115723" title="Robots Unreachable" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/robotsunreachable1.png" alt="Robots Unreachable" width="389" height="24" /></a></p>
<h3>Pages Blocked with Robots.txt</h3>
<p>This report is gone entirely from the UI but it&#8217;s still available (up to 100,000 URLs) from the API (both the CSV and the feed).
<a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/robots.png"><img class="alignnone size-large wp-image-115688" title="Pages Blocked By Robots.txt" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/robots-600x54.png" alt="Pages Blocked By Robots.txt" width="600" height="54" /></a></p>
<p>All in all, I&#8217;m happy to learn that most of this information is still available. It&#8217;s somewhat cumbersome and confusing that what&#8217;s available isn&#8217;t consistent across delivery mechanisms and figuring out how to access the API-based data isn&#8217;t all that straightforward. But for those power users (like me) who process this data programmatically, that Google is continuing to provide this information is great news.</p>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/google-webmaster-tools-crawl-errors-how-to-get-detailed-data-from-the-api-115153/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Implementing Pagination Attributes Correctly For Google</title>
		<link>http://searchengineland.com/implementing-pagination-attributes-correctly-for-google-114970</link>
		<comments>http://searchengineland.com/implementing-pagination-attributes-correctly-for-google-114970#comments</comments>
		<pubDate>Tue, 13 Mar 2012 16:41:44 +0000</pubDate>
		<dc:creator>Vanessa Fox</dc:creator>
				<category><![CDATA[Features: Analysis]]></category>
		<category><![CDATA[Google: SEO]]></category>
		<category><![CDATA[SEO: Duplicate Content]]></category>
		<category><![CDATA[SEO: General]]></category>
		<category><![CDATA[Top News]]></category>

		<guid isPermaLink="false">http://searchengineland.com/?p=114970</guid>
		<description><![CDATA[Google&#8217;s latest blog post provides details and a video from Maile Ohye about how they handle the pagination attributes within a page&#8217;s source code. You can use these attributes to indicate pages in a series (such as a multi-page article or set of product listings), which enables Google to cluster the pages into a single entity [...]]]></description>
			<content:encoded><![CDATA[<p>Google&#8217;s latest blog post <a href="http://googlewebmastercentral.blogspot.com/2012/03/video-about-pagination-with-relnext-and.html">provides details and a video</a> from Maile Ohye about how they handle the pagination attributes within a page&#8217;s source code. You can use these attributes to indicate pages in a series (such as a multi-page article or set of product listings), which enables Google to cluster the pages into a single entity and combine their indexing and other properties (such as incoming link value). Using these attributes is trickier than it may seem at first glance, so below, a few tips from the blog post, video, and the recent <a href="http://searchmarketingexpo.com/west/2012/full_agenda2#611">SMX West session I moderated</a>, which featured Maile. (Keep in mind that currently, only Google supports these attributes.)</p>
<h2>How the Pagination Attributes Work</h2>
<p>The pagination attributes can be used for any set of content that spans multiple pages. Typical scenarios include multi-page articles, product listings, and forum discussions.  Simply use the rel=next and rel=prev attributes to link all pages in a series together. For the following set of pages:</p>
<ul>
<li>www.site.com/products?page=1</li>
<li>www.site.com/products?page=2</li>
<li>www.site.com/products?page=3</li>
</ul>
<p>The pagination attributes would be as follows:
<strong></strong></p>
<p><strong>Page 1:</strong></p>
<p>&lt;link rel=&#8221;next&#8221; href=&#8221;http://www.site.com/products?page=2&#8243;&gt;
<strong></strong></p>
<p><strong>Page 2:</strong></p>
<p>&lt;link rel=&#8221;prev&#8221; href=&#8221;http://www.site.com/products?page=1&#8243;&gt;</p>
<p>&lt;link rel=&#8221;next&#8221; href=&#8221;http://www.site.com/products?page=3&#8243;&gt;
<strong></strong></p>
<p><strong>Page 3:</strong></p>
<p>&lt;link rel=&#8221;prev&#8221; href=&#8221;http:// site.com/products?page=2&#8243;&gt;</p>
<h3>When To Use Pagination Attributes Instead of Canonical Attributes</h3>
<p>Some sites are set up to use the <a href="http://searchengineland.com/canonical-tag-16537">canonical attributes</a> to point all pages in a series to page one. As Maile points out in the video, this isn&#8217;t the correct use of the canonicalization tag (in part because Google only indexes the content on the canonical page, so any content from the rest of the pages in the series would be ignored).</p>
<p>If the paginated content is a subset of the canonical page (such as when you have a view all version or a filtered result set) or is identical (such as when the sort order changes the display but not the content), then use the canonical attribute instead of the pagination attributes.</p>
<h2>General Best Practices</h2>
<h3>Use Absolute URLs</h3>
<p>The href values can be absolute or relative (the original version of this article said they had to be absolute, but that version was incorrect). But using absolute URLs is a best practice, both to combat scrapers and in case URLs are duplicated accidentally across directories or subdomains.</p>
<h3>The Chain Can&#8217;t Be Broken</h3>
<p>The rel=&#8221;next&#8221; and rel=&#8221;prev&#8221; values must match. If they don’t, the chain is broken. For instance, for the following pages:</p>
<ul>
<li>www.site.com/products?page=1</li>
<li>www.site.com/products?page=2</li>
</ul>
<p>The rel=&#8221;next&#8221; attribute for page=1 must point to page=2 and the rel=&#8221;prev&#8221; attribute for page=2 must point to page=1.</p>
<p>The pagination attributes can only link together URLs with matching parameters. For instance, the following URLs aren’t considered part of the same series, as the second URL would break the chain:</p>
<ul>
<li>www.site.com/products?page=1</li>
<li>www.site.com/products?page=2&amp;referrer=twitter</li>
<li>www.site.com/products?page=3</li>
</ul>
<p>This means that ideally, you should dynamically insert the pagination values based on the fetched URL. In the case of the above example,when Googlebot fetches the page as:</p>
<p>www.site.com/products?page=2&amp;referrer=twitter</p>
<p>The pagination values should be (dynamically inserted as):</p>
<p>&lt;link rel=&#8221;prev&#8221; href=&#8221;http:// www.site.com/products?page=1&amp;referrer=twitter&#8221;&gt;</p>
<p>&lt;link rel=”next” href=&#8221;http://www.site.com/products?page=3&amp;referrer=twitter&#8221;&gt;</p>
<h3>Each page can be in only one pagination chain</h3>
<ul>
<li>A page can’t contain multiple rel=&#8221;next&#8221; attributes.</li>
<li>Multiple pages can’t have the same rel=&#8221;prev&#8221;.</li>
<li>A page that contains a rel=&#8221;canonical&#8221; attribute to another page can’t be part of the canonical URLs paginated series. It must be paginated with the URLs that match it (and then Google will use the canonical attributes to consolidate each page accordingly). See more on how this works at the end of this article.</li>
</ul>
<h2>Advanced Techniques</h2>
<p>Product listings in particular often have additional complexity, such as sort orders and filtered navigation. It’s best to start with the simplest paginated series and then make canonicalization and pagination decisions for each level of complexity. You can&#8217;t specify one view/filter as canonical and point all other versions of the paginated series to that default set.</p>
<h3>Viewing and Sorting Options</h3>
<p>If a set of product listings has multiple view options and those listings span multiple pages, you have to create a pagination set for each view option separately (since the pages aren&#8217;t subsets of a default version). For instance, if you provide options to view 20 products at a time or 100 products at a time and to sort by newest, price, and ratings, you would need to implement a separate paginated series for each view option. As you might imagine, this could cause you to end up with a large number of paginated series&#8217;. Shown below are just two of the many possible:</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/pagination-sel.png"><img class="alignnone size-large wp-image-114980" title="Pagination and Sorting" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/pagination-sel-600x426.png" alt="Pagination and Sorting" width="600" height="426" /></a></p>
<h3>Filtering (AKA Faceted Navigation)</h3>
<p>Things get even more complex when you introduce filters.</p>
<ul>
<li>If the filtered view is a subset of a single non-filtered page (perhaps the view=100 option), you can use the canonical attribute to point the filtered page to the non-filtered one. However, if the filtered view results in paginated content, this may not be viable (as each page may not be a subset of what you would like to point to as canonical).</li>
<li>If you want the filtered view to rank separately from the default view, you would create a paginated series for each filtered category. You would need to also paginate all of the various sort and view options separately. Take REI.com as an example. The site has a <a href="http://www.rei.com/category/4500304">Snowboards section</a> that would likely be paginated and (hopefully) rank for [snowboards] queries. But a filtered view of just <a href="http://www.rei.com/search?cat=4500304&amp;jxGender=Women%27s&amp;hist=cat%2C4500304%3ASnowboards%5EjxGender%2CWomen%27s">Women&#8217;s Snowboards</a>, while a subset of the snowboards content would likely be paginated separately in order to rank for [women's snowboards] queries. (On the other hand, REI probably doesn&#8217;t want the filter by size version to rank separately, so that variation could be canonicalized.)</li>
</ul>
<h2>Using the Canonical Attribute in Conjunction with the Pagination Attributes</h2>
<p>Each URL that&#8217;s paginated separately should also contain a canonical attribute. In the case of the earlier example with the optional referrer parameter, for instance, Google will first consolidate the default paginated series separately, and then the paginated series that contained the referrer=twitter parameter separately, but then use the canonical attribute of the pages to further consolidate the pages to the default version. That means that the URL of:</p>
<p>www.site.com/products?page=2&amp;referrer=twitter</p>
<p>Would end up with the following markup:</p>
<p>&lt;link rel=&#8221;canonical&#8221; href=&#8221;http://www. site.com/products?page=2&#8243;&gt;</p>
<p>&lt;link rel=&#8221;prev&#8221; href=&#8221;http://www. site.com/products?page=1&amp;referrer=twitter&#8221;&gt;</p>
<p>&lt;link rel=&#8221;next&#8221; href=&#8221;http:// www.site.com/products?page=3&amp;referrer=twitter&#8221;&gt;</p>
<h2>Too Confusing To Implement?</h2>
<p>I admit, all of this sounds confusing. But it&#8217;s not so bad if you take things step by step:</p>
<ol>
<li>What is the canonical version of each URL? Add the canonical attribute to the pages.</li>
<li>What is the default view for each paginated series? Add the pagination attributes to these pages.</li>
<li>What views/filters are subsets of broader views? Add the canonical attribute to these pages to point to those broader views.</li>
<li>What views/filters are not subsets of broader views or you want to rank separately? Add separate pagination attributes to these pages to make each a separate series.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/implementing-pagination-attributes-correctly-for-google-114970/feed</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Google Webmaster Tools Revamps Crawl Errors, But Is It For The Better?</title>
		<link>http://searchengineland.com/google-webmaster-tools-revamps-crawl-errors-but-is-it-for-the-better-114892</link>
		<comments>http://searchengineland.com/google-webmaster-tools-revamps-crawl-errors-but-is-it-for-the-better-114892#comments</comments>
		<pubDate>Tue, 13 Mar 2012 02:04:01 +0000</pubDate>
		<dc:creator>Vanessa Fox</dc:creator>
				<category><![CDATA[Features: Analysis]]></category>
		<category><![CDATA[Google: Webmaster Central]]></category>
		<category><![CDATA[Top News]]></category>

		<guid isPermaLink="false">http://searchengineland.com/?p=114892</guid>
		<description><![CDATA[Google has just revamped the crawl errors data available in webmaster tools. Crawl errors are issues Googlebot encountered while crawling your site, so useful stuff! I originally started this article by writing that in most cases, these changes are for the better and in only a few (really maddening) cases, useful functionality has been removed. [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://googlewebmastercentral.blogspot.com/2012/03/crawl-errors-next-generation.html">Google has just revamped the crawl errors data</a> available in webmaster tools. Crawl errors are issues Googlebot encountered while crawling your site, so useful stuff! I originally started this article by writing that in most cases, these changes are for the better and in only a few (really maddening) cases, useful functionality has been removed. But now that I&#8217;ve gone through the changes, I unfortunately need to revise my summary. This update is mostly about removing super useful data, masked by a few user interface changes. (And I hate to write that, because webmaster tools is near and dear to my heart.)</p>
<p><em><strong>Update 3/17/12:</strong>After talking with Google, I&#8217;ve learned that most of what I was disappointed to find had been removed and that I feel is useful detail for power users is in fact still available through the API! I&#8217;ve <a href="http://searchengineland.com/google-webmaster-tools-crawl-errors-how-to-get-detailed-data-from-the-api-115153">dug into the details and have written up my findings</a>. I&#8217;ve also updated this story with additional details from Google:</em></p>
<ul>
<li><em>Access denied errors include 401, 403, and 407. That some of these were showing up as &#8220;other&#8221; was a bug that has since been fixed.</em></li>
<li><em>Not followed errors are indeed URLs returned either a 301 or 302 and Googlebot had trouble crawling that redirect due to an issue.</em></li>
</ul>
<p>So what&#8217;s changed?</p>
<h2>Site vs. URL Errors</h2>
<p>Crawl errors have been organized into two categories: site errors and URL errors. Site errors are those which are likely site-wide, as opposed to URL-specific. <a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/Site-Errors-Bad.png"><img class="alignnone size-large wp-image-114903" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Google site errors" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/Site-Errors-Bad-600x229.png" alt="Google site errors" width="600" height="229" /></a></p>
<p>Site errors are categorized as:</p>
<ul>
<li><strong>DNS</strong> &#8211; These errors include things like DNS lookup timeout, domain name not found, and DNS error. (Although these specifics are no longer listed, as described more below.)</li>
<li><strong>Server Connectivity</strong> &#8211; The errors include things like network unreachable, no response, connection refused, and connection reset. (These specifics are also no longer listed.)</li>
<li><strong>Robots.txt Fetch</strong> &#8211; These errors are specific to the robots.txt file. If Googlebot receives a server error when trying to access this file, they have no way of knowing if a robots.txt file exists, and if so, what pages it blocks, so they stop the crawl until they no longer get an error when attempting to fetch it.</li>
</ul>
<div>URL errors are page-specific.</div>
<div><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/url-errors.png"><img class="alignnone size-large wp-image-114907" title="Google page-level errors" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/url-errors-600x113.png" alt="Google page-level errors" width="600" height="113" /></a></div>
<div>URL errors are categorized as:</div>
<div>
<ul>
<li><strong>Server error</strong> &#8211; These are 5xx errors (such as 503 for server maintenance)</li>
<li><strong>Soft 404</strong> &#8211; These are URLs that are detected as returning an error page but don&#8217;t return a 404 response code (they typically have a response code of 200 or 301/302). Error pages that don&#8217;t return a 404 can hurt crawl efficiency as Googlebot can end up crawling these pages instead of valid pages you want indexed. In addition, these pages can end up in search results, which is not an ideal searcher experience.</li>
<li><strong>Access denied</strong> -These are URLs that returned a 401, 403, or 407  response code. Often this simply means that the URLs prompt for a login, which is likely not an error. You may, however, want to block these URLs from crawling to improve crawl efficiency.</li>
<li><strong>Not found</strong> &#8211; Typically, these are URLs that return a 404 or 410.</li>
<li><strong>Not followed</strong> &#8211; (updated) These are URLs that triggered redirects that Googlebot had trouble crawling (for instance, because of a redirect loop). The UI lists whether the URL initially returned a 301 or 302, but doesn&#8217;t provide the details of the redirect error.</li>
<li><strong>Other</strong> &#8211; This is a catch-all that includes all other errors.</li>
</ul>
<h2>Trends Over Time</h2>
<p>Google now shows trends over the last 90 days for each error type. The daily count seems to be the aggregate count of how many URLs with that error type Google knows about, not the number crawled that particular day. As Google recrawls a URL and no longer gets the error, it&#8217;s removed from the list (and the count). In addition, Google still lists the date Googlebot first encountered the error, but now when you click the URL to see the details, you can see the last time Googlebot tried to access the URL as well.</p>
<h2>Priorities and Fixed Status</h2>
<p>Google says they are now listing URLs in priority order, based on a &#8220;multitude&#8221; of factors, including whether or not you can fix the problem, if the URL is listed in your Sitemap, if it gets a lot of traffic, and how many links it has. You can mark a URL as fixed and remove it from the list. However, once Google recrawls that page, if the error still exists, it will return to the list.</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/03/fixed.png"><img class="alignnone size-full wp-image-114920" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="fixed" src="http://searchengineland.com/figz/wp-content/seloads/2012/03/fixed.png" alt="" width="485" height="163" /></a></p>
<p>Google suggests using the Fetch as Googlebot feature to test your fix (and in fact now has a button right on the details page to do so), but since you are allowed only 500 fetches per account (not per site) each week (which I believe has increased from the previous limit), you should use this functionality judiciously.</p>
<h2>What&#8217;s Gone Missing?</h2>
<p>Unfortunately, several pieces of important functionality have been lost with this change.</p>
<ul>
<li><strong>Ability to download all crawl error sources.</strong> Previously, you could download a CSV file that listed URLs that returned an error along with the pages that linked to those URLs. You could then sort that CSV by linking source to find broken links within your site and had an easy list of sites to contact to fix links to important pages of your site. Now, the only way to access this information is to click on an individual URL to view its details, then click the Linked From tab. There seems to be no way to download this data, even at the individual URL level. <em>(<strong>Update 3/17/12:</strong> This detail is still available from the API-based crawl errors feed.)</em></li>
<li><strong>100K URLs of each type.</strong> Previously, you could download up to 100,000 URLs with each type of error. Now, both the display and download are limited to 1,000. Google says &#8220;less is more&#8221; and &#8220;there was no realistic way to view all 100,000 errors—no way to sort, search, or mark your progress.&#8221; Google is wrong. There were absolutely realistic ways to view, sort, search, and mark your progress. The CSV download made all of this easy using Excel. And more data is always better to see patterns, especially for large scale sites with multiple servers, content management systems, and page templates. A lot has been lost here.  <em>(<strong>Update 3/17/12:</strong> 100k URLs for each error is still available from the API-based crawl errors feed and API-based CSV download.)</em></li>
<li><strong>Redirect errors</strong> &#8211; Inexplicably, the &#8220;not followed&#8221; errors no longer seem to list errors like redirect loop and too many redirects. Instead it simply lists the response code returned (301 or 302). This seems weird to me (not to mention extraordinarily less useful) as 301s are followed just fine and typically aren&#8217;t an error at all (and 302s are only sometimes problematic), but all the redirect errors that used to be listed are critical to know about and fix. Listing URLs that return a 301 status code as &#8220;not followed&#8221; is misleading and alarming for no reason. And if this list of URLs is actually those with redirect errors, then omitting what that error is (such as too many redirects) makes this data incredibly non-useful.  <em>(<strong>Update 3/17/12:</strong> Confirmed with Google that is a list of URLs that return either a 301 or 302 that subsequently Googlebot is unable to crawl. The specific issue is still available from the API-based crawl errors feed and API-based CSV download.)</em></li>
<li><strong>Specifics about soft 404s.</strong> The soft 404 report used to specify whether the URLs listed returned a 200 status code or redirected to an error page. But the status code column appears to be empty now.  <em>(<strong>Update 3/17/12:</strong> This detail is still available from the API-based crawl errors feed and API-based CSV download.)</em></li>
<li><strong>URLs blocked by robots.txt .</strong> Google says they removed this report because &#8220;while these can sometimes be useful for diagnosing a problem with your robots.txt file, they are frequently pages you<em>intentionally</em> blocked&#8221;. They say that similar information will soon be available in the crawler access section of webmaster tools. Why remove data you&#8217;re planning to replace before replacing it? Couldn&#8217;t they have just moved this report to the crawler access section? I get the feeling that they won&#8217;t be replacing this report as is, but providing less granular data in its place. While it&#8217;s true that this report didn&#8217;t list errors necessarily, it was very useful. You could skim the CSV to see if any sections of pages you expected to be indexed were blocked. And it was critical for diagnosis. Why aren&#8217;t certain pages indexed? You could check this report before spending extensive time debugging the issue. But now you can&#8217;t do either of those things. <em>(<strong>Update 3/17/12:</strong> This report is still available from the API-based crawl errors feed and API-based CSV download.)</em></li>
<li><strong>Specifics about site level errors. </strong>The previous version of these reports listed the specific problem (such as DNS lookup timeout or domain name not found). That was very helpful in digging into what was going on. Now, you only get the count for the general category, not the specifics of what kind of error it was within that category. <em>(<strong>Update 3/17/12:</strong> This detail is still available from the API-based crawl errors feed and API-based CSV download.)</em></li>
<li><strong>Specific URLs with &#8220;site&#8221; level errors.</strong> Google says you don&#8217;t need to know the URL if the issue was at the site level. Mostly, this is likely true. But I&#8217;ve definitely encountered cases, particularly with DNS errors, that the error only happened with specific URLs, not the entire site. Knowing the URL that triggered the error would help track down issues in these cases. <em>(<strong>Update 3/17/12:</strong> This detail is still available from the API-based crawl errors feed and API-based CSV download.)</em></li>
</ul>
<p><em>(<strong>Update 3/17/12:</strong> I got lots of additional detail form Google and as noted above, am happy to report that I was at least partially wrong &#8212; most of this data is still available through the API. Power users who want this level of detail are likely to prefer the API anyway, so my disappoint has lessened.)</em></p>
<p>As for my comment in the earlier version of this story where I said that &#8220;I get the sense that many of these recent changes are designed to make the data easier for small site owners to use, and don&#8217;t really have the large enterprise-level site (or agency) in mind. For these latter organizations, more data is better, as we have systems to parse and crunch the data&#8221;, Google has told me:</p>
<blockquote>&#8220;Our strategy for Webmaster Tools is to improve the web interface and provide important, actionable, and useful information. Our changes are designed to improve the experience for all of our users, including power users. For example, we made changes to have crawl errors going back 90 days and to show the full aggregate count of URL errors instead of just the previous 100,000 cap. Power users can still access the firehose of data through our original GData API. One of the improvements we made is to now display the full count of URL errors, and that should help give more accurate data to larger sites. For example, previously if one site has over 35 million Not Found errors, that number would have been capped and shown as 100,000 errors. Now, that site can see the new number and even see where the increase happened in the historical data. We think that&#8217;s a big improvement.&#8221;</blockquote>
<p>The point about the total number of errors shown is certainly a good one. Very large sites are likely to have more than 100k errors, and knowing the significance of the problem is helpful in prioritizing.</p>
<p>Of course, in part, I&#8217;m sad to see features that I worked hard on launching when I was product manager for webmaster central be dismantled and made less useful. But mostly, as a frequent user of the product, I don&#8217;t want to lose useful functionality. <em><strong>Update 3/17/12:</strong> As noted above, I&#8217;m happy that a lot of this functionality is still available through the API. Read on to my <a href="http://searchengineland.com/google-webmaster-tools-crawl-errors-how-to-get-detailed-data-from-the-api-115153">dive into how to access these details through the API</a>.</em></p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/google-webmaster-tools-revamps-crawl-errors-but-is-it-for-the-better-114892/feed</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Is SEO Killing America?</title>
		<link>http://searchengineland.com/is-seo-killing-america-112237</link>
		<comments>http://searchengineland.com/is-seo-killing-america-112237#comments</comments>
		<pubDate>Thu, 23 Feb 2012 21:55:06 +0000</pubDate>
		<dc:creator>Vanessa Fox</dc:creator>
				<category><![CDATA[Features: Analysis]]></category>
		<category><![CDATA[Search & Society: General]]></category>
		<category><![CDATA[Top News]]></category>

		<guid isPermaLink="false">http://searchengineland.com/?p=112237</guid>
		<description><![CDATA[Last week at the Tools of Change conference, Clay Johnson, author of the new book The Information Diet gave a keynote talk titled &#8220;Is SEO Killing America&#8220;. Sigh.  If you&#8217;ve been involved in search for any length of time, your first reaction may be, this again? Haven&#8217;t we done this before? Once or twice? Clay&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>Last week at the Tools of Change conference, Clay Johnson, author of the new book <a href="http://www.amazon.com/gp/product/1449304680">The Information Diet</a> gave a keynote talk titled &#8220;<a href="http://www.informationdiet.com/blog/read/the-information-diet-stump-speech">Is SEO Killing America</a>&#8220;. Sigh.  If you&#8217;ve been involved in search for any length of time, your first reaction may be, <a href="http://www.ninebyblue.com/seo-is-the-worst-thing-ever-invented/">this again</a>? Haven&#8217;t we <a href="http://searchengineland.com/the-promise-reality-of-mixing-the-social-graph-with-search-engines-12032">done this before</a>? <a href="http://searchengineland.com/thoughts-on-web-developers-seo-reputation-problems-28047">Once</a> or <a href="http://searchengineland.com/dilbert-hiring-a-weasel-to-do-seo-corrupt-the-industry-112056">twice</a>?</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/toc1.png"><img class="alignnone size-full wp-image-112784" title="Clay Johnson" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/toc1.png" alt="Clay Johnson" width="563" height="283" /></a></p>
<p>Clay&#8217;s a friend of mine and I&#8217;ve read his book (it&#8217;s quite good, by the way), so I knew both that he doesn&#8217;t really think that SEO is killing America and that he&#8217;s unaware just how much we&#8217;re all over this particular linkbait-y title.</p>
<p>And indeed his talk was not about how SEO is killing America. Instead, it was about two things:</p>
<ul>
<li>As a culture, we want to be entertained and told that we are right. It&#8217;s much easier for news organizations to sell news that reaffirms our opinions than news that educates and challenges us.</li>
<li>News organizations need page views, so policies such as the &#8220;<a href="http://www.businessinsider.com/the-aol-way">AOL Way</a>&#8221; may sacrifice investigative journalism at the altar of popular search queries.</li>
</ul>
<p>Clay&#8217;s talk (and his book) are mostly about the former, but my interest is in the latter. In his talk, Clay noted that we broadcast what we want by way of our searches and clicks. In turn, others see the content we&#8217;ve made popular in &#8220;most read&#8221; modules on news sites and content creators write more articles on popular topics based on search volume. The danger is that we don&#8217;t always seek out stuff that&#8217;s good for us and the more we look for what&#8217;s more fun to consume, the more that&#8217;s all that&#8217;s available.</p>
<p>He cited the &#8220;AOL Way&#8221; and the practice of using search data to determine traffic potential of topics and to decide what to write more about as an example of how the media&#8217;s focus on SEO may be an obstacle to the best possible news coverage.</p>
<p>For me, this argument is another variation of the &#8220;<a href="http://searchengineland.com/google-says-seo-is-not-spam-98266">SEO is spam</a>&#8221; argument. Spam is spam and lumping it in with solid search engine optimization processes doesn&#8217;t make it SEO. Creating content simply based on popular search terms isn&#8217;t SEO either. In my book <em><a href="http://www.amazon.com/gp/product/0470537191?ie=UTF8&amp;tag=nibybl-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0470537191">Marketing in the Age of Google</a></em>, I addressed this issue at length and wrote about how tactics of spammers were mislabeled as tactics of SEO, but that it may be too late to reclaim the term. There, I wrote:</p>
<p><em>Integrating a search acquisition strategy into a more comprehensive business strategy includes:</em></p>
<ul>
<li><em>Using search data to build a comprehensive and effective product and content strategy.</em></li>
<li><em>Understanding searcher behavior and building searcher personas that maximize customer satisfaction and conversion.</em></li>
<li><em>Realizing the customer acquisition funnel often begins with the search box, not your web site.</em></li>
<li><em>Integrating organic search with other marketing efforts.</em></li>
<li><em>Ensuring technical architecture o the site can be properly crawled  and indexed by search engines so that it can be visible to searchers.</em></li>
</ul>
<p>I have explored the search data issue in depth as it relates to journalism during my <a href="http://press.org/events/vanessa-fox-search-engine-optimization">National Press Club workshops</a>. At least three components are involved:</p>
<ul>
<li>Search data is valuable for learning what your audience is interested in to help ensure you meet their needs.</li>
<li>It&#8217;s important for content creators (including journalists) to understand how to connect with searchers in order to gain maximum visibility.</li>
<li>Investigative journalism is vital, and search may not be the best initial channel for reaching readers.</li>
</ul>
<h2> Using Search Data</h2>
<p>As with nearly everything else in life, you can use search data for good or for evil. Take the Super Bowl start time, for instance. In 2011, the Huffington Post famously spammed the hell out of Google by creating an article that <a href="http://searchengineland.com/what-time-does-the-super-bowl-start-a-continuing-lesson-in-search-visibility-63633">basically just repeated every variation of related search query</a>. Not only did this article contain little useful information, but one wonders if Super Bowl viewers are really a key target audience for a supposed news site or if the point was more about page views that keeping the public informed on the issues of the day.</p>
<p>But with the latest Super Bowl in 2012, the <a href="http://searchengineland.com/when-is-the-super-bowl-start-time-the-nfl-finally-gets-it-right-110176">NFL created a page</a> specifically for those seeking out information about the game schedule. Although they were using the same search data, not only was the page useful, but it addressed the NFL&#8217;s target audience. The point was obviously not about simply page views but to engage with viewers and get them to interact with additional content on the site. I talked to John Cole, who recently joined NFL.com to head up search and social media and is  responsible for this new tactic at NFL.com. He told me that user testing found that their target users found the information they were looking for regarding the game schedule much more quickly with the details they added to the pages. To me, this is a perfect use of search data: find out what your audience is looking for and answer their questions (making them happy and keeping them engaged with your brand).</p>
<p>The attempt of to simply maximize page views by creating pages about popular topics is not caused by the availability of search data. This type of reporting has existed since the beginning of time and the online medium simply provides new opportunities for creativity. For instance, when reading an article a few days ago, I came across the following set of headlines:</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/mia.png"><img class="alignnone size-full wp-image-112719" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="MIA Super Bowl" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/mia.png" alt="MIA Super Bowl" width="516" height="262" /></a></p>
<p>Why indeed did M.I.A flip the bird during the Super Bowl? When I clicked through to the page, I first encountered this:</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/mia2.png"><img class="alignnone size-full wp-image-112720" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="MIA Super Bowl" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/mia2.png" alt="MIA Super Bowl" width="520" height="144" /></a></p>
<p>Entertainment Weekly certainly is taking a page from HuffPo&#8217;s playbook by filing this story under as many keywords as possible. But what about the story itself? Do we find out why she did it?</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/mia3.png"><img class="alignnone size-full wp-image-112721" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="MIA Super Bowl" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/mia3.png" alt="MIA Super Bowl" width="513" height="96" /></a></p>
<p>Not exactly.</p>
<h2>Being Visible To Your Target Audience</h2>
<p>In the olden days of yore, the printed newspaper arrived at one&#8217;s door, and one flipped through the pages and skimmed through the headlines while drinking one&#8217;s morning coffee. Wearing a corset (or top hat depending on one&#8217;s fashion leanings). But things have changed. Now, when we want to news, we either go to an online source such as Google News or we search for exactly what we want to know. You can see this by <a href="http://www.google.com/insights/search">checking search volume</a> for  just about any news item. See for instance, search volume for [healthcare reform] queries:</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/healthcare.png"><img class="alignnone size-large wp-image-112725" title="Google Insights for Search" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/healthcare-600x147.png" alt="Google Insights for Search" width="600" height="147" /></a></p>
<p>Not only should journalists use search data to make sure they&#8217;re answering all of the questions their readers have about a particular topic, but they should make sure they&#8217;re using the language of their readers so that when those readers seek out content, the news stories appear. (You can see how simply a spelling change can make all the difference in the world with <a href="http://searchengineland.com/kadafi-gaddafi-qaddafi-in-the-age-of-search-69170">different spelling guidelines for &#8220;Gaddafi&#8221;</a>). It&#8217;s not spamming or killing America to make sure that your headline contains descriptive words that match how readers are searching for stories.</p>
<p>This doesn&#8217;t only help news stories appear for the right searches but helps click through on those headlines on news sites and aggregators.</p>
<h2>What About Investigative Journalism</h2>
<p>Investigative journalism is trickier. No one is searching for information about the topic at hand until the story breaks, but how to get the news out there in the first place?  Certainly, this type of journalism is tougher to disseminate.  It was easier in yonder days of yore with the printed paper and the doorstep and the corsets and the like.  It can seem like a lot less trouble to just write stories that you already know people are searching for information about. I asked Clay how he recommended journalists go about getting readers for stories no one was searching for and he told me:</p>
<blockquote>&#8220;I think the best asset an investigative journalist can have is a strong social network. But let&#8217;s not also forget that journalists usually come with a distribution point baked in. People still do read the paper. People go to nytimes.com.&#8221;</blockquote>
<p>And he pointed out that these stories can drive search interest. What the media chooses to cover and the words they use to describe events have direct impact on what people search for and how they search. Once a breaking story hits, people do in fact begin searching for more information about it.</p>
<p>So perhaps, in fact, SEO isn&#8217;t killing America but can instead keep America informed.</p>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/is-seo-killing-america-112237/feed</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Did Super Bowl Advertisers Take Advantage of Search Interest?</title>
		<link>http://searchengineland.com/did-super-bowl-advertisers-take-advantage-of-search-interest-110444</link>
		<comments>http://searchengineland.com/did-super-bowl-advertisers-take-advantage-of-search-interest-110444#comments</comments>
		<pubDate>Tue, 07 Feb 2012 20:07:58 +0000</pubDate>
		<dc:creator>Vanessa Fox</dc:creator>
				<category><![CDATA[Features: Analysis]]></category>
		<category><![CDATA[Search & Society: General]]></category>
		<category><![CDATA[Search Ads: General]]></category>
		<category><![CDATA[SEO - Search Engine Optimization]]></category>
		<category><![CDATA[Top News]]></category>

		<guid isPermaLink="false">http://searchengineland.com/?p=110444</guid>
		<description><![CDATA[Over the past couple of days, numerous stats and figures have been published about how Super Bowl advertisers took advantage (or not) of social media this year. But commercials also drive people to search engines, which in turn (when things go right) can lead potential customers to advertiser web sites where rather than talk about [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-full wp-image-110705" style="margin-left: 10px; margin-bottom: 10px;" title="seen-on-tv" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/seen-on-tv.jpg" alt="" width="200" height="170" />Over the past couple of days, numerous stats and figures have been published about <a href="http://marketingland.com/the-social-bowl-grading-super-bowl-xlvi-ads-by-social-comments-engagement-5451">how Super Bowl advertisers took advantage (or not) of social media this year</a>. But commercials also drive people to search engines, which in turn (when things go right) can lead potential customers to advertiser web sites where rather than talk about a brand as they can on social media sites, they can watch the commercials again, cementing brand messaging, and take a closer look at the products being sold. (Which is presumably why a company would spend $3.5 million dollars on a thirty second spot in the first place.)</p>
<h2>Commercials Drive Searches</h2>
<p>Since the 2009 Super Bowl, I&#8217;ve monitored how the ads influence search interest, and every year, the trend has been the same. As people watch the Super Bowl, they search for everything they&#8217;re watching: teams, players, performers, and of course, commercials. The trend continues the day after the game as people talk about the commercials and turn to Google (and Bing) to watch them again. Take a look at the spiking searches for February 7th, the day after the game according to Google Trends:</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/trendsfrom6th.png"><img class="alignnone size-large wp-image-110454" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Google Super Bowl Trends - Monday" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/trendsfrom6th-600x135.png" alt="Google Super Bowl Trends - Monday" width="600" height="135" /></a></p>
<p>Nearly every search is Super Bowl related, and searchers are clearly seeking out the ads. As you can see from search #8, commercials often cause people to search for the brands directly. Google Insights for Search shows that brands that advertised saw significant search spikes on Sunday. See for instance, the search volume for [bud light platinum].</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/bud-light-platinum-insights.png"><img class="alignnone size-large wp-image-110456" title="bud-light-platinum-insights" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/bud-light-platinum-insights-600x419.png" alt="Bud Light Platinum Google Insights" width="600" height="419" /></a></p>
<p>They seemed to have really liked those ads in Iowa.</p>
<p><a href="http://googleblog.blogspot.com/2012/02/super-bowl-xlvi-mobile-manning-and.html">Google reported</a> that searches for [super bowl ads] were 122 times higher this week and that the big search winners were Acura, GoDaddy, and M&amp;Ms.</p>
<h2>Where Are Advertisers Sending Potential Customers?</h2>
<p>As I do every year, I took note of what advertisers included in the commercial. Did they include a web site URL? A Facebook page? Did they seem to even be aware of this crazy new thing called the internet? And then I looked at the advertisers&#8217; search visibility. I was looking for the following flow:</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/search-flow.png"><img class="alignnone size-large wp-image-110458" title="Commercial to Search Flow" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/search-flow-600x93.png" alt="" width="600" height="93" /></a></p>
<p>Last year, many only paid attention to a flow like this:</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/facebookflow1.png"><img class="alignnone size-medium wp-image-110544" title="Facebook Flow" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/facebookflow1-300x85.png" alt="Facebook Flow" width="300" height="85" /></a></p>
<p>&nbsp;</p>
<p>I understand that Super Bowl commercials are about branding, not necessarily instant purchases, and I realize other positive outcomes exist (discussions on social media and the like). I&#8217;m just saying that if someone is searching for you, you may as well show up. And if you&#8217;ve gotten potential customers to view your commercial, you may as well make it easy for them to view more information about your products.</p>
<p>This year, many advertisers simply included their domain name in the ad (33 of the 53 advertisers I tracked did this). This approach can help cut out the search step, although as the response to the <a href="http://searchengineland.com/scoring-super-bowl-2010-advertising-hows-the-search-visibility-35588">Dockers ad during the 2010 Super Bowl showed</a>, advertising a URL causes people to, well, search for the URL. So you can&#8217;t always cut out the search step, no matter how hard you try.</p>
<p>Last year&#8217;s Super Bowl ads were <a href="http://searchengineland.com/scoring-the-2011-super-bowl-commercials-for-search-visibility-and-visitor-engagement-63672">all about Facebook fan pages</a> (that often were impossible to find; don&#8217;t say &#8220;find us on Facebook&#8221; unless that&#8217;s an achievable task). This year, only fourof the ads included a nod to Facebook and all used actual URLs. Pepsi Max even went with an easy to remember redirect to Facebook: pepsimax.com/facebook.</p>
<p>Four commercials advertised Twitter hashtags (last year was the first year for this, and then it was mostly only for movie trailers). I was astonished to find that when a hashtag was included in a commercial, people instantly started using it to tweet about the commercial and the hashtag began trending. (As you can see, even the bands with songs in the commercials started trending.)</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/twitter-hashtag-trend.png"><img class="alignnone size-full wp-image-110473" title="Twitter Hashtag Trend" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/twitter-hashtag-trend.png" alt="Twitter Hashtag Trend" width="328" height="325" /></a></p>
<p>Of course, there&#8217;s a risk in this strategy. Things may go really well, as Audi found with #SoLongVampires, or very awry as Bud Light found with #MAKEITPLATINUM. (Did people really even use the same capitalization in the hashtag as was used in the commercial? Amazing.)</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/twitter-trends.png"><img class="alignnone size-large wp-image-110479" title="Twitter Trends" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/twitter-trends-600x392.png" alt="Twitter Trends" width="600" height="392" /></a></p>
<p>What began trending on Twitter also tended to show search spikes. For instance, take a look at searches for [echo and the bunnymen]:</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/echoandthebunnymen.png"><img class="alignnone size-large wp-image-110481" title="Echo and the Bunnymen Search Trends" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/echoandthebunnymen-600x187.png" alt="Echo and the Bunnymen Search Trends" width="600" height="187" /></a></p>
<p>So what we talk about, we also search for.</p>
<h2>The Future is&#8230; QR Codes?</h2>
<p>It may have seemed like GoDaddy used the same tired formula as always in their ads (although, apparently <a href="http://www.ninebyblue.com/godaddy-superbowl-ad-sex-still-sells-and-influences-searches/">sex does sell</a>, so I can&#8217;t knock sticking with something that works), but in fact, they tried something new this year: including a QR code in the ad.</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/godaddy-cloud.png"><img class="alignnone size-large wp-image-110507" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="GoDaddy QR Code" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/godaddy-cloud-600x308.png" alt="GoDaddy QR Code" width="600" height="308" /></a></p>
<p>The online version of the commercial includes the QR during the entire length of the ad, but when aired during the Super Bowl, it appeared only briefly at the end, so I&#8217;m not sure if  anyone managed to pull up the QR code reader on their mobile phone, rush to the TV, and scan it before it disappeared from the screen. Including it in the online version seems even more nonsensical though, as the idea seems to be that you&#8217;re watching the ad on your computer, see the QR code, scan it with your phone, and are brought to the godaddy.com site on your phone. I would guess that including a link to the web site in the commercial so that you can simply click and access the web site on your computer would make entering your credit card information for all those domain names quite a bit easier.</p>
<h2>Scoring Search Visibility</h2>
<p>So how did advertisers do in search? It&#8217;s difficult to come up with exact search coverage percentages. For instance, if a brand advertised multiple products and ranked well in search results for one product but not the other does the tick mark for that brand go in the yes or no column for search visibility? What if the product showed up for its name but not for its tagline?</p>
<p>For the purposes of the stats below, I used the following guidelines:</p>
<ul>
<li>I counted each brand once, even if they aired ads for multiple products</li>
<li>If they ranked organically for at least one of brand, product, or tagline queries, I put a yes in the organic search column</li>
<li>If they had a paid search ad for at least one of brand, product, or tagline queries, I put a yes in the paid search column</li>
</ul>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/ad-percentages.png"><img class="alignnone size-large wp-image-110552" title="Super Bowl Commercials" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/ad-percentages-600x440.png" alt="Super Bowl Commercials" width="600" height="440" /></a></p>
<p>In a follow up column, I&#8217;ll point out some interesting choices, but for now, let&#8217;s just look at how well advertisers thought out web sites, search, and social media.</p>
<p>Of the 53 brands I tracked:</p>
<ul>
<li>33 ended the ad with a URL to the brand site, 4 went with a Twitter hashtag, and 4 sent viewers to Facebook.</li>
<li>44 bought a paid search ad</li>
<li>51 ranked organically for the brand name (although far fewer ranked for the promoted taglines or hashtags)</li>
</ul>
<h2>Chrysler and YouTube</h2>
<p>Last year, Chrysler&#8217;s Eminem ad was one of the most popular commercials of the game. I found it odd at the time that although they designed their site&#8217;s home page to tie in quite well to the vibe of that ad, they bought search ads to the commercial on YouTube. I felt they lost an opportunity to further interact with potential customers and lost some control of the experience (related videos could easily be to competitors, for instance). Their flow looked like this:</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/youtubeflow.png"><img class="alignnone size-full wp-image-110521" title="YouTube Flow" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/youtubeflow.png" alt="YouTube Flow" width="533" height="157" /></a></p>
<p>That&#8217;s not a bad outcome, but I thought that if they had used paid search to drive visitors to the commercial on their site, they might have been able to better leveraged the opportunity. This year, Chrysler once again had a much-talked-about ad, and they decided to mix things up a bit.</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/chrysler.png"><img class="alignnone size-large wp-image-110535" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Chrysler Demand" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/chrysler-600x129.png" alt="Chrysler Demand" width="600" height="129" /></a></p>
<p>For [chrysler]-related searches, the paid search ad points at their home page, which is a great tie in to the commercial. But for other searches, they&#8217;ve once again chosen to promote YouTube.</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/chrysler-paid-search.png"><img class="alignnone size-large wp-image-110524" title="Chrysler Paid Search" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/chrysler-paid-search-600x180.png" alt="" width="600" height="180" /></a></p>
<p>This time, the YouTube link makes a lot more sense as it&#8217;s to the channel, so there are no competitor links and the entire page is focused on getting votes for the YouTube AdBlitz, engaging socially, and even includes an ad for the car featured in the commercial. All in all, I fully support this approach. They keep the branded searches pointing at their home page (after all, not everyone searching for the brand is searching for the commercial), which is tightly-integrated with the campaign, and they send those looking for the commercial to a page designed to specifically engage with them.  What a difference a year makes.</p>
<p><strong>2012 Paid Search Ad to YouTube:</strong></p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/chrysler-youtube.png"><img class="alignnone size-large wp-image-110526" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Chrysler YouTube" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/chrysler-youtube-600x413.png" alt="Chrysler YouTube" width="600" height="413" /></a></p>
<p><strong>2011 Paid Search Ad to YouTube:</strong></p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/2011-chrysler.png"><img class="alignnone size-large wp-image-110527" title="2011 Chrysler YouTube" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/2011-chrysler-600x373.png" alt="2011 Chrysler YouTube" width="600" height="373" /></a></p>
<h2>Acura NSX vs. Bud Light Platinum</h2>
<p>We&#8217;ve already seen that the #makeitplatinum hashtag strategy both worked and didn&#8217;t work for Bud Light (they definitely got it trending, but for perhaps the wrong reasons). What about organic search visibility? Sadly, the brand web site doesn&#8217;t appear at all in Google for searches for [bud light platinum] (although they have bought a paid search ad to the YouTube page).</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/bud-light-platinum.png"><img class="alignnone size-full wp-image-110536" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Bud Light Platinum" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/bud-light-platinum.png" alt="Bud Light Platinum" width="592" height="322" /></a></p>
<p>Acura NSX, on the other hand (which was a spiking search on Monday), does an excellent job with organic search, taking the top spot with a page devoted to it. (Although including the commercial on the page would have been a good idea.)</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/acura-nsx.png"><img class="alignnone size-full wp-image-110537" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Acura NSX" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/acura-nsx.png" alt="Acura NSX" width="536" height="320" /></a></p>
<p>Overall, I felt brands did a much better job of keeping things simple and driving viewers to interesting, relevant pages that engaged them. Watch for my next post in the coming days for some specifics on what went right and spectacularly wrong.</p>
<h6>(Stock image via <a href="http://www.shutterstock.com/">Shutterstock.com</a>. Used under license.)</h6>
<p>Related:</p>
<ul>
<li><a href="http://searchengineland.com/when-is-the-super-bowl-start-time-the-nfl-finally-gets-it-right-110176">Super Bowl 2012: What Time Does It Start?</a></li>
<li><a href="http://searchengineland.com/scoring-the-2011-super-bowl-commercials-for-search-visibility-and-visitor-engagement-63672">Super Bowl 2011: Commercials and Search Visibility</a></li>
<li><a href="http://searchengineland.com/scoring-super-bowl-2010-advertising-hows-the-search-visibility-35588">Super Bowl 2010: Commercials and Search Visibility</a></li>
<li><a href="http://searchengineland.com/scoring-the-superbowl-ads-do-broadcast-marketers-get-online-acquisition-16398">Super Bowl 2009: Commercials and Search Visibility</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/did-super-bowl-advertisers-take-advantage-of-search-interest-110444/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>When Is the Super Bowl Start Time? The NFL Finally Gets It Right</title>
		<link>http://searchengineland.com/when-is-the-super-bowl-start-time-the-nfl-finally-gets-it-right-110176</link>
		<comments>http://searchengineland.com/when-is-the-super-bowl-start-time-the-nfl-finally-gets-it-right-110176#comments</comments>
		<pubDate>Sun, 05 Feb 2012 18:21:00 +0000</pubDate>
		<dc:creator>Vanessa Fox</dc:creator>
				<category><![CDATA[Features: Analysis]]></category>
		<category><![CDATA[Search & Society]]></category>
		<category><![CDATA[Search Marketing: Search Term Research]]></category>

		<guid isPermaLink="false">http://searchengineland.com/?p=110176</guid>
		<description><![CDATA[Super Bowl 46 kicks off on February 5, 2012 at 6:30pm EST on NBC. Amazingly enough, I found this information by searching on Google and clicking on the second result: nfl.com. Amazing because every year, football fans flock to search engines searching for the start time, and until now, organizations like the NFL, the playing [...]]]></description>
			<content:encoded><![CDATA[<p>Super Bowl 46 kicks off on February 5, 2012 at 6:30pm EST on NBC. Amazingly enough, I found this information by searching on Google and clicking on the second result: nfl.com.</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/super-bowl-time-serp.png"><img class="alignnone size-full wp-image-110177" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="super-bowl-time-serp" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/super-bowl-time-serp.png" alt="" width="524" height="328" /></a></p>
<p>Amazing because every year, football fans flock to search engines searching for the start time, and until now, organizations like the NFL, the playing teams, and the broadcasting station didn&#8217;t show up at all in search results because none of their sites answered the question. Seem crazy?</p>
<ul>
<li><a href="http://searchengineland.com/can-searchers-find-the-superbowl-16396">2009  Results</a>: In 2009, start-time related searches were among the most popular the morning of the game, but neither the NFL nor NBC were anywhere to be found.</li>
<li><a href="http://searchengineland.com/searching-for-the-superbowl-start-time-how-are-the-engines-the-nfl-and-cbs-doing-35451">2010 Results</a>: In 2010, both nfl.com and cbs.com had significant technical infrastructure issues that kept search engines from crawling and indexing the content. Again, the search results were sad and this time, full of spammers trying to capitalize on the search volume.</li>
<li><a href="http://searchengineland.com/what-time-does-the-super-bowl-start-a-continuing-lesson-in-search-visibility-63633">2011 Results</a>: In 2011, problems continued. But news organizations jumped in, and the Huffington Post in particular ranked well for its article that simply listed all of the various ways people were searching for the Super Bowl start time. (That article was later &#8220;edited for clarity&#8221;).</li>
</ul>
<p>This year, <a href="http://deadspin.com/5881720/what-time-does-the-super-bowl-start-he-wrote-as-a-headline-to-game-the-google-results">things are finally getting better</a>. Even the Huffington Post, while still getting every variation of spelling and tagging in the article for maximum search coverage (&#8220;For starters, it&#8217;s two words, not one. &#8220;Superbowl&#8221; is an incorrect spelling.&#8221;), has filled out their article a bit with actual information.</p>
<div><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/huffpo.png"><img class="alignnone size-large wp-image-110179" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Super Bowl Huffington Post" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/huffpo-600x566.png" alt="" width="600" height="566" /></a></div>
<p>The results could still be better. While [Super Bowl start time] has overall higher search volume than [Super Bowl kick off time], the latter is the top search this morning, and NFL.com only ranks for the former (HuffPo does quite well with the latter). Superbowl.com, which redirects to the NFL site, ranks, but as I mentioned in earlier years, this domain 302 redirects to nfl.com. A 301 instead would consolidate the domains (including value signals such as links), which might cause the target URL to do better overall in relevant searches. But still, compared to earlier years, I&#8217;d call these results a win for the NFL.</p>
<div><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/trends-9am.png"><img class="alignnone size-full wp-image-110182" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Super Bowl Trends" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/trends-9am.png" alt="" width="201" height="305" /></a></div>
<p>Sadly, NBC, the Giants, and the Patriots, and TV Guide all fail to appear in results once again. Even though both Google Insights for Search and my articles over the years should have prepared them for this year&#8217;s search interest.</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/insights.png"><img class="alignnone size-large wp-image-110184" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Start Time Insights" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/insights-600x257.png" alt="" width="600" height="257" /></a></p>
<p>Why should these sites care about showing up for these searches? They&#8217;ve invested substantially in site content and those seeking out the game start time are a perfect audience for that content. Searchers would click for the start time and stay for the fan jam videos and view the ads.</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/02/nfl-events-page.png"><img class="alignnone size-large wp-image-110191" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="NFL Events" src="http://searchengineland.com/figz/wp-content/seloads/2012/02/nfl-events-page-600x523.png" alt="" width="600" height="523" /></a></p>
<p>Of  course, Super Bowl viewers will see lots of ads anyway today. But that&#8217;s a topic for the next article.</p>
<p><strong>Related:</strong></p>
<ul>
<li><a href="http://searchengineland.com/can-searchers-find-the-superbowl-16396">2009 Super Bowl Start Time</a></li>
<li><a href="http://searchengineland.com/searching-for-the-superbowl-start-time-how-are-the-engines-the-nfl-and-cbs-doing-35451">2010 Super Bowl Start Time</a></li>
<li><a href="http://searchengineland.com/what-time-does-the-super-bowl-start-a-continuing-lesson-in-search-visibility-63633">2011 Super Bowl Start Time</a></li>
<li><a href="http://searchengineland.com/scoring-the-superbowl-ads-do-broadcast-marketers-get-online-acquisition-16398">2009 Super Bowl Commercials</a></li>
<li><a href="http://searchengineland.com/scoring-super-bowl-2010-advertising-hows-the-search-visibility-35588">2010 Super Bowl Commercials</a></li>
<li><a href="http://searchengineland.com/scoring-the-2011-super-bowl-commercials-for-search-visibility-and-visitor-engagement-63672">2011 Super Bowl Commercials</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/when-is-the-super-bowl-start-time-the-nfl-finally-gets-it-right-110176/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Google Webmaster Tools Adds Useful Download Options</title>
		<link>http://searchengineland.com/google-webmaster-tools-adds-useful-download-options-108684</link>
		<comments>http://searchengineland.com/google-webmaster-tools-adds-useful-download-options-108684#comments</comments>
		<pubDate>Fri, 20 Jan 2012 19:48:47 +0000</pubDate>
		<dc:creator>Vanessa Fox</dc:creator>
				<category><![CDATA[Google: Webmaster Central]]></category>

		<guid isPermaLink="false">http://searchengineland.com/?p=108684</guid>
		<description><![CDATA[Google webmaster tools data is helpful stuff, but has been somewhat tough to download and use. Last month, Google made things a bit easier by providing a Python script for downloading search query data (as this report isn&#8217;t available as of yet through the API). Now, they&#8217;ve added new download options that significantly add to [...]]]></description>
			<content:encoded><![CDATA[<p>Google webmaster tools data is helpful stuff, but has been somewhat tough to download and use. Last month, Google made things a bit easier by providing a <a href="http://googlewebmastercentral.blogspot.com/2011/12/download-search-queries-data-using.html">Python script for downloading search query data</a> (as this report isn&#8217;t available as of yet through the API). Now, they&#8217;ve added new download options that significantly add to the usefulness of the data. Below, why these new CSV files are so important.</p>
<h2>Search Queries: Download Chart Data</h2>
<p>Search query chart data downloads provide access to information not easily available before. When you view search query data, the chart in the user interface shows impression and click data per day, but in the past, this detail has only been available if you hover over a dot in the chart.</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/01/gwt-chart.png"><img class="size-full wp-image-108695 alignnone" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Google webmaster tools search query charts" src="http://searchengineland.com/figz/wp-content/seloads/2012/01/gwt-chart.png" alt="Google webmaster tools search query charts" width="262" height="106" /></a></p>
<p>Now, you can access this data by clicking the Download Chart Data  button below the Top Queries and Top URLs reports. (You can use filtering options to drill into specific data or date ranges).</p>
<p>The data downloads to a CSV file as shown below:</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/01/gwt-csv.png"><img class="alignnone size-full wp-image-108698" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Query Data CSV" src="http://searchengineland.com/figz/wp-content/seloads/2012/01/gwt-csv.png" alt="Query Data CSV" width="340" height="202" /></a></p>
<p>When you click on a specific query in the Top Queries report page, you see a similar chart, but its data is not available for download. In the example below, clicking on [track santa claus] query in the Top Queries report brings up the query details for that single query, and you can download the page and position information, but not the chart data. (Note that a display bug seems to be preventing the buttons on this page from displaying in Internet Explorer.)</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/01/query-details.png"><img class="alignnone size-large wp-image-108806" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="query-details" src="http://searchengineland.com/figz/wp-content/seloads/2012/01/query-details-600x497.png" alt="Google webmaster tools query details" width="600" height="497" /></a></p>
<p>You can filter the Top Queries report and download the impression and click data from there (although you may not be able to filter to a single query). In the example below, a filtered report for [track santa claus] resulted in 16 queries.</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/01/tracksanta2.png"><img class="alignnone size-large wp-image-108715" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Google search query details" src="http://searchengineland.com/figz/wp-content/seloads/2012/01/tracksanta2-600x295.png" alt="Google search query details" width="600" height="295" /></a></p>
<p>The chart data in this case can be particularly useful due to how Google reports trended data. Watch for a separate post detailing how this works, but in short, Google reports on the top 1,000 queries per day. For reports that span multiple days, such as the example above, impression and click data is only included if the query was in the top 1,000 for each day. This can result in misleading totals, since the impressions and clicks aren&#8217;t totals for the entire data range, but only totals from the days in that range that the query was in the top 1,000. Confusing? Totally understandable.</p>
<p>Let&#8217;s look at the example above. I&#8217;ve filtered the Top Queries report to only show queries that include [track santa claus] for a 31 day period and the report shows:</p>
<ul>
<li>16 queries</li>
<li>170,000 impressions</li>
<li>3,000 clicks</li>
</ul>
<p>Does this mean the site got traffic for only 16 queries that contain the words [track santa claus]? No, it means that only 16 queries made it into the top 1,000 at least one day during the date range. Based on that foundation, you might assume that those 16 queries resulted in 170,000 impressions for the site and brought in 3,000 clicks for the 31 day period. But that assumption would be wrong. (This is where things can really start to get confusing.)</p>
<p>The Search Queries report only reports data (at all) for those days that a query is in the top 1,000. And the totals are the sum of the counts on those days only &#8212; not the entire date range. This level of detail has always been available (sort of) by way of the dots on the charts. Looking more closely at our [track santa claus] example, notice that several days in the chart don&#8217;t have dots.</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/01/tracksanta3.png"><img class="alignnone size-large wp-image-108736" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Google query detail no data" src="http://searchengineland.com/figz/wp-content/seloads/2012/01/tracksanta3-600x98.png" alt="Google query detail no data" width="600" height="98" /></a></p>
<p>The days without dots are the days that the query (or set of queries) didn&#8217;t make the top 1,000 and therefore have no data reported. Until now, the only way to make some sense of the data was to count the dots to better understand the summary data. Now, when you download the chart data, you can much more easily understand what the visualization represents:</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/01/tracksanta4.png"><img class="alignnone size-full wp-image-108742" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Query Detail CSV" src="http://searchengineland.com/figz/wp-content/seloads/2012/01/tracksanta4.png" alt="Query Detail CSV" width="265" height="248" /></a></p>
<p>With the CSV, it&#8217;s much more apparent that although the specified date range is 12/18/11 &#8211; 1/17/12, the totals are for 11 days, not 31. These queries were in the top 1,000 on 12/27, but not on the 28th or 29th. Does this mean that on the 28th, the impressions and clicks were lower than on the 27th? Maybe. Or it could mean that impressions and clicks stayed the same but another query spiked in volume and bumped it from the top 1,000.</p>
<h2>User Interface Improvements</h2>
<p>One nice change is the ability to view more data on a page. Most pages, including the dashboard page listing all sites in the account, now include an option for the number of rows to show.</p>
<p><a href="http://searchengineland.com/figz/wp-content/seloads/2012/01/show.png"><img class="alignnone size-full wp-image-108757" style="border-image: initial; border-width: 1px; border-color: black; border-style: solid;" title="Google webmaster tools show rows" src="http://searchengineland.com/figz/wp-content/seloads/2012/01/show.png" alt="Google webmaster tools show rows" width="335" height="54" /></a></p>
<p>You can now view up to 500 rows at once, which makes overall account management and data viewing much easier.</p>
<p>I had initially thought that more link data might be available as well, but after talking with Google, I realize that the user interface improvements have simply made the link reports more easily visible.</p>
<p>In particular, the detailed chart data available for download provides a significant improvement as it brings a much more accurate understanding of what the data actually is. Watch for my more detailed post on the search query data shortly.</p>
]]></content:encoded>
			<wfw:commentRss>http://searchengineland.com/google-webmaster-tools-adds-useful-download-options-108684/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic page generated in 0.644 seconds. -->
<!-- Cached page generated by WP-Super-Cache on 2012-05-25 12:32:44 -->
<!-- Compression = gzip -->
