The new search engine Blekko just launched, and it’s making SEO-related data that it has found during its crawling and indexing of the web available for all sites. Just what does it include and how useful is it?
Accessing SEO Data
You can access SEO data for any domain or URL by clicking the SEO link below any page in search results. You can click link for a list of external links.
The SEO tools include:
- a list of pages that link to the site
- link distribution and anchor text data
- crawl data
- pages indexed
- a site comparison tool
- a duplicate content report
Blekko also has created scoring, such as Host Rank and URL Rank. Other than assuming that a higher number is better, I’m not sure what goes into the scoring or what the scale is.
Of most interest to many site owners is likely competitive link data. Google webmaster tools only provides external link information to verified site owners, and Yahoo Site Explorer, which lists competitive link data but has a questionable future since Bing now powers Yahoo search results. Several third-party tools have stepped up to fill that potential void, another source of data is always welcome.
They keys to value in third party link data are always:
- How closely does the third-party index resemble the indices of the major search engines? Is substantial link data missing?
- How often is the reporting updated and how complete is each report? In other words, are reported changes in the number of links accurate or does each report simply reflect a different section of the site’s link graph because it’s from a different part of the web?
- How well does the third party canonicalize duplicate URLs and eliminate spam?
The first thing that confuses me about Blekko’s data is that the count provided from clicking link doesn’t match the count shown on the inbound links tab.
That said, clicking link under a URL in Blekko search results seems to provide a list of links to that URL (see first image above), sorted by relevance or date. The date sort seems like it could be useful for learning about new links to a site (although I find that analytics referral data is most useful for learning about new links to your own site).
Clicking the rank stats icon beside the sort options generates a list in table view with two scores beside each URL. I have no idea what these mean. The table includes a legend, but I don’t see the legend notations used anywhere in the table. Clicking more details brings up an even more confusing table. I can’t find help documentation for any of this on the Blekko site and while a few columns seem fairly obvious (lang, notporn), others are perplexing (people?).
The Show Terms button appears to simply include the data from the first view in the second view.
To get to the rest of the link data, you have to go back to the search results and click the SEO link. From there you can get both domain and page level data. Let’s look at page-level data to compare to the tables above.
Regional link distribution is interesting, particularly when looking at why certain pages rank in particular countries or for queries where location is relevant.
The second image shows what appears when clicking the see all 2500 link. I’m assuming that 2500 is the maximum number of URLs shown in this report, not that the link count has changed to 2500. As you can see, the first table includes the rank (although I don’t know what this signifies) and the second includes the date (again, I’m not sure if this date is when Blekko first discovered the page or last crawled it or something else), although both tables list both columns.
Clicking show graphs, brings up, well graphs.
The historical data could be useful. The inbound links by domain includes internal links (as does some of the table data), which may be less useful.
Clicking the Visualize icon next to a URL brings up yet another set of graphs, and at this point, while I’m all for more data, I’m getting a bit fatigued by little bits of data spread across multiple pages. Here, you enter up to four URLs to “visualize”, and this seems to provide comparison bar graphs.
These graphs include data that is on yet more pages. Head back to the overview page and scroll down for anchor text data.
The table is a little smooshed together, but I think the columns are number (of links with that anchor text), percentage (of that anchor text), good, exclude, and zapsite. This is some of the data I alluded to earlier that’s included on the comparison link graphs. I have no idea what any of it means. I imagine that “good” are the links that Blekko has deemed valuable, but after that, your guess is as good as mine.
But the big question is how does the link data stack up to what else is out there? Is it useful, for instance, for understanding what links Google sees to a site? This won’t tell us how consistent the data will be from crawl to crawl, but at least we can see a current comparison snapshot. Below are the link counts into the searchengineland.com domain (note that Yahoo Site Explorer data may not yet be restored). The first thing you’ll notice is that Blekko shows significantly fewer links than Google webmaster tools. Is this because Google is counting every link and Blekko is only counting each domain once, no matter how many times pages on that domain link in or is because Blekko hasn’t crawled as much of the web? Well, Blekko shows 145,281 inbound links to the URL searchengineland.com (as opposed to the domain), whereas Google webmaster tools shows 1,891,942 unique links and 4,993 total domains, so my guess at this point is that Blekko is counting unique URLs, but just knows about fewer of them.
|Google Webmaster Tools||Yahoo Site Explorer||Blekko||SEOmoz Linkscape||Majestic SEO|
Well. Those numbers are all over the place. Google shows significantly more, but of course, you can’t see competitive data, so the other tools are useful, even if they are incomplete. The only question is whether the index used to generate the data changes from report to report (which makes the data less useful to chart changes over time).
The more data the better, but I don’t know how useful it is to know how many pages Blekko has crawled, unless you’re trying to optimize for Blekko. As you can see below, searchengineland.com (I assume the home page) was crawled 19 seconds ago, but the robots.txt file was crawled 30 days ago. I’m not sure if the average page length is actionable (there’s not really an ideal length of a page) and I don’t know if page latency is server side or client side (and I guess it’s in seconds?). I also am not sure what one does with a site’s Analytics or Adsense ID.
Site Pages (I think) lists all the pages indexed in Blekko. In this case, the total crawled matches the total indexed, which I find suspect. Either Blekko is doing no canonicalization (searchengineland.com has duplicate URLs with all kinds of tracking codes, for instance) or the numbers are coming from the same source. I can’t get the visualization details (accessible via the icon) to load.
The Compare tab has the link graph data shown earlier, as well as additional slices of the link data, as shown below. I’m not sure if this is the best way to compare lists of links, although I guess the idea is that the links are in order of importance (at least, according to Blekko).
This is an awesome concept. I don’t know how awesome it is in execution. The help section says:
The duplicated content tab (/duptext) displays URLs that have content that is the same as that of the website you are looking at. Click on a URL in the “urls with duplicated content” section to see a cached version of that URL with the duplicated content highlighted. You can also see seo, sections and whois information for each URL with duplicated content.
When I look at the ninebyblue.com, I see that 161 sites have content from ninebyblue.com, but I can’t seem to get to a list of what the content is or where it’s located on other sites.
This functionality seems to be available only at the URL level.
At the URL level, you can see other information, such as page source (although you can also get this from the source of the page itself). When you click sections, you can see the source code by section.
The usefulness of this data depends on how closely you think it reflects Google (assuming site owners care more about traffic from Google than Blekko). More data is always better, but just keep in mind that this data may not provide the complete picture. It’s also the case that while you can see this data for all sites, not just your own, having competitive information is more useful for some categories than others. Blekko also provides a toolbar, so you can view this data for sites as you browse them.
The value of data is always in what you do with it. You can waste your time entirely gathering and analyzing data that’s not actionable or you can find key insights that help you move your business forward. I’m always on the side of more data, but dat alone isn’t enough. As for Blekko data specifically, it’s great that we now have more sources, although I would love some additional explanatory help information. We’ll have to watch the data over time to see how actionable the historical trends are.
Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.