How To Manage Big Data With Pivot Tables

Big data is all the rage right now. If you search Google News for the term, there are 53,500 thousand results. And that’s just the news.

big dataMost of us in Internet marketing aren’t truly working with big data, in the truest sense of the word.

However, when we’re investigating things like the competitive landscape for a particular keyword we want to target, we’re definitely working with bigger data sets than your everyday, garden-variety spreadsheet.

When you’re trying to find correlations and causal relationships, anything short of a pivot table is going to fall short.

Pivot tables are like the Swiss army knife of data analysis in Excel. And trust me … They look scary at first, but they’re so easy to use once you learn the underpinning logic that powers them.

What I’ll be using for my “big data” is a Keyword Difficulty Report from SEOmoz and a rank report from Raven.

(Disclaimer: I have no affiliation with either site, and you need to be a subscriber to use both of these tools.)

I’ll start with the Keyword Difficulty Report because it’s one of my faves from SEOmoz’s host of reports. Why? Because I used to spend hours collecting this data manually, and now I can get it in a few clicks. Better yet, it’s delivered to my inbox the next day, so I can just set it and forget it.

The way the report works is you enter a keyword you’re trying to target, and it does a search on the search engine of your choice. (I ran a report for gucci handbags and chose Google US for my report.) Then it scrapes the following metrics and delivers them sometime the next day to your Inbox.

Here are the metrics included in the report for the top 10 results:

  • Page Authority
  • mozRank
  • mozTrust
  • mozTrust/mozRank
  • Total Links
  • Internal Links
  • External Links
  • Followed Links
  • Nofollowed Links
  • Linking Root Domains
  • On-Page Analysis Grade (A-F)
  • Broad Keyword Usage In Title
  • Broad Keyword Usage In Document
  • Keyword Used In URL
  • Keyword In Domain
  • Keyword Exact Match
  • Exact Anchor Text Links
  • % Links With Exact Anchor Text
  • Linking Root Domains With Exact Anchor Text
  • % Linking Root Domains With Exact Anchor Text
  • Partial Anchor Text Links
  • % Links With Partial Anchor Text
  • Partial Anchor Text Root Domains
  • % Linking Root Domain With Partial Anchor Text
  • Domain Authority
  • Domain mozRank
  • Domain mozTrust
  • DmT/DmR
  • External Links To This Domain
  • Linking Root Domains To This Domain
  • Linking C-Blocks Domains To This Domain
  • Tweets
  • Facebook Shares
  • Google +1 Shares

Crazy, right? That’s some pretty big data. But it’s too much — for a regular table at least. You need to be able to slice and dice the data and isolate portions of it to effectively evaluate it. The nice thing about pivot tables is that you can literally do exactly that by simply dragging and dropping your data around at will.

If you want to follow along, you can download the Excel table I worked from. It includes a tab for the raw [hideous] data, one for the formatted tables, and finally one for the pivot tables I reference.

So let’s get started …

Preliminary Setup

As I mentioned in my post on how to format tables in Excel, before I do anything with raw data, I format it as a table. To do that, add a row above your headings by right-clicking on the row number and choosing Insert from the flyout menu.

If you make sure the columns and rows surrounding your data are empty, Excel will be able to auto-detect the perimeter of your data. Now just click any cell inside the data and choose Home > Styles > Format as Table > the table style of your choosing (Mac: Tables > Table Styles).

formatted table in Excel

Click for larger image.

There are also a few rules you should follow that will reduce your troubleshooting down the road significantly:

1.  Set your table up in tabular format.

Formatting your data as a table, as we did above, will take care of this requirement. But basically you need to follow the structure of a database, where your data only includes column headings. You can’t have row headings, like this table:

wrong table formatting for a pivot table

This means that sometimes you will have columns that have a lot of duplication. For example, my audit checklist is set up as a table, so I can easily sort and filter it when I’m writing up my findings. And one of the columns is called Category.

These categories include things like Architecture, Duplicate Content, Crawl, Keywords, etc. So you’ll see the category repeated for each “record” in the database, like so:

database setup for pivot tables

2.  Organize your columns by buckets, not individual values.

This is probably the hardest concept to grasp when you’re first starting with pivot tables. So, for example, if you wanted to see visits to a site broken down by month and medium, you might be tempted to create a column for each medium (organic, PPC, social, email, etc.) and each month. This would be incorrect and will cause you all manner of aggravation when you go to pivot the data.

Your column headings would need to be the broader buckets: Visits, Medium, and Month.

Here’s where it gets a bit tricky.

Let’s say you have visit data for two years, and you want to compare year-over-year data — in other words, visits for 2012 compared to the same months in 2011. Then, you would need to add a year column as well. You won’t need a Year column anytime you have data that spans more than one year.

I’ll teach you a trick later in this post for how to group by year. It’s just specifically if you want to compare the values from one year (or quarter or month) to another.

3.  Avoid blank columns, rows, and cells.

To ensure you don’t have any blank rows and columns, make sure the bottom-right corner of your table is flushed up against the data (see screenshot below).

Also, blank cells can wreak havoc in your data and even cause Excel to think numbers are text, which will limit your filtering options down the road.

table perimeter

To hunt down these blanks in your data, click inside your table, press Ctrl-A (Mac: Command-A) to select just the data inside your table, then Ctrl-G (Mac: Control-G) to pull up the Go To dialog.

Click the Special button, choose Blanks, and click OK. This will select all the blank cells in your data. Enter a 0, then without clicking anything press the number 0 and Ctrl-Enter (Mac: Command-Return). This will enter the 0 inside all of the blank cells, though they make show up as hyphens.

Pivot Table Options

Now we’re ready to jump in head first.

The first thing you’ll want to do is click anywhere inside your formatted table and choose Insert > Tables > Pivot Table (Mac: Tables > Tools > Summarize with Pivot Table).

When prompted, choose to place it on a new worksheet (this is the default).

This is what your starting canvas will look like on a PC:

starting pivot table on a PC

Click for larger image.

And like this on a Mac:

starting pivot table for Mac

Click for larger image.

That dialog box you see hugging the right side of the Excel window is where all the magic happens. Any time you click inside your pivot table, it will appear. Click outside and it disappears again. But you can also turn it off by either clicking the X in the top-right corner of the dialog (upper-left corner on a Mac) or deselecting Field List under Pivot Table Tools > Options > Show (Mac: Pivot Table > View > Builder).

Note: Since the functionality of pivot tables is nearly identical between the PC and Mac, from this point forward, I’ll just be showing PC screenshots.

Setting Row Labels

Let’s choose our row label from the Pivot Table Field List. The most logical choice here would be URL. This is usually the value I keep constant as I play musical chairs with the column and report filters.

To set the Row Label to URL, simply click-and-drag URL down to the Row Labels field.

Setting row label in pivot table

Click for larger image.

Setting Values

Next, we need to decide which values we want to view. For now, I’m going to just look at External Links, Tweets, and Facebook Shares to see if any patterns emerge that could provide clues as to how Google decided the pecking order for this query.

To do this, click-and-drag each of those items down to the Values field.

adding values to a pivot table

Click for larger image.

Summary Options

By default, if you drag a field(s) that contains a numerical value into the Values field, the default summary option will be Sum. If you drag text into this Values field, the default summary value will be Count.

It makes sense: If there’s something to add, Excel wants to find the sum; if there’s not, it wants to count instances (e.g., instances of URLs that contain the search term).

But Excel provides other summary options as well — namely Average, Max, Min, Product, Count Numbers, Std Dev, Var, and Varp. To get to these options, right-click any cell in the column you want to modify and choose Summarize Values By > your option (Mac: Field Settings > your option > OK).

To learn more about these summary options, I couldn’t find a good resource on the Microsoft site, but this post contains a good overview.

Relabeling Columns

Personally, I don’t like how Excel adds “Sum of” (or Count , Average, Max, etc) to each of your column headings in the pivot table. But if you try to delete them, Excel will give you a nasty message forbidding you from doing so.

It’s because it has this annoying rule that you can’t use the same column heading in your pivot table that you use in your static table. So dumb.

Anyway, to get around this, I used to purposely make my static table column labels lamer to preserve the intuitive ones for the pivot table, which was time-intensive and made me irritable. Then I recently discovered that I could just put a space after the label, and Excel is none the wiser.

There are several ways you can change these labels — one of which is to right-click any label and choose Value Field Settings (just Field Settings on a Mac). But the easiest option sometimes is to just click inside the column heading of your pivot table and start typing.

Column labels for pivot tables

 

Pro Tip: If you want to easily change the order of your columns, a little-known trick is to just start typing over one column name with another already in the table.

For example, if we wanted to Ext Links to the end, just click on the FB Shares label and type “Ext Links ” (don’t forget the space at the end). When you hit Enter (Mac: Return), your other columns will shift to the left.

Setting Total Options

The Grand Total row has limited usefulness in this case, so we’re going to ax it. With any cell in your pivot table selected, go to Pivot Table Tools > Design > Layout > Grand Totals > Off for Rows and Columns (Mac: Pivot Table > Design > Totals > Deselect Show for Rows).

I also took this opportunity to clean up the spreadsheet a bit: hid the gridlines outside my table, added thousandths separators to the numbers greater than 1000, and added a title and general citation info at the top of my sheet.

Pivot table totals

Click for larger image.

Adding More Data To Rows

Now, let’s say we want to be able see page titles alongside the URLs. All you have to do is drag Title down under URL in the Row Labels field. And if you want to swap them — and have the titles appear on top — just drag URL under Title in the Row Labels field.

pivot table rows

Click for larger image.

Examples of where this option could be useful would be to list all of the organic (or paid) keywords a landing page is getting traffic for from an analytics report or all of the URLs linking to a site from each unique domain from a backlinks report (Majestic, Open Site Explorer, ahrefs, ReverseInternet.com, etc.).

Setting Column Labels

Up to this point we’ve been dragging all values we want to see in the columns to the Values field. When you drag a metric to the Values field, you see the word “Values” appear in the Column Labels field.

I have to admit, this issue of how to populate the columns of my pivot table confused me early on in my wanderings because it seemed like you should just drag row labels down to that field and column labels down to the columns field. And for a while, I just rolled the dice and just picked one; if it didn’t work, I moved it.  Let me (hopefully … no promises) spare you the confusion.

The data source we’re working with isn’t really conducive to using the column labels option. It would be if we combined tweets, Facebook shares, and Google +1′s into one column and called it something like Social Shares. But, instead of reformatting the original table (which I’ve had to do many times over the years), I’m going to switch to a new data set to show a couple other options as well.

In the downloadable Excel file, you’ll see a rank report for the term online university. I pulled Rank down into the Values field. But I have rankings for both Google and Bing, so I pulled Engine into the Column Labels field. This splits the rank values into two columns: Google and Bing.

columns label

Click for larger image.

 

Report Filters

Another powerful option you have with pivot tables is the report filter. Dragging a value into this field will generate a drop-down list above your pivot table that will allow you to filter the entire table by the values you choose from this drop-down.

With our current pivot table, we’ll add Visits to the Values field and Engine to the Report filter, then look at the rankings and organic visits for each search engine individually by only choosing one at a time.

report filter in pivot table

Click for larger image.

 

Grouping

One very cool, lesser-known feature of pivot tables is the ability to group your data. You may decide you want to see your rank and visit data grouped by month instead of week.

What most people would do is create another column in the original data set, label it Month, fill in the values, and refresh the data in the static table (right-click > Refresh) to add the new data to the pivot table.

The easier way is to right-click on any cell in the column you want to group ( in this case Date) and choose Group from the fly-out menu. You can even group by more than one value, such as Months and Years or Months and Quarters. I’ll just choose Months from the Grouping dialog and click OK.

The only problem is now the Rank is summarized by sum, and we want to now switch that to Average.

To do this, right-click on any cell in the Rank column and choose Summarize Values By > Average (Mac: Field Settings > Average >OK). I also got rid of the decimals by choosing Home > Number > Decrease Decimal and changed the heading to Avg Rank.

Here’s the result:

pivot table summarized by month

But you’re not limited to the preset grouping values. You can create a custom group by selecting the items you want to group in a column. Press-and-hold down the Ctrl key (Command key on the Mac) to select multiple, non-contiguous cells. Then right-click and choose Group. Name your groups in the new cells that appear above each group.

Here, I went back to the Gucci pivot table and grouped by Gucci and non-Gucci sites showing up on page 1 of Google for the query.

custom groupings in pivot tables

Click for larger image.

There are so many more options available to you with pivot tables. This post just scrapes the surface and lays the foundation for more advanced techniques I’ll be demonstrating in future posts.

But you now have the skills to be able to take large data sets and rearrange, sort, and filter like a pro in your search for correlations that can give you the competitive advantage you need.

To learn more about how to use pivot tables, take advantage of these resources from the Microsoft site:

If you run into any problems while venturing into these deep Excel waters, you can find me on Twitter (@AnnieCushing) or ask in the comments below.

Download File

You can download the Excel file I used here.

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: Channel: Analytics | Search & Analytics

Sponsored


About The Author: is an SEO and analytics consultant. Her areas of expertise are analytics, technical SEO, and everything to do with data — collection, analysis, and beautification. She’s on a mission to rid the world of ugly data, one spreadsheet at a time. If you just can’t get enough data visualization tips, you can check out her blog, Annielytics.com.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://www.JoeRega.com/ Joseph Rega

    This was extremely helpful, thank you Annie!

  • http://twitter.com/Anthony_Mac85 Anthony Mcloughlin

    Hi Annie – This looks like an amazing way to manage and analyze big sets of data. However I am a little confused. How do you get the keyword difficulty tool report mailed to you in excel format?

  • http://twitter.com/asholstein Andrew Holstein

    Can’t even describe how awesome pivot tables are. Now I look for excuses to use them, even if I don’t really need them.

  • http://twitter.com/AnnieCushing Annie Cushing

    You’ll get an email saying that it’s ready if you’re an SEOmoz member. And, unlike most tools, you actually can get a fully formatted report. But I still get the csv and format it myself. 

  • http://twitter.com/AnnieCushing Annie Cushing

    You bet! :)

  • http://twitter.com/AnnieCushing Annie Cushing

    They’re more addicting than pistachios. :)

  • Lyndsy Simon

    I have to disagree with your recommendation to replace blanks with zeroes – a blank is actually “null”, and nulls are different than zeroes.

    Imagine you have a column with SERP rank for Google for a list of keywords. That column is populated by a script that goes and and looks at the first 100 results of a Google search for that keyword. If the page under analysis isn’t found, then the field contains a null value; otherwise, it gets the numeric rank.

    Now let’s pretend there are 3 rows. The SERP positions are 2, 4, and “not in top 100″, which is represented by a null. If you take the average of those three values, you’ll see your average rank is 3: (2+4)/2.

    If instead, your column contained 2, 4, and 0, the average rank would be 2: (2+4+0)/3.

    This is only one of the problem you run into, but underlines the point - it’s never okay to simply replace nulls with zeroes, as they are very different animals.

  • http://twitter.com/mitchmonsen Mitch Monsen

    Great work here, madam Annie. :)

    The day I discovered pivot tables is second only to the day I discovered importxml in my analytic life.

  • http://twitter.com/AnnieCushing Annie Cushing

    I would disagree that it’s “never” okay to replace nulls w/ 0′s. You overstate your point. That said, not all data sets are friendly to pivot tables. But in the example you state, you could easily replace the nulls w/ a value greater than 100 and then filter those rows out or just note that any value greater than 100 = null. Problem solved.

  • http://twitter.com/AnnieCushing Annie Cushing

    Right??? I’d include the day I learned how to use vlookups to that short list. :)

  • http://www.bigbags.tumblr.com Bags

     Mmmmm. Pistachios.

  • http://pulse.yahoo.com/_J6AB7B7E2MLHHWGL4SWKY5UEFM bizhacks.com

    Very nice article, very useful information and file, thank you!

    Go Yahoo!

  • http://watchthedarkknightrisex.wordpress.com/ Ravi Shankar

    Superlative stuff… Thanks for such a great Post. 

  • http://twitter.com/Anthony_Mac85 Anthony Mcloughlin

    Ahh – I’ve cracked it. You have to search for the keyword phrase first and then once the data loads showing you the top ten, you have to click on “Run full report”. I’ll look forward to having a play with the pivot tables tomorrow :) 

  • http://twitter.com/malcolm_gibb Malcolm Gibb

    Great post! I’ve always struggled building pivot tables but know how valuable they are in visualising different data sets. This post really helps!

  • Rajesh Magar

    Oh God that was nice and lengthy!

    yeah that’s the same situation (In picture ) I come to when it’s getting overflow with all such data. But post is so much helpful. Thanks Annie.

  • http://twitter.com/BarryLoughran Barry Loughran

    One new great feature is the ‘slicer’ function.  It’s unbelievable for creating dynamic charts and graphs.  I’ve set up some amazing charts that can be sliced by so many metrics.  It really did blow my mind when I found it out, as did grouping too :)

  • http://www.liquid-silver-marketing.co.uk/ Farky Rafiq

    I’ve been looking for this post everywhere, fortunately I
    posted it to my g+! (its like finding a bar of chocolate that I saved for later…
    Sweet!)

  • http://trafficdigital.com/ Mark Hughes

    Unfortunately I’m using Excel 2004 on a Mac and don’t get the same options in the pivot table –  I end up with this: http://imgur.com/qz7V9

    Looks like I’ll be going through the typically cumbersome Microsoft training sessions instead… disappointed!

  • http://twitter.com/AnnieCushing Annie Cushing

    Glad to help, Malcolm. :)

  • http://twitter.com/AnnieCushing Annie Cushing

    Ha! I know … I write monster posts. Glad it helped though. :)

  • http://twitter.com/AnnieCushing Annie Cushing

    Oh, I can hardly wait to cover pivot charts and slicers!

  • http://twitter.com/AnnieCushing Annie Cushing

    Yay, data = chocolate! Love it. :)

  • http://twitter.com/AnnieCushing Annie Cushing

    Yeah, just covering Mac and PC is very time-consuming. Going back two versions of each in a single post would be impossible. But I’ve been one of Microsoft’s critics of its subpar Mac version. Even 2011 is way inferior to 2010. You still can’t even create pivot charts in 2011, though PCs have had that functionality since 2000.

  • http://www.rachel-levy.com Rachel Levy

    Great post!  Are you able to share your “audit checklist” you referenced above? I’d love to see it!  

 

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide