Why Clean Source Tagging Is Worth Your Time

Messy, incomprehensible analytics make my stomach churn. Just knowing that I’m going to spend the next several hours cleaning up sloppy data puts the kibosh on my day.

The problem with Google Analytics, or any analytics package for that matter, is that even if my site is properly tagged and I’ve developed a systematic inbound link tagging notation, it’s almost certain that my Source/Medium/Referral URL reports will still include inconsistencies that need to be resolved to provide accurate reporting:

Software doesn’t know what isn’t logically defined. Google does a decent job of auto-tagging referring traffic and categorizing into Source/Medium buckets which are logical.

However, a bunch of stuff slips through the cracks into the catch-all “Referral” bucket. Over time, the bucket grows big enough to constitute a very significant percentage of overall traffic and needs to be addressed, cleaned up, and recategorized in order to provide accurate Source/Medium reporting.

A classic example in the table above is bingiton.com. It’s unclear to Google Analytics as to what the site is, so it’s tagged as a referral.

However, by visiting the site, we can easily see that it’s a site comparing Google and Bing organic results and as a result, the source can either be Bing or Google (although the goal of the site is to show Bing organic results are better than Google’s), and Medium should be Organic, as the results are clearly coming from a search engine, as shown below.

searching [analytics] on bingiton.com

The reality is that Google Analytics will never be perfect at tagging inbound traffic. Enforcing partners to adhere to a clean inbound link format can prevent part of this phenomenon, but at some point it can and will get totally out of our control. That said, GA reports are a perfect starting point for your in-house analytics team to work with for producing a truly accurate performance report.

The good news is that there is a solution; and furthermore, it’s not too difficult to put in place. At my company, we call it Clean Source Tagging. The idea is that there’s a raw source and a clean source. We take what GA gives us (the raw source), manually review it for accuracy, and compile an Excel database of all traffic sources and a reviewed and approved list of true clean sources.

If you don’t recognize a source, visit the site to determine what it actually is. We can even get a little fancy with the output and include primary source, secondary source, and medium to gather all relevant information:

We then take the standard GA Source/Medium output and run a simple vlookup against the clean source table to associate each referral with the correct source/medium.

If it sounds like a lot of work, it’s because it is. But it’s a one-time project with a small ongoing component to keep the database fresh and up to date. The bulk of referring URLs are static week-over-week.

As a result, the initial project may take a while; but if you keep the database handy, run the vlookup and address the new unknown referrals, it’s a quick job even an intern can handle in under an hour per week.

So, now that we’ve got a clean source table, what’s next? Dashboards. The clean source table gives us the ability to create the simple, intuitive and insightful dashboards executives crave:

With a little extra work up front, we can create a beautiful Excel template, drop new data in, update the clean source table, and deliver source performance reports in no time. What was previously an irrelevant and unusable report in GA, or a mountain of work to get right, becomes a near automated solution that is a staple of weekly reporting.

Now you can hit snooze one more time on Monday morning. You earned it.

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: Channel: Analytics | Search & Analytics


About The Author: is the Vice President of Performance Marketing and Analytics at SellPoints and is based in the San Francisco Bay Area.

Connect with the author via: Email | Twitter | LinkedIn


Get all the top search stories emailed daily!  


Other ways to share:

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • harryfassett

    Much cleaner data and better visual of what is really going on, i.e. much more detailed reporting. Very good article Benny!


Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest


Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States


Australia & China

Learn more about: SMX | MarTech

Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!



Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide