Google’s Plan To Withhold Search Data & Create New Advertisers

google-not-provided-200pxHow do you convince a bunch of publishers to buy ads on your service? If you’re Google, how about withholding data from those publishers about how people are finding their sites, as a way to drive them into your ad system? I don’t think Google planned for this to happen. But that’s the end result, in what could be called the “Not Provided” scheme.

“Dark Google” Nearly Two Years Old

In October 2011 — nearly two years ago now — “Dark Google” was born. Google began holding back information it previously gave to publishers for free, “referrer data” that let publishers understand the exact terms people used when they searched on Google and then clicked to publisher sites.

Did your site get found on Google for when someone searched for things like “iphone cases,” “reasons why the US might attack Syria” or “help for a bullied child?” In the past, you could tell this. Google provided that data using an industry-standard mechanism that analytics tools could easily tap into.

Originally, people were finding a low percentage of terms were being withheld, or showing up as “not provided,” to use the term that Google Analytics shows, in these cases. But over time, the percentage for many sites has risen, making some wonder if eventually, we’ll be living in a 100% not provided world.

Withheld For Privacy, Despite Deliberate Loopholes

Why did Google break a system that had been in existence even before Google itself? Google said it was done to better protect the privacy of searchers. People might search for sensitive information, so by withholding search terms, Google felt it was preventing any eavesdropping or leakage that might happen.

It’s a good reason, one I agree with in general. But it was also a flawed move, because Google still allows sensitive search terms to potentially leak in three different ways. These are:

1) Search terms that get suggested by Google Instant autocomplete

2) Search terms that Google provides to publishers through its Google Webmaster Central service

3) Search terms that Google continues to transmit across the open web to its advertisers

The latter loophole is especially noteworthy. Google expressly went out of the way to ensure that its advertisers could still receive referrer information the traditional way, with no need to log-in to some closed Google system.

When Google first shifted to this new system, I wrote that the third loophole was putting a price on privacy. Google appeared willing to protect privacy up to the point where it got too pricey for itself. Having a bunch of irritated, angry advertisers might have proved too expensive.

Historical Data Not Archived

Google’s biggest defense against such accusations, that this was all done to increase ad revenues, has been the second loophole on the list. Publishers can use Google Webmaster Central, log-in and see the terms driving traffic to their sites, all for free.

There are caveats, however. You’ll only see the last 90 days worth of data, and only for the top 2,000 queries for any particular day that are sending traffic to your site. The number of terms used to be smaller, but Google expanded this early 2012. Because the exact terms change each day, potentially, publishers can see many thousands of different queries.

Personally, I think the “depth” of queries is great. For many sites, seeing only the top 100 or 200 queries sending them traffic might encompass a huge chunk of their visitors. But the historical data for many sites has been lost, and continues to be, because of that 90 day window.

Want to know how your top terms compare today to a year ago, or what traffic from those terms is like? You can’t do that in Google Webmaster Central, because Google won’t store it for longer.

I’ve repeatedly asked Google why it doesn’t expand the period of time that this data is retained. After all, it was more than happy to store that data when it was transmitted the old way, for anyone who wanted to capture it via Google Analytics.

Here’s my most recent discussion, with the head of Google’s web spam team Matt Cutts (the video should start at the right spot; if not, it’s 38:15 in):

Google’s usual response is to point to a Python script it created, for those who want to download the data programmatically. That’s a bad solution for many publishers, in my opinion. It’s like telling them they can only use Google Analytics if they set-up a routine that will automatically forward their server logs each day. It’s not easy. It’s not how Google usually aims to serve its users, of which publishers are a key constituency.

The other answer has repeatedly been the standard Google “we’ll consider it” type of response. Two years in, what else needs to be considered? Clearly, the inaction on expanding the time period shows it’s not a Google priority.

Now: Unlimited, Easy Archiving — For AdWords Accounts

Things changed dramatically at the end of last month. Quietly, Google announced a new “Paid & Organic” report for those with AdWords accounts.

Want to store those search terms Google’s been withholding and dropping out of Google Webmaster Central after 90 days? Just sign-up for AdWords. Allow it to link to your Google Webmaster Central account. It’ll start pulling the search term data out of there constantly — no Python script required.

Want to know your top terms, after doing this? Select “All Online Campaigns,” and make an empty campaign, if you don’t already have one. Then go to the “Dimensions” tab, change “View” to “Paid & organic,” and there are all your stats. You’ll see your top terms, sortable by clicks, queries and other ways.

The good news is that you don’t have to be a paying AdWords customer to do this. You just need an AdWords account. The bad news is that feels wrong that Google is forcing publishers into its ad interface to get information about their “non-paid” listings. It also suggests an attempt to upsell people on the idea of buying AdWords, if they aren’t already.

Planned Or Not, It’s The Wrong Signal To Send

I don’t believe things were orchestrated this way, with terms being withheld to push AdWords. I really don’t.

I think the search team that wanted to encrypt and strip referrer information had the best intentions, that it really did believe sensitive information could be in referrer data (and it can) and sought to protect this.

I think AdWords continued to transmit it because ultimately, the search team couldn’t veto that department’s decision.

But regardless of the intentions, the end result looks bad. Google instituted a system pitched as if it was protecting user privacy yet which had three major loopholes, including an explicit one for its own advertisers. Now, it has a system that further encourages people to use the AdWords system.

In the end, it makes it seem as if Google — which has a symbiotic relationship with publishers — doesn’t want keep them fully appraised of how they are being found unless it has a better chance of earning ad revenue from them. That’s a terrible message to send, but that’s the one that’s going out.

There’s one bit of good news. I asked Google for any comment about all this. I was told:

We plan to provide a year’s worth of data in the Top Search Queries report in Webmaster Tools.

There’s no set timing on when this will happen, but I’d expect it to be relatively soon. That will be welcomed. Even better would be if the data was able to be archived for as long as people want, just like they can now do if they agree to flow it into AdWords.

Related Articles

Related Topics: Channel: SEO | Features: Analysis | Google: AdWords | Google: Analytics | Google: Critics | Google: Webmaster Central | Top News

Sponsored


About The Author: is a Founding Editor of Search Engine Land. He’s a widely cited authority on search engines and search marketing issues who has covered the space since 1996. Danny also serves as Chief Content Officer for Third Door Media, which publishes Search Engine Land and produces the SMX: Search Marketing Expo conference series. He has a personal blog called Daggle (and keeps his disclosures page there). He can be found on Facebook, Google + and microblogs on Twitter as @dannysullivan.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • James Davis

    Thanks for this, Danny. We’ve been struggling with this problem and it only seems to be getting worse. For the last 30 days, 65% of all traffic in Analytics for our main site is “not set” or “not provided.” In WMT, Google reports we took 2500 clicks for the same period, but only reports 400 of those clicks – ALL of which were spread across just 2 branded keywords. This means that the data Google serves us is almost useless. Unfortunately, the problem is that although we have at times spent as much as $30,000 per month in Adwords, the data is not the same and cannot be used the same way. When you develop Adwords ads, the text, keywords, message, landing page and action are all different than what is targeted and setup organically. So while there is valuable data to be mined with a decent spend in Adwords, it doesn’t correspond – at least in our industry – with organic data at all.

  • Tom Roberts

    “I don’t think Google planned for this to happen. But that’s the end result”

    Come on Danny. I don’t believe for one second you’d be this naive, you’re too skart of a cookie. Of course it was deliberate.

  • http://www.seoconsultant.ie SEO Consultant

    The problem is if you use Marketo / Salesforce – to store your search queries. There you still get ‘Not Provided’, since you can not get Marketo or any similar tool to log into AdWords. :(

  • John Broadbent

    Well I guess we need to prepare for everything.

  • joeyoungblood

    Hooray!!!!!!!!!!!!!! I’ve been saying this since they launcehd NP. It’s a move to make more $$ in the long run and knock out ‘search retargeting’ services

  • joeyoungblood

    Unfortunately I agre. Since Larry Page took back CEO control and Marissa Meyers was moved to Maps/Local the blurring of user experience / privacy and making Google more long-run money has gotten exponentially worse. Google is a public company and making small changes little bits at a time to adjust behavior is how they can make more long-run money for another decade to come, unless they start losing market share.

  • joeyoungblood

    “not set” I believe is PPC traffic being incorrectly assigned, check your tagging. https://support.google.com/analytics/answer/2820717?hl=en

  • Colin Guidi

    yea that’s correct

  • Colin Guidi

    I want to link the two properties, but I’m on the fence. Hypothesizing what Google might do when SEOs open the fire hose from WMTs to AdWords.

    “Google is forcing publishers into its ad interface to get information about their “non-paid” listings”

    I mean, they already have this data, but what will happen when we readily make it available through linking the two? This, I’m not sure about.

  • joeyoungblood

    Google is also taking what is left of keywords with their new “local carousel”. When you click on a business image/name Google runs a new search query for their brand name. This would pass on branded keywords to the site instead of the actual search query.

    I think this helps add to the evidence that Google is not doing this haphazardly or accidentaly.

  • jnffarrell1

    Google’s action was in-Advertisment. It makes sense. For historical purposes Advertisers can journal their own historical data by storing today’s result for comparison next year. Moreover, 90 day data access may be part of negotiations with the EU. Furthermore, data encryption is coming to Google’s cloud and advertisers will have access to only the last 90 days.

  • Catfish Comstock

    If it isn’t too big of a concern to apply to paid traffic, it shouldn’t be too big to apply to organic. Otherwise its exactly what it sounds like it is. If they would at least make a brand / non brand filter for your Keyword Unavailable that you could see in Webmaster tools you could at least know where your traffic was coming from. Right now, it’s impossible for businesses to scale because they don’t know if the uptick or decrease in traffic is due to their brand being more or less popular or if they actually have improved their footprint across relevant non brand traffic. This whole thing is really ridiculous and Google has yet to prove that what they did was really all that necessary (but not necessary enough for paid searches). I mean if you think about it, it’s a pretty indefensible position that reeks of greed and manipulation (which is how most people feel about the whole thing).

  • http://www.instantatlas.com/ David E Carey

    ‘Duh! Really do you dink sooo?’ Never thought for a moment that this strategy by Google was really anything to do with ‘privacy’ of the searcher. I find myself (as are my competitors) having to some degree introduce PPC into the online marketing mix. Funny how I can see many more keyword phrases reported in my analytics for leads that originate via the PPC channel – mmmmm!

  • Tom

    The lack of keyword referral data also results in a worse user experience for the search by preventing personalisation on the resulting website based on the search term. If Google really believe this is an issue if user privacy they should only pass referrer data to SSL sites. Within months a large proportion if the web would move to SSL, also helping Google in their new found battle to stop the NSA tracking everything and everyone ( http://g.re.af/gnsa ).

  • kimberly537

    My Uncle Julian just got a new white Land Rover Range Rover only from part-time off a computer. you can try this out w­w­w.J­A­M­20.c­o­m

  • Dave

    For the past few days, I have been redirected to https version of Google search, irrespective of what country I searched for, even though I am not signed in. This is happening for IE, Firefox and Chrome. This mean more “not provided”, if it’s going to be implemented internationally.

  • Scott Davis

    (not set) is direct traffic and did not come through with a keyword by definition.

    I feel your pain though. I’m dealing with 52% (not provided) myself.

  • Colin Guidi

    That’s not true. (not set) can come from several different reasons, but the most common (from what I’ve seen) is when you don’t properly tag your PPC campaigns. For example if you run in Bing and don’t manually tag your URLs.

    Go to joeyoungbloods link above, it’ll explain it for you.

    And direct traffic is traffic directly driven to your website, therefore users bypass SERPs and do not enter keywords but rather your web address, so of course there wouldn’t be keywords associated with it.

  • Scott Davis

    If you go into the keyword view under direct traffic, you’ll see that all of the keyword information is (not set).

    You’ll have to do this by changing the primary dimension to Traffic Source -> Keywords

  • http://searchengineland.com/ Danny Sullivan

    For various reasons, I don’t think it was planned this way, as I explained. But as I also explained, it doesn’t really matter. Planned or not, the end result is that it might as well have been planned. But I did drop “inadvertent” from the headline to punch that up a bit more, and put the emphasis back on the end result.

  • Tom Roberts

    And I just realised I tried to spell “smart” like “skart”.

    But thanks for the follow up Danny. You’re definitely a smart cookie, not a cookie made from a skart/scart lead.

  • Matt O’Toole

    Having a brand/non-brand filter wouldn’t work perfectly as you’d be entering brand terms blind, without knowing if someone has mis-typed or otherwise used a branded phrase you just hadn’t thought of; you’d have to add in an almost endless list of branded term possibilities.

  • http://www.archology.com/ Jenny Halasz

    I also think it was accidental at first, but that as time goes on, it becomes more and more about increasing the bottom line. Not provided, combined ad/organic data and the recent changes to the keyword tool all point to one thing… increase the number of advertisers and the number of keywords they are bidding on. Sorry to link drop, but we covered both sides of this issue on the Archology blog last week… you might be interested in reading: http://www.archology.com/organic-data-and-not-provided/

  • PeterD

    If privacy was the “main concern”, then why isn’t Adwords referral data blocked, too?

    And are we to believe Google just “stumble about” making these significant changes without realizing the consequences? Really?

    I think they know their business very well indeed.

  • Spreading Wisom

    Nobody in their right mind would believe that this was “accidental”. Not buying that “privacy” BS either… queries in analytics are anonymous. It’s not like there’s an IP address attached to the keyword or a Facebook account.

  • Spreading Wisom

    Nobody in their right mind would believe that this was “accidental”. Not buying that “privacy” BS either… queries in analytics are anonymous. It’s not like there’s an IP address attached to the keyword or a Facebook account.

  • Garrett Rent

    So, when does the paid version of GA (with “enhanced” keyword visibility) release? (Over 10,000,000 sites using GA) x ($19.99/mo.) = serious overnight incremental revenue stream. And, with GA you hook all the SEOs and SMBs that don’t wish to (or can’t afford to) play w/ Adwords each month.

  • http://searchengineland.com/ Danny Sullivan

    Heh. I have some old scart leads in my junk box. HDMI is so much better :)

  • Cathy Dunham

    I first noticed this dramatic drop in one of our client’s organic keyword totals two days ago… so I’ve been searching diligently for some solid, credible answers that explain why this occurred. Thanks, Danny and Matt, for filling us in. However, let me add my two cents:

    Dear Matt Cutts,

    The hardest part to understand is that Google has increasingly promoted “writing quality content that’s relevant to your users”. I totally agree! And, I’ve been using those keyword search terms for years to help me define the more specific details of content my clients’ search visitors want to find. We’re just trying to do our best to provide a richer user experience (in total agreement with what Matt Cutts has been saying). And now you chop off our most valuable resource tool? DUH! And not to mention how totally painful and useless the Keyword Planner Tool is now compared to last year! I’m usually a mild-mannered speaker who rarely swears, but WTF?

    Affectionately Google-friendly Optimization Strategist,
    Cathy Dunham

  • natfinn

    We’ve always told our clients two things when it comes to Google:
    1) follow the money in regards to their product updates &
    2) for real keyword data, trust a CPC campaign’s information. We never thought it would really come to this.

    #DarkGoogle has really squeezed the sales funnel this time.

  • Catfish Comstock

    I didn’t say it would work perfectly but at least a trend against typical brand terms like company name, product names would be a good start as these terms are the brand terms that get most of the traffic. Thus eliminating traffic numbers for those terms would at least give you an 80 to 90% view of your brand traffic and would allow you to make year over year comparisons that were more meaningful in terms of understanding of you have improved your SEO performance.

  • Catfish Comstock

    As soon as shareholders figure out that revenue stream you can count on it. Unfortunately Google going public was the end of do no evil.

  • Catfish Comstock

    As soon as shareholders figure out that revenue stream you can count on it. Unfortunately Google going public was the end of do no evil.

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide