Google Search History Expands, Becomes Web History
Google’s Search History feature, which was switched on as a default option for many Google searchers in February, has now been renamed Web History to reflect how it has expanded to track what Google users do as they surf the web. It’s a huge move for Google and raises anew privacy issues. Below, a detailed […]
Google’s Search History feature, which was switched on as a default option for many Google searchers in February, has now been renamed Web History to reflect how it has expanded to track what Google users do as they surf the web. It’s a huge move for Google and raises anew privacy issues. Below, a detailed look at how the system works, how to pause or delete logging if you want, the impact on search results and more.This is a big story, and not all parts may be of interest to everyone. If you want to skip ahead, use the links to jump to particular sections:
- Web History Depends On Google Toolbar
- Pushing The Google Toolbar
- Using Your Web History
- Browsing Page Visits
- Searching & What Gets Stored
- Pausing Web History
- Deleting Web History
- Permanently Ending Web History
- Toolbar Alternatives
- Web History, Personalized Search & Closing The Loop
- Should You Worry?
Web history is tied to the Google Toolbar. The Google Toolbar — first released back in December 2000 — has long had the ability to track whatever a user views across the web. This only happened if the toolbar’s PageRank meter was enabled. By default, the PageRank meter was NOT switched on. In fact, it’s long been joked that the relatively few people who do turn it on are SEOs who obsess over getting links from pages with high PageRank.
That’s now changing. If you download the Google Toolbar directly from the Google Toolbar page, it will still be the case that the PageRank meter will NOT be switched on. However, Google will now begin prompting users in various ways to download a version where the PageRank tracking feature IS switched on or to get those with the toolbar already installed to switch it over.
Let’s go through the options as Google has explained them to me:
- Virgin Searcher: You come to Google WITHOUT a Google Account, which you get only if you sign-up with an email address for various Google services such as Gmail or Google Analytics. In other words, you just came to search. Google is NOT going to start keeping track of your web history or your searches, beyond the general and pretty anonymous logging any search engine does (see Google Anonymizing Search Records To Protect Privacy for much more about this). You might see messages and prompts encouraging you to try Web History or the Google Toolbar, however (Google’s long had promotions like this). If you select one of these options, you’ll get a version of the Google Toolbar with tracking enabled.
- Google Account Holder: Have a Google Account and log into your main page for some reason? Soon, you should start seeing a prompt to try Web History.
- Google Toolbar User Without PageRank: Come to Google with the Google Toolbar installed, but without the PageRank meter on, and Google says you should start seeing messages to sign-up for Web History. Do that, and the PageRank meter will be enabled.
- Google Toolbar User With PageRank: Come to Google, and you’ll be prompted to sign-up for Web History if you aren’t already enrolled. You won’t need to have PageRank switched on because you’re already that way!
- Search History User Without The Toolbar: Many people have had the Search History feature enabled over the past few weeks, because of the change Google did back in February. If you visit your Search History area, you’ll see it changed to Web History and Google will prompt you to get the Google Toolbar with tracking.
Is this retroactive? IE, if you’ve had the toolbar for years, does all that history flow in? No — only web surfing history from when you enrolled in Web History will be logged.
So far, I haven’t seen some of the prompts happen, with the testing I’ve been doing. Some might not be live yet. But here are some examples.
For a “virgin” searcher who comes to Google with no toolbar and opens an account for the first time, they’ll see a Web History link at the top of their search results.
Clicking on the Web History link makes this “Welcome to Web History” page come up with a big button prompting the searcher to “Enable Web History and Install Toolbar.”
Have the Google Toolbar and a Google Account but do NOT have the PageRank meter on? The software disclosure box disappears and the Enable button now says “Enable Web History and PageRank.”
Clicking on the button causes my browser to quickly download a software applet, which flips PageRank on.
What if you come to Google with the Google Toolbar already having PageRank enabled — plus you have a Google account? The “Enable” button asks you simply to “Enable Web History.”
I selected “Enable Web History” button and that was it. Now pages I viewed started getting logged.
In the steps above, I also saw some inconsistencies. Sometimes Google would log me out, then make me log back in, when I selected the Web History link. Other times, that link would say Search History. This is all likely due to the rollout being in progress and me possibly switching between data centers.
Let’s get more into what’s recorded. Let’s also assume in this case that you love the idea.
To reach your web history, use that “Web History” link at the top of any search results page, or you can go to it directly here. As shown above, pages you visit on the web get logged. Unfortunately, there’s no way to zoom in on only visited pages.
There’s no way to filter to see just pages I’ve visited, which I think is a huge oversight. A key pitch behind this product is that you can see all the pages you’ve visited at your fingers tips. Except you can’t, not unless you don’t mind them being mixed with all your searches.
Some other annoyances. For one, I’d find Google would switch me out of my test account and into my main Google account for no apparent reason. This might be related to having Google Talk using my main account at the same time I had my browser using the test account. Most people probably don’t have several different Google accounts, but still — it’s disturbing to see a switch like that. It could also mean that data you never wanted to go into one account could start flowing there.
Visit several pages from one site? The feature nicely consolidates them. Still, it’s a big buggy. That “WordPress > Error” title shows up for a page that actually loaded fine. You can see “http://mattcutts.com” has no title shown despite the page actually having a proper title tag.
In addition, visit the same site several times throughout a day and each particular session will be consolidated, not all of them into one.
The bottom red highlighted section shows when I went to mattcutts.com at 12:41am my time and viewed a few pages. Those all got consolidated around the time of that visit. Then I went back at 1:00am. All the pages from that second visit were consolidated separately from my previous visit, as the top red highlighted section shows.
“Session” consolidation like this is handy if you want to see your browsing chronologically. But if you want to see all visits to a particular site — say for across an entire day, week or month, this doesn’t seem possible.
How about doing a search to make it happen? That kind of works. There’s a search box at the top of your Web History section called “Search History.” This lets you search against items you’ve visited. A search for [http://mattcutts.com] told me I had no items, which surprised me. I expected the URLs to be at least matched. Switching to [mattcutts.com] didn’t help, either. Using the site command, [site:mattcutts.com], worked to give me five matches.
I could even use the links to the far right and top of page to sort the list by date (by default, it’s by relevance).
Of course, most searchers have no idea of the site: command, so this “feature” is largely invisible to typical Web History users. Also, [site:www.flickr.com] failed to find the various visits I made to Flickr, so the command might not always work perfectly (it did work for several other sites I tried).
Speaking of searching, what exactly are you searching against? The Google Blog post about the new feature says:
Imagine being able to search over the full text of pages you’ve visited online and finding that one particular quote you remember reading somewhere months ago.
This suggests that when you visit a page, Google is making a copy of that page — exactly when you visited it — and saving the entire text at that time. Gary Price has been testing this and believes that’s the case. I’m not so sure, because I don’t know how he can test a page from 2006 that’s been recorded in Web History when it only started recording pages today.
Let me explain this more. Search History has always recorded any pages you visited after clicking through to them from Google search results. In some testing I’m doing, it also does seem to be the case that when you would click this way, Google would make a copy of that page stored within your own Search History area, as of the time you visited.
Now when you go to pages across the web — without clicking on them from search results — Google seems to be making copies of these pages as well. I’ll check on this, but if so, it leads to another issue. Is Google making copies of pages if site owners have blocked from spidering or caching (for more on blocking, see yesterday’s Google Releases Improved Content Removal Tools article). If so, should it be doing this?
Let’s talk now about pausing Web History. There may be times when you decide you don’t want the pages you visit to be recorded, even though you like recording in general.
The most straight-forward way is to use the Pause link within your Web History screen. Look to the left-hand side, and you’ll see it near the bottom of the link list.
Recording of everything — the searches you do, as well as the pages you visit, also stops. It won’t resume until you select the Resume link (which replaces the Pause link you originally clicked on).
Unfortunately, there’s no way to selectively pause items. Perhaps you’re OK with your searches being recorded but want to pause only web page visits from being stored. You can’t do that.
Also unfortunately, there’s no Pause button on the toolbar itself. That would be ideal. If you’re browsing the web, it’s a pain to have to go to Google, push Pause on a web page, then go back to your surfing.
As a workaround to this, you can sign-out of Google using the toolbar. Look for the Settings button, then click on that. In the drop down, you’ll see the account you’re signed into Google using.
Sign-out, and that’s it. What you visit is no longer recorded. Just keep in mind that if you go back and use ANY Google account that requires being signed-in, you’ll cause recording to resume. If you want to securely permanently pause it, you need to use the Pause link.
Been using Web History for a bit and decide there are some searches, or places you’ve been, that you rather not have recorded any longer? Go back to those links on the left-hand side of the Web History page and find the “Remove items” option.
Those allow you to select all items listed on a page or your entire search history, if you want. Alternatively, you can tick items individually to wipe them out. If you decide to wipeout everything, this also automatically pauses your Web History going forward. Pretty smart that — Google assuming if you want to kill everything, you probably don’t want more stuff recorded.
Be aware that while deleting wipes out material from your Web History, potentially some of the information is still available in two different ways:
- Information on what you searched for might still be associated with your IP address in server log data. My Google Anonymizing Search Records To Protect Privacy article from last month goes into detail about server logs. For most people, this really isn’t something to worry about. The data stored within those server logs is far, far less identified with you as is stored with your Web History profile. In fact, the server logs will NOT have any of your Google Toolbar tracking data. The Google Web History Privacy FAQ touches on the log issue when it mentions the “separate log system” that’s maintained.
- Web History data is also archived. These archives are not “retrievable in real-time by end users,” Google told me. But the data is ultimately retrievable. If Google itself decided it needed to pull the archives and check something, it could — even though you deleted the data in the “live” system. Similarly, a government agency could potentially legally compel Google to go to its archives and recover information that was deleted off a live system. In addition, while toolbar tracking data won’t be part of a Google server log, that data is being logged in some way — and archives of that data could be recovered. In short, if you really, really don’t want data recorded, don’t think deleting it after the fact is enough.
Back in February, I wrote the long Google Ramps Up Personalized Search article to explain how Search History was effectively the default for anyone new signing-up for any Google service. I cannot underscore enough the importance of this change. It instantly meant many more people than ever before were going to be getting their search data logged with Google — really logged, really profiled beyond the typical web server logging that any web site has happening.
Today’s change is going to push many more people into recording search history — as well as web browsing history. From what I can see so far, Google’s not trying to sneakily include people who already have accounts or the Google Toolbar into Web History. Those pages I mentioned above have pretty big buttons telling you something is going to happen — and it only happens if you decide to push one of them. Even those signing up for a Google account for the first time still get the big button option/warning. So there’s fair warning, but I expect there will be fair take-up as well — unless today’s move backfires on Google as simply too much for a company that continues to get more worrisome for people.
Down the line, some people might rethink having the service at all. You can remove it altogether. Log-in to your Google Account (or select the “My Account” link at the top of any Google page, if you are logged in). In the “My Services” list, choose the “Edit” link, then select the “Delete Web History” link. Follow the instructions on the next screen, that that will permanently remove Web History from your account. Too many steps to follow? Click here, and it will jump you right to the delete option.
Sadly, there’s no way to remove Web History but keep only Search History, if you’ve upgraded. I can give you a workaround. You’ll have to delete Web History and lose all your saved data — including any saved searches. Once you’ve done this, go back to the main Google Accounts page, look in the “Try something new” list and select “Web History.” When you get the “Welcome To Web History” screen like I showed above, select the “Limit Web History to Searches” option. That will keep page visits from being logged if you still decide to use the Google Toolbar with the PageRank meter enabled.
Once you’ve done this, you’ll see this message as a reminder your web history is “limited” to searches.
FYI, if you enable Web History and then remove the Google Toolbar, obviously nothing gets logged. Google will keep telling you that Web History is enabled, but so far (I’ve tested), it doesn’t remind you that you need to get the toolbar going for it to fully work.
I suspect that the integration of the Google Toolbar with Web History is going to freak some people out — to the point they decide the toolbar needs to go completely. If that’s you, but you like some of the features of the Google Toolbar, there are some excellent workarounds.
First, get Groowe. For both Firefox and Internet Explorer, it faithfully imitates the key features of the Google Toolbar as well as many other search engines. I’ve used it for years. It’s one of those rare keeper tools that’s survived on my desktop. With a click, I can search against any of the major search engines — plus save to Digg or Delicious.
Second, SEOs should get Search Status for Firefox. Among many other awesome and useful things, it gives you the PageRank meter without the data going to Google, to the best of my knowledge. I’ve not tested what happens if you have Search Status pulling PageRank data while you have Web History enabled but do NOT have the Google Toolbar installed. If I have time, I’ll try this later. But I’m fairly sure you’ll be safe. Test it out yourself, and then someone comment below to let us all know!
Made it through all the how it works stuff? Let’s move into the why. Google is big on personalization. Big, big, big. Everyone else can keep their “wisdom of the crowds” stuff. For Google, getting up close and personal with individuals is seen as a big leap forward on many fronts — and 2007 is the year Google is going all out after it. Consider these stories we’ve covered over the past few weeks that are all related to personalized moves:
- Google Adds RSS Snippets To Personalized Home Page
- Google Ramps Up Personalized Search
- Google Offers ‘Themes’ For Personalized Homepage
- Google My Maps: Mashups For The Masses
- Google Offers “Queryless Search” & Personalized Recommendations
The more Google can know about you, the more it believes it can deliver you a better or more unique experience (not to mention more targeted ads). But in particular, personalization is seen as the next generational step in delivering better search results. My Google Ramps Up Personalized Search article I keep mentioning explains how the personalized search results at Google work in much more detail and the entire next generational idea.
Until now, the key factors to influence personalized rankings have been:
- Sites someone clicks on in search results
- Sites added to the Google Personalized Homepage
- Sites saved with Google Bookmarks
Now add a fourth — sites an individual visits as recorded with Web History will influence rankings, as well.
“In our view, we could do a better job personalizing if we had more data to personalize on. By having web history for most web sites you visit, that helps us understand more about you,” said Marissa Mayer, vice president of search products and user experience at Google. “Ultimately, we think our personalized search result swill get much better.”
Tapping into the toolbar to help with rankings has long been expected. My Google: Master Of Closing The Loop? article from last week revisited this:
With all the things I’ve mentioned above — Google Toolbar, Google Analytics, Web Accelerator, AdWords conversion tracking — Google had always made noises or semi-reassurances that the data would somehow be “ring fenced” or not shared with other departments. Especially with Google Toolbar, we were repeatedly told it would not be used to find new web pages nor harnessed for ranking purposes.
So now it comes true — toolbar data will be used for ranking purposes, specifically to alter each individual’s unique search results. And I have no doubt we’ll see it used in aggregate, just as Google is certainly going to continue to “close the loop” by tapping into any data source it can. Some of these site owners will continue to be able to influence (see 3 Ranking Survival Tips For Google’s New Personalized Results for some advice here). Others will return to that ages old advice that sites with good content and that seek to help visitors should have a better chance to filter to the top.
Trying to understand more about the personalization change, from a site owner’s perspective or just from Google’s view as a business strategy? Here are two more articles from us to check out:
- Google’s Marissa Mayer on Personalized Search
- Google’s Matt Cutts on Personalization and the Future of SEO
With today’s announcement, part of me wants to ring the alarm bell and shout “Uninstall your toolbar! Delete your Google account!” Because let’s face it. Google’s getting big, huge, giant. It’s no longer a joke that the once small, lovable company wants to conquer the world. The Google monster company really is gobbling it up, with no barriers seemingly left. The “we’re a tech company” charade is over from the very top, with CEO Eric Schmidt finally calling Google recently in a Wired interview “an advertising company.” As for the mission “to organize the world’s information and make it universally accessible and useful,” that’s not even listed in the four ways we’re told by him to think of Google:
- An advertising company
- An end-user system (to me, a combination operating system/super office suite of software)
- A giant supercomputer
- A social phenomenon
I remember when Google was a search engine, with a philosophy that said, “Google does search.” Now it puts ads on TV, in radio, in print — serves as a payment platform, provides web analytics, pitches software “packs” to us and more. Does it really need to have our web surfing histories as well? When’s enough enough?
Indeed, just today we had news that the European Union is likely to send Google a letter that it might be violating data retention laws. I can virtually guarantee you that whatever Google gets dinged on, Yahoo and Microsoft are probably doing the same. But no one focuses on them in terms of search privacy.
Moreover, I’m actually pretty annoyed at some of the privacy advocacy groups. When Google announced it would anonymize server data last month, I still saw some old school concerns that fairly anonymous cookie data and IP addresses were a privacy concern. C’mon — you want to be concerned about something, you get concerned about the fact Google has — and is growing — real honest-to-goodness personally identifiable profiles of individual searchers. And if you want to get concerned about that, also get concerned that Yahoo and Microsoft have similar profiling — just not as visible to the searcher.
Indeed, that visibility was another reason Mayer said Google was making today’s Web History move: “We want to be transparent with users to see what data we have,” she said.
As noted, Google Toolbar users with PageRank-enabled (all of whom voluntarily enabled it) have been sending their page visitation histories to Google for years. They simply couldn’t see that history, nor delete it, if they wanted. Now you can. Now if you’re concerned, it’s because the data is more in your face.
Should you be concerned? Of course. Everyone should be concerned about their private data. Everyone should really think about what is being logged and how it is being used. But we also make tradeoffs. We want certain things from companies, and to get them, we have to give up some of our privacy, often trusting it will be protected. Mayer herself says this:
“There is a greater level of association here [of personal data to individuals]. There’s also a greater level of utility, being able to easily and nicely see your history of where you’ve been on the web. There is a trade-off in that there is a record that is associated with your email address, and that users need to consider carefully.”
There is no doubt many people will find having a web history useful. There is no doubt web histories will help improve results for many people. Just using a search engine at all — even if you don’t log in — still involves trust in passing along a query that by the query’s own content could potentially identify you. In the end, those with concerns needn’t enroll in the web history feature. Google will work fine without it.
Interested in what others are saying? Techmeme is tracking the story, so you’ll find related coverage there.
Postscript: Some answers to follow-up questions:
- A Google-hosted copy of a page visited via the toolbar or from clicking on a search result is not made.
- Secure (SSL/HTTPS) pages are not tracked