Hold On — Issues Remain Over Google & Viacom’s Deal On YouTube Viewing Privacy

Google and Viacom have reached an agreement meant to ease privacy concerns about YouTube records being handed over to Viacom through a court order. However, there remain some questions about how exactly the “anonymizing” of these records will actually work. Until those are answered, I wouldn’t breathe a complete privacy sigh of relief yet.

Google was ordered to hand over YouTube viewing records earlier this month as part of Viacom’s copyright infringement lawsuit against the company. Privacy concerns were immediately raised, causing Viacom to react by saying it didn’t want all the information the court was granting it rights to and Google saying it was hopeful the data could be anonymized in some way.

An agreement allowing this has now been reached. The legal language (PDF file):

When producing data from the Logging Database pursuant to the Order, Defendants shall substitute values while preserving uniqueness for entries in the following fields: User ID, IP Address and Visitor ID. The parties shall agree as promptly as feasible on a specific protocol to govern this substitution whereby each unique value contained in these fields shall be assigned a correlative unique substituted value, and preexisting interdependencies shall be retained in the version of the data produced.

OK, the idea here is that by replacing the “real” IP address or user ID information, it won’t be possible to know the “real” identity of those who did searches. Unfortunately, changing to “fake” information still doesn’t solve the privacy concerns entirely.

In the case of AOL search data that was released in 2006, all the associated user information with those records was also anonymized. But because each individual still had the same unique “fake” address, it was possible to see all the queries done by an “anonymous” user. That activity profile in a few cases made it possible to guess at the real person doing the searches.

Consider this example. Let’s say part of a YouTube log originally looks like this:

  • May 5, 2008 – User: juliefielding – Search: “Battlestar Galactica”
  • May 5, 2008 – User: juliefielding – Video Watched: Battlestar Galactica, Season 2, Episode 5
  • May 5, 2008 – User: juliefielding – Search: “Julie Fielding Homecoming”
  • May 5, 2008 – User: juliefielding – Video: “Julie Fielding’s Homecoming Party”

OK, this is GREATLY simplified (see Google Anonymizing Search Records To Protect Privacy for a more real-life example of how logging works). But you can see how the logs show how someone with the user account of juliefielding (who probably is a “real” Julie Fielding) has done searches and watched particular videos.

Now let’s say we “anonymize” the user name like this:

  • May 5, 2008 – User: dskw92qw4 – Search: “Battlestar Galactica”
  • May 5, 2008 – User: dskw92qw4 – Video Watched: Battlestar Galactica, Season 2, Episode 5
  • May 5, 2008 – User: dskw92qw4 – Search: “Julie Fielding Homecoming”
  • May 5, 2008 – User: dskw92qw4 – Video: “Julie Fielding’s Homecoming Party”

Now “juliefielding” has become the anonymous user “dskw92qw4,” so supposedly we can’t identify her. However, we do know everything this user has watched — and if they’re watching something called “Julie Fielding’s Homecoming Party,” we might assume they’re connected with Julie Fielding. Moreover, if we have a long pattern of viewing (and possibly searching) history, we might better be able to guess at how the person is. Imagine a person who is constantly watching videos that they themselves have uploaded, for example.

I asked Google about this. Why wasn’t the agreement worked out to drop out ANY user information entirely, especially as that still doesn’t seem necessary to achieve Viacom’s overall goal of simply guessing at how much infringing content may be viewed overall? I covered this in my earlier piece.d

If you only want to know the percentage of “infringing videos” that are watched versus “non-infringing” ones, then you only need a record of the videos requested. You don’t need IP addresses of those requesting them. You certainly don’t need the very personally-identifiable information of those who are logged in and watching them.

Saul Hansell at the New York Times raises the same issue in his write-up today:

I’m not entirely sure what Viacom will get out of all this. No doubt they will be able to prove that lots of people uploaded clips of material from MTV, Comedy Central and other Viacom properties and that lots of people watched them. You don’t need server logs to show that.

In response, Google said that the exact anonymizing protocol hadn’t been worked out and it highlighted this part of the agreement:

The parties agree that they shall not engage in any efforts to circumvent the encryption utilized pursuant to Paragraph 1 this Stipulation. This Paragraph does not limit in any way any party’s rights under Paragraph 8 below.

Neither response solves the concerns. Yes, how exactly substitution of user information will be done is yet to be worked out, but the underlying principle that each unique user will be given a unique alternative identity is not being challenged — and so individual user profiles can still potentially be worked out.

As for “encryption,” what encryption? I suppose this is meant to say that Viacom won’t try to build profiles to then work out who an anonymous user might really be. But if it won’t do that, then why hand over even anonymous user info? The company doesn’t need this. But handing over the information at all remains a privacy threat, agreement or not. Records get lost. Records get in the wrong hands.

Agreements are all well and good, but the most secure way to protect privacy is not to hand out information that isn’t needed. While building up user profiles as I’ve described is still not a major threat to the vast majority of YouTube users, it’s still a concern. And since Viacom doesn’t even need anonymous data, I’d hope the two parties get together once again and drop any of this entirely.

Give Viacom records of what was watched on YouTube, sure. But no, don’t give the company records about who watched this material, “anonymized” or not.

For more, see discussion on Techmeme.

Related Topics: Channel: Video | Google: YouTube & Video | Legal: Privacy


About The Author: is a Founding Editor of Search Engine Land. He’s a widely cited authority on search engines and search marketing issues who has covered the space since 1996. Danny also serves as Chief Content Officer for Third Door Media, which publishes Search Engine Land and produces the SMX: Search Marketing Expo conference series. He has a personal blog called Daggle (and keeps his disclosures page there). He can be found on Facebook, Google + and microblogs on Twitter as @dannysullivan.

Connect with the author via: Email | Twitter | Google+ | LinkedIn


Get all the top search stories emailed daily!  


Other ways to share:

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.

Comments are closed.

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest


Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States


Australia & China

Learn more about: SMX | MarTech

Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!



Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide