WTF! US Court Declares You Have No Privacy On YouTube

You have no privacy on YouTube. So effectively declared a US judge yesterday. And now somebody in the US government better stop grandstanding about search and privacy protection and actually get some laws enacted. Yesterday’s move might be the ultimate incentive, as US politicians realize that what they’ve watched on YouTube may now be open […]

Chat with SearchBot

You have no privacy on YouTube. So effectively declared a US judge yesterday. And now somebody in the US government better stop grandstanding about search and privacy protection and actually get some laws enacted. Yesterday’s move might be the ultimate incentive, as US politicians realize that what they’ve watched on YouTube may now be open season.

I can appreciate Viacom’s concern over copyright infringement on YouTube. But yesterday’s ruling (PDF file) that it should be provided with data about EVERYONE who has used YouTube, including what they watched, is alarming. Viacom itself should do what the court did not and limit the data it takes.

Wired and News.com both have write-ups on the news, and the Electronic Frontier Foundation has a great analysis as well. So first the news about what’s happened, then some analysis of my own.

Viacom Wants Viewing Records

From the ruling, Viacom demanded all logs of all videos watched on YouTube, with the idea being they’ll figure out which were infringing copyright and then get a percentage of how much copyright-infringing viewing is happening via the service. (Set aside the issue that what Viacom might consider copyright infringement might not be seen the same way by others.) From the ruling:

Plaintiffs seek all data from the Logging database concerning each time a YouTube video has been viewed on the YouTube website or through embedding on a third-party website. They need the data to compare the attractiveness of allegedly infringing videos with that of non-infringing videos. A markedly higher proportion of infringing-video watching may bear on plaintiffs’ vicarious liability claim, and defendants’ substantial non-infringing use defense.

Viewing Records Have Personal Data

And what’s in the database? Bearing in mind that courts often don’t define things accurately, here’s what the ruling says is in it:

Defendants’ “Logging” database contains, for each instance a video is watched, the unique “login ID” of the user who watched it, the time when the user started to watch the video, the internet protocol address other devices connected to the internet use to identify the user’s computer (“IP address”), and the identifier for the video. That database (which is stored on live computer hard drives) is the only existing record of how often each video has been viewed during various time periods.

OK, there’s a log of all videos that have been viewed, just as any web site keeps a log of pages that are requested and other material that’s transmitted. That log also contains things like the IP address of those making the requests and, if they’re logged in, their user name. See our Google Anonymizing Search Records To Protect Privacy for a more detailed explanation of all this.

If you only want to know the percentage of “infringing videos” that are watched versus “non-infringing” ones, then you only need a record of the videos requested. You don’t need IP addresses of those requesting them. You certainly don’t need the very personally-identifiable information of those who are logged in and watching them. Remember, this isn’t a case against any particular individuals. It’s against Google in general. Asking for individual viewing data isn’t necessary.

To explain in another way, say you’ve got a library full of legal and illegal books [books that were printed without copyright permission]. You want to know what percentage of all the books being read are illegal ones. All you need are the raw number of illegal books checkout versus total checkouts. You don’t need any information about the patrons who actually checked out the books.

Google Pushes The Privacy Angle, Gets Denied

Google did argue against handing over the logs on a privacy standpoint:

Defendants argue that the data should not be disclosed because of the users’ privacy concerns, saying that “Plaintiffs would likely be able to determine the viewing and video uploading habits of YouTube’s users based on the user’s login ID and the user’s IP address.” But defendants cite no authority barring them from disclosing such information in civil discovery proceedings, and their privacy concerns are speculative. Defendants do not refute that the “login ID is an anonymous pseudonym that users create for themselves when they sign up with YouTube,” which without more “cannot identify specific individuals.” Google has elsewhere stated:

We . . . are strong supporters of the idea that data protection laws should apply to any data that could identify you. The reality is though that in most cases, an IP address without additional information cannot.

Google Software Engineer Alma Whitten, Are IP addresses personal?, GOOGLE PUBLIC POLICY BLOG (Feb. 22, 2008)

Therefore, the motion to compel production of all data from the Logging database concerning each time a YouTube video has been viewed on the YouTube website or through embedding on a third-party website is granted.

How Did Judge Fail To Understand Privacy Issues More?

I bolded the statements above that made me drop my jaw. They sparked questions like:

  • How did Google let this judge come away thinking that logged-in data is the same as fairly anonymous IP data?
  • Did Google really tell the judge that a login is an “anonymous pseudoynm?”
  • How did Google not point out to the judge that in another civil case, involving the US Department Of Justice, the court agreed that IP addresses could be sensitive enough that it denied a request for Google to hand over search data?
  • Why is there no suggestion that viewing records could be handed over without login or IP information yet still be useful to what Viacom seeks to accomplish?

This simply can’t stand. I’ve got an email out to Google asking when they’ll comply or, more importantly, if they’ll find a way to refuse. The company stood up to the US Department Of Justice when it wanted far less personal information and won.

Of a somewhat related concern, YouTube users have the ability to post private videos. Viacom wanted access to these. That was fortunately denied.

Other Demands Granted & Denied

The ruling covered other demands. Here’s a rundown of what was given and not:

  • YouTube/Google Source Code: NO, which I agree entirely with Michael Arrington at TechCrunch, wouldn’t seem to have been that useful at all.
  • Video ID Source Code: NO (this is the system Google developed to identify infringement)
  • Removed Videos: YES, and you’d think it will cost Viacom a billion dollars just to process that data alone. There are millions of these, according to the ruling. But Viacom, according to the ruling, says “they can handle it electronically.”
  • YouTube Viewing Logs: YES
  • Google Advertising Data: NO
  • Google Video Data: Yes — and what exactly is in the “schema” being provided is unclear.
  • YouTube Private Videos: NO

Will Video Save The Privacy Star?

Let’s go back now to the privacy issues about YouTube’s logs. Both the ruling and the EFF make reference to the Video Privacy Protection Act that protects disclosure of video tape rentals. The judge didn’t think this applied. The EFF certainly does:

In a footnote, the Court references the VPPA, noting that the federal law “prohibits video tape service providers from disclosing information on the specific video materials subscribers request or obtain.” It is possible that the reference to “video tapes” in the VPPA was confusing. However, the Act is not limited to the technology available at the time of its enactment.

To the contrary, the act refers to “prerecorded video cassette tapes or similar audio visual materials.” A YouTube video may not be a videotape, but certainly qualifies as audio visual material. Thus, YouTube is a “video tape service provider” under the act, because it is “engaged in the business [of] delivery of … audio visual materials.” The VPPA protects “personally identifiable information,” which is defined to include “information which identifies a person as having requested or obtained specific video materials or services.” This is exactly what is in the Logging database.

Hopefully, there will be some further arguments that the act does apply, as well as the common sense that user data is not necessary. But recall the reason why the act was enacted in the first place. As the EFF says:

The VPPA passed after a newspaper disclosed Supreme Court nominee Robert Bork’s video rental records. As Congress recognized, your selection of videos to watch is deeply personal and deserves the strongest protection.

National Internet Privacy Act Needed

Indeed, video viewing is personal, online or offline. So is the searching we do. So is what we do on the web in terms of surfing (data that our ISPs sell). But our web activities have no specific federal protection. As a result, we get issues like the US Department Of Justice having overreached, the Viacom ruling today, or the case in Florida where Google search activity is being requested in a pornography case over community values. We also get new laws letting the US government secretly demand whatever it wants, just as an old law allowing that was overturned.

Last year, Microsoft and Ask called for some industry standards (a call that would have been better if they’d involved Yahoo and Google beforehand). Google also called for a global privacy standard last year, then this year said it would back a national law in response to the latest letter over privacy issues from US representative Joe Barton. Microsoft also pushed for a privacy framework this year.

We need it. And we need it without political grandstanding, without the privacy advocates arguing that it will water down US state laws (as EPIC did in response to Google saying they’d back a national law). We need protection, and we need various groups to diligently work together to make that happen.

I remain amazed that after the AOL data leak in 2006, little has happened since then to protect us. But when lawmakers start to understand that the porn and other embarrassing material some of them have watched on YouTube is to be handed over to Viacom, maybe they’ll finally wake up. Remember, lawmakers, even those now deleted videos are fair game.

Google’s Arguments & Data Retention Come Back To Haunt It

As for Google, it’s partially due for a round of “I told you so” from privacy advocates. They’ve been pushing for data destruction, under the argument that if you don’t have visitation data recorded, then you have nothing that can be handed over.

For its part, Google has been reducing the time period it keeps cookie and IP-based data. In fact, it was the leader in the industry in this regard, something it gets virtually no credit for. But it still argues that some retention is needed for a period of at least months, with the latest shot being a post about how it helps to fight search spam. It has also argued, as the court noted, that IP data itself is mostly anonymous.

Now Google is finding that some of its own arguments are being used against it. True, by a judge who, however it happened, doesn’t understand the real difference between IP data and logged data. But if Google didn’t have the data, if it hadn’t been pushing for retention — if it had accelerated the previously announced anonymizing plan that might include YouTube [it might not — I’m checking on this] — then there would be less data at issue.

Privacy Advocates, Politicians Stay Focused On Cookie Schmookie Issues

Lest the privacy advocates feel all smug, remember that most of their attention — and that of lawmakers — has been about IP addresses and cookies. Both can sound scary, but to me, they are old school data privacy issues. The real focus should be on the data we store when we’re logged in. Data we may store for long periods of time — that we want to store — and which isn’t gong to be destroyed unless we request that. How protected is this data? The lack of focus here has driven me crazy. As I wrote about the EU getting worked up over cookies last year:

Yesterday in my Google Bad On Privacy? Maybe It’s Privacy International’s Report That Sucks article, I spent a considerable amount of time being upset with Privacy International for doing what I thought was a slipshod report on privacy. Today, I’m similarly critical about the EU move. It’s not — as I said yesterday — that I’m a Google fanboy that thinks it can do no wrong. In fact, it’s the opposite — I think Google as well as all the major search engines (and big companies for that matter) to have outside privacy groups and governmental bodies keeping them honest. My upset is that both Privacy International and the EU have seemed more concerned with style than substance.

As I pointed out in my article yesterday, Google have a variety of privacy policies that cover a range of services that it offers. These services can have data well beyond what’s in server logs, and it’s difficult for me — someone who regularly writes about Google — to know what happens with my data. Consider the accounts I have:

  • Google AdWords, with associated billing and ad campaign info
  • Google AdSense, with associated payments and information logged from traffic on my sites
  • Gmail, with mail going back for four years
  • Google Web History, with my search data
  • Google Analytics, with site activity data stored
  • Google Calendar, with a list of my activities

That’s just some of my accounts. If I delete my web history, I know that data is destroyed, though what’s kept on offline archives currently is not destroyed, from what I was last told. If I go to the Gmail privacy FAQ (far more useful than the Gmail privacy policy, which fails to link directly to the FAQ), I’m told deleting my mail really deletes it, even off backups, though that might take time. But then again, are these deleted from online backups only? What about offline?

Honestly, I can’t take another round of kill the IPs, kill the cookies, and especially politicians playing catch-up to the 1990s as 2010 approaches. We need a comprehensive internet privacy act, one with substance, not one that plays word games over IP addresses and cookie data.

YouTube Privacy

This started about the privacy of viewing on YouTube. I’m trying to get answers about how those who are concerned might be able to delete their data — if they can at all — before any handover happens.

YouTube has a clear viewing history option that can be used similar to how search history records can be deleted. But does this actually delete data or simply wipe out what shows for you, while records are still kept? And how come I see nothing in my YouTube account when I know I’ve viewed things before? And — heh — if everyone clears their viewing history, will the court come after them for destroying evidence?

As a reminder of what’s recorded if you log in, from the YouTube privacy policy:

We may record information about your usage, such as when you use YouTube, the channels, groups, and favorites you subscribe to, the contacts you communicate with, the videos you watch, and the frequency and size of data transfers, as well as information you display or click on in YouTube (including UI elements, settings, and other information). If you are logged in, we may associate that information with your account. We may use clear GIFs (a.k.a. “Web Beacons”) in HTML-based emails sent to our users to track which emails are opened by recipients.

Conclusion

I hope this ruling serves as a flashpoint or wake-up call to a variety of people who can make a national privacy standard a reality. We don’t need more empty talk. We need actual progress.

In the short term, I’m following up with Google to see what it can do to oppose handing over the personal part of viewing records that aren’t needed. I want to see the company stand up with full-force and fight against this in every way it can.

As for Viacom, I’d like to see it very quickly come out with a statement saying it will work to take the data without requiring personal information be included. No cookies. No IPs. No logged-in data. Viacom doesn’t need that, and it should say clearly it doesn’t want it. And if it fails to do it, well, I’m considering just how much of Viacom I can start cutting out of my life — and if others would do the same.

Postscript: I called Judge Louis L Stanton’s chambers at 9:15am Eastern time today, on the chance he might be able to give a statement or more information about the case. Not surprisingly, he wasn’t available. The law clerk I spoke with wasn’t certain if he’d be able to comment at all.

I asked if anyone there was aware what a controversy his ruling had sparked. Clearly, they were not. I explained further that because of the ruling, millions of people would shortly be hearing from the mainstream media (when it jumped on the story) that their YouTube viewing habits were about to be handed over to Viacom.

I further offered to send some links to the existing coverage, so that the judge could understand better what was being said. But the court, apparently, can’t receive information by email, I was told. Instead, the clerk asked if I wanted to mail some information for the judge to consider.

Um — yes, I suppose I could mail it, I said, but I explained again that his chambers would probably get calls from places like the New York Times and the Wall Street Journal within hours, so they might like to be a bit more timely on this.

Eventually, the law clerk took down my physical mailing address so they could get in touch, as well as my phone number and, after I insisted, my email address. I also asked if the court had internet access, since if it can’t get email, who knows what it can do. But it does, so I suggested the judge and his staff go to Techmeme, where the issue is currently the top story there and will continue to grow. That’s T-E-C-H-M-E-M-E-DOT-COM, I spelled out.

Postscript 2: Google has sent this:

We are pleased the court put some limits on discovery, including refusing to allow Viacom to access users’ private videos and our search technology. We are disappointed the court granted Viacom’s overreaching demand for viewing history. We will ask Viacom to respect users’ privacy and allow us to anonymize the logs before producing them under the court’s order.

OK, trying to get more about whether they’ll fight or refuse if they cannot anonymize the data, along with answers to some other questions.

Postscript 3: According to the LA Times, Viacom has a statement out saying they didn’t ask for or would obtain personal information. So hopefully that means Google is free to excise this despite the judge ordering it released:

In a statement, the entertainment giant said it did not ask for nor would it obtain “any personally identifiable information of any user.”

“Any information that we or our outside advisors obtain — which will not include personally identifiable information — will be used exclusively for the purpose of proving our case against YouTube and Google, will be handled subject to a court protective order and in a highly confidential manner,” the New York-based company said.

Viacom general counsel Michael Fricklas said “unequivocally that this information will not be used” for the purpose of trying to find the identities of people who uploaded copyrighted Viacom clips to YouTube.

News.com also notes that there’s apparently a restriction that’s designed to protect any personal information that might be given to Viacom:

The court’s protective order stipulates that data turned over to Viacom by Google must be used for the sole purpose of proving Viacom’s claim against Google that YouTube is a hotbed of pirated video content, the sources said. Viacom will not have direct access to the YouTube user data, the source said. Access is restricted to outside counsel and experts.

Viacom, therefore, is forbidden from targeting individual users in the manner of the RIAA’s lawsuits against individuals found to be downloading illegal music.

Postscript 4: See the updated post, Google: Expects Viacom Will Take YouTube Data Without User Info.


Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.


About the author

Danny Sullivan
Contributor
Danny Sullivan was a journalist and analyst who covered the digital and search marketing space from 1996 through 2017. He was also a cofounder of Third Door Media, which publishes Search Engine Land and MarTech, and produces the SMX: Search Marketing Expo and MarTech events. He retired from journalism and Third Door Media in June 2017. You can learn more about him on his personal site & blog He can also be found on Facebook and Twitter.

Get the must-read newsletter for search marketers.