Skip to main content

Is Google reading your bMail?

Chris Hoofnagle, adjunct professor of information | May 20, 2013

Both users of bMail and the campus itself have never received a clear answer to a simple question: Is Google subjecting data in Google Apps for Education to data analysis or mining for purposes unnecessary for technical rendition of service?

A recently-filed lawsuit suggests that Google is indeed applying analysis to our messages, but masking this behavior by not showing users advertising. In Fread v. Google, students from the University of Hawaii and the University of the Pacific allege:

“…Google does not serve targeted content-based advertising to Google Apps EDU users. Google nonetheless extracts the content and meaning from Plaintiffs’ and Class Members’ Sent and Received e-mail messages and uses that content for various purposes and for profit


From this reading, Google collects, extracts, and/or generates metadata consisting of “PHIL Clusters” (Probabilistic Hierarchical Inferential Learner). “PHIL Clusters” represent the meaning inferred from particular words or phrases in Plaintiffs’ and the Class’ Received and Sent e-mails.

Systems such as PHIL, or similar systems, learn “concepts” by learning an explanatory model of text.

Thus, Google’s use of PHIL’s concepts are designed and supposed to model the actual ideas that occur in Plaintiffs and Class members’ mind in creating e-mail content.

Why should we believe these plaintiffs? For one, they are represented by attorneys who are litigating another case concerning Gmail, Dunbar v. Google. In the discovery process, these attorneys could have learned about this data analysis. We, the users of bMail, however, will remain in the dark, because Google has sealed much of the record in these cases.

As educational institutions, we are under a duty to supervise and control Google’s maintenance and use of educational records. We cannot do this without having clear answers about Google’s processes. The Fread lawsuit gives us an opportunity to do so: Google’s filings with the court are under oath and scrutinized by the expert plaintiff lawyers in the case. As a System, we could call upon Google to provide us with unreadacted versions of these materials.

Related: The good, not so good, and long view on Bmail

Comments to “Is Google reading your bMail?

  1. Legality aside in both cases, which sort of surveillance/privacy intrusion are you most concerned with: government or corporate?

    The first carries with it a big scary 20th century specter, but has exactly (not approximately, but exactly) zero impact on the daily lives and lived experience of 99%+ Americans.

    The second, on the other hand, affects every single last one of us(!) practically every single last time we use an electronic device or step into any commercial establishment. You’d never realize this listening to the Cold War hysterics from the likes of the Electronic Freedom Foundation.

  2. Two articles in CNET regarding Google’s recent filing. Google states that there should be no expectation of privacy for users of Gmail:

    1) Google filing says Gmail users have no expectation of privacy

    2) Gmail: You weren’t really expecting privacy, were you?

    I do not believe that the terms of service for Google Apps for Education prevent the very data mining that is at issue here. Google’s motion to dismiss expressly indicates that the contracts with UH and UoP expressly allow Google to data-mine students’ email and that it is the responsibility of the institution (UH and UoP in this case) to notify students of Google’s terms.

    Google’s argument against the wiretapping claims are also interesting. For those of us who don’t use Google’s services, we have no basis to claim that our emails are being illegally intercepted even though we have never actually consented to Google’s terms of service. The reason? It’s part of the business of providing email. I suspect people concerned about privacy will move more rapidly toward adopting encryption.

    Chris, I’d be interested in your opinions on Google’s recent filing.

  3. Google is using our data. They capture it and use it for Google Now and other services they are trying to merge into Google Glass and Google Plus. This data mining is out of control. I would love to see a course just on Google and top social sites. Do we have this? I did see that Berkly was recently named a top school in social media marketing.

  4. @J,

    I agree with you concerning liability–it’s our job at Berkeley to address FERPA. But on your other point, why would aggregate data (if it actually is anonymous…) not be an “educational record”?


  5. Curious, if true, why this has NOT hit national attention? AND why we don’t have the obligation to voice this to our constituents on campus with a broader stroke of the mighty keyboard or pen?

    Is this another “possible UC Pepper Spray” type of protest and incident? Who on campus is responsible to vet this and protect us as students and our other colleagues?

    What are our other options, why were we not given a selection vs. only 1 option?

    Questions, Questions, but who has the answers…….

    Very concerning if true.

  6. To the extent the mining results in aggregated/anonymized data only, FERPA-compliance would arguably be in tact here. And liability re FERPA isn’t really a direct concern for Google, as the liability and risk is focused primarily at the schools.

    Not to say that such a restriction isn’t included in the terms of use, I would certainly push for it if I were UC counsel.

  7. Hi Allen, could you point to this section in the terms of service? I think the issue of data mining is an open question, especially w/r/t traffic data and data that is generated by the bMail system itself.

  8. A salient point:

    The terms of service for Google Apps for Education prohibits any data mining by Google or any related party. To do such mining would jeopardize the service’s FERPA certification.

Comments are closed.