Tuesday, March 11, 2014

Big Data Privacy Risks

A big data privacy workshop was held by MIT and the White House on March 3, 2014. Many interesting topics were discussed including the privacy of medical data.

"Medical data is special, but not because privacy is more important than in other areas. It's special because progress in healthcare is too important and too urgent to wait for privacy to be solved. I'm in favor of privacy but not at the cost of avoidable pain and suffering and death. We need to find ways to make full EMR data sets available to researchers. We'll have to live with some violations of privacy, as we do today.  And as Mike [Stonebraker] said, what we need to focus on is auditing mechanisms, and finding ways to punish those that misbehave." (at 1:46:00) - John Guttag, Professor, MIT, “Clinical Data: Opportunities and Obstacles,” 03/03/2014

The italicized sentence above is a call to action for privacy and data use advocates alike. Decisions made without extensive consideration of the benefits and harms of the many options surrounding privacy and the collection, aggregation and uses of personal data would be imprudent. (https://en.wikipedia.org/wiki/Precautionary_principle). Further, because of the nuance of the issues, pervasive myths, and general lack of familiarity with risks (especially among big data practitioners and even among privacy practitioners), it can't be a dialog that is both short-lived and legitimate.

In the six hour video, the arguments for not addressing privacy take the forms exemplified by the quote above: general pleas ("too important and too urgent"), fear-mongering ("pain and suffering and death"), and dubious assurances ("punish those that misbehave"). This is not a solid foundation for making risk decisions. A valid and salient argument for the wholesale collection and analysis of data, healthcare or otherwise, is never constructed. No one provided specific or measurable benefits.

I would like to see an approach or framework for calculating privacy-related risks and benefits that could be applied in these situations. The framework should not be domain-specific and, if rigorously constructed, would be applied to population health, national defense, consumer marketing, and all other privacy domains with equal effectiveness. Such a framework would allow individual knowledge and experiences to be included in collective discussion and analysis, and allow for grounded debate about the outcomes. At the very least, the framework would provide a basis for more meaningful dialog. It would accelerate the ability of researchers, the government and the public to conduct a more informed and nuanced risk analysis.

Principles which can be used for such a universal framework have already been developed or are being updated by international organizations:
OECD Guidelines on the Protection of Privacy and Transborder Flows of Personal Data
EU Data Protection Directive
APEC Privacy Framework
The development of a risk analysis should start by incorporating these various privacy principles into a framework. How it would then be completed and which assessment methodologies would be appropriate would have to be determined, but the discussion should begin in earnest. It's time for less hand-waving and something more substantive.