Wednesday, March 26, 2014

Towards a More Useful Visualization of Risk

I am a photographer, and I consider natural landscapes my most challenging subject. This is because I must capture both what is inside and outside of the frame by including just the right objects and presenting them in a meaningful way. When we examine risk matrices using the analogy of landscape photography, we see that they leave out too many important things (benefits, possibilities and uncertainty), do a poor job of providing understanding of the situation in context (sense making), and leave decision makers over-reliant on visual cues and visual motifs (spurious visuals). A better representation would put risk information in a context of relevant business data, provide visual cues for potential focus areas, and give better signals for decision making. 

Typically a risk matrix provides event, likelihood, and impact. Important information is missing but the typical risk matrix tricks the viewer into thinking it is all there. Sometimes businesses get caught up in visuals. Sometimes it is useful to drop back to the data to see what the visuals really speak to.

Here are some examples:
  • skin scrape or minor cut (medium likelihood, minor impact)
  • paper cut (medium likelihood, inconsequential impact) 
  • death (low likelihood, catastrophic impact)
  • lost limb (low likelihood, severe impact)
  • stolen wallet (low likelihood, moderate impact)

Which risk should get the most attention and resources? How is this determined? What specifically is being asked and what information is available? Is this the same type of information found in a risk matrix used for business purposes?

To demonstrate what is missing, here are narratives for each example above, respectively:
  • Peter is climbing Half Dome, a lifelong goal
  • Quinn files paperwork in an office, earning $1800 weekly
  • Ryan is undergoing a possibly lethal treatment, to cure a debilitating, painful disease
  • Steve works manually, loading a sheet metal press, earning $1200 weekly
  • Tom is spending three days in an area noted for its pick-pockets, during his dream vacation

The narratives reveal what the risk matrix lacks: context. Needless to say, when presented in a risk matrix, death would likely get the most attention, discussion and debate. Any significant deflection of attention would need to be based on information outside the frame - and that's a poor informational and visual model. In essence, the viewer is fighting against the tool that is meant to assist in deciding where to put that attention. When that happens, something is wrong.

A better representation would use a clustered stock chart in a format I call Benefit-Harm Pairing. Compared to the risk matrix, the Benefit-Harm Pairing shows us more aspects of risk and it better approximates a natural narrative style thinking. The pairing also provides action indicators which are more tailored to the risk than the risk matrix variety of "reduce likelihood." The chart below represents: activity, harm (expected, minimum, maximum), and benefit (expected, minimum, maximum).

The sample Benefit-Harm Pairing presents a different picture of risk. Starting on the left with the treatment activity, the expected benefit is greater than the expected harm but there are also problems: high maximum harm and low minimum benefit.

Benefit-Harm Pairing has a strong idealized form and weak idealized form. For each risk the strong idealized form seeks that:
  • all possible benefits are greater than all possible harms

and the weak idealized form seeks that:
  • expected benefit is greater than expected harm
  • maximum harm is close to or less than expected benefit
  • minimum benefit is close to or greater than expected harm

In our treatment example, a weak idealized form would lead to the same type of tailored tactics that experience has lead to in actual practice:
  • lower the maximum harm - by employing counter-agents for the most likely fatality-inducing aspects of treatment
  • increase minimum benefit - by supplementing the primary treatment with less beneficial but more proven strategies that alleviate rather than cure

Depending on circumstances, we also have the option of assessment over time which offers an additional tactic:
  • stop harm when utility isn't realized - by monitoring treatment to make sure there are signs that it is working as expected, before continuing to expose the patient to lethal treatment

We have a similar Benefit-Risk Pairing shape with the factory worker. One of the indicated general approaches is to bring the minimum benefit up closer to potential harm. In practice, this could mean providing guaranteed lifetime benefits to compensate for work related injuries. 

The next three scenarios are different from the prior two but are similar to each other in basic shape. Picking the vacation example, the general approach indicated is to lower the maximum harm. A sample solution is splitting up the contents of the wallet into multiple pockets, and a hotel safe if possible.

Benefit-Harm Pairing addresses a number things that the traditional risk matrix does not:

  • is based on activities rather than events
  • provides benefit information as a context for harm information
  • represents expected values and possible ranges simultaneously
  • better addresses the goal of surfacing risk appetite and risk tolerance
  • more accurately reflects real-world prioritization and resource allocation
  • is closer to narrative, which is how people naturally think about risk
  • can be used for individual activities, sets of activities, or options for activities
  • addresses black swan events and nuisance events with equal effectiveness
  • allows for any kind of risk to be more easily integrated

I suggest that while a risk matrix does give a view of risk, it isn't a particularly useful view of risk. Risk matrix conversations tend to focus around the correct values of specific likelihood and impacts, often as a proxy for benefits, desires, and other data outside the frame. People intuitively know important data isn't in the model. Benefit-Harm Pairing drives the conversation closer to the heart of the matter: are the benefits of this action worth the harms? Where could we focus resources and make adjustments to improve the relative benefit-harm outcomes? Should we abandon an activity? While Benefit-Harm Pairing has flaws, it seems to be more useful on the whole. In the words of George E. P. Box: "All models are wrong. Some are useful."

I look forward to your comments and feedback.

Tuesday, March 11, 2014

Big Data Privacy Risks

A big data privacy workshop was held by MIT and the White House on March 3, 2014. Many interesting topics were discussed including the privacy of medical data.

"Medical data is special, but not because privacy is more important than in other areas. It's special because progress in healthcare is too important and too urgent to wait for privacy to be solved. I'm in favor of privacy but not at the cost of avoidable pain and suffering and death. We need to find ways to make full EMR data sets available to researchers. We'll have to live with some violations of privacy, as we do today.  And as Mike [Stonebraker] said, what we need to focus on is auditing mechanisms, and finding ways to punish those that misbehave." (at 1:46:00) - John Guttag, Professor, MIT, “Clinical Data: Opportunities and Obstacles,” 03/03/2014

The italicized sentence above is a call to action for privacy and data use advocates alike. Decisions made without extensive consideration of the benefits and harms of the many options surrounding privacy and the collection, aggregation and uses of personal data would be imprudent. ( Further, because of the nuance of the issues, pervasive myths, and general lack of familiarity with risks (especially among big data practitioners and even among privacy practitioners), it can't be a dialog that is both short-lived and legitimate.

In the six hour video, the arguments for not addressing privacy take the forms exemplified by the quote above: general pleas ("too important and too urgent"), fear-mongering ("pain and suffering and death"), and dubious assurances ("punish those that misbehave"). This is not a solid foundation for making risk decisions. A valid and salient argument for the wholesale collection and analysis of data, healthcare or otherwise, is never constructed. No one provided specific or measurable benefits.

I would like to see an approach or framework for calculating privacy-related risks and benefits that could be applied in these situations. The framework should not be domain-specific and, if rigorously constructed, would be applied to population health, national defense, consumer marketing, and all other privacy domains with equal effectiveness. Such a framework would allow individual knowledge and experiences to be included in collective discussion and analysis, and allow for grounded debate about the outcomes. At the very least, the framework would provide a basis for more meaningful dialog. It would accelerate the ability of researchers, the government and the public to conduct a more informed and nuanced risk analysis.

Principles which can be used for such a universal framework have already been developed or are being updated by international organizations:
OECD Guidelines on the Protection of Privacy and Transborder Flows of Personal Data
EU Data Protection Directive
APEC Privacy Framework
The development of a risk analysis should start by incorporating these various privacy principles into a framework. How it would then be completed and which assessment methodologies would be appropriate would have to be determined, but the discussion should begin in earnest. It's time for less hand-waving and something more substantive.