Humanistic Data Visualization

Yesterday, I participated in “Visualizing Text as Data,” the inaugural discussion series from Northeastern’s NULab for Text, Maps, and Networks. We discussed Data Visualization in Sociology, by Kieran Healy and James Moody and Humanities Approaches to Graphical Display, by Johanna Drucker, though most of the conversation focused on the piece by Drucker.

Drucker writes:

…Graphical tools are a kind of intellectual Trojan horse, a vehicle through which assumptions about what constitutes information swarm with potent force. These assumptions are cloaked in a rhetoric taken wholesale from the techniques of the empirical sciences that conceals their epistemological biases under a guise of familiarity. So naturalized are the Google maps and bar charts of generated from spread sheets that they pass as unquestioned representations of “what it.”

Data visualizations – just like statical techniques – are an interpretation of the data, not a realization of the data. In the statistical world, there are known problematic techniques such as p-hacking where you find something significant only because you tried so many thing something (randomly) had to be significant. This is part of the art of data analysis – data fundamentally needs to be interpreted, but we should always be clear on what we’re interpreting, what assumptions we’re making in that interpretation, and what biases go into that interpretation.

Using a humanist lens, Drucker seems to apply a similar argument to visualizations. We are too accustomed to taking a visual representation of data as a ground Truth of what that data can tell us and to unaccustomed to thinking of visualization as a interpretation.

That’s not to say that visualization has no purpose, or that the fact that visualizations are interpretation is irreparably problematic.

There’s a great classic example of from Francis Anscombe – Anscombe’s quartet, as it’s appropriately called. Four data sets which appear comparable from their basic statistical properties, but which are obviously different when visualized.

But I don’t think that Drucker wants to throw visualization out all together. I read her article as a provocation – a reminder that visualizations, too, are interpretations of data.

Arguably, this reminder is even more important when were talking about visualizations rather than narrative or statistical descriptions. Those later modes almost inherently force a user to engage – to think about what they’re reading and what it means. Though there’s still plenty of misleading interpretation in the statistical world.

The real concern – and the one Drucker highlights so poignantly – is that we accept visualizations without question – we don’t spend enough time thinking about what boundaries a visualization should push.

In many ways this makes sense – we expect a visualization to be quickly and easily interpretable.  But we are at risk of letting our biases run wild if we don’t question this. It may be easy for someone to interpret gender in a visualization if colors indicate pink for women and blue for men.

But please, please, don’t use this color scheme to encode gender. It may be interpretable, but it carries with it too much baggage of social norms. Far better to shake things up a bit.

Drucker pushes this argument to the extreme. Changing the gender color scheme is a relatively minor act of subversion, what happens if you take this questioning further? Make the user really work to understand the data?

This argument reminds me of the work of Elizabeth Peabody – who created intricate mural charts which could only be understood with a significant amount of time and energy. These visualizations were not “user friendly,” but at a time when women had few rights, they pushed the boundary of who gets to create knowledge.

This also reminds me of the arguments of Bent Flyvjerg, who argues that social science should stop trying so hard to be computational and should instead focus on phronesis – emphasizing a humanities, rather than computational, approach.

I’m not sure the two approaches are as mutually exclusive as Flyvbjerg fears, but his argument, like Drucker’s, raises a crucial point: it is not enough to ask “what is,” it is not enough to take computation as ground truth and – in terms of visualization – to take what is easy as what is good.

Regardless of field, we should be hesitant to put humanistic concerns aside, to think that facts can stand isolated from values. Values matter. Our assumptions and interpretations matter, and it may not always be most appropriate to try to bury our biases and try to pretend that they don’t exist.

Rather, we should bring them to the fore and examine them critically. Instead of asking “do I have any biases?” perhaps we’d do better ask ourselves, “do I have Good biases?”


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.