Text as Data Conference

At the end of last week, I had the pleasure of attending the eighth annual conference on New Directions in Analyzing Text as Data, hosted by Princeton University and organized by Will Lowe, John Londregan, Marc Ratkovic, and Brandon Stewart.

The conference had a truly excellent program, and was packed with great content on a wide variety of text analysis challenges.

There were a number of papers on topic modeling, including work from my colleague Ryan Gallagher on Anchored correlation explanation: Topic modeling with minimal domain knowledge. – a really cool, information-theory approach to topic modeling.

Luke Miratrix also presented joint work with Angela Fan, Finale Doshi-Velez on Prior matters: simple and general methods for evaluating and improving topic quality in topic modeling, an approach which aims to approve upon standard LDA by using priors to promote informative words.

I also really enjoyed Hanna Wallach’s presentation on A network model for dynamic textual communications with application to government email corpora, which introduces the Interaction-Partition Topic Model (IPTM), which combines elements of LDA with ERGMs.

There were also a number of talks reflecting and improving upon the ways in which we approach the methodological challenges of textual data.

Laura Nelson argued for a process of computational grounded theory, in which textual analysis helps guide and direct deep reading, but in which the researcher stays intimately familiar with her corpus.

Justin Grimmer presented the great paper, How to make causal inferences using texts, which presents a conceptual framework for making causal inference using text.

For my own research, Will Hobbs might get the prize for method I’d most like to use, with his paper on Latent dimensions of attitudes on the Affordable Care Act: An application of stabilized text scaling to open-ended survey responses. He presents a very clever method for scaling common and uncommon words in order to extract latent dimensions from short text. It’s really cool.

And, of course, Nick Beauchamp presented work done jointly with myself and Peter Levine on mapping conceptual networks. In this work, we present and validate a model for measuring the conceptual network an individual uses when reasoning. In these networks, nodes are concepts and edges represent the connections between those concepts More on this in future posts, I’m sure.

Finally the session titles were the absolute best. See, for example:

  • How Does This Open-Ended Question Make You Feel?
  • Fake Pews! (a session on religiosity)
  • America’s Next Top(ic) Model
  • Fwd: Fw: RE: You have to see this paper!

Well played, well played.

Many thanks to all the conference organizers for a truly engaging and informative couple of days.


Leave a Reply

Your email address will not be published. Required fields are marked *