While last semester I looked at gender representation in comic books by analyzing a network of superheroes, this semester I’m taking my research down a different path.
Through my Ph.D. I ultimately hope to develop quantitative methods for describing and measuring the quality of political and civic deliberation.
To that end, this semester, I’ll be looking at data from a popular political blog aimed at providing a space for political conversation. I have scraped this website’s entire corpus of nearly 30,000 posts from 2004 through the present, including posts and comments from 4,435 unique users.
From this, I plan to build a network of interactions – who comments on whose posts? Who recommends whose posts? Are there sub-communities within this larger online community?
Additionally, as I build my skill set in Natural Language Processing, I hope to do some basic text analysis on the content of posts and comments, looking for variation in word choice between communities as well as comparing the content of different types of posts – for example, are there keywords that would predict how many comments a post will get?
No doubt more questions will come up along the way, but as I dive into this data, these are some of the questions I’m thinking about.