I had the opportunity today to hear from Chenhao Tan, a Ph.D. Candidate in Computer Science at Cornell University who is looking at the dynamics of online social interactions.
In particular, Tan has done a great deal of work around predicting retweet rates for Twitter messages. That is, given two tweets by the same author on the same topic, can you predict which one will be retweeted more?
Interestingly, such pairs of tweets naturally occur frequently on Twitter. For one 2014 study, Tan was able to identify 11,000 pairs of author and topic controlled tweets with different retweet rates.
Through a computational model comparing words used as well as a number of custom features, such as the “informativeness” of a given tweet, Tan was able to build model which could correctly identify which tweet was more popular.
He even created a fun tool that allows you to input your own tweet text to compare which is more likely to be retweeted more.
From all this Twitter data, Tan was also able to compare the language of “successful” tweets to the tweets drawn from Twitter as a whole; as well as compare how these tweets fit into a given poster’s tone.
Interestingly, Tan found that the best strategy is to “be like the community, be like yourself.” That is – the most successful tweets were not notably divergent from Twitter norms and tended to be in line with the personal style of the original poster.
Tan interpreted this as a positive finding, indicating that a user doesn’t need to do something special in order to “stand out.” But, such a result to also point to Twitter as an insular community – unable to amplify messages which don’t fit the dominant norm.
And this leads to one of Tan’s broader research questions. Studies like his work around Twitter look at micro-level data; examining words and exploring how individual’s minds are changed. But, as Tan pointed out, the work of studying online communities can also be explored from a broader, macro level: what do healthy, online environments look like and how are they maintained?
There is more work to be done on both of these questions, but Tan’s work an intriguing start.