While social media might seem like a confusing tangle, there are actually discernible structures underlying and ordering online communities like Twitter, according to a new study by professors from Royal Holloway University and Princeton University.
The article by John Bryden, Sebastian Funk, and Vincent Jansen, titled “Word usage mirrors community structure in the online social network Twitter” and published in EPJ Data Science, describes how Twitter users group themselves (consciously or unconsciously) into “tribes” which are identifiable by shared language patterns -- and which also tended to have similar vocations, political opinions, and hobbies.
The researchers tracked 75 million public messages sent on Twitter by around 250,000 members, and used this data to determine which individuals were more likely to talk to each other. Then they correlated these relationships with linguistic patterns to identify a wide range of sub-communities, which they term “tribes.”
No surprise, one such community was interested in -- perhaps “obsessed with” would be more accurate -- Justin Bieber, and members of this group displayed their own unique verbal tics. Jansen noted: “Interestingly, just as people have varying regional accents, we also found that communities would misspell words in different ways. The Justin Bieber fans have a habit of ending words in ‘ee’, as in ‘pleasee’.”
The Twitter language tribes get pretty, well, tribal, in the sense of tightly focused on their parochial interests. According to Funk, another group, which the researchers nicknamed “anipals,” “was interested in hosting parties to raise funds for animal welfare, while another was a fascinating growing community interested in the concept of gratitude.”
The model passes the predictive test, according to the authors, who assert that “The words used by an individual user, in turn, can be used to predict the community of which that user is a member.” The authors claim a roughly 80% accuracy rate in being able to predict which community a Twitter user belongs to based on linguistic analysis.