Diffusion of Lexical Change in Social Media
نویسندگان
چکیده
Computer-mediated communication is driving fundamental changes in the nature of written language. We investigate these changes by statistical analysis of a dataset comprising 107 million Twitter messages (authored by 2.7 million unique user accounts). Using a latent vector autoregressive model to aggregate across thousands of words, we identify high-level patterns in diffusion of linguistic change over the United States. Our model is robust to unpredictable changes in Twitter's sampling rate, and provides a probabilistic characterization of the relationship of macro-scale linguistic influence to a set of demographic and geographic predictors. The results of this analysis offer support for prior arguments that focus on geographical proximity and population size. However, demographic similarity - especially with regard to race - plays an even more central role, as cities with similar racial demographics are far more likely to share linguistic influence. Rather than moving towards a single unified "netspeak" dialect, language evolution in computer-mediated communication reproduces existing fault lines in spoken American English.
منابع مشابه
PLOS ONE: Diffusion of Lexical Change in Social Media
Introduction Materials and Methods Results Discussion Supporting Information Acknowledgments Author Contributions References Reader Comments (0) Figures ADVERTISEMENT Diffusion of Lexical Change in Social Media 1,534 VIEWS 3 SAVES 57 SHARES OPEN ACCESS PEER-REVIEWED RESEARCH ARTICLE Jacob Eisenstein , Brendan O'Connor, Noah A. Smith, Eric P. Xing
متن کاملIdentifying regional dialects in online social media
Electronic social media offers new opportunities for informal communication in written language, while at the same time, providing new datasets that allow researchers to document dialect variation from records of natural communication among millions of individuals. The unprecedented scale of this data enables the application of quantitative methods to automatically discover the lexical variable...
متن کاملMapping the geographical diffusion of new words
Language in social media is rich with linguistic innovations, most strikingly in the new words and spellings that constantly enter the lexicon. Despite assertions about the power of social media to connect people across the world, we find that many of these neologisms are restricted to geographically compact areas. Even for words that become ubiquituous, their growth in popularity is often geog...
متن کاملA Knowledge Management Approach to Discovering Influential Users in Social Media
A key step for success of marketer is to discover influential users who diffuse information and their followers have interest to this information and increase to diffuse information on social media. They can reduce the cost of advertising, increase sales and maximize diffusion of information. A key problem is how to precisely identify the most influential users on social networks. In this pape...
متن کاملConstruction of the Gmane corpus for examining the diffusion of lexical innovations
Large-scale linguistic corpora, complete with information about speakers’ social networks as well as demographic and temporal information, allow for empirical validation of complex theories about the social interactions and linguistic properties leading to large-scale language change. We present ongoing work on the diffusion of lexical innovations using a corpus we have compiled from the Gmane ...
متن کاملCompetitive dynamics of lexical innovations in multi-layer networks
We study the introduction of lexical innovations into a community of language users. Lexical innovations, i.e., new terms added to people’s vocabulary, play an important role in the process of language evolution. Nowadays, information is spread through a variety of networks, including, among others, online and offline social networks and the World Wide Web. The entire system, comprising network...
متن کامل