Social Media Writing Style Fingerprint

نویسندگان

  • Himank Yadav
  • Juliang Li
چکیده

We present our approach for computer-aided social media text authorship attribution based on recent advances in short text authorship verification. We use various natural language techniques to create word-level and character-level models that act as hidden layers to simulate a simple neural network. The choice of word-level and character-level models in each layer was informed through validation performance. The output layer of our system uses an unweighted majority vote vector to arrive at a conclusion. We also considered writing bias in social media posts while collecting our training dataset to increase system robustness. Our system achieved a precision, recall and F-measure of 0.82,​ ​0.926​ ​and​ ​0.869​ ​respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring Stylistic Variation with Age and Income on Twitter

Writing style allows NLP tools to adjust to the traits of an author. In this paper, we explore the relation between stylistic and syntactic features and authors’ age and income. We confirm our hypothesis that for numerous feature types writing style is predictive of income even beyond age. We analyze the predictive power of writing style features in a regression task on two data sets of around ...

متن کامل

Age Prediction in Blogs: A Study of Style, Content, and Online Behavior in Pre- and Post-Social Media Generations

We investigate whether wording, stylistic choices, and online behavior can be used to predict the age category of blog authors. Our hypothesis is that significant changes in writing style distinguish pre-social media bloggers from post-social media bloggers. Through experimentation with a range of years, we found that the birth dates of students in college at the time when social media such as ...

متن کامل

Comparing writing style feature-based classification methods for estimating user reputations in social media

In recent years, the anonymous nature of the Internet has made it difficult to detect manipulated user reputations in social media, as well as to ensure the qualities of users and their posts. To deal with this, this study designs and examines an automatic approach that adopts writing style features to estimate user reputations in social media. Under varying ways of defining Good and Bad classe...

متن کامل

Social Media Writing and Social Class: A Correlational Analysis of Adolescent CMC and Social Background

In a large social media corpus (2.9 million tokens), we analyze Flemish adolescents’ non-standard writing practices and look for correlations with the teenagers’ social class. Three different aspects of adolescents’ social background are included: educational track, parental profession, and home language. Since the data reveal that these parameters are highly correlated, we combine them into on...

متن کامل

On the Identification of Emotions and Authors' Gender in Facebook Comments on the Basis of their Writing Style

In this paper, we propose a method for automatic identifying emotions in written texts in social media with high proliferation such as Facebook. For that task we try to model the way people use the language to express themselves, and also use this model for identifying the gender of the authors. We focused on Spanish due to the lack of studies and resources in that language.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1712.04762  شماره 

صفحات  -

تاریخ انتشار 2017