Private or Corporate? Predicting User Types on Twitter
نویسندگان
چکیده
In this paper we present a series of experiments on discriminating between private and corporate accounts on Twitter. We define features based on Twitter metadata, morphosyntactic tags and surface forms, showing that the simple bag-of-words model achieves single best results that can, however, be improved by building a weighted soft ensemble of classifiers based on each feature type. Investigating the time and language dependence of each feature type delivers quite unexpecting results showing that features based on metadata are neither timenor language-insensitive as the way the two user groups use the social network varies heavily through time and space.
منابع مشابه
Corporate Twitter Channels: The Impact of Engagement and Informedness on Corporate Reputation
This article examines firm communication on a corporate Twitter channel and its effects on corporate reputation. We identify the importance of user engagement and informedness in explaining corporate reputation, and examine three design factors that likely affect user engagement in a corporate Twitter channel. We conduct an exploratory 2 × 2 × 2 experiment among Twitter users to collect data. W...
متن کاملA High-Performance Model based on Ensembles for Twitter Sentiment Classification
Background and Objectives: Twitter Sentiment Classification is one of the most popular fields in information retrieval and text mining. Millions of people of the world intensity use social networks like Twitter. It supports users to publish tweets to tell what they are thinking about topics. There are numerous web sites built on the Internet presenting Twitter. The user can enter a sentiment ta...
متن کاملUQAM-NTL: Named entity recognition in Twitter messages
This paper describes our system used in the 2 Workshop on Noisy User-generated Text (WNUT) shared task for Named Entity Recognition (NER) in Twitter, in conjunction with Coling 2016. Our system is based on supervised machine learning by applying Conditional Random Fields (CRF) to train two classifiers for two different evaluations. The first evaluation aims at predicting the 10 fine-grained typ...
متن کاملMultiview Deep Learning for Predicting Twitter Users' Location
The problem of predicting the location of users on large social networks like Twitter has emerged from real-life applications such as social unrest detection and online marketing. Twitter user geolocation is a difficult and active research topic with a vast literature. Most of the proposed methods follow either a content-based or a network-based approach. The former exploits user-generated cont...
متن کاملCorporate Credit Risk Analysis Utilizing Textual User Generated Content - A Twitter Based Feasibility Study
Irrecoverable receivables resulting from insolvent debtors endanger the own liquidity. Therefore, corporate credit risk analysis should be continuously improved in order to diminish bad debt. We analyse in how far user generated content (UGC) contains evidence concerning the financial stability of companies and hence, can enhance the information base for corporate credit risk analysis. For this...
متن کامل