Sentiment classification of blog posts using topical extracts
نویسندگان
چکیده
Unlike news stories and product reviews which usually have a strong focus on a single topic, blog posts are often unstructured, and opinions expressed in blog posts do not necessarily correspond to a specific topic. This can lead to unsatisfactory performance of sentiment classification. In this paper we report our pilot study on addressing topic drift in blogs. We examine this phenomenon by manual inspection and extablish a ground truth. Our annotations have shown that topic drift is indeed very common, with all documents sampled showing a considerable degree of drift, averaging over 80%. The topical sentences are extracted from each post to produce an extract data set. We propose to address the topical drift problem by classifying the blog posts using the sentence-level polarities of topical extracts. We propose and evaluate two models for aggregating the sentence polarities by comparing their performance to that of a popular word-based model. Our preliminary results suggest that topical extracts can provide a concise but more accurate representation of the sentiment polarity of the blog posts. More importantly, sentence-level polarities are potentially a more reliable evidence than word distributions with regard to document polarity prediction.
منابع مشابه
Positive, Negative, or Mixed? Mining Blogs for Opinions
The rich non-factual information on the blogosphere presents interesting research questions. In this paper, we present a study on analysis of blog posts for their sentiment by using a generic sentiment lexicon. In particular, we applied Support Vector Machine to classify blog posts into three categories of opinions: positive, negative and mixed. We investigated the performance difference betwee...
متن کاملStock Market Forecasting Techniques: Literature Survey
The goal of this paper is to study different techniques to predict stock price movement using the sentiment analysis from social media, data mining. In this paper we will find efficient method which can predict stock movement more accurately. Social media offers a powerful outlet for people’s thoughts and feelings it is an enormous ever-growing source of texts ranging from everyday observations...
متن کاملSentiment-Based Ranking of Blog Posts Using Rhetorical Structure Theory
Polarity estimation in large-scale and multi-topic domains is a difficult issue. Most state-of-the-art solutions essentially rely on frequencies of sentiment-carrying words (e.g., taken from a lexicon) when analyzing the sentiment conveyed by natural language text. These approaches ignore the structural aspects of a document, which contain valuable information. Rhetorical Structure Theory (RST)...
متن کاملLeveraging Textual Sentiment Analysis with Social Network Modelling: Sentiment Analysis of Political Blogs in the 2008 U.S. Presidential Election
Automatic computational analysis and categorisation of political texts with respect to the rich array of personal sentiments, opinions, stances, and political orientations expressed in polarised political discourse is an exciting task which opens up many avenues for more accurate and naturalistic large-scale political analysis. The task does however pose major challenges for state-of-the-art Se...
متن کاملMicro-blogging Sentiment Analysis Using Bayesian Classification Methods
In this project I address the problem of accurately classifying the sentiment in posts from micro-blogs such as Twitter. As Twitter gains popularity, it becomes more useful to analyze trends and sentiment of its users towards various topics. Determining the general attitude of users towards a product or service, for example, can help a business measure overall consumer attitudes and customer sa...
متن کامل