Aspect-Oriented Sentiment Analysis of Customer Reviews Using Distant Supervision Techniques
نویسنده
چکیده
The opinions and experiences of other people constitute an important source of information in our everyday life. For example, we ask our friends which dentist, restaurant, or smartphone they would recommend to us. Nowadays, online customer reviews have become an invaluable resource to answer such questions. Besides helping consumers to make more informed purchase decisions, online reviews are also of great value to vendors, as they represent unsolicited and genuine customer feedback that is conveniently available at virtually no costs. However, for popular products there often exist several thousands of reviews so that manual analysis is not an option. In this thesis, we provide a comprehensive study of how to model and automatically analyze the opinion-rich information contained in customer reviews. In particular, we consider the task of aspectoriented sentiment analysis. Given a collection of review texts, the task’s goal is to detect the individual product aspects reviewers have commented on and to decide whether the comments are rather positive or negative. Developing text analysis systems often involves the tedious and costly work of creating appropriate resources — for instance, labeling training corpora for machine learning methods or constructing special-purpose knowledge bases. As an overarching topic of the thesis, we examine the utility of distant supervision techniques to reduce the amount of required human supervision. We focus on the two main subtasks of aspect-oriented review mining: (i) identifying relevant product aspects and (ii) determining and classifying expressions of sentiment. We consider both subtasks at two different levels of granularity, namely expression vs. sentence level. For these different levels of analysis, we experiment with dictionary-based and supervised approaches and examine several distant supervision techniques. For aspect detection at the expression level, we cast the task as a terminology extraction problem. At the sentence level, we cast the task as a multi-label text categorization problem and exploit section headings in review texts for a distant supervision approach. With regard to sentiment analysis, we present detailed studies of sentiment lexicon acquisition and sentiment polarity classification and show how pros and cons summaries of reviews can be exploited to reduce the manual effort in this context. We evaluate our approaches in detail, including insightful mistake analyses. For each of the tasks, we find significant improvements in comparison to relevant state-of-the-art methods. In general, we can show that the presented distant supervision methods successfully reduce the required amount of human supervision. Our approaches allow to gather very large amounts of labeled data — typically some orders of magnitude more data than possible with traditional annotation. We conclude that customer review mining systems can benefit from the proposed methods. keywords: sentiment analysis, customer review mining, opinion mining, aspect-oriented review mining, distant supervision, weakly labeled data, indirect crowdsourcing
منابع مشابه
Mining Interesting Aspects of a Product using Aspect-based Opinion Mining from Product Reviews (RESEARCH NOTE)
As the internet and its applications are growing, E-commerce has become one of its rapid applications. Customers of E-commerce were provided with the opportunity to express their opinion about the product on the web as a text in the form of reviews. In the previous studies, mere founding sentiment from reviews was not helpful to get the exact opinion of the review. In this paper, we have used A...
متن کاملImplicit Aspect Detection in Restaurant Reviews using Cooccurence of Words
For aspect-level sentiment analysis, the important first step is to identify the aspects and their associated entities present in customer reviews. Aspects can be either explicit or implicit, where the identification of the latter is more difficult. For restaurant reviews, this difficulty is escalated due to the vast number of entities and aspects present in reviews. The problem of implicit asp...
متن کاملAn Approach to Perform Aspect level Sentiment Analysis on Customer Reviews using Sentiscore Algorithm and Priority Based Classification
This paper analyses the customer reviews on restaurant domain using sentiment analysis and text mining techniques. The most integral part of our work is to assign Sentiment scores to the aspects with respect to the words used. We have devised Sentiscore algorithm to perform this function. The dataset we have at our disposal is a set of review documents obtained from an authenticated repository....
متن کاملSentiment Analysis of movie reviews using SentiWordNet Approach
In this paper, a new kind of domain specific feature-based heuristic for sentiment analysis of movie reviews using aspect-level is presented. The unsupervised learning technique for sentiment classification is used. The SentiWordNet based scheme using two different linguistic feature selections containing adjectives, adverbs and verbs and n-gram feature extraction is performed. In aspect orient...
متن کاملAspect-Specific Ranking of Product Reviews Using Topic Modeling
We examine the problem of ranking different aspects of a product through examination of its customer reviews. For instance, a restaurant review may contain distinct and possibly differing opinions on the food, decor, service, and price. We present a ranking system that uses Latent Dirichlet Allocation (LDA) and a database of opinion-oriented words to predict the aspect-specific sentiment of ind...
متن کامل