Modifying SO-PMI for Japanese Weblog Opinion Mining by Using a Balancing Factor and Detecting Neutral Expressions

نویسندگان

  • Guangwei Wang
  • Kenji Araki
چکیده

We propose a variation of the SO-PMI algorithm for Japanese, for use in Weblog Opinion Mining. SO-PMI is an unsupervised approach proposed by Turney that has been shown to work well for English. We first used the SO-PMI algorithm on Japanese in a way very similar to Turney’s original idea. The result of this trial leaned heavily toward positive opinions. We then expanded the reference words to be sets of words, tried to introduce a balancing factor and to detect neutral expressions. After these modifications, we achieved a wellbalanced result: both positive and negative accuracy exceeded 70%. This shows that our proposed approach not only adapted the SO-PMI for Japanese, but also modified it to analyze Japanese opinions more effectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

OMS-J: An Opinion Mining System for Japanese Weblog Reviews Using a Combination of Supervised and Unsupervised Approaches

We introduce a simple opinion mining system for analyzing Japanese Weblog reviews called OMS-J. OMS-J is designed to provide an intuitive visual GUI of opinion mining graphs for a comparison of different products of the same type to help a user make a quick purchase decision. We first use an opinion mining method using a combination of supervised (a Naive Bayes Classifier) and unsupervised (an ...

متن کامل

Extracting Aspect-Evaluation and Aspect-Of Relations in Opinion Mining

The technology of opinion extraction allows users to retrieve and analyze people’s opinions scattered over Web documents. We define an opinion unit as a quadruple consisting of the opinion holder, the subject being evaluated, the part or the attribute in which the subject is evaluated, and the value of the evaluation that expresses a positive or negative assessment. We use this definition as th...

متن کامل

Building a Twitter opinion lexicon from automatically-annotated tweets

Opinion lexicons, which are lists of terms labelled by sentiment, are widely used resources to support automatic sentiment analysis of textual passages. However, existing resources of this type exhibit some limitations when applied to social media messages such as tweets (posts in Twitter), because they are unable to capture the diversity of informal expressions commonly found in this type of m...

متن کامل

Improving the performance of UPQC under unbalanced and distortional load conditions: A new control method

This paper presents a new control method for a three-phase four-wire Unified Power Quality Conditioner (UPQC) to deal with the problems of power quality under distortional and unbalanced load conditions. The proposed control approach is the combination of instantaneous power theory and Synchronous Reference Frame (SRF) theory which is optimized by using a self-tuning filter (STF) and without us...

متن کامل

Combination of Ensemble Data Mining Methods for Detecting Credit Card Fraud Transactions

As we know, credit cards speed up and make life easier for all citizens and bank customers. They can use it anytime and anyplace according to their personal needs, instantly and quickly and without hassle, without worrying about carrying a lot of cash and more security than having liquidity. Together, these factors make credit cards one of the most popular forms of online banking. This has led ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007