An Empirical Approach to Text Categorization Based on Term Weight Learning
نویسندگان
چکیده
In this paper) we propose a method for text categorizaLion task using term weight learning. In our approach, learning is to learn true keywords from the error of clustering results. Parameters of term weighting are then estimated so as to maximize the true keywords and minimize the other words in the text. The characteristic of our approach is that the degree of context dependency is used in order to judge whether a word in a text is a true keyvv·ord or not. The experiments using Wall Street Journal corpus demonstrate the effectiveness of the method.
منابع مشابه
A Document Weighted Approach for Gender and Age Prediction Based on Term Weight Measure
Author profiling is a text classification technique, which is used to predict the profiles of unknown text by analyzing their writing styles. Author profiles are the characteristics of the authors like gender, age, nativity language, country and educational background. The existing approaches for Author Profiling suffered from problems like high dimensionality of features and fail to capture th...
متن کاملImproving the Operation of Text Categorization Systems with Selecting Proper Features Based on PSO-LA
With the explosive growth in amount of information, it is highly required to utilize tools and methods in order to search, filter and manage resources. One of the major problems in text classification relates to the high dimensional feature spaces. Therefore, the main goal of text classification is to reduce the dimensionality of features space. There are many feature selection methods. However...
متن کاملEfficient Method Based on Combination of Deep Learning Models for Sentiment Analysis of Text
People's opinions about a specific concept are considered as one of the most important textual data that are available on the web. However, finding and monitoring web pages containing these comments and extracting valuable information from them is very difficult. In this regard, developing automatic sentiment analysis systems that can extract opinions and express their intellectual process has ...
متن کاملDoes a New Simple Gaussian Weighting Approach Perform Well in Text Categorization?
A new approach to the Text Categorization problem is here presented. It is called Gaussian Weighting and it is a supervised learning algorithm that, during the training phase, estimates two very simple and easily computable statistics which are: the Presence P, how much a term / is present in a category c\ the Expressiveness E, how much / is present outside c in the rest of the domain. Once the...
متن کاملA Systematic Review of Banking Business Models with an Approach to Sustainable Development
Modern banks have shifted their function as purely administrative, economic and industrial entities into socio-political institutions that must be sensitive to the surrounding environment. This function has always been neglected. This study was conducted based on primary, secondary, and tertiary data and reviews the full text of 75 studies selected from more than 245 studies. The selected elect...
متن کامل