Text Categorization and Relational Learning

نویسنده

  • William W. Cohen
چکیده

We evaluate the rst order learning system FOIL on a series of text categorization problems. It is shown that FOIL usually forms classiiers with lower error rates and higher rates of precision and recall with a relational encoding than with a propositional encoding. We show that FOIL's performance can be improved by relation selection, a rst order analog of feature selection. Relation selection improves FOIL's performance as measured by any of recall, precision, F-measure, or error rate. With an appropriate level of relation selection, FOIL appears to be competitive with or superior to existing proposi-tional techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the Operation of Text Categorization Systems with Selecting Proper Features Based on PSO-LA

With the explosive growth in amount of information, it is highly required to utilize tools and methods in order to search, filter and manage resources. One of the major problems in text classification relates to the high dimensional feature spaces. Therefore, the main goal of text classification is to reduce the dimensionality of features space. There are many feature selection methods. However...

متن کامل

Fuzzy relational thesauri in information retrieval: automatic knowledge base expansion by means of classified textual data

In our ongoing project we develop a tool which provides domain engineers with a facility to create fuzzy relational thesauri (FRT) describing subject domains. The created fuzzy relational thesauri can be used as knowledge base for an intelligent information agent when answering user queries relevant to the described domains, or for textual searching on the web. However, the manual creation of (...

متن کامل

Automatic Categorization of Questions for a Mathematics Education Service

This paper describes a new approach to managing a stream of questions about mathematics by integrating a text categorization framework into a relational database management system. The corpus studied is based on unstructured submissions to an ask-an-expert service in learning mathematics. The classification system has been tested using a Näıve Bayes learner built into the framework. The perform...

متن کامل

Relational Lasso - An Improved Method Using the Relations Among Features -

Relational lasso is a method that incorporates feature relations within machine learning. By using automatically obtained noisy relations among features, relational lasso learns an additional penalty parameter per feature, which is then incorporated in terms of a regularizer within the target optimization function. Relational lasso has been tested on three different tasks: text categorization, ...

متن کامل

New Approach for Data classification using Multi view graph learning Technique

Text classification approach gaining more importance because of the accessibility of large number of electronic documents from a variety of resource. Text categorization (Also called Text Categorization) is the task of assigning predefined categories to documents. It is the method of finding interesting regularities in large textual, where interesting means non trivial, hidden, previously unkno...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995