A Classification Approach to Word Prediction
نویسندگان
چکیده
The eventual goal of a language model is to accurately predict the value of a missing word given its context. We present an approach to word prediction that is based on learning a representation for each word as a function of words and linguistics predicates in its context. This approach raises a few new questions that we address. First, in order to learn good word representations it is necessary to use an expressive representation of the context. We present a way that uses external knowledge to generate expressive context representations, along with a learning method capable of handling the large number of features generated this way that can, potentially, contribute to each prediction. Second, since the number of words “competing” for each prediction is large, there is a need to “focus the attention” on a smaller subset of these. We exhibit the contribution of a “focus of attention” mechanism to the performance of the word predictor. Finally, we describe a large scale experimental study in which the approach presented is shown to yield significant improvements in word prediction tasks.
منابع مشابه
Using Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کاملUsing Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کاملA Document Weighted Approach for Gender and Age Prediction Based on Term Weight Measure
Author profiling is a text classification technique, which is used to predict the profiles of unknown text by analyzing their writing styles. Author profiles are the characteristics of the authors like gender, age, nativity language, country and educational background. The existing approaches for Author Profiling suffered from problems like high dimensionality of features and fail to capture th...
متن کاملCustomer Retention Based on the Number of Purchase: A Data Mining Approach
Purpose: this study wants to find any relationship between the numbers of purchase and the income the customer brings to the company. The attempt is to find those customers who buy more than one life insurance policy and represent the signs of good payments at the same time by the help of data mining tools. Design/ methodology/ approach: the approach of this research is to use data mining tools...
متن کاملSemi-Supervised Learning Based Prediction of Musculoskeletal Disorder Risk
This study explores a semi-supervised classification approach using random forest as a base classifier to classify the low-back disorders (LBDs) risk associated with the industrial jobs. Semi-supervised classification approach uses unlabeled data together with the small number of labelled data to create a better classifier. The results obtained by the proposed approach are compared with those o...
متن کاملContext-Based Word Prediction and Classification
This paper presents a new approach for word prediction problem. Word prediction is a natural language processing problem that tries to predict the correct word in a given context. Word completion utilities, writing aids, and language translation are among the most common applications of word prediction. In this paper, we describe a new method to predict the correct word given its context. A dat...
متن کامل