Using Filtered Second Order Co-occurrence Matrix to Improve the Traditional Co-occurrence Model
نویسندگان
چکیده
Using co-occurrence statistics to measure word similarities/relatedness has applications in many areas of natural language processing. Our experiment results also indicate that two words with zero co-occurrence statistics could still be related. In this paper, we present two algorithms, both of which were evaluated on 80 synonym test questions from the Test of English as a Foreign Language (TOEFL) and 50 synonym test questions from a collection of tests for students of English as a Second Language (ESL). The evaluation results show that the first algorithm improves the performance of co-occurrence based applications significantly; and the second ensemble algorithm (which incorporates the first algorithm) achieves the best results on the synonym questions of both tests. Keyword: Co-occurrence, Word similarity, Word relatedness, Synonym test, PMI
منابع مشابه
Context Dependent Class Language Model based on Word Co-occurrence Matrix in LSA Framework for Speech Recognition
We address the issue of data sparseness problem in language model (LM). Using class LM is one way to avoid this problem. In class LM, infrequent words are supported by more frequent words in the same class. This paper investigates a class LM based on LSA. A word-document matrix is usually used to represent a corpus in LSA framework. However, this matrix ignores word order in the sentence. We pr...
متن کاملDedicated Hardware for Real-Time Computation of Second-Order Statistical Features for High Resolution Images
We present a novel dedicated hardware system for the extraction of second-order statistical features from high-resolution images. The selected features are based on gray level co-occurrence matrix analysis and are angular second moment, correlation, inverse difference moment and entropy. The proposed system was evaluated using input images with resolutions that range from 512×512 to 2048×2048 p...
متن کاملپنهانشکنی تصویر براساس ویژگیهای ماتریس هموقوعی
In this paper two novel steganalysis methods is presented based on co-occurrence matrix of an image. It is shown that by using features extracted from this matrix, we can differentiate between cover and stego images. These features include energy, entropy, contrast, inverse difference moment, maximum probability and correlation. We use SVM classification for separation of cover and stego imag...
متن کاملSecond-Order Statistical Texture Representation of Asphalt Pavement Distress Images Based on Local Binary Pattern in Spatial and Wavelet Domain
Assessment of pavement distresses is one of the important parts of pavement management systems to adopt the most effective road maintenance strategy. In the last decade, extensive studies have been done to develop automated systems for pavement distress processing based on machine vision techniques. One of the most important structural components of computer vision is the feature extraction met...
متن کاملAdapting Image Texture Co-occurrence Analysis for Audio Texture Similarity
In this letter, we adapt a well-known technique of image texture analysis (grey-level co-occurrence matrix) to compute similarity between musical audio signals. Grey-level cooccurrence matrices estimate the joint probability of pairs of pixel values separated by a spatial displacement vector. Instead of using pixel grey-level values, we propose to use frame-based audio features, obtained either...
متن کامل