نتایج جستجو برای: Stop words
تعداد نتایج: 178858 فیلتر نتایج به سال:
This paper discusses recent research on methods for estimating configuration parameters for the Matrix Comparator used for linking unstandardized or heterogeneously standardized references. The matrix comparator computes the aggregate similarity between the tokens (words) in a pair of references. The two most critical parameters for the matrix comparator for obtaining the best linking results a...
With the availability of vast collection of research articles on internet, textual analysis is an increasingly important technique in scientometric analysis. While the context in which it is used and the specific algorithms implemented may vary, typically any textual analysis exercise involves intensive pre-processing of input text which includes removing topically uninteresting terms (stop wor...
Over the past decades systems for automatic management of electronic documents have been one of the main fields of research. Text processing is a wide area that includes many important disciplines. In the processes of organizing unstructured text in order to implement a mining technique, preprocessing has to be applied. One of the most important preprocessing techniques is the removal of functi...
A recently proposed adaptive strategy for text recognition uses a linguistic fact that over half of the words on a typical English page are among 150 common stop words. The small lexicon permits word-shape based recognition that yields word identities from which character prototypes can be extracted. This paper describes a fast procedure for locating the best candidates for those stop words. Th...
The effectiveness of three stop words lists for Arabic Information Retrieval---General Stoplist, CorpusBased Stoplist, Combined Stoplist ---were investigated in this study. Three popular weighting schemes were examined: the inverse document frequency weight, probabilistic weighting, and statistical language modelling. The Idea is to combine the statistical approaches with linguistic approaches ...
Spoken language usually precedes language represented in writing. Children know how to speak and listen years before they learn to read and write. The history of language is estimated to be in the order of magnitude of hundreds of thousands of years, the history of writing in thousands of years. There are many language communities without writing, but only in the case of dead languages such as ...
This paper addresses the problem of identifying collection dependent stop-words in order to reduce the size of inverted files. We present four methods to automatically recognise stop-words, analyse the tradeoff between efficiency and effectiveness, and compare them with a previous pruning approach. The experiments allow us to conclude that in some situations stop-words pruning is competitive wi...
Stop words are very important for information retrieval and text analysis investigation. This study aimed to automatically analyzed detect stop in texts Uzbek language. Because of limited availability methods automatic search we a newly prepared corpus. language belongs the family agglutinative languages. As with all languages, can explain that detection is more complex process than inflected l...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید