نتایج جستجو برای: text length
تعداد نتایج: 467834 فیلتر نتایج به سال:
The paper proposes a new method of linear text segmentation based on lexical cohesion of a text. Namely, first a single chain of disambiguated words in a text is established, then the rips of this single chain are considered as boundaries for the segments of the cohesion text structure (Cohesion TextTiling or CTT). The summaries of arbitrarily length are obtained by extraction using three diffe...
We argue that the advent of large volumes (of fulllength text, as opposed to short texts like abstracts and newswire, should be accompanied by corresponding new approaches to information access. Towamd this end, we discuss the merits of imposing structure on fulllength text documents; that is, a partition of t’he text into coherent multi-paragraph units that represent the pattern of subtopics t...
In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature spa...
We prove an inequality for the number of periods in a word [Formula: see text] terms length and its initial critical exponent. Next, we characterize all length-[Formula: prefix characteristic Sturmian lazy Ostrowski representation text], use this result to show that our is tight infinitely many words text]. propose two related measures periodicity infinite words. Finally, also consider special ...
Data used for training statistical machine translation method are usually prepared from three resources: parallel, non-parallel and comparable text corpora. Parallel corpora are an ideal resource for translation but due to lack of these kinds of texts, non-parallel and comparable corpora are used either for parallel text extraction. Most of existing methods for exploiting comparable corpora loo...
We present an index that stores a text of length n such that given a pattern of length m, all the substrings of the text that are within Hamming distance (or edit distance) at most k from the pattern are reported in O(m+ log log n + #matches) time (for constant k). The space complexity of the index is O(n1+ǫ) for any constant ǫ > 0.
We present a filtering based algorithm for the k-mismatch pattern matching problem with don’t cares. Given a text t of length n and a pattern p of length m with don’t care symbols in either p or t (but not both), and a bound k, our algorithm finds all the places that the pattern matches the text with at most k mismatches. The algorithm is deterministic and runs in Θ(nmk logm) time.
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید