نتایج جستجو برای: source text
تعداد نتایج: 572362 فیلتر نتایج به سال:
This paper proposes a new approach for acquiring morpho-lexical probabilities from an untagged corpus. This approach demonstrates a way to extract very useful and nontrivial information from an untagged corpus, which otherwise would require laborious tagging of large corpora. The paper describes the use of these morpho-lexical probabilities as an information source for morphological disambiguat...
We propose a novel approach to Vietnamese word segmentation. Our approach is based on the Single Classification Ripple Down Rules methodology (Compton and Jansen, 1990), where rules are stored in an exception structure and new rules are only added to correct segmentation errors given by existing rules. Experimental results on the benchmark Vietnamese treebank show that our approach outperforms ...
We present Poet’s Little Helper (PLH), a tool that implements a methodology to generate poetry using minimal language-dependent information. The user only needs to provide a corpus with a set of sentences, a rhyme checker and a syllable-counter. From these building blocks, PLH produces: (1) an exploratory analysis of the suitability of the given corpus for poetry generation. (2) a novel and non...
This paper describes a new open source architecture for unit-selection based speech synthesis called BOSS (Bonn Open Synthesis System). It is built up modularly, with communications between modules taking place in a fixed format. This makes the addition, deletion and substitution of modules very easy. The strict separation between data and algorithms allows for the simple creation of new speech...
Text Summarization is compressing the source text into a shorter version preserving its information content and overall meaning. It is very complicated for human beings to manually summarize large documents of text. Text summarization plays an important role in the area of natural language processing and text mining. Many approaches use statistics and machine learning techniques to extract sent...
Abstract—Text Categorization is the task of automatically sorting a set of documents into categories from a predefined set and Text Summarization is a brief and accurate representation of input text such that the output covers the most important concepts of the source in a condensed manner. Document Summarization is an emerging technique for understanding the main purpose of any kind of documen...
In this paper we present results from the METER (MEasuring TExt Reuse) project whose aim is to explore issues pertaining to text reuse and derivation, especially in the context of newspapers using newswire sources. Although the reuse of text by journalists has been studied in linguistics, we are not aware of any investigation using existing computational methods for this particular task. We inv...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید