source text

نتایج جستجو برای: source text

تعداد نتایج: 572362 فیلتر نتایج به سال:

Learning Morpho-Lexical Probabilities from an Untagged Corpus with an Application to Hebrew

Journal: :Computational Linguistics 1995

Moshe Levinger Uzzi Ornan Alon Itai

This paper proposes a new approach for acquiring morpho-lexical probabilities from an untagged corpus. This approach demonstrates a way to extract very useful and nontrivial information from an untagged corpus, which otherwise would require laborious tagging of large corpora. The paper describes the use of these morpho-lexical probabilities as an information source for morphological disambiguat...

متن کامل

A Fast and Accurate Vietnamese Word Segmenter

Journal: :CoRR 2017

Dat Quoc Nguyen Dai Quoc Nguyen Thanh Vu Mark Dras Mark Johnson

We propose a novel approach to Vietnamese word segmentation. Our approach is based on the Single Classification Ripple Down Rules methodology (Compton and Jansen, 1990), where rules are stored in an exception structure and new rules are only added to correct segmentation errors given by existing rules. Experimental results on the benchmark Vietnamese treebank show that our approach outperforms ...

متن کامل

Poet's Little Helper: A methodology for computer-based poetry generation. A case study for the Basque language

2017

Aitzol Astigarraga José María Martínez-Otzeta Igor Rodriguez Rodriguez Basilio Sierra Elena Lazkano

We present Poet’s Little Helper (PLH), a tool that implements a methodology to generate poetry using minimal language-dependent information. The user only needs to provide a corpus with a set of sentences, a rhyme checker and a syllable-counter. From these building blocks, PLH produces: (1) an exploratory analysis of the suitability of the given corpus for poetry generation. (2) a novel and non...

متن کامل

Speech synthesis development made easy: the bonn open synthesis system

2001

Esther Klabbers Karlheinz Stöber Raymond N. J. Veldhuis Petra Wagner Stefan Breuer

This paper describes a new open source architecture for unit-selection based speech synthesis called BOSS (Bonn Open Synthesis System). It is built up modularly, with communications between modules taking place in a fixed format. This makes the addition, deletion and substitution of modules very easy. The strict separation between data and algorithms allows for the simple creation of new speech...

متن کامل

Comparison of Image-Based and Text-Based Source Code Classification Using Deep Learning

Journal: :SN Computer Science 2020

متن کامل

A Linguistic Steganalysis Approach Base on Source Features of Text and Immune Mechanism

Journal: :Computer and Information Science 2017

متن کامل

An Approach for Concept-based Automatic Multi- Document Summarization using Machine Learning

2012

G. PadmaPriya

Text Summarization is compressing the source text into a shorter version preserving its information content and overall meaning. It is very complicated for human beings to manually summarize large documents of text. Text summarization plays an important role in the area of natural language processing and text mining. Many approaches use statistics and machine learning techniques to extract sent...

متن کامل

Test Model for Text Categorization and Text Summarization

Journal: :CoRR 2011

Khushboo Thakkar Urmila Shrawankar

Abstract—Text Categorization is the task of automatically sorting a set of documents into categories from a predefined set and Text Summarization is a brief and accurate representation of input text such that the output covers the most important concepts of the source in a condensed manner. Document Summarization is an emerging technique for understanding the main purpose of any kind of documen...

متن کامل

$$ {{\text{C}}_{\alpha }} - {\text{C}} $$ Bond Cleavage of the Peptide Backbone in MALDI In-Source Decay Using Salicylic Acid Derivative Matrices

Journal: :Journal of The American Society for Mass Spectrometry 2011

متن کامل

METER: MEasuring TExt Reuse

2002

Paul D. Clough Robert J. Gaizauskas Scott S. L. Piao Yorick Wilks

In this paper we present results from the METER (MEasuring TExt Reuse) project whose aim is to explore issues pertaining to text reuse and derivation, especially in the context of newspapers using newswire sources. Although the reuse of text by journalists has been studied in linguistics, we are not aware of any investigation using existing computational methods for this particular task. We inv...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید