source text

نتایج جستجو برای: source text

تعداد نتایج: 572362 فیلتر نتایج به سال:

Revealing Disease Similarities by Text Mining

2017

Alberto Calderone Luana Licata Elisa Micarelli Livia Perfetto Gianni Cesareni

Texts written in human language contain structured information that is not easily parsable by computers. Text mining relies on large text corpora to derive rules which can be used by automatic means to extract automatically such information. Scientific literature represents the main source of information to study any biological phenomenon. While some phenomenon are studied to the point that cor...

متن کامل

Towards Optimal Choice Selection for Improved Hybrid Machine Translation

Journal: :Prague Bull. Math. Linguistics 2012

Christian Federmann Maite Melero Pavel Pecina Josef van Genabith

In recent years, machine translation (MT) research focused on investigating how hybridMT as well as MT combination systems can be designed so that the resulting translations give an improvement over the individual translations. As a first step towards achieving this objective we have developed a parallel corpus with source data and the output of a number of MT systems, annotated with metadata i...

متن کامل

Translation of Conditional Compilation

1997

Maarit Harsu

This paper describes how to translate the compiler directives for conditional compilation in automated source-to-source translation between high-level programming languages. The directives for conditional compilation of the source language are translated into the corresponding directives of the target language, and the source program text of each branch of conditional compilation is translated ...

متن کامل

A text categorisation tool for open source communities based on semantic analysis

Journal: :Behaviour & Information Technology 2013

متن کامل

The source text for the 2020 Afrikaans translation of the New Testament

Journal: :Tydskrif vir Geesteswetenskappe 2021

متن کامل

MedXN: an open source medication extraction and normalization tool for clinical text

Journal: :Journal of the American Medical Informatics Association 2014

متن کامل

Mining Textual Data in Croatian

2005

Bojana Dalbelo Basic Boris Berecek Ana Cvitas

Business intelligence systems find textual data a very useful source of information. Text processing algorithms and systems in English and other world languages are well developed, which is not the case with Croatian language. This paper explores the applicability of existing systems and examines optimal parameters for Croatian. The quality of input data strongly influences clustering and class...

متن کامل

Semantic Typology and Parallel Corpora: Something about Indefinite Pronouns

2017

Barend Beekhuizen Julia Watson Suzanne Stevenson

Patterns of crosslinguistic variation in the expression of word meaning are informative about semantic organization, but most methods to study this are labor intensive and obscure the gradient nature of concepts. We propose an automatic method for extracting crosslinguistic co-categorization patterns from parallel texts, and explore the properties of the data as a potential source for automatic...

متن کامل

Combining Content-Based and URL-Based Heuristics to Harvest Aligned Bitexts from Multilingual Sites with Bitextor

Journal: :Prague Bull. Math. Linguistics 2010

Miquel Esplà-Gomis Mikel L. Forcada

Nowadays, many websites in the Internet are multilingual and may be considered sources of parallel corpora. In this paper we will describe the free/open-source tool Bitextor, created to harvest aligned bitexts from these multilingual websites, which may be used to train corpusbased machine translation systems. This tool uses the work developed in previous approaches withmodifications and improv...

متن کامل

Multi-source morphosyntactic tagging for spoken Rusyn

2017

Yves Scherrer Achim Rabus

This paper deals with the development of morphosyntactic taggers for spoken varieties of the Slavic minority language Rusyn. As neither annotated corpora nor parallel corpora are electronically available for Rusyn, we propose to combine existing resources from the etymologically close Slavic languages Russian, Ukrainian, Slovak, and Polish and adapt them to Rusyn. Using MarMoT as tagging toolki...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید