نتایج جستجو برای: parts of speech tagging

تعداد نتایج: 21177608  

Journal: :CoRR 1998
Simon Cozens

It has been argued that, when learning a first language, babies use a series of small clues to aid recognition and comprehension, and that one of these clues is word length. In this paper we present a statistical part of speech tagger which trains itself solely on the number of letters in each word in a sentence.

1995
Eric Sanders Paul Taylor

This paper describes a variety of methods for inserting phrase boundaries in text. The methods work by ex­ amining the likelihood of a phrase break occurring in a sequence of three part-of-speech tags. The paper explains this basic technique and desribes more sophisticaed vari­ ations using distance probabilities.

Journal: :Procesamiento del Lenguaje Natural 2006
Jesús González Martí David González Maline José Antonio Troyano Jiménez

This paper proposes an improvement of the Brill’s “TransformationRule Based” POS-Tagger Algorithm. Our improvement decreases training times considerably without affecting the accuracy of the algorithm.

2009
Sara Stymne

In this article, compound processing for translation into German in a factored statistical MT system is investigated. Compounds are handled by splitting them prior to training, and merging the parts after translation. I have explored eight merging strategies using different combinations of external knowledge sources, such as word lists, and internal sources that are carried through the translat...

2014
Guillaume Wisniewski Nicolas Pécheux Souhir Gahbiche-Braham François Yvon

When Part-of-Speech annotated data is scarce, e.g. for under-resourced languages, one can turn to cross-lingual transfer and crawled dictionaries to collect partially supervised data. We cast this problem in the framework of ambiguous learning and show how to learn an accurate history-based model. Experiments on ten languages show significant improvements over prior state of the art performance.

Journal: :CoRR 2013
Jyoti Singh Nisheeth Joshi Iti Mathur

In this paper we present a Marathipart of speech tagger. It is morphologically rich language. it is spoken by the native people of Maharashtra. The general approach used for development of tagger is statistical using Trigram Method. The main concept of Trigram is to explore the most likely POS for a token based on given information of previous two tags by calculating probabilities to determine ...

2016
Joachim Bingel Maria Barrett Anders Søgaard

Neuro-imaging studies on reading different parts of speech (PoS) report somewhat mixed results, yet some of them indicate different activations with different PoS. This paper addresses the difficulty of using fMRI to discriminate between linguistic tokens in reading of running text because of low temporal resolution. We show that once we solve this problem, fMRI data contains a signal of PoS di...

2015
Robert Moore

Ratnaparkhi (1996) introduced a method of inferring a tag dictionary from annotated data to speed up part-of-speech tagging by limiting the set of possible tags for each word. While Ratnaparkhi’s tag dictionary makes tagging faster but less accurate, an alternative tag dictionary that we recently proposed (Moore, 2014) makes tagging as fast as with Ratnaparkhi’s tag dictionary, but with no decr...

Journal: :Procesamiento del Lenguaje Natural 2015
Carla Parra Escartín Héctor Martínez Alonso

In this article, four Part-of-Speech (PoS) taggers for Spanish are compared. The evaluation has been carried out without prior training or tuning of the PoS taggers. To allow for a comparison across PoS taggers, their tagsets have been mapped to the universal PoS tagset (Petrov, Das, and McDonald, 2012). The PoS taggers have also been compared as regards the information they provide and how the...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید