نتایج جستجو برای: parts of speech tagging
تعداد نتایج: 21177608 فیلتر نتایج به سال:
It has been argued that, when learning a first language, babies use a series of small clues to aid recognition and comprehension, and that one of these clues is word length. In this paper we present a statistical part of speech tagger which trains itself solely on the number of letters in each word in a sentence.
This paper describes a variety of methods for inserting phrase boundaries in text. The methods work by ex amining the likelihood of a phrase break occurring in a sequence of three part-of-speech tags. The paper explains this basic technique and desribes more sophisticaed vari ations using distance probabilities.
This paper proposes an improvement of the Brill’s “TransformationRule Based” POS-Tagger Algorithm. Our improvement decreases training times considerably without affecting the accuracy of the algorithm.
In this article, compound processing for translation into German in a factored statistical MT system is investigated. Compounds are handled by splitting them prior to training, and merging the parts after translation. I have explored eight merging strategies using different combinations of external knowledge sources, such as word lists, and internal sources that are carried through the translat...
When Part-of-Speech annotated data is scarce, e.g. for under-resourced languages, one can turn to cross-lingual transfer and crawled dictionaries to collect partially supervised data. We cast this problem in the framework of ambiguous learning and show how to learn an accurate history-based model. Experiments on ten languages show significant improvements over prior state of the art performance.
In this paper we present a Marathipart of speech tagger. It is morphologically rich language. it is spoken by the native people of Maharashtra. The general approach used for development of tagger is statistical using Trigram Method. The main concept of Trigram is to explore the most likely POS for a token based on given information of previous two tags by calculating probabilities to determine ...
Neuro-imaging studies on reading different parts of speech (PoS) report somewhat mixed results, yet some of them indicate different activations with different PoS. This paper addresses the difficulty of using fMRI to discriminate between linguistic tokens in reading of running text because of low temporal resolution. We show that once we solve this problem, fMRI data contains a signal of PoS di...
Ratnaparkhi (1996) introduced a method of inferring a tag dictionary from annotated data to speed up part-of-speech tagging by limiting the set of possible tags for each word. While Ratnaparkhi’s tag dictionary makes tagging faster but less accurate, an alternative tag dictionary that we recently proposed (Moore, 2014) makes tagging as fast as with Ratnaparkhi’s tag dictionary, but with no decr...
In this article, four Part-of-Speech (PoS) taggers for Spanish are compared. The evaluation has been carried out without prior training or tuning of the PoS taggers. To allow for a comparison across PoS taggers, their tagsets have been mapped to the universal PoS tagset (Petrov, Das, and McDonald, 2012). The PoS taggers have also been compared as regards the information they provide and how the...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید