نتایج جستجو برای: n grams
تعداد نتایج: 982486 فیلتر نتایج به سال:
In this paper we present the contents of the University of Amsterdam submission in the CLEF Cross Language Speech Retrieval 2007 English task. We describe the effects of using character n-grams and field combinations on both monolingual English retrieval, and crosslingual Dutch to English retrieval.
We use web-scale N-grams in a base NP parser that correctly analyzes 95.4% of the base NPs in natural text. Web-scale data improves performance. That is, there is no data like more data. Performance scales log-linearly with the number of parameters in the model (the number of unique N-grams). The web-scale N-grams are particularly helpful in harder cases, such as NPs that contain conjunctions.
Character n-grams have been identified as the most successful feature in both singledomain and cross-domain Authorship Attribution (AA), but the reasons for their discriminative value were not fully understood. We identify subgroups of character n-grams that correspond to linguistic aspects commonly claimed to be covered by these features: morphosyntax, thematic content and style. We evaluate t...
This paper presents a Constraint Grammarinspired machine learner and parser, Ling Pars, that assigns dependencies to morpho logically annotated treebanks in a functioncentred way. The system not only bases at tachment probabilities for PoS, case, mood, lemma on those features' function probabili ties, but also uses topological features like function/PoS n-grams, barrier tags and daughter-se...
In this paper, a new language model, the Multi-Class Composite N-gram, is proposed to avoid a data sparseness problem in small amount of training data. The Multi-Class Composite Ngram maintains an accurate word prediction capability and reliability for sparse data with a compact model size based on multiple word clusters, so-called Multi-Classes. In the Multi-Class, the statistical connectivity...
In this paper, we propose a system which improves text catchphrases on the web using onomatopoeia and the Japanese Google N-grams. Onomatopoeia is regarded as a fundamental tool in daily communication for people. The proposed system inserts an onomatopoetic word into plain text catchphrases. Being based on a large catchphrase encyclopedia, the proposed system evaluates each catchphrase’s candid...
In this article, we investigate the properties of phoneme N-grams across half of the world's languages. We investigate if the sizes of three different N-gram distributions of the world's language families obey a power law. Further, the N-gram distributions of language families parallel the sizes of the families, which seem to obey a power law distribution. The correlation between N-gram distrib...
In this paper, we propose a method of automatically generating multiple-choice fill-inthe-blank exercises from existing text passages that challenge a reader’s comprehension skills and contextual awareness. We use a unique application of word co-occurrence likelihoods and the Google n-grams corpus to select words with strong contextual links to their surrounding text, and to generate distractor...
In this paper, we extend current state-of-theart research on unsupervised acquisition of scripts, that is, stereotypical and frequently observed sequences of events. We design, evaluate and compare different methods for constructing models for script event prediction: given a partial chain of events in a script, predict other events that are likely to belong to the script. Our work aims to answ...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید