نتایج جستجو برای: n grams

تعداد نتایج: 982486  

2007
Bouke Huurnink

In this paper we present the contents of the University of Amsterdam submission in the CLEF Cross Language Speech Retrieval 2007 English task. We describe the effects of using character n-grams and field combinations on both monolingual English retrieval, and crosslingual Dutch to English retrieval.

2010
Emily Pitler Shane Bergsma Dekang Lin Kenneth Ward Church

We use web-scale N-grams in a base NP parser that correctly analyzes 95.4% of the base NPs in natural text. Web-scale data improves performance. That is, there is no data like more data. Performance scales log-linearly with the number of parameters in the model (the number of unique N-grams). The web-scale N-grams are particularly helpful in harder cases, such as NPs that contain conjunctions.

2015
Upendra Sapkota Steven Bethard Manuel Montes-y-Gómez Thamar Solorio

Character n-grams have been identified as the most successful feature in both singledomain and cross-domain Authorship Attribution (AA), but the reasons for their discriminative value were not fully understood. We identify subgroups of character n-grams that correspond to linguistic aspects commonly claimed to be covered by these features: morphosyntax, thematic content and style. We evaluate t...

2006
Eckhard Bick

This paper presents a Constraint Grammarinspired machine learner and parser, Ling­ Pars, that assigns dependencies to morpho­ logically annotated treebanks in a functioncentred way. The system not only bases at­ tachment probabilities for PoS, case, mood, lemma on those features' function probabili­ ties, but also uses topological features like function/PoS n-grams, barrier tags and daughter-se...

Journal: :IOP Conference Series: Materials Science and Engineering 2021

2001
Shuntaro Isogai Katsuhiko Shirai Hirofumi Yamamoto Yoshinori Sagisaka

In this paper, a new language model, the Multi-Class Composite N-gram, is proposed to avoid a data sparseness problem in small amount of training data. The Multi-Class Composite Ngram maintains an accurate word prediction capability and reliability for sparse data with a compact model size based on multiple word clusters, so-called Multi-Classes. In the Multi-Class, the statistical connectivity...

Journal: :Int. J. Fuzzy Logic and Intelligent Systems 2012
Hiroaki Yamane Masafumi Hagiwara

In this paper, we propose a system which improves text catchphrases on the web using onomatopoeia and the Japanese Google N-grams. Onomatopoeia is regarded as a fundamental tool in daily communication for people. The proposed system inserts an onomatopoetic word into plain text catchphrases. Being based on a large catchphrase encyclopedia, the proposed system evaluates each catchphrase’s candid...

Journal: :CoRR 2014
Taraka Rama Lars Borin

In this article, we investigate the properties of phoneme N-grams across half of the world's languages. We investigate if the sizes of three different N-gram distributions of the world's language families obey a power law. Further, the N-gram distributions of language families parallel the sizes of the families, which seem to obey a power law distribution. The correlation between N-gram distrib...

2016
Jennifer Hill Rahul Simha

In this paper, we propose a method of automatically generating multiple-choice fill-inthe-blank exercises from existing text passages that challenge a reader’s comprehension skills and contextual awareness. We use a unique application of word co-occurrence likelihoods and the Google n-grams corpus to select words with strong contextual links to their surrounding text, and to generate distractor...

2012
Bram Jans Steven Bethard Ivan Vulic Marie-Francine Moens

In this paper, we extend current state-of-theart research on unsupervised acquisition of scripts, that is, stereotypical and frequently observed sequences of events. We design, evaluate and compare different methods for constructing models for script event prediction: given a partial chain of events in a script, predict other events that are likely to belong to the script. Our work aims to answ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید