نتایج جستجو برای: n grams

تعداد نتایج: 982486  

2008
Howard Lei David Van Leeuwen

The three ICSI systems involved in the evaluations are the keyword HMM supervector system [1], the GMM supervector system, and the keyword phone lattice N-grams system [2], which we enhanced by including prosodic N-grams. Descriptions of the keyword HMM supervector and keyword phone lattice Ngrams + prosodic N-grams systems will be discussed in sections 3 and 4. A description of the GMM superve...

2016
Yunita Sari Mark Stevenson

We presented our system for PAN 2016 Author Clustering task. Our software used simple character n-grams to represent the document collection. We then ran K-Means clustering optimized using the Silhouette Coefficient. Our system yields competitive results and required only a short runtime. Character n-grams can capture a wide range of information, making them effective for authorship attribution...

2008
Jan Pomikálek Pavel Rychlý

We have analyzed the SPEX algorithm by Bernstein and Zobel [1] for detecting co-derivative documents using duplicate n-grams. Though we totally agree with the claim that not using unique n-grams can greatly increase efficiency and scalability of the process of detecting co-derivative documents, we have found serious bottlenecks in the way SPEX finds the duplicate n-grams. We propose a solution ...

2015
Donato Hernández Fusilier Manuel Montes-y-Gómez Paolo Rosso Rafael Guzmán-Cabrera

In this paper we consider the detection of opinion spam as a stylistic classification task because, given a particular domain, the deceptive and truthful opinions are similar in content but differ in the way opinions are written (style). Particularly, we propose using character ngrams as features since they have shown to capture lexical content as well as stylistic information. We evaluated our...

2008
Paul McNamee

For CLEF 2008 JHU conducted monolingual and bilingual experiments in the ad hoc TEL and Persian tasks. The TEL task involved focused on searching electronic card catalog records in English, French, and German using data from the British Library, the Bibliotheque Nationale de France, and the Österreichische Nationalbibliothek (Austrian National Library). The approach we adopted for TEL was to st...

Journal: :Procedia - Social and Behavioral Sciences 2015

2012
Ariya Rastrow Sanjeev Khudanpur Mark Dredze

Statistical language models used in deployed systems for speech recognition, machine translation and other human language technologies are almost exclusively n-gram models. They are regarded as linguistically naı̈ve, but estimating them from any amount of text, large or small, is straightforward. Furthermore, they have doggedly matched or outperformed numerous competing proposals for syntactical...

2017
Melanie Andresen Heike Zinsmeister

Automatic dependency annotations have been used in all kinds of language applications. However, there has been much less exploitation of dependency annotations for the linguistic description of language varieties. This paper presents an attempt to employ dependency annotations for describing style. We argue that for this purpose, linear n-grams (that follow the text’s surface) alone do not appr...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید