n grams

نتایج جستجو برای: n grams

تعداد نتایج: 982486 فیلتر نتایج به سال:

ICSI System Description for SRE2008 Submission

2008

Howard Lei David Van Leeuwen

The three ICSI systems involved in the evaluations are the keyword HMM supervector system [1], the GMM supervector system, and the keyword phone lattice N-grams system [2], which we enhanced by including prosodic N-grams. Descriptions of the keyword HMM supervector and keyword phone lattice Ngrams + prosodic N-grams systems will be discussed in sections 3 and 4. A description of the GMM superve...

متن کامل

Exploring Word Embeddings and Character N-Grams for Author Clustering

2016

Yunita Sari Mark Stevenson

We presented our system for PAN 2016 Author Clustering task. Our software used simple character n-grams to represent the document collection. We then ran K-Means clustering optimized using the Silhouette Coefficient. Our system yields competitive results and required only a short runtime. Character n-grams can capture a wide range of information, making them effective for authorship attribution...

متن کامل

Detection of algorithmically generated malicious domain names using masked N-grams

Journal: :Expert Systems with Applications 2019

متن کامل

Combining n-grams and deep convolutional features for language variety classification

Journal: :Natural Language Engineering 2019

متن کامل

Detecting Co-Derivative Documents in Large Text Collections

2008

Jan Pomikálek Pavel Rychlý

We have analyzed the SPEX algorithm by Bernstein and Zobel [1] for detecting co-derivative documents using duplicate n-grams. Though we totally agree with the claim that not using unique n-grams can greatly increase efficiency and scalability of the process of detecting co-derivative documents, we have found serious bottlenecks in the way SPEX finds the duplicate n-grams. We propose a solution ...

متن کامل

Detection of Opinion Spam with Character n-grams

2015

Donato Hernández Fusilier Manuel Montes-y-Gómez Paolo Rosso Rafael Guzmán-Cabrera

In this paper we consider the detection of opinion spam as a stylistic classification task because, given a particular domain, the deceptive and truthful opinions are similar in content but differ in the way opinions are written (style). Particularly, we propose using character ngrams as features since they have shown to capture lexical content as well as stylistic information. We evaluated our...

متن کامل

JHU Ad Hoc Experiments at CLEF 2008

2008

Paul McNamee

For CLEF 2008 JHU conducted monolingual and bilingual experiments in the ad hoc TEL and Persian tasks. The TEL task involved focused on searching electronic card catalog records in English, French, and German using data from the British Library, the Bibliotheque Nationale de France, and the Österreichische Nationalbibliothek (Austrian National Library). The approach we adopted for TEL was to st...

متن کامل

Automatic Genre Classification via N-grams of Part-of-Speech Tags

Journal: :Procedia - Social and Behavioral Sciences 2015

متن کامل

Revisiting the Case for Explicit Syntactic Information in Language Models

2012

Ariya Rastrow Sanjeev Khudanpur Mark Dredze

Statistical language models used in deployed systems for speech recognition, machine translation and other human language technologies are almost exclusively n-gram models. They are regarded as linguistically naı̈ve, but estimating them from any amount of text, large or small, is straightforward. Furthermore, they have doggedly matched or outperformed numerous competing proposals for syntactical...

متن کامل

The Benefit of Syntactic vs. Linear N-grams for Linguistic Description

2017

Melanie Andresen Heike Zinsmeister

Automatic dependency annotations have been used in all kinds of language applications. However, there has been much less exploitation of dependency annotations for the linguistic description of language varieties. This paper presents an attempt to employ dependency annotations for describing style. We argue that for this purpose, linear n-grams (that follow the text’s surface) alone do not appr...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید