نتایج جستجو برای: n grams

تعداد نتایج: 982486  

2017
David R. W. Sears Andreas Arzt Harald Frostel Reinhard Sonnleitner Gerhard Widmer

String-based (or viewpoint) models of tonal harmony often struggle with data sparsity in pattern discovery and prediction tasks, particularly when modeling composite events like triads and seventh chords, since the number of distinct n-note combinations in polyphonic textures is potentially enormous. To address this problem, this study examines the efficacy of skip-grams in music research, an a...

Journal: :The Programming Historian en español 2017

Journal: :International Journal of Corpus Linguistics 2017

2012
Camille Besse Alireza Bakhtiari Luc Lamontagne

Analyzing international political behavior based on similar precedent circumstances is one of the basic techniques that policymakers use to monitor and assess current situations. Our goal is to investigate how to analyze geopolitical conflicts as sequences of events and to determine what probabilistic models are suitable to perform these analyses. In this paper, we evaluate the performance of N...

2000
Paul McNamee James Mayfield Christine D. Piatko

The Hopkins Automated Information Retriever for Combing Unstructured Text (HAIRCUT) is a research IR system developed at the Johns Hopkins University Applied Physics Laboratory (JHU/APL). HAIRCUT benefits from a basic design decision to support flexibility throughout the system. One specific example of this is the way we represent documents and queries; words, stemmed words, character n-grams, ...

2001
Hirofumi Yamamoto Shuntaro Isogai Yoshinori Sagisaka

In this paper, a new language model, the Multi-Class Composite N-gram, is proposed to avoid a data sparseness problem for spoken language in that it is difficult to collect training data. The Multi-Class Composite N-gram maintains an accurate word prediction capability and reliability for sparse data with a compact model size based on multiple word clusters, called MultiClasses. In the Multi-Cl...

2008
Hans van Halteren

This paper shows that it is very often possible to identify the source language of medium-length speeches in the EUROPARL corpus on the basis of frequency counts of word n-grams (87.2%96.7% accuracy depending on classification method). The paper also examines in detail which positive markers are most powerful and identifies a number of linguistic aspects as well as cultureand domain-related ones.1

2013
Gabriela Ramírez-de-la-Rosa Thamar Solorio Manuel Montes-y-Gómez Yang Liu Lisa Bedore Elizabeth Peña Aquiles Iglesias

We present a set of new measures designed to reveal latent information of language use in children at the lexico-syntactic level. Our analysis of spontaneous narratives from children identified with language impairment and children developing typically shows that these metrics are a promising approach that could aid in the task of language assessment.

2017
Ben Whitham

While advanced defenders have successfully used honeyfiles to detect unauthorized intruders and insider threats for more than 30 years, the complexity associated with adaptively devising enticing content has limited their diffusion. This paper presents four new designs for automating the construction of honeyfile content. The new designs select a document from the target directory as a template...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید