grams

نتایج جستجو برای: grams

تعداد نتایج: 8929 فیلتر نتایج به سال:

Multiword expressions in spontaneous speech: do we really speak like that?

2005

Helmer Strik Diana Binnenpoorte Catia Cucchiarini

In this study, we examined the pronunciation characteristics of multiword expressions (MWEs). We first drew up an inventory of frequently occurring N-grams extracted from orthographic transcriptions of spontaneous speech contained in a large corpus of spoken Dutch. For about 10% of these Ngrams phonetic transcriptions were available, which were examined. Our results show that the pronunciation ...

متن کامل

Fast Structural Alignment of Biomolecules Using a Hash Table, N-Grams and String Descriptors

Journal: :Algorithms 2009

Raphael André Bauer Kristian Rother Peter Moor Knut Reinert Thomas Steinke Janusz M. Bujnicki Robert Preissner

This work presents a generalized approach for the fast structural alignment of thousands of macromolecular structures. The method uses string representations of a macromolecular structure and a hash table that stores n-grams of a certain size for searching. To this end, macromolecular structure-to-string translators were implemented for protein and RNA structures. A query against the index is p...

متن کامل

Classifying disease outbreak reports using n-grams and semantic features

Journal: :International Journal of Medical Informatics 2009

متن کامل

Diacritics restoration based on word n-grams for Slovak texts

Journal: :Open Computer Science 2021

Abstract Despite the modern boom in technology, we are still faced with fact that people write texts without diacritics. There two main reasons for this. The first, historical reason stems from past when use of diacritics was troublesome and would text them. second one is speed - typing usually faster. Text easy to understand people, but some types documents, missing can cause a problem. This a...

متن کامل

Learning Chinese Word Embeddings With Words and Subcharacter N-Grams

Journal: :IEEE Access 2019

متن کامل

Storing the Web in Memory: Space Efficient Language Models with Constant Time Retrieval

2010

David Guthrie Mark Hepple

We present three novel methods of compactly storing very large n-gram language models. These methods use substantially less space than all known approaches and allow n-gram probabilities or counts to be retrieved in constant time, at speeds comparable to modern language modeling toolkits. Our basic approach generates an explicit minimal perfect hash function, that maps all n-grams in a model to...

متن کامل

Three Approaches to Automatic Assignment of ICD-9-CM Codes to Radiology Reports

Journal: :AMIA ... Annual Symposium proceedings. AMIA Symposium 2007

Ira Goldstein Anna Arzumtsyan Özlem Uzuner

We describe and evaluate three systems for automatically predicting the ICD-9-CM codes of radiology reports from short excerpts of text. The first system benefits from an open source search engine, Lucene, and takes advantage of the relevance of reports to one another based on individual words. The second uses BoosTexter, a boosting algorithm based on n-grams (sequences of consecutive words) an...

متن کامل

The latent words language model

Journal: :Computer Speech & Language 2012

Koen Deschacht Jan De Belder Marie-Francine Moens

Statistical language models have found many applications in information retrieval since their introduction almost three decades ago. Currently the most popular models are n-gram models, which are known to suffer from serious sparseness issues, which is a result of the large vocabulary size |V | of any given corpus and of the exponential nature of n-grams, where potentially |V | n-grams can occu...

متن کامل

CROSS-DOMAIN SENTIMENT CLASSIFICATION USING GRAMS DERIVED FROM SYNTAX TREES AND AN ADAPTED NAIVE BAYES APPROACH by SRILAXMI CHEETI

2012

SRILAXMI CHEETI

There is an increasing amount of user-generated information in online documents, including user opinions on various topics and products such as movies, DVDs, kitchen appliances, etc. To make use of such opinions, it is useful to identify the polarity of the opinion, in other words, to perform sentiment classification. The goal of sentiment classification is to classify a given text/document as ...

متن کامل

User simulation for spoken dialogue systems: learning and evaluation

2006

Kallirroi Georgila James Henderson Oliver Lemon

We propose the “advanced” n-grams as a new technique for simulating user behaviour in spoken dialogue systems, and we compare it with two methods used in our prior work, i.e. linear feature combination and “normal” n-grams. All methods operate on the intention level and can incorporate speech recognition and understanding errors. In the linear feature combination model user actions (lists of 〈 ...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید