grams

نتایج جستجو برای: grams

تعداد نتایج: 8929 فیلتر نتایج به سال:

Protein classification using modified n-grams and skip-grams.

Journal: :Bioinformatics 2017

S M Ashiqul Islam Benjamin J Heil Christopher Michel Kearney Erich J Baker

Motivation Classification by supervised machine learning greatly facilitates the annotation of protein characteristics from their primary sequence. However, the feature generation step in this process requires detailed knowledge of attributes used to classify the proteins. Lack of this knowledge risks the selection of irrelevant features, resulting in a faulty model. In this study, we introduce...

متن کامل

s-grams: Defining generalized n-grams for information retrieval

Journal: :Inf. Process. Manage. 2007

Anni Järvelin Antti Järvelin Kalervo Järvelin

n-grams have been used widely and successfully for approximate string matching in many areas. s-grams have been introduced recently as an n-gram based matching technique, where di-grams are formed of both adjacent and non-adjacent characters. s-grams have proved successful in approximate string matching across language boundaries in Information Retrieval (IR). s-grams however lack precise defin...

متن کامل

N-gramas sintácticos no-continuos

Journal: :Polibits 2013

Grigori Sidorov

In this paper, we present the concept of noncontinuous syntactic n-grams. In our previous works we introduced the general concept of syntactic n-grams, i.e., n-grams that are constructed by following paths in syntactic trees. Their great advantage is that they allow introducing of the merely linguistic (syntactic) information into machine learning methods. Certain disadvantage is that previous ...

متن کامل

Unordered N-gram Representation Based on Zero-suppressed BDDs for Text Mining and Classification

2007

Ryutaro Kurai Shin-ichi Minato Thomas Zeugmann

In this paper, we present a new method to analyze unordered n-grams by using ZBDDs (Zero-suppressed BDDs). n-grams have been used not only for text analysis but also for text indexing in some search engines. We newly use a variation of n-grams called unordered n-grams. Unordered n-grams abstract from the position of the characters in each n-gram, i.e., they just deal with the range of ordinary ...

متن کامل

CIC-FBK Approach to Native Language Identification

2017

Ilia Markov Lingzhen Chen Carlo Strapparava Grigori Sidorov

We present the CIC-FBK system, which took part in the Native Language Identification (NLI) Shared Task 2017. Our approach combines features commonly used in previous NLI research, i.e., word n-grams, lemma n-grams, part-of-speech n-grams, and function words, with recently introduced character n-grams from misspelled words, and features that are novel in this task, such as typed character n-gram...

متن کامل

Syntactic Dependency-Based N-grams: More Evidence of Usefulness in Classification

2013

Grigori Sidorov Francisco Velasquez Efstathios Stamatatos Alexander F. Gelbukh Liliana Chanona-Hernández

The paper introduces and discusses a concept of syntactic n-grams (sn-grams) that can be applied instead of traditional n-grams in many NLP tasks. Sn-grams are constructed by following paths in syntactic trees, so sngrams allow bringing syntactic knowledge into machine learning methods. Still, previous parsing is necessary for their construction. We applied sn-grams in the task of authorship at...

متن کامل

Fast parameterized matching with q-grams

Journal: :Journal of Discrete Algorithms 2008

متن کامل

Experiments in Farsi Text Retrieval

2001

FARHAD OROUMCHIAN NAGHMEH KARIMI MINA ZOLFY

-A series of experiments is being conducted on the Farsi language in the domain of laws in the university of Tehran. One of the goals of these experiments is to establish the performance of different weighting schemes and retrieval models. For the lack of a Farsi stemmer and some characterisitics of the language, it was decided to experiment with N-grams. With un-stemmed words and 2-grams, 3-gr...

متن کامل

The subjective frequency of word n-grams

Journal: :Psihologija 2013

متن کامل

Comparing word, character, and phoneme n-grams for subjective utterance recognition

2008

Theresa Wilson Stephan Raaijmakers

In this paper, we compare the performance of classifiers trained using word n-grams, character n-grams, and phoneme n-grams for recognizing subjective utterances in multiparty conversation. We show that there is value in using very shallow linguistic representations, such as character n-grams, for recognizing subjective utterances, in particular, gains in the recall of subjective utterances.

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید