grams

key lexical chunks in applied linguistics article abstracts

Journal: :journal of teaching language skills 2015

hadi farjami

in any discourse domain, certain chunks are particularly frequent and deserve attention by the novice to be initiated and by the expert to maintain a sense of community. to make a relevant contribution to the awareness about applied linguistics texts and discourse, this study attempted to develop lists of lexical chunks frequently used in the abstracts of applied linguistics journals. the abstr...

متن کامل

Authorship Attribution in Portuguese Using Character N-grams

2017

Ilia Markov Jorge Baptista Obdulia Pichardo-Lagunas

For the Authorship Attribution (AA) task, character n-grams are considered among the best predictive features. In the English language, it has also been shown that some types of character n-grams perform better than others. This paper tackles the AA task in Portuguese by examining the performance of different types of character n-grams, and various combinations of them. The paper also experimen...

متن کامل

Unsupervised Approach for Automatic Keyword Extraction from Arabic Documents

2014

Arafat Atwi Awajan

In this paper, we present an unsupervised two-phase approach to extract keywords from Arabic documents that combines statistical analysis and linguistic information. The first phase detects all the N-grams that may be considered keywords. In the second phase, the N-grams are analyzed using a morphological analyzer to replace the words of the N-grams with their base forms that are the roots for ...

متن کامل

Approximate String Joins in a Database (Almost) for Free Erratum

2003

Luis Gravano Panagiotis G. Ipeirotis H. V. Jagadish Nick Koudas S. Muthukrishnan Divesh Srivastava

In [GIJ01a, GIJ01b] we described how to use q-grams in an RDBMS to perform approximate string joins. We also showed how to implement the approximate join using plain SQL queries. Specifically, we described three filters, count filter, position filter, and length filter, which can be used to execute efficiently the approximate join. The intuition behind the count filter was that strings that are...

متن کامل

Growth Performance and Profitability of Broilers with Vermi Meal on Fermented Ration Under Two Management Systems

Journal: International Journal of Advanced Biological and Biomedical Research 2019

Marcos Bollido,

This study was conducted to evaluate the effects of the different levels of vermi (Eisenia fetida) meal on fermented ration in broiler chicken growth and profitability under two management systems. For this purpose, 120 day old broiler chickens (cobb vantress) were tested in a Completely Randomized Design with four (4) dietary treatments: first, the commercial feeds (control), second, 2% vermi ...

متن کامل

Detection of New Malicious Code Using N-grams Signatures

2004

Tony Abou-Assaleh Nick Cercone Vlado Keselj Ray Sweidan

Signature-based malicious code detection is the standard technique in all commercial anti-virus software. This method can detect a virus only after the virus has appeared and caused damage. Signature-based detection performs poorly when attempting to identify new viruses. Motivated by the standard signature-based technique for detecting viruses, and a recent successful text classification metho...

متن کامل

CCG parsing with one syntactic structure per n-gram

2009

Tim Dawborn James R. Curran

There is an inherent redundancy in natural languages whereby certain common phrases (or n-grams) appear frequently in general sentences, each time with the same syntactic analysis. We explore the idea of exploiting this redundancy by pre-constructing the parse structures for these frequent n-grams. When parsing sentences in the future, the parser does not have to re-derive the parse structure f...

متن کامل

Rel-grams: A Probabilistic Model of Relations in Text

2012

Niranjan Balasubramanian Stephen Soderland Mausam Oren Etzioni

We introduce the Rel-grams language model, which is analogous to an n-grams model, but is computed over relations rather than over words. The model encodes the conditional probability of observing a relational tuple R, given that R′ was observed in a window of prior relational tuples. We build a database of Rel-grams co-occurence statistics from ReVerb extractions over 1.8M news wire documents ...

متن کامل

Syllable-Based Burrows-Wheeler Transform

2007

Jan Lansky Katsiaryna Chernik Zuzana Vlckova

The Burrows-Wheeler Transform (BWT) is a compression method which reorders an input string into the form, which is preferable to another compression. Usually Move-To-Front transform and then Huffman coding is used to the permutated string. The original method [3] from 1994 was designed for an alphabet compression. In 2001, versions working with word and n-grams alphabet were presented. The newe...

متن کامل

ASOBEK at SemEval-2016 Task 1: Sentence Representation with Character N-gram Embeddings for Semantic Textual Similarity

2016

Asli Eyecioglu Bill Keller

A growing body of research has recently been conducted on semantic textual similarity using a variety of neural network models. While recent research focuses on word-based representation for phrases, sentences and even paragraphs, this study considers an alternative approach based on character n-grams. We generate embeddings for character n-grams using a continuous-bag-of-n-grams neural network...

متن کامل