نتایج جستجو برای: n grams
تعداد نتایج: 982486 فیلتر نتایج به سال:
The COVID-19 pandemic announced by the World Health Organization has disrupted human lives at different scales, including economy, public health, and people's emotions. Social media databases record huge accumulated information concern this pandemic. Twitter platform is considered one of most active social that enable users to tweet in conversations they are concerned about. problem arises when...
Understanding an enterprise’s workforce and skill-set can be seen as the key to understanding an organization’s capabilities. In today’s large organizations it has become increasingly difficult to find people that have specific skills or expertise or to explore and understand the overall picture of an organization’s portfolio of topic expertise. This article presents a case study of analyzing a...
Short Message Service (SMS) is an important component of modern mobile services. Given unique characteristics of Chinese language, it is imperative to conduct study to understand characteristic of language usage patterns in Chinese SMS so that important facts like why and how people in China use SMS can be discovered. In this paper, we report an analysis of Chinese SMS logs from three different...
A new extension to ELAN offers expanded n-gram analysis tools including improved search capabilities and an extensive library of statistical measures of association for n-grams. This paper presents an overview of the new tools and a case study in American Sign Language synthesis that exploits these capabilities for computing more natural timing in generated sentences. The new extension provides...
CNG Method for Authorship Attribution. The Common N-Grams (CNG) classification method for authorship attribution (AATT) was described in [2]. The method is based on extracting the most frequent byte n-grams of size n from the training data. The n-grams are sorted by their normalized frequency, and the first L most-frequent n-grams define an author profile. Given a test document, the test profil...
Plagiarism and copyright infringement are major problems in academic and corporate environments. Existing solutions for detecting infringements in structured text such as source code are restricted to textual similarity comparisons of two pieces of work. In this paper, we examine authorship attribution as a means for tackling plagiarism detection. Given several samples of work from several auth...
This paper deals with automatic supervised classification of documents. The approach suggested is based on a vector representation of the documents centred not on the words but on the n-grams of characters for varying n. The effects of this method are examined in several experiments using the multivariate chi-square to reduce the dimensionality, the cosine and Kullback&Liebler distances, and tw...
A cluster keyboard partitions the letters of the alphabet onto subset keys. On such keyboards most words are typed with no more key presses than on the standard keyboard, but a key sequence may stand for two or more words. In current practice, this ambiguity problem is addressed by hypothesizing words according to their unigram (occurrence) frequency. When the hypothesized word is not the inten...
This paper presents an approach for author profiling of an unknown users from their texts produced in social media. In particular, we address the identification of two profile dimensions: gender and language variety, of Arabic twitter users based on their tweets. Our approach focused on applying metaclassification technique on features extracted from tweets body. We explored two main sets of fe...
-A series of experiments is being conducted on the Farsi language in the domain of laws in the university of Tehran. One of the goals of these experiments is to establish the performance of different weighting schemes and retrieval models. For the lack of a Farsi stemmer and some characterisitics of the language, it was decided to experiment with N-grams. With un-stemmed words and 2-grams, 3-gr...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید