نتایج جستجو برای: grams

تعداد نتایج: 8929  

Journal: :International seminars in surgical oncology : ISSO 2005
M Salhab W Al Sarakbi K Mokbel

Results Overall the breast weight was higher in patients with ER+ disease. The weight was found to be significantly higher in women aged 50 years or older with ER+ tumours (669 vs. 220 grams, p = 0.015). There was no significant difference in breast weight between ER+ and ERtumours in women aged less than 50 years (median weight: 440 vs.408 grams, p = 0.379). We observed a non-significant assoc...

2004
David Holmes Samsum Kashfi Syed Uzair Aqeel

We address name search for transliterated Arabic given names. In previous work, we addressed similar problems with English and Arabic surnames. In each previous case, we used a variant of Soundex and n-grams to improve precision and recall of name matching compared against well known approaches such as the Russell Soundex algorithm. Unlike prior work, the proposed approach does not rely upon So...

2017
Shervin Malmasi Marcos Zampieri

In this paper we examine methods to detect hate speech in social media, while distinguishing this from general profanity. We aim to establish lexical baselines for this task by applying supervised classification methods using a recently released dataset annotated for this purpose. As features, our system uses character n-grams, word n-grams and word skip-grams. We obtain results of 78% accuracy...

Journal: :Chemica : Jurnal Teknik Kimia (e-journal) 2021

Bioplastics are plastics that easily degraded in nature, which can minimize their potential as pollutants. It is mainly made from agro-polymers such cellulose and starch. Starch sugar palm (Arenga pinnata) has the gelatinization principle requires bioplastic raw materials by adding glycerol chitosan. This research aims to determine effect of addition chitosan composition variation on mechanical...

Journal: :Information Sciences 2023

A text written using symbols from a given alphabet can be compressed the Huffman code, which minimizes length of encoded text. It is necessary, however, to employ text-specific codebook, i.e. symbol-codeword dictionary, decode original Thus, compression performance should evaluated by full code length, plus codebook. We studied several alphabets for compressing texts – letters, n-grams syllable...

1996
Xiang Tong David A. Evans

This paper describes an automatic, context-sensitive, word-error correction system based on statistical language modeling (SLM) as applied to optical character recognition (OCR) postprocessing. The system exploits information from multiple sources, including letter n-grams, character confusion probabilities, and word-bigram probabilities. Letter n-grams are used to index the words in the lexico...

2007
Chen Li Bin Wang Xiaochun Yang

Many applications need to solve the following problem of approximate string matching: from a collection of strings, how to find those similar to a given string, or the strings in another (possibly the same) collection of strings? Many algorithms are developed using fixed-length grams, which are substrings of a string used as signatures to identify similar strings. In this paper we develop a nov...

2015
Bogdan Marchis Alexandru Tifrea Mihai Volmer Traian Rebedea

This paper presents a new approach for finding the best ngrams that efficiently summarize a large set of reviews. The proposed unsupervised method uses a readability score and a representativeness score to select those n-grams that best convey the main opinions contained in the processed reviews. In order to further refine the selected n-grams, we use sentiment analysis and part of speech (POS)...

Journal: :Computación y Sistemas 2014
Hiram Calvo Andrea Segura-Olivares Alejandro García

Paraphrase recognition consists in detecting if an expression restated as another expression contains the same information. Traditionally, for solving this prob­ lem, several lexical, syntactic and semantic based tech­ niques are used. For measuring word overlapping, most of the works use n-grams; however syntactic n-grams have been scantily explored. We propose using syntac­ tic dependency and...

Journal: :Journal of the American Medical Informatics Association : JAMIA 2014
Rao Muhammad Adeel Nawab Mark Stevenson Paul D. Clough

OBJECTIVE We aim to identify duplicate pairs of Medline citations, particularly when the documents are not identical but contain similar information. MATERIALS AND METHODS Duplicate pairs of citations are identified by comparing word n-grams in pairs of documents. N-grams are modified using two approaches which take account of the fact that the document may have been altered. These are: (1) d...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید