Common phrases and minimum-space text storage

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Disambiguating Cue Phrases in Text and Speech

Cue phrases are linguistic expressions such as 'now' and 'welg tha t may explicitly mark the structure of a discourse. For example, while the cue phrase ' inczdcntally' may be used SENTENTIALLY as an adverbial, the DISCOUaSE use initiates a digression. In [8], we noted the ambiguity of cue phrases with respect to discourse and sentential usage and proposed an intonational model for their disamb...

متن کامل

Extraction of Significant Phrases from Text

Prospective readers can quickly determine whether a document is relevant to their information need if the significant phrases (or keyphrases) in this document are provided. Although keyphrases are useful, not many documents have keyphrases assigned to them, and manually assigning keyphrases to existing documents is costly. Therefore, there is a need for automatic keyphrase extraction. This pape...

متن کامل

Classifying Cue Phrases in Text a

Cue phrases may be used in a dkozsrse sense to explicitly signal discourse structure, but also in a sepztent&l sense to convey semantic rather than structural information. This paper explores the use of machine learning for classifying cue phrases as discourse or sentential. Two machine learning programs (CGRENDEL and C4.5) are used to induce classification rules from sets of pre-classified cue...

متن کامل

Statistical Phrases in Automated Text Categorization

In this work we investigate the usefulness of n-grams for document indexing in text categorization (TC). We call n-gram a set tk of n word stems, and we say that tk occurs in a document dj when a sequence of words appears in dj that, after stop word removal and stemming, consists exactly of the n stems in tk, in some order. Previous researches have investigated the use of n-grams (or some varia...

متن کامل

Detecting multiword phrases in mathematical text corpora

We present an approach for detecting multiword phrases in mathematical text corpora. The method used is based on characteristic features of mathematical terminology. It makes use of a software tool named Lingo which allows to identify words by means of previously defined dictionaries for specific word classes as adjectives, personal names or nouns. The detection of multiword groups is done algo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Communications of the ACM

سال: 1973

ISSN: 0001-0782,1557-7317

DOI: 10.1145/361972.361982