Writing Style Recognition and Sentence Extraction

نویسنده

  • Hans van Halteren
چکیده

This paper examines whether feature sets which have been developed for authorship attribution can also be used for the sentence extraction task. Experiments show that the feature sets distinguish significantly better between extract and non-extract sentences than a random baseline classifier, but that a careful combination with other features is necessary in order to outperform a positional baseline classifier. Furthermore, it is vital that the training material reflects the intended task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improvement of Chemical Named Entity Recognition through Sentence-based Random Under-sampling and Classifier Combination

Chemical Named Entity Recognition (NER) is the basic step for consequent information extraction tasks such as named entity resolution, drug-drug interaction discovery, extraction of the names of the molecules and their properties. Improvement in the performance of such systems may affects the quality of the subsequent tasks. Chemical text from which data for named entity recognition is extracte...

متن کامل

New Feature Sets for Summarization by Sentence Extraction

they don’t necessarily provide a coherent account— be used as a basis for further processing. Ideally, the document would be thoroughly analyzed using linguistic and world knowledge to determine which sentences are appropriate for the extract. In practice, the necessary analysis is still too immature or too computationally intensive to yield sufficient results. Many existing systems extract sen...

متن کامل

Persian Character Recognition Using New Hybridization of Independent Orthogonal Moments

Character recognition is a new research field in the domain of pattern recognition which deals with the style of writing. Some of the challengeable problems in character identification are changing in the style of writing, font and turns of words and etc. In this paper, the goal is Persian character identification using independent orthogonal moment as the feature extraction technique.The propo...

متن کامل

The study and recognition of artistic dyes in the Islamic period of Iran in writing and painting (Based on poetry of Khorasanid style poets)

The main features of Iranian painting in the post-Islamic centuries are the association with Persian literature. Persian literature and Persian art have intrinsic links, since the artist and poet are based on the unit's vision, rooted in a culture and intellectual space, to create. The result of this poet's creation is a literary work, and this work can have all the features of the work of art....

متن کامل

FOR S ENTENCE U NIT S EGMENTATION FROM S PEECH Sébastien Cuendet

The sentence segmentation task is a classification task that aims at inserting sentence boundaries in a sequence of words. One of the applications of sentence segmentation is to detect the sentence boundaries in the sequence of words that is output by an automatic speech recognition system (ASR). The purpose of correctly finding the sentence boundaries in ASR transcriptions is to make it possib...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002