Robustness of sentence length measures in written texts

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word-length entropies and correlations of natural language written texts

We study the frequency distributions and correlations of the word lengths of ten European languages. Our findings indicate that a) the word-length distribution of short words quantified by the mean value and the entropy distinguishes the Uralic (Finnish) corpus from the others, b) the tails at long words, manifested in the high-order moments of the distributions, differentiate the Germanic lang...

متن کامل

The Effect of Length of Pre-task Planning Time on Discourse-analytic Measures and Analytic Ratings in L2 Written Narratives

The favorable gains gleaned from the provision of pre-task planning time (PTP) have struck a chord with SLA researchers as they try to manipulate task features to promote language production and development. In a similar vein, the present study is a two-fold attempt to first compare the effect of the length of pre-task planning time on discourse-analytic measures in narrative written production...

متن کامل

Automatic Structuring of Written Texts

This paper deals with automatic structuring and sentence boundary labelling in natural language texts. We describe the implemented structure tagging algorithm and heuristic rules that are used for automatic or semiautomatic labelling. Inside the detected sentence the algorithm performs a decomposition to clauses and then marks the parts of text which do not form a sentence, i.e. headings, signa...

متن کامل

Discourse Segmentation of German Written Texts

Discourse segmentation is the division of a text into minimal discourse segments, which form the leaves in the trees that are used to represent discourse structures. A definition of elementary discourse segments in German is provided by adapting widely used segmentation principles for English minimal units, while considering punctuation, morphology, sytax, and aspects of the logical document st...

متن کامل

Detecting Sentence Boundaries in Sanskrit Texts

The paper applies a deep recurrent neural network to the task of sentence boundary detection in Sanskrit, an important, yet underresourced ancient Indian language. The deep learning approach improves the F scores set by a metrical baseline and by a Conditional Random Field classifier by more than 10%.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Physica A: Statistical Mechanics and its Applications

سال: 2018

ISSN: 0378-4371

DOI: 10.1016/j.physa.2018.04.104