Computational Constancy Measures of Texts—Yule's <italic>K</italic> and Rényi's Entropy

نویسندگان

  • Kumiko Tanaka-Ishii
  • Shunsuke Aihara
چکیده

This article presents a mathematical and empirical verification of computational constancy measures for natural language text. A constancy measure characterizes a given text by having an invariant value for any size larger than a certain amount. The study of such measures has a 70-year history dating back to Yule’s K, with the original intended application of author identification. We examine various measures proposed since Yule and reconsider reports made so far, thus overviewing the study of constancy measures. We then explain how K is essentially equivalent to an approximation of the second-order Rényi entropy, thus indicating its signification within language science. We then empirically examine constancy measure candidates within this new, broader context. The approximated higher-order entropy exhibits stable convergence across different languages and kinds of text. We also show, however, that it cannot identify authors, contrary to Yule’s intention. Lastly, we apply K to two unknown scripts, the Voynich manuscript and Rongorongo, and show how the results support previous hypotheses about these scripts.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Familial melanoma-astrocytoma syndrome: synchronous diffuse astrocytoma and pleomorphic xanthoastrocytoma in a patient with germline CDKN2A/B deletion and a significant family history

Familial melanoma-astrocytoma syndrome is a tumor predisposition syndrome caused by inactivating germline alteration of the CDKN2A tumor suppressor gene on chromosome 9p21. While some families with germline CDKN2A mutations are prone to development of just melanomas, other families develop both melanomas, astrocytomas, and occasionally other nervous-system neop...

متن کامل

Unraveling gene-gene interactions regulated by ligands of the aryl hydrocarbon receptor.

The co-expression of genes coupled to additive probabilistic relationships was used to identify gene sets predictive of the complex biological interactions regulated by ligands of the aryl hydrocarbon receptor ((Italic)Ahr(/Italic)). To maximize the number of possible gene-gene combinations, data sets from murine embryonic kidney, fetal heart, and vascular smooth muscle cells challenged (Italic...

متن کامل

Human Face Recognition and the Face Image Set

If we consider an n x n image as an n2 dimensional vector, then images of faces can be considered as points in this n2-dimensional image space. Our previous studies of physical transformations of the face, including translation, small rotations and illumination changes, showed that the set of face images consists of relatively simple connected sub-regions in image space [1]. Consequently linear...

متن کامل

The Cytotoxic Effects of Ferula Persica var. Persica and Ferula Hezarlalehzarica against HepG2, A549, HT29, MCF7 and MDBK Cell Lines

      Cancers belong to a group of disorders which are very important for researchers.  Because they have several types and cause mortality in human beings. Many  investigations are performing in order to introduce cheaper drugs with lower side  effects especially with natural sources. Ferula <span style="font-family: TimesNewRom...

متن کامل

Effect of Rb and Ta Doping on the Ionic Conductivity and Stability of the Garnet Li<sub>7+2<italic>x</italic><italic>y</italic></sub>(La<sub>3<italic>x</italic></sub>Rb<sub><italic>x</italic></sub>)(Zr<sub>2<italic>y</italic></sub>Ta<sub><italic>y</italic></sub>)O<sub>12</sub> (0 <italic>x</italic> 0.375, 0 <italic>y</italic> 1) Superionic Conductor: A First Principles Investigation

In this work, we investigated the effect of Rb and Ta doping on the ionic conductivity and stability of the garnet Li7+2x−y(La3−xRbx)(Zr2−yTay)O12 (0 ≤ x ≤ 0.375, 0 ≤ y ≤ 1) superionic conductor using first principles calculations. Our results indicate that doping does not greatly alter the topology of the migration pathway, but instead acts primarily to change the lithium concentration. The st...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015