Word Complexity And Repetitions In Words

نویسندگان

  • Lucian Ilie
  • Sheng Yu
  • Kaizhong Zhang
چکیده

With ideas from data compression and combinatorics on words, we introduce a complexity measure for words, called repetition complexity, which quantifies the amount of repetition in a word. The repetition complexity of w, r(w), is defined as the smallest amount of space needed to store w when reduced by repeatedly applying the following procedure: n consecutive occurrences uu . . . u of the same subword u of w are stored as (u, n). The repetition complexity has interesting relations with well-known complexity measures, such as subword complexity, sub, and Lempel-Ziv complexity, lz. We have always r(w) ≥ lz(w) and could even be that the former is linear while the latter is only logarithmic; e.g., this happens for prefixes of certain infinite words obtained by iterated morphisms. An infinite word α being ultimately periodic is equivalent to: (i) sub(pref n (α)) = O(n), (ii) lz(pref n (α)) = O(1), and (iii) r(pref n (α)) = lgn + O(1). De Bruijn words, well known for their high subword complexity, are shown to have almost highest repetition complexity; the precise complexity remains open. r(w) can be computed in time O(n(logn)) and it is open, and probably very difficult, to find fast algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Abelian Properties of Words

We say that two finite words u and v are abelian equivalent if and only if they have the same number of occurrences of each letter, or equivalently if they define the same Parikh vector. In this paper we investigate various abelian properties of words including abelian complexity, and abelian powers. We study the abelian complexity of the Thue-Morse word and the Tribonacci word, and answer an o...

متن کامل

The Sum of Exponents of Maximal Repetitions in Standard Sturmian Words

A maximal repetition is a non-extendable (with the same period) periodic segment in a string, in which the period repeats at least twice. In this paper we study problems related to the structure of maximal repetitions in standard Sturmian words and present the formulas for the sum of their exponents. Moreover, we show how to compute the sum of exponents of maximal repetitions in any standard St...

متن کامل

On primary and secondary repetitions in words

Combinatorial properties of maximal repetitions (runs) in formal words are studied. We classify all maximal repetitions in a word as primary and secondary where the set of all primary repetitions determines all the other repetitons in the word. Essential combinatorial properties of primary repetitions are established.

متن کامل

Characteristics of final part-word repetitions

In an earlier paper, we have described final part-word repetitions in the conversational speech of two school-age boys of normal intelligence with no known neurological lesions. In this paper we explore in more detail the phonetic and linguistic characteristics of the speech of the boys. The repeated word fragments were more likely to be preceded by a pause than followed by one. The word immedi...

متن کامل

Comparison of emotional and non-emotional word repetitions in patients with aphasia

BACKGROUND Aphasia is a language disorder caused by left hemisphere damage. For treatment of aphasia, in some of therapeutic approaches, the right hemisphere (RH) abilities, such as, emotional perception, is used for stimulation of the language process in the left hemisphere. The aim of this study is to investigate emotional word repetition in aphasia after a stroke, in Persian language patient...

متن کامل

EFL Textbook Evaluation: An Analysis of Readability and Vocabulary Profiler of Four Corners Book Series

This study aimed to investigate whether there is any significant relationship between the readability and vocabulary profile including the most frequent words (K1 words) and academic word list (AWL) of reading passages of Four Corners series which were EFL textbooks. To determine the readability of the texts, the Flesch–Kincaid (1975) readability test was used, while the texts' academic word li...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Int. J. Found. Comput. Sci.

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2004