Counting occurrences for a finite set of words: an inclusion-exclusion approach

نویسندگان

  • Pierre Nicodème
  • Frédérique Bassino
  • Julien Clément
  • Julien Fayolle
چکیده

In this paper, we give the multivariate generating function counting texts according to their length and to the number of occurrences of words from a finite set. The application of the inclusion-exclusion principle to word counting due to Goulden and Jackson (1979, 1983) is used to derive the result. Unlike some other techniques which suppose that the set of words is reduced (i.e., where no two words are factor of one another), the finite set can be chosen arbitrarily. Noonan and Zeilberger (1999) already provided a MAPLE package treating the non-reduced case, without giving an expression of the generating function or a detailed proof. We give a complete proof validating the use of the inclusionexclusion principle and compare the complexity of the method proposed here with the one using automata for solving the problem.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Constructions for Clumps Statistics

We consider a component of the word statistics known as clump; starting from a finite set of words, clumps are maximal overlapping sets of these occurrences. This object has first been studied by Schbath [22] with the aim of counting the number of occurrences of words in random texts. Later work with similar probabilistic approach used the Chen-Stein approximation for a compound Poisson distrib...

متن کامل

The Inclusion-Exclusion Principle for IF-States

Applying two definitions of the union of IF-events, P. Grzegorzewski gave two generalizations of the inclusion-exclusion principle for IF-events.In this paper we prove an inclusion-exclusion principle for IF-states based on a method which can also be used to prove Grzegorzewski's inclusion-exclusion principle for probabilities on IF-events.Finally, we give some applications of this principle by...

متن کامل

Identification of Organizational Culture Components Based on Islamic – Iranian Values: A Field Literature Review with Synthesizing Approach

Organizational culture is defined as prominent values and a set of key characteristics govern the organization. Paying attention to the importance of organizational culture increases staff’s productivity and job satisfaction. Therefore, the aim of this study was identification, counting and classification of organizational culture components based on Islamic-Iranian values by synthesizing appro...

متن کامل

Counting the Occurrences of Generalized Patterns in Words Generated by a Morphism

We count the number of occurrences of certain patterns in given words. We choose these words to be the set of all finite approximations of a sequence generated by a morphism with certain restrictions. The patterns in our considerations are either classical patterns 1-2, 2-1, 1-1· · · -1, or arbitrary generalized patterns without internal dashes, in which repetitions of letters are allowed. In p...

متن کامل

The Peano curve and counting occurrences of some patterns

We introduce Peano words, which are words corresponding to finite approximations of the Peano space filling curve. We then find the number of occurrences of certain patterns in these words.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007