We present a method which displays all palindromes of a given length from De Bruijn words of a certain order, and also a recursive one which constructs all palindromes of length \(n+1\) from the set of palindromes of length \(n\). We show that the palindrome complexity function, which counts the number of palindromes of each length contained in a given word, has a different shape compared with the usual (subword) complexity function. We give upper bounds for the average number of palindromes contained in all words of length n, and obtain exact formulae for the number of palindromes of length 1 and 2 contained in all words of length \(n\).
Authors
Mira-Cristiana Anisiu
Tiberiu Popoviciu Institute of Numerical Analysis, Romanian Academy, Cluj-Napoca
Valeriu Anisiu
Department of Mathematics Faculty of Mathematics and Computer Science Babeş-Bolyai University of Cluj-Napoca
Zoltán Kása
Department of Computer Science Faculty of Mathematics and Computer Science Babeş-Bolyai University of Cluj-Napoca
Keywords
?
Paper coordinates
M.-C. Anisiu, V. Anisiu, Z. Kása, Properties of palindromes in finite words, Pure Math. Appl., 17 (2006) nos. 3-5, pp. 183-195.
[1] J.-P. Allouche, M. Baake, J. Cassaigne and D. Damanik, Palindrome complexity, Theoret. Comput. Sci., 292 (2003), 9–31.
[2] M.-C. Anisiu and J. Cassaigne, Properties of the complexity function for finite words, Rev. Anal. Num. Théor. Approx., 33 (2004), 123–139.
[3] J.-P. Borel and C. Reutenauer, Palindromic factors of billiard words, Theoret. Comput. Sci., 340 (2005), 334–348.
[4] N.G. De Bruijn, A combinatorial problem, Nederl. Akad. Wetensch. Proc., 49 (1946), 758–764 = Indag. Math., 8 (1946), 461–467.
[5] A. Ehrenfeucht, K.P. Lee and G. Rozenberg, Subword complexities of various classes of deterministic developmental languages without interactions, Theoret. Comput. Sci., 1 (1975), 59–75.
[6] C. Flye Sainte-Marie, Solution to question nr. 48, l’Intermédiaire des Mathématiciens, 1 (1894), 107–110
[7] H. Fredricksen, A survey of full length nonlinear shift register cycle algorithms, SIAM Review, 24 (1982), 195–221.
[8] R.A. Games, A generalized recursive construction for De Bruijn sequences, IEEE Trans. Inform. Theory, 29 (1983), 843–850.
[9] M. Giel-Pietraszuk, M. Hoffmann, S. Dolecka, J. Rychlewski and J. Barciszewski, Palindromes in proteins, J. Protein Chem., 22 (2003), 109–113.
[10] I.J. Good, Normal recurring decimals, J. London Math. Soc., 21 (1946),
167–169.
[11] R.L. Graham, D.E. Knuth and O. Patashnik, Concrete Mathematics: A Foundation for Computer Science, 2nd edition (Reading, Massachusetts: Addison-Wesley), 1994.
[12] M. Heinz, Zur Teilwortkomplexität für Wörter und Folgen über einem endlichen Alphabet, EIK, 13 (1977), 27–38.
[13] A. de Luca, On the combinatorics of finite words, Theoret. Comput. Sci., 218 (1999), 13–39.
[14] A. de Luca and Al. de Luca, Combinatorial properties of Sturmian palindromes, Int. J. Found. Comput. Sci., 17 (2006), 557–573.
[15] F. Levé and P. Séébold, Proof of a conjecture on word complexity, Bull. Belg. Math. Soc. Simon Stevin, 8 (2001), 277–291.
[16] M.H. Martin, A problem in arrangements, Bull. American Math. Soc., 40 (1934), 859–864.
[17] M. Morse and G.A. Hedlund, Symbolic dynamics, Amer. J. Math., 60 (1938), 815–866.
[18] A. Ralston, A new memoryless algorithm for De Bruijn sequences, J. Algorithms, 2 (1981), 50–62
2006-Anisiu-A-K-Properties
Properties of palindromes in finite words
Mira-Cristiana AnisiuTiberiu Popoviciu Institute of Numerical AnalysisRomanian Academy, Cluj-Napocae-mail: mira@math.ubbcluj.roandValeriu AnisiuDepartment of MathematicsFaculty of Mathematics and Computer ScienceBabeş-Bolyai University of Cluj-Napocae-mail: anisiu@math.ubbcluj.roandZoltán KÁsaDepartment of Computer ScienceFaculty of Mathematics and Computer ScienceBabeş-Bolyai University of Cluj-Napocae-mail: kasa@cs.ubbcluj.ro
(Received: July 12-15, 2006)
Abstract
We present a method which displays all palindromes of a given length from De Bruijn words of a certain order, and also a recursive one which constructs all palindromes of length n+1n+1 from the set of palindromes of length nn. We show that the palindrome complexity function, which counts the number of palindromes of each length contained in a given word, has a different shape compared with the usual (subword) complexity function. We give upper bounds for the average number of palindromes contained in all words of length nn, and obtain exact formulae for the number of palindromes of length 1 and 2 contained in all words of length nn.
Mathematics Subject Classifications (2000). 68R15
1 Introduction
The palindrome complexity of infinite words has been studied by several authors (see [1], [3], [14] and the references therein). Similar problems related to the number of palindromes are important for finite words too. One of the reasons is that palindromes occur in DNA sequences (over 4 letters) as well as in protein description (over 20 letters), and their role is under research ([9]).
Let an alphabet AA with card(A)=q >= 1\operatorname{card}(A)=q \geq 1 be given. The set of the words of length nn over AA will be denoted by A^(n)A^{n}.
Given a word w=w_(1)w_(2)dotsw_(n)w=w_{1} w_{2} \ldots w_{n}, the reversed of ww is widetilde(w)=w_(n)dotsw_(2)w_(1)\widetilde{w}=w_{n} \ldots w_{2} w_{1}. Denoting by epsi\varepsilon the empty word, we put by convention widetilde(epsi)=epsi\widetilde{\varepsilon}=\varepsilon. The word ww is a palindrome if widetilde(w)=w\widetilde{w}=w. We denote by a^(k)a^{k} the word ubrace(a dots aubrace)_(k" times ")\underbrace{a \ldots a}_{k \text { times }}. The set of the subwords of a word ww which are nonempty palindromes will be denoted by PAL ( ww ). The (infinite) set of all palindromes over the alphabet AA is denoted by PAL(A)\mathrm{PAL}(A), while PAL_(n)(A)=PAL(A)nnA^(n)\operatorname{PAL}_{n}(A)=\operatorname{PAL}(A) \cap A^{n}.
2 Storing and generating palindromes
An old problem asks if, given an alphabet AA with card(A)=q\operatorname{card}(A)=q, there exists a shortest word of length q^(k)+k-1q^{k}+k-1 containing all the q^(k)q^{k} words of length kk. The answer is affirmative and was given in [6], [10], [4]. For each k inNk \in \mathbb{N}, these words are called De Bruijn words of order kk. This property can be proved by means of the Eulerian cycles in the De Bruijn graph B_(k-1)B_{k-1}. If a window of length kk is moved along a De Bruijn word, at each step a different word is seen, all the q^(k)q^{k} words being displayed.
We ask if it is possible to arrange all palindromes of length kk in a similar way. The answer is in general no, excepting the case of the two palindromes aba dots aa b a \ldots a and bab dots bb a b \ldots b of odd length.
Proposition 1 Given a word w inA^(n)w \in A^{n} and k >= 2k \geq 2, the following statements are equivalent:
(1) all the subwords of length kk are palindromes;
(2) nn is even, k=n-1k=n-1 and there exists a,b in A,a!=ba, b \in A, a \neq b so that w=(ab)^(n//2)w=(a b)^{n / 2}.
Furthermore, in this case the only palindromes of ww are (ab)^(n//2-2)a(a b)^{n / 2-2} a and (ba)^(n//2-2)b(b a)^{n / 2-2} b.
Proof. Let us consider the first two palindromes a_(1)a_(2)dotsa_(k)a_{1} a_{2} \ldots a_{k} and b_(1)b_(2)dotsb_(k)b_{1} b_{2} \ldots b_{k} such that a_(2)a_(2)dotsa_(k)=b_(1)b_(2)dotsb_(k-1)a_{2} a_{2} \ldots a_{k}=b_{1} b_{2} \ldots b_{k-1}, hence
a_(k-i+1)=a_(i)=b_(i-1)=b_(k-i+2),quad i=2,dots,ka_{k-i+1}=a_{i}=b_{i-1}=b_{k-i+2}, \quad i=2, \ldots, k
If k=2l,(l >= 1)k=2 l,(l \geq 1) we have b_(2)=a_(1)=a_(3)=dots=a_(k-1)b_{2}=a_{1}=a_{3}=\ldots=a_{k-1} and b_(3)=a_(2)=dots=a_(k)b_{3}=a_{2}=\ldots=a_{k} and a_(1)a_(2)dotsa_(k)a_{1} a_{2} \ldots a_{k} is a palindrome if and only if a_(1)=a_(2)=dots=a_(k)a_{1}=a_{2}=\ldots=a_{k}, hence a_(1)a_(2)dotsa_(k)=a^(k)a_{1} a_{2} \ldots a_{k}=a^{k}; it follows that b_(1)b_(2)dotsb_(k)=a^(k)b_{1} b_{2} \ldots b_{k}=a^{k} too, and the two palindromes are equal.
If k=2l+1k=2 l+1, we have b_(2)=a_(1)=a_(3)=dots=a_(k)b_{2}=a_{1}=a_{3}=\ldots=a_{k} and b_(3)=a_(2)=dots=a_(k-1)b_{3}=a_{2}=\ldots=a_{k-1}, hence a_(1)a_(2)dotsa_(k)=abab dots a(a!=b)a_{1} a_{2} \ldots a_{k}=a b a b \ldots a(a \neq b) and b_(1)b_(2)dotsb_(k)=bab dots bb_{1} b_{2} \ldots b_{k}=b a b \ldots b. If another palindrome will follow, it must be again (ab)^(n//2)(a b)^{n / 2} (equal with the first one).
Remark 1 For k=1k=1, the maximum length of a word containing all distinct palindromes of length 1 (i.e. letters) exactly once is n=qn=q.
It is obvious that for k >= 2k \geq 2 it is not possible to arrange all palindromes of length kk in the most compact way. But each palindrome is determined by the
parity of its length and its first |~k//2~|\lceil k / 2\rceil letters, where |~*~|\lceil\cdot\rceil denotes the ceil function (which returns the smallest integer that is greater than or equal to a specified number).
Proposition 2 All palindromes of length kk can be obtained from a De Bruijn word of length q^(|~k//2~|)+|~k//2~|-1q^{\lceil k / 2\rceil}+\lceil k / 2\rceil-1.
Proof. The De Bruijn word contains all different words of length |~k//2~|\lceil k / 2\rceil. Each such word a_(1)dotsa_(|~k//2~|)a_{1} \ldots a_{\lceil k / 2\rceil} can be extended to a palindrome by symmetry, for kk even, and by taking a_(|~k//2~|+1)=a_(|~k//2~|-1),dots,a_(k)=a_(1)a_{\lceil k / 2\rceil+1}=a_{\lceil k / 2\rceil-1}, \ldots, a_{k}=a_{1}, for kk odd.
Example 1 Let k=3,q=3k=3, q=3 and the De Bruijn word of order |~k//2~|=2w_(1)=0221201100\lceil k / 2\rceil=2 w_{1}=0221201100. From each word of length 2 which appears in the given De Bruijn word, we obtain the corresponding palindrome of length k=3k=3 :
Let k=4,q=2k=4, q=2 and the De Bruijn word of order |~k//2~|=2w_(2)=01100\lceil k / 2\rceil=2 w_{2}=01100. From each word of length 2 contained in 01100 we obtain by symmetry the corresponding palindrome of length k=4k=4 :
There are several algorithms which construct De Bruijn words, for example, in [16], [18], [7] and [8].
We can generate recursively all palindromes of length n,n inNn, n \in \mathbb{N}, using the difference representation. This is based on the following proposition.
Proposition 3 If w_(1),w_(2),dots,w_(p)w_{1}, w_{2}, \ldots, w_{p} are all binary ( A={0,1}A=\{0,1\} ) palindromes of length nn, where p=2^(|~(n)/(2)~|),n >= 1p=2^{\left\lceil\frac{n}{2}\right\rceil}, n \geq 1, then
Proof. If ww is a binary palindrome of length nn, then 0w00 w 0 and 1w11 w 1 will be palindromes too, and the only palindromes of length n+2n+2 which contains ww as a subword, which proves the proposition.
In order to generate all binary palindromes of a given length let us begin with an example considering all binary palindromes of length 3 and 4 and their decimal representation:
The sequence of palindromes in increasing order based on their decimal value for a given length can be represented by their differences. The difference representation of the sequence 0,2,5,70,2,5,7 is 2,3,2(2-0=2,5-2=3,7-5=2)2,3,2(2-0=2,5-2=3,7-5=2), and the difference representation of the sequence 0,6,9,150,6,9,15 is 6,3,66,3,6. A difference representation is always a symmetric sequence and the corresponding sequence of palindromes in decimal can be obtained by successive addition beginning with 0:0+6=6,6+3=9,9+6=150: \mathbf{0}+6=\mathbf{6}, \mathbf{6}+3=\mathbf{9}, \mathbf{9}+6=\mathbf{1 5}. By direct computation we obtain the following difference representation of palindromes for length n <= 8n \leq 8.
This representation can be generalized for q >= 2q \geq 2. The number of palindromes in this case is q^(|~(n)/(2)~|)q^{\left\lceil\frac{n}{2}\right\rceil}.
For n=2kn=2 k we have the difference representation:
from which the difference representation for 2k+12 k+1 is:
ubrace(q^(k),dots,q^(k)ubrace)_(q-1" times "),a_(1),ubrace(q^(k),dots,q^(k)ubrace)_(q-1" times "),a_(2),ubrace(q^(k),dots,q^(k)ubrace)_(q-1" times "),dots,ubrace(q^(k),dots,q^(k)ubrace)_(q-1" times "),a_(q^(k)-1),ubrace(q^(k),dots,q^(k)ubrace)_(q-1" times ").\underbrace{q^{k}, \ldots, q^{k}}_{q-1 \text { times }}, a_{1}, \underbrace{q^{k}, \ldots, q^{k}}_{q-1 \text { times }}, a_{2}, \underbrace{q^{k}, \ldots, q^{k}}_{q-1 \text { times }}, \ldots, \underbrace{q^{k}, \ldots, q^{k}}_{q-1 \text { times }}, a_{q^{k}-1}, \underbrace{q^{k}, \ldots, q^{k}}_{q-1 \text { times }} .
For n=2k+1n=2 k+1 we have the difference representation:
ubrace(q^(k),dots,q^(k)ubrace)_(q-1" times "),a_(1),ubrace(q^(k),dots,q^(k)ubrace)_(q-1" times "),a_(2),dots,a_(q^(k)-1),ubrace(q^(k),dots,q^(k)ubrace)_(q-1" times "),\underbrace{q^{k}, \ldots, q^{k}}_{q-1 \text { times }}, a_{1}, \underbrace{q^{k}, \ldots, q^{k}}_{q-1 \text { times }}, a_{2}, \ldots, a_{q^{k}-1}, \underbrace{q^{k}, \ldots, q^{k}}_{q-1 \text { times }},
from which the difference representation for 2k+22 k+2 is:
{:[ubrace((q+1)q^(k),dots,(q+1)q^(k)ubrace)_(q-1" times ")","a_(1)","ubrace((q+1)q^(k),dots,(q+1)q^(k)ubrace)_(q-1" times ")","a_(2)","],[ dots","ubrace((q+1)q^(k),dots,(q+1)q^(k)ubrace)_(q-1" times ")","a_(q^(k)-1)","ubrace((q+1)q^(k),dots,(q+1)q^(k)ubrace)_(q-1" times ").]:}\begin{aligned}
& \underbrace{(q+1) q^{k}, \ldots,(q+1) q^{k}}_{q-1 \text { times }}, a_{1}, \underbrace{(q+1) q^{k}, \ldots,(q+1) q^{k}}_{q-1 \text { times }}, a_{2}, \\
& \ldots, \underbrace{(q+1) q^{k}, \ldots,(q+1) q^{k}}_{q-1 \text { times }}, a_{q^{k}-1}, \underbrace{(q+1) q^{k}, \ldots,(q+1) q^{k}}_{q-1 \text { times }} .
\end{aligned}
3 The shape of the palindrome complexity functions
For an infinite sequence UU, the (subword) complexity function p_(U):NlongrightarrowNp_{U}: \mathbb{N} \longrightarrow \mathbb{N} (defined in [17] as the block growth, then named subword complexity in [5]) is given by p_(U)(n)=card(F(U)nnA^(n))p_{U}(n)=\operatorname{card}\left(F(U) \cap A^{n}\right) for n inNn \in \mathbb{N}, where F(U)F(U) is the set of all finite subwords (factors) of UU. Therefore the complexity function maps each nonnegative number nn to the number of subwords of length nn of UU; it verifies the iterative equation
s(j,n)s(j, n) being the cardinal of the set of the subwords in UU having the length nn and the right valence jj. A subword u in Uu \in U has the right valence jj if there are jj and only jj distinct letters x_(i)x_{i} such that ux_(i)in F(U),1 <= i <= ju x_{i} \in F(U), 1 \leq i \leq j.
For a finite word ww of length nn, the complexity function p_(w):NlongrightarrowNp_{w}: \mathbb{N} \longrightarrow \mathbb{N} given by p_(w)(k)=card(F(w)nnA^(k)),k inNp_{w}(k)=\operatorname{card}\left(F(w) \cap A^{k}\right), k \in \mathbb{N}, has the property that p_(w)(k)=0p_{w}(k)=0 for k > nk>n. The corresponding iterative equation is
where s_(0)(k)=s(0,k)in{0,1}s_{0}(k)=s(0, k) \in\{0,1\} stands for the cardinal of the set of subwords vv (suffixes of ww of length kk ) which cannot be continued as vx in F(w),x in Av x \in F(w), x \in A. We can write (2) in a condensed form
The above relations have their correspondents in terms of left extensions of the subwords.
For an infinite sequence UU, the complexity function p_(U)p_{U} is nondecreasing; more than that, if there exists m inNm \in \mathbb{N} such that p_(U)(m+1)=p_(U)(m)p_{U}(m+1)=p_{U}(m), then p_(U)p_{U} is constant for n >= mn \geq m.
The complexity function for a finite word ww of length nn has a different behaviour, because of p_(w)(n)=1p_{w}(n)=1 (there is a unique subword of length nn, namely w)w). It was proved ([12], [13], [15], [2]) that the shape of the complexity function is trapezoidal:
Theorem 1 Given a finite word ww of length nn, there are three intervals of monotonicity for p_(w):[0,J],[J,M]p_{w}:[0, J],[J, M] and [M,n][M, n]; the function increases at first, is constant and then decreases with the slope -1 .
The palindrome complexity function of a finite or infinite word ww is given by pal_(w):NlongrightarrowN,pal_(w)(k)=card(PAL(w)nnA^(k)),k inN\operatorname{pal}_{w}: \mathbb{N} \longrightarrow \mathbb{N}, \operatorname{pal}_{w}(k)=\operatorname{card}\left(\operatorname{PAL}(w) \cap A^{k}\right), k \in \mathbb{N}. Obviously,
{:(5)pal_(w)(k) <= min{q^(|~k//2~|),n-k+1}","k in{0","dots","n}:}\begin{equation*}
\operatorname{pal}_{w}(k) \leq \min \left\{q^{\lceil k / 2\rceil}, n-k+1\right\}, k \in\{0, \ldots, n\} \tag{5}
\end{equation*}
The palindrome u inPAL(w)u \in \mathrm{PAL}(w) has the palindrome valence jj if there are jj and only jj distinct letters x_(i)x_{i} such that x_(i)ux_(i)inPAL(w),1 <= i <= jx_{i} u x_{i} \in \mathrm{PAL}(w), 1 \leq i \leq j. We denote by
{:(6)s_(p)(j","k)=card{u in(PAL(w)nnA^(k)):u" has the palindrome valence "j}",":}\begin{equation*}
s_{p}(j, k)=\operatorname{card}\left\{u \in\left(\operatorname{PAL}(w) \cap A^{k}\right): u \text { has the palindrome valence } j\right\}, \tag{6}
\end{equation*}
and by s_(p)(0,k)s_{p}(0, k) the cardinal of the set of subwords v inPAL(w)nnA^(k)v \in \mathrm{PAL}(w) \cap A^{k} (not necessarily suffixes or prefixes of ww ) which cannot be continued as xvx in PAL(w)x v x \in \operatorname{PAL}(w), x in Ax \in A.
The palindrome complexity function of finite or infinite words satisfies the iterative equation
Due to the fact that the number of even palindromes is not directly related to that of odd ones, we do not expect that pal_(w)\mathrm{pal}_{w} is of trapezoidal shape, as it was the case for the subword complexity function p_(w)p_{w}.
Figure 1: Odd and even palindrome complexity function
For this reason we define the odd, respectively even palindrome complexity function as the restrictions of pal_(w)\mathrm{pal}_{w} to odd, respectively even integers: pal_(w)^(o)\mathrm{pal}_{w}^{o} : 2N+1rarrN,pal_(w)^(o)(k)=pal_(w)(k);pal_(w)^(e):2NrarrN,pal_(w)^(e)(k)=pal_(w)(k)2 \mathbb{N}+1 \rightarrow \mathbb{N}, \operatorname{pal}_{w}^{o}(k)=\operatorname{pal}_{w}(k) ; \operatorname{pal}_{w}^{e}: 2 \mathbb{N} \rightarrow \mathbb{N}, \operatorname{pal}_{w}^{e}(k)=\operatorname{pal}_{w}(k).
These functions have a trapezoidal form for short words; nevertheless, this is not true in general, as the following examples show.
Example 2 The word w_(1)=1010^(5)1^(2)0^(7)10w_{1}=1010^{5} 1^{2} 0^{7} 10 with |w_(1)|=19\left|w_{1}\right|=19 has pal_(w_(1))^(o)(1)=2\operatorname{pal}_{w_{1}}^{o}(1)=2, pal_(w_(1))^(O)(3)=3,pal_(w_(1))^(O)(5)=1,pal_(w_(1))^(O)(7)=2,pal_(w_(1))^(O)(9)=1.quad\operatorname{pal}_{w_{1}}^{O}(3)=3, \operatorname{pal}_{w_{1}}^{O}(5)=1, \operatorname{pal}_{w_{1}}^{O}(7)=2, \operatorname{pal}_{w_{1}}^{O}(9)=1 . \quad (see Fig. 1.)
Example 3 The word w_(2)=1^(4)0^(6)10^(8)1^(2)0w_{2}=1^{4} 0^{6} 10^{8} 1^{2} 0 with |w_(2)|=22\left|w_{2}\right|=22 has pal_(w_(2))^(e)(2)=2\mathrm{pal}_{w_{2}}^{e}(2)=2, pal_(w_(2))^(e)(4)=3,pal_(w_(2))^(e)(6)=1,pal_(w_(2))^(e)(8)=2,pal_(w_(2))^(e)(10)=1.quad\operatorname{pal}_{w_{2}}^{e}(4)=3, \operatorname{pal}_{w_{2}}^{e}(6)=1, \operatorname{pal}_{w_{2}}^{e}(8)=2, \operatorname{pal}_{w_{2}}^{e}(10)=1 . \quad (see Fig. 1.)
REMARK 2 The palindrome complexity for infinite words is not nondecreasing, as the usual complexity function is. Indeed, we can continue the word in Example 2 with 11001100 dots11001100 \ldots, and its odd palindrome complexity function will be as that for w_(1)w_{1}, and then equal to 0 for k >= 11k \geq 11. Similarly, we can continue w_(2)w_{2} in Example 3 with 1010 dots1010 \ldots to obtain an infinite word with the even palindrome complexity of w_(2)w_{2} till k=10k=10 and equal to 0 for k >= 12k \geq 12.
4 Average number of palindromes
We consider an alphabet AA with q >= 2q \geq 2 letters.
Definition 1 We define the total palindrome complexity PP by
where ww is a word of length |w||w|, and pal_(w)(n)\operatorname{pal}_{w}(n) denotes the number of distinct palindromes of length nn which are nonempty subwords of ww.
Because the set of the nonempty palindromes in ww is denoted by PAL ( ww ), we can write also P(w)=card(PAL(w))P(w)=\operatorname{card}(\operatorname{PAL}(w)).
Definition 2 The average number of palindromes M_(q)(n)M_{q}(n) contained in all words of length nn is defined by
We can give the following upper estimate for M_(q)(n)M_{q}(n).
Theorem 2 For n inNn \in \mathbb{N}, the average number of palindromes contained in the words of length nn satisfies the inequalities
{:[M_(q)(n) <= (q^(-(n-1)//2)(q+3)+2n(q-1)+q^(3)-2q^(2)-2q-1)/((q-1)^(2))","" for "n" odd "","],[(10)M_(q)(n) <= (q^(-n//2)(3q+1)+2n(q-1)+q^(3)-2q^(2)-2q-1)/((q-1)^(2))","" for "n" even ".]:}\begin{align*}
& M_{q}(n) \leq \frac{q^{-(n-1) / 2}(q+3)+2 n(q-1)+q^{3}-2 q^{2}-2 q-1}{(q-1)^{2}}, \text { for } n \text { odd }, \\
& M_{q}(n) \leq \frac{q^{-n / 2}(3 q+1)+2 n(q-1)+q^{3}-2 q^{2}-2 q-1}{(q-1)^{2}}, \text { for } n \text { even } . \tag{10}
\end{align*}
For a fixed palindrome pi\pi, with |pi|=k|\pi|=k, the number of the words of length nn in which it appears as a subword at position i(1 <= i <= n-k+1)i(1 \leq i \leq n-k+1) is q^(n-k)q^{n-k}. But the position ii is arbitrary, so that there are at most (n-k+1)q^(n-k)(n-k+1) q^{n-k} words in which pi\pi is a subword, these words being not necessarily distinct. It follows that
and M_(q)(n) <= q+sum_(k=2)^(n)(n-k+1)q^(-k+|~k//2~|)M_{q}(n) \leq q+\sum_{k=2}^{n}(n-k+1) q^{-k+\lceil k / 2\rceil}.
We split the sum according to k=2j,j=1,dots,|__ n//2__|k=2 j, j=1, \ldots,\lfloor n / 2\rfloor, respectively k=2j+1k=2 j+1, j=1,dots,|__(n-1)//2__|j=1, \ldots,\lfloor(n-1) / 2\rfloor, and obtain
Making use of sum_(j=1)^(s)q^(-j)=(1-q^(-s))//(q-1)\sum_{j=1}^{s} q^{-j}=\left(1-q^{-s}\right) /(q-1) and sum_(j=1)^(s)jq^(-j)=(q-q^(1-s)(s+1)+:}{:sq^(-s))//(q-1)^(2)\sum_{j=1}^{s} j q^{-j}=\left(q-q^{1-s}(s+1)+\right. \left.s q^{-s}\right) /(q-1)^{2}, it follows that M_(q)(n)M_{q}(n) satisfies the inequalities in (10).
Corollary 1 The following inequality holds
{:(12)l i m s u p_(n rarr oo)(M_(q)(n))/(n) <= (2)/(q-1).:}\begin{equation*}
\limsup _{n \rightarrow \infty} \frac{M_{q}(n)}{n} \leq \frac{2}{q-1} . \tag{12}
\end{equation*}
Proof.
{:[l i m s u p_(n rarr oo)(M_(q)(n))/(n)=max{l i m s u p_(n rarr oo)(M_(q)(2n+1))/(2n+1),l i m s u p_(n rarr oo)(M_(q)(2n))/(2n)}],[ <= max{lim_(n rarr oo)((q^(-n)(q+3)+2(2n+1)(q-1)+q^(3)-2q^(2)-2q-1)/((q-1)^(2)))(1)/(2n+1):}],[{:lim_(n rarr oo)((q^(-n)(3q+1)+4n(q-1)+q^(3)-2q^(2)-2q-1)/((q-1)^(2)))(1)/(2n)}=(2)/(q-1)]:}\begin{aligned}
\limsup _{n \rightarrow \infty} & \frac{M_{q}(n)}{n}=\max \left\{\limsup _{n \rightarrow \infty} \frac{M_{q}(2 n+1)}{2 n+1}, \limsup _{n \rightarrow \infty} \frac{M_{q}(2 n)}{2 n}\right\} \\
\leq & \max \left\{\lim _{n \rightarrow \infty}\left(\frac{q^{-n}(q+3)+2(2 n+1)(q-1)+q^{3}-2 q^{2}-2 q-1}{(q-1)^{2}}\right) \frac{1}{2 n+1}\right. \\
& \left.\lim _{n \rightarrow \infty}\left(\frac{q^{-n}(3 q+1)+4 n(q-1)+q^{3}-2 q^{2}-2 q-1}{(q-1)^{2}}\right) \frac{1}{2 n}\right\}=\frac{2}{q-1}
\end{aligned}
We are interested in finding how large is the average number of palindromes contained in the words of length nn compared to the length nn. The numerical estimations done for small values of nn show that M_(q)(n)M_{q}(n) is comparable to nn, but Corollary 1 allows us to show that for q >= 4q \geq 4 this does not hold.
Corollary 2 For an alphabet with q >= 4q \geq 4 letters,
{:(13)l i m s u p_(n rarr oo)(M_(q)(n))/(n) < 1.:}\begin{equation*}
\limsup _{n \rightarrow \infty} \frac{M_{q}(n)}{n}<1 . \tag{13}
\end{equation*}
In the proof of Theorem 2 we have used the rough inequality (11), which was sufficient to prove the result. In fact, it is not difficult to calculate exactly
This result has intrinsic importance.
Theorem 3 The number of occurrences of the palindromes of length 1, respectively 2 , in all words of length nn (counted once if a palindrome appears in a word, and once again if it appears in another one) is given by
[alpha]={[1","" if "alpha" is true "],[0","" if "alpha" is false "]:}[\alpha]=\left\{\begin{array}{l}
1, \text { if } \alpha \text { is true } \\
0, \text { if } \alpha \text { is false }
\end{array}\right.
and obtain
S_(n,1)=sum_(w inA^(n))sum_(a in A)[a" in "w]=qsum_(w inA^(n))[a_(1)" in "w]S_{n, 1}=\sum_{w \in A^{n}} \sum_{a \in A}[a \text { in } w]=q \sum_{w \in A^{n}}\left[a_{1} \text { in } w\right]
where a_(1)a_{1} is a fixed letter of the alphabet AA. Then
S_(n,1)=qsum_(w inA^(n))[a_(1)" in "w]=q(q^(n)-sum_(w inA^(n))[a_(1)not" in "w])=q^(n+1)-q(q-1)^(n)S_{n, 1}=q \sum_{w \in A^{n}}\left[a_{1} \text { in } w\right]=q\left(q^{n}-\sum_{w \in A^{n}}\left[a_{1} \operatorname{not} \text { in } w\right]\right)=q^{n+1}-q(q-1)^{n}
We proceed similarly to calculate S_(n,2)=sum_(w inA^(n))sum_(pi in PAL(w)nnA^(2))1S_{n, 2}=\sum_{w \in A^{n}} \sum_{\pi \in \operatorname{PAL}(w) \cap A^{2}} 1 and obtain
S_(n,2)=sum_(w inA^(n))sum_(a in A)[aa" in "w]=qsum_(w inA^(n))[a_(1)a_(1)" in "w]S_{n, 2}=\sum_{w \in A^{n}} \sum_{a \in A}[a a \text { in } w]=q \sum_{w \in A^{n}}\left[a_{1} a_{1} \text { in } w\right]
where a_(1)a_{1} is again a fixed letter of the alphabet AA. We denote varphi(n):=sum_(w inA^(n))[a_(1)a_(1):}\varphi(n):=\sum_{w \in A^{n}}\left[a_{1} a_{1}\right. in w]w], for which varphi(2)=1\varphi(2)=1 and varphi(3)=2q-1\varphi(3)=2 q-1. It is easier to establish a recurrence formula for psi(n)=q^(n)-varphi(n)=sum_(w inA^(n))[a_(1)a_(1):}\psi(n)=q^{n}-\varphi(n)=\sum_{w \in A^{n}}\left[a_{1} a_{1}\right. not in {:w]\left.w\right]. The number psi(n)\psi(n) is obtained from:
the number (q-1)psi(n-1)(q-1) \psi(n-1) of words which do not end in a_(1)a_{1} and have not a_(1)a_(1)a_{1} a_{1} in their first n-1n-1 positions;
the number (q-1)psi(n-2)(q-1) \psi(n-2) of words which end in a_(1)a_{1}, have the n-1n-1 position occupied by one of the other q-1q-1 letters and have not a_(1)a_(1)a_{1} a_{1} in the first n-2n-2 positions.
It follows that psi\psi satisfies the recurrence formula
The expression of S_(n,2)S_{n, 2} from (16) allows us to improve Corollary 1.
Corollary 3 The following inequality holds
{:(19)l i m s u p_(n rarr oo)(M_(q)(n))/(n) <= (q+1)/(q(q-1)).:}\begin{equation*}
\limsup _{n \rightarrow \infty} \frac{M_{q}(n)}{n} \leq \frac{q+1}{q(q-1)} . \tag{19}
\end{equation*}
Corollary 4 The inequality (13) holds for q=3q=3 too.
It seems that (13) holds also for q=2q=2. Using a computer program we obtained some values for the terms of the sequence M^(**)(n)=M_(2)(n)//n,n >= 2M^{*}(n)=M_{2}(n) / n, n \geq 2. The first values are: M^(**)(n)=1,n=2,dots,7;M^(**)(8)=0.99750;M^(**)(9)=M^{*}(n)=1, n=2, \ldots, 7 ; M^{*}(8)=0.99750 ; M^{*}(9)= 0.98550 , which were close to 1 . We tried for greater values of nn and get
The last value was obtained in a very long time, so for greater values of nn we generated some random words w_(1),w_(2),dots,w_(ℓ)w_{1}, w_{2}, \ldots, w_{\ell} of length 100 , respectively 200 , 300, 400 and 500 over A={0,1}A=\{0,1\} and get some roughly approximate values M^(**)(n)≃(pal_(w_(1))(n)+cdots+pal_(w_(ℓ))(n))//ℓM^{*}(n) \simeq\left(\operatorname{pal}_{w_{1}}(n)+\cdots+\operatorname{pal}_{w_{\ell}}(n)\right) / \ell. For ℓ=200\ell=200 we obtained
This method allows us to obtain the previous exactly computed values M^(**)(20)M^{*}(20), dots,M^(**)(30)\ldots, M^{*}(30) with two exact digits. These numerical results allow us to formulate the following
Conjecture The sequence M_(q)(n)//nM_{q}(n) / n is strictly decreasing for n >= 7n \geq 7.
Acknowledgements. The first and the third author acknowledge the support of the Romanian Academy (Grant 13 GAR/2006) and the kind hospitality of Rényi Institute in the frame of the cooperation program between the Romanian Academy and the Hungarian Academy of Sciences.
References
[1] J.-P. Allouche, M. Baake, J. Cassaigne and D. Damanik, Palindrome complexity, Theoret. Comput. Sci., 292 (2003), 9-31.
[2] M.-C. Anisiu and J. Cassaigne, Properties of the complexity function for finite words, Rev. Anal. Num. Théor. Approx., 33 (2004), 123-139.
[3] J.-P. Borel and C. Reutenauer, Palindromic factors of billiard words, Theoret. Comput. Sci., 340 (2005), 334-348.
[4] N.G. De Bruijn, A combinatorial problem, Nederl. Akad. Wetensch. Proc., 49 (1946), 758-764 = Indag. Math., 8 (1946), 461-467.
[5] A. Ehrenfeucht, K.P. Lee and G. Rozenberg, Subword complexities of various classes of deterministic developmental languages without interactions, Theoret. Comput. Sci., 1 (1975), 59-75.
[6] C. Flye Sainte-Marie, Solution to question nr. 48, l'Intermédiaire des Mathématiciens, 1 (1894), 107-110.
[7] H. Fredricksen, A survey of full length nonlinear shift register cycle algorithms, SIAM Review, 24 (1982), 195-221.
[8] R.A. Games, A generalized recursive construction for De Bruijn sequences, IEEE Trans. Inform. Theory, 29 (1983), 843-850.
[9] M. Giel-Pietraszuk, M. Hoffmann, S. Dolecka, J. Rychlewski and J. Barciszewski, Palindromes in proteins, J. Protein Chem., 22 (2003), 109-113.
[10] I.J. Good, Normal recurring decimals, J. London Math. Soc., 21 (1946), 167-169.
[11] R.L. Graham, D.E. Knuth and O. Patashnik, Concrete Mathematics: A Foundation for Computer Science, 2nd edition (Reading, Massachusetts: Addison-Wesley), 1994.
[12] M. Heinz, Zur Teilwortkomplexität für Wörter und Folgen über einem endlichen Alphabet, EIK, 13 (1977), 27-38.
[13] A. de Luca, On the combinatorics of finite words, Theoret. Comput. Sci., 218 (1999), 13-39.
[14] A. de Luca and Al. de Luca, Combinatorial properties of Sturmian palindromes, Int. J. Found. Comput. Sci., 17 (2006), 557-573.
[15] F. Levé and P. Séébold, Proof of a conjecture on word complexity, Bull. Belg. Math. Soc. Simon Stevin, 8 (2001), 277-291.
[16] M.H. Martin, A problem in arrangements, Bull. American Math. Soc., 40 (1934), 859-864.
[17] M. Morse and G.A. Hedlund, Symbolic dynamics, Amer. J. Math., 60 (1938), 815-866.
[18] A. Ralston, A new memoryless algorithm for De Bruijn sequences, J. Algorithms, 2 (1981), 50-62.