Mira-Cristiana Anisiu Tiberiu Popoviciu Institue of Numerical Analysis, Romanian Academy, Romania
Keywords
?
Paper coordinates
M.-C. Anisiu, Global complexities for infinite sequences, in Analysis, Functional Equations, Approximation and Convexity, Proceeding of the Conference held in Honor of Professor Elena Popoviciu, Eds. L. Lupşa and M. Ivan, Carpatica, Cluj-Napoca, 1999, 7-11 (pdf filehere)
[1] J.-P. Allouche, Sur la complexite des suites infnies, Bull. Belg. Math. Soc. 1(1994), 133-143
[2] P. Arnoux, Ch. Mauduit, I. Shiokawa, Jun-ichi Tamura, Complexity of sequences defined by billiards in the cube, Bull. Soc. Math. France 122(1)(1994), 1-12
[3] S. Ferenczi, Complexity of sequences and dynamical systems, to appear in Discrete Math. 206(1999)
[4] S. Ferenczi, Z. K·sa, Complexities for finite factors of infinite sequences, to appear in Theor. Comput. Sci. 218(1)(1999), 177-195
[5] A. Ivanyi, On the d-complexity of words, Ann. Univ. Sci. Budapest Sect. Comput. 8(1987), 69-90
[6] J. Shallit, On the maximum number of distinct factors in a binary string, Graphs and Combinatorics 9(1993), 197-200
1999-Anisiu-GlobComp
Global complexities for infinite sequences
Mira-Cristiana Anisiu*"T. Popoviciu" Institute of Numerical AnalysisP. O. Box 68, 3400 Cluj-Napoca
1 Introduction
The language complexity of a finite word or infinite sequence is aimed to give a measure of the number of different factors in the given word or sequence. In fact, the definitions in the finite or infinite case coincide.
Let AA be a finite nonvoid alphabet and U=u_(0)u_(1)dotsU=u_{0} u_{1} \ldots an infinite sequence with u_(i)in A,i inNu_{i} \in A, i \in \mathbb{N}. The complexity of the sequence will be a function p_(U):N^(**)rarrNp_{U}: \mathbb{N}^{*} \rightarrow \mathbb{N} given by
{:(1)p_(U)(n)=♯L_(n)(U)","quad n inN^(**):}\begin{equation*}
p_{U}(n)=\sharp L_{n}(U), \quad n \in \mathbb{N}^{*} \tag{1}
\end{equation*}
♯\sharp denoting the cardinal of the set L_(n)(U)L_{n}(U) of the factors of length nn in UU.
For the finite word ww of length q inN^(**)q \in \mathbb{N}^{*}, the complexity p_(w)p_{w} may be conceptually defined in the same way. But, because ♯L_(n)(w)=0\sharp L_{n}(w)=0 for n > qn>q, we can consider only the restriction on the set {1,dots,q}\{1, \ldots, q\}, so p_(w):{1,dots,q}rarrNp_{w}:\{1, \ldots, q\} \rightarrow \mathbb{N},
{:(2)p_(w)(n)=♯L_(n)(w)","quad n in{1","dots q}.:}\begin{equation*}
p_{w}(n)=\sharp L_{n}(w), \quad n \in\{1, \ldots q\} . \tag{2}
\end{equation*}
The complexity functions p_(U)p_{U}, respectively p_(w)p_{w}, estimate the richness in factors of length nn of a sequence or of a word, for every n inN^(**)n \in \mathbb{N}^{*} or n in{0,dots,q}n \in\{0, \ldots, q\}. It would be of great help to have a global indicator of the complexity.
In the case of finite words there are two known definitions, the first one of total complexity given independently by Iványi [5] and Shallit [6], the second of maximal complexity, suggested by Rauzy.
Definition 1 For the word ww of length q inN^(**)q \in \mathbb{N}^{*}, the total complexity is given by
The aim of this paper is to define similar global complexities (not depending on the specific length nn of the factors) for infinite sequences.
2 The significance of total and maximal complexity for words
Given a word of length qq, the complexity function p_(w)p_{w} may be considered as a vector in the space R^(q)\mathbb{R}^{q}. On this space we have the well-known Minkowski norm ||x||_(1)=sum_(j=1)^(q)|x_(j)|\|x\|_{1}=\sum_{j=1}^{q}\left|x_{j}\right|, Chebyshev norm ||x||_(oo)=max_(j=1)^(q)|x_(j)|\|x\|_{\infty}=\max _{j=1}^{q}\left|x_{j}\right|, or Euclid norm ||x||=(sum_(j=1)^(q)x_(j)^(2))^(1//2)\|x\|=\left(\sum_{j=1}^{q} x_{j}^{2}\right)^{1 / 2}, which are of course equivalent. It is obvious that if we denote
We can also consider a Euclidian complexity given by ||p_(w)||\left\|p_{w}\right\|, which has the disadvantage that its values are not in general integer numbers.
The total and maximal complexities were used by Ferenczi and Kasa [4] to define upper and lower complexities for infinite sequences, in the following way:
the upper and lower total finite-word complexity function by
These two notions, having many interesting properties, are extensively studied in [4]. Being functions of n inN^(**)n \in \mathbb{N}^{*}, the upper and lower total (maximal) finite-word complexity functions are similar to the initial complexity function p_(U)p_{U}. What we intend is to define global complexities for the case of infinite sequences too.
3 Global complexities for infinite sequences
There already exist some notions to estimate the global complexity for infinite sequences, as for example the topological entropy [1]
for every sequence UU.
The definitions we give extend those for finite words to infinite sequences; to this aim it is necessary to "normalize" the values of the complexity function p_(U)(n)p_{U}(n) by dividing them with (♯A)^(n)(\sharp A)^{n}.
Definition 2 For the infinite sequence UU, the total complexity is given by
{:(9)C_(U)=s u p_(n=1)^(oo)(p_(U)(n))/((♯A)^(n)).:}\begin{equation*}
C_{U}=\sup _{n=1}^{\infty} \frac{p_{U}(n)}{(\sharp A)^{n}} . \tag{9}
\end{equation*}
Remark 1 The total complexity in definition 2 is not necessarily finite; for example, for the Champernowne word containing successively all the binary written numbers 011011100101111 dots011011100101111 \ldots we have p_(U)(n)=2^(n)p_{U}(n)=2^{n} and K_(U)=ooK_{U}=\infty, while C_(U)=1C_{U}=1.
For this reason, instead of definition 2 we propose one for normal complexity, which will have values less than 1 .
Definition 3 For the infinite sequence UU, the normal complexity is given by
Remark 2 In the definitions above, there appears the sequence
c_(n)=(p_(U)(n))/((#A)^(n)),quad n inN^(**)c_{n}=\frac{p_{U}(n)}{(\# A)^{n}}, \quad n \in \mathbb{N}^{*}
Because of the inequality
p_(U)(n+1) <= (#A)p_(U)(n),quad n inN^(**)p_{U}(n+1) \leq(\# A) p_{U}(n), \quad n \in \mathbb{N}^{*}
holding for each complexity function, the sequence c_(n)c_{n} is non-increasing, so C_(U)=c_(1)=1C_{U}=c_{1}=1. For the Champernowne sequence we have c_(n)=1,n inN^(**)c_{n}=1, n \in \mathbb{N}^{*}.
4 Global complexities for sequences with known complexity functions
In the following we consider sequences for which the complexity function is known and try to determine (at least approximately) their global complexities. The alphabet has three symbols in example 4.1, a+ba+b in 4.2 and two symbols for Sturmian sequences and those in 4.3 and 4.4. The topological entropy is h_(U)=0h_{U}=0, excepting example 4.4.
4.1 Sequences defined by billiards in the cube
A sequence generated by the structure of billiard trajectories in the cube, associating to a trajectory starting with totally irrational direction the sequence with values in {1,2,3}\{1,2,3\} given by coding 1 (respectively 2,3 ) any time
the particle rebounds on a frontal (lateral, horizontal) side of the cube, was shown in [2] to have the complexity p_(U)(n)=n^(2)+n+1p_{U}(n)=n^{2}+n+1.
Proposition 1 For a sequence defined by the cubic billiard with totally irrational direction, the total, maximal and normal complexity will be
K_(U)=2.75,C_(U)=1" and "K_(U)~~0.3994006256.K_{U}=2.75, C_{U}=1 \text { and } \mathcal{K}_{U} \approx 0.3994006256 .
4.2 Sequences with p_(U)(n)=an+bp_{U}(n)=a n+b
For an important class of sequences the complexity function is linear.
Proposition 2 Let UU be a sequence having the complexity function given by p_(U)(n)=an+b,n inN^(**)(a inN^(**),b inN,a+b >= 2)p_{U}(n)=a n+b, n \in \mathbb{N}^{*}\left(a \in \mathbb{N}^{*}, b \in \mathbb{N}, a+b \geq 2\right). Its total, maximal and normal complexity will be
where _(alpha)F_(beta){ }_{\alpha} F_{\beta} denotes the hypergeometric function.
Remark 3 Well-known sequences of this type are Sturmian sequences (which are not ultimately periodic, but are recurrent), for which a=b=1a=b=1; they have the global complexities given by
K_(U)=3,C_(U)=1" and "K_(U)=7//2-4ln 2~~0.727411278.K_{U}=3, C_{U}=1 \text { and } \mathcal{K}_{U}=7 / 2-4 \ln 2 \approx 0.727411278 .
4.3 The power sequence
Let us consider the words v_(i)=0^(i),w_(i)=1^(i)v_{i}=0^{i}, w_{i}=1^{i} and u_(i)=v_(i)w_(i),i inN^(**),u_(i)u_{i}=v_{i} w_{i}, i \in \mathbb{N}^{*}, u_{i} being obtained by concatenation of v_(i)v_{i} and w_(i)w_{i}. The power sequence UU is given by
This sequence was mentioned in Section 3 and it is obtained by writing successively all the binary written numbers. Its complexity function is p_(U)(n)=2^(n)p_{U}(n)=2^{n} and the topological entropy is h_(U)=1h_{U}=1.
Proposition 4 For the Champernowne sequence, the global complexities are given by
Acknowledgements. The author thanks the Ministry of Science and Technology for supporting partially this work (3004GR/1997).
References
[1] J.-P. Allouche, Sur la complexité des suites infinies, Bull. Belg. Math. Soc. 1(1994), 133-143
[2] P. Arnoux, Ch. Mauduit, I. Shiokawa, Jun-ichi Tamura, Complexity of sequences defined by billiards in the cube, Bull. Soc. Math. France 122(1)(1994), 1-12
[3] S. Ferenczi, Complexity of sequences and dynamical systems, to appear in Discrete Math. 206(1999)
[4] S. Ferenczi, Z. Kása, Complexities for finite factors of infinite sequences, to appear in Theor. Comput. Sci. 218(1)(1999), 177-195
[5] A. Iványi, On the dd-complexity of words, Ann. Univ. Sci. Budapest Sect. Comput. 8(1987), 69-90
[6] J. Shallit, On the maximum number of distinct factors in a binary string, Graphs and Combinatorics 9 (1993), 197-200