Global complexities for infinite sequences

Abstract

Authors

Mira-Cristiana Anisiu
Tiberiu Popoviciu Institue of Numerical Analysis, Romanian Academy, Romania

Keywords

?

Paper coordinates

M.-C. Anisiu, Global complexities for infinite sequences, in Analysis, Functional Equations, Approximation and Convexity, Proceeding of the Conference held in Honor of Professor Elena Popoviciu, Eds. L. Lupşa and M. Ivan, Carpatica, Cluj-Napoca, 1999, 7-11 (pdf file here)

PDF

About this paper

Journal
Publisher Name
DOI
Print ISSN
Online ISSN

google scholar link

[1] J.-P. Allouche, Sur la complexite des suites infnies, Bull. Belg. Math. Soc. 1(1994), 133-143
[2] P. Arnoux, Ch. Mauduit, I. Shiokawa, Jun-ichi Tamura, Complexity of sequences defined by billiards in the cube, Bull. Soc. Math. France 122(1)(1994), 1-12
[3] S. Ferenczi, Complexity of sequences and dynamical systems, to appear in Discrete Math. 206(1999)
[4] S. Ferenczi, Z. K·sa, Complexities for finite factors of infinite sequences, to appear in Theor. Comput. Sci. 218(1)(1999), 177-195
[5] A. Ivanyi, On the d-complexity of words, Ann. Univ. Sci. Budapest Sect. Comput. 8(1987), 69-90
[6] J. Shallit, On the maximum number of distinct factors in a binary string, Graphs and Combinatorics 9(1993), 197-200

1999-Anisiu-GlobComp

Global complexities for infinite sequences

Mira-Cristiana Anisiu*"T. Popoviciu" Institute of Numerical AnalysisP. O. Box 68, 3400 Cluj-Napoca

1 Introduction

The language complexity of a finite word or infinite sequence is aimed to give a measure of the number of different factors in the given word or sequence. In fact, the definitions in the finite or infinite case coincide.
Let A A AAA be a finite nonvoid alphabet and U = u 0 u 1 U = u 0 u 1 U=u_(0)u_(1)dotsU=u_{0} u_{1} \ldotsU=u0u1 an infinite sequence with u i A , i N u i A , i N u_(i)in A,i inNu_{i} \in A, i \in \mathbb{N}uiA,iN. The complexity of the sequence will be a function p U : N N p U : N N p_(U):N^(**)rarrNp_{U}: \mathbb{N}^{*} \rightarrow \mathbb{N}pU:NN given by
(1) p U ( n ) = L n ( U ) , n N (1) p U ( n ) = L n ( U ) , n N {:(1)p_(U)(n)=♯L_(n)(U)","quad n inN^(**):}\begin{equation*} p_{U}(n)=\sharp L_{n}(U), \quad n \in \mathbb{N}^{*} \tag{1} \end{equation*}(1)pU(n)=Ln(U),nN
\sharp denoting the cardinal of the set L n ( U ) L n ( U ) L_(n)(U)L_{n}(U)Ln(U) of the factors of length n n nnn in U U UUU.
For the finite word w w www of length q N q N q inN^(**)q \in \mathbb{N}^{*}qN, the complexity p w p w p_(w)p_{w}pw may be conceptually defined in the same way. But, because L n ( w ) = 0 L n ( w ) = 0 ♯L_(n)(w)=0\sharp L_{n}(w)=0Ln(w)=0 for n > q n > q n > qn>qn>q, we can consider only the restriction on the set { 1 , , q } { 1 , , q } {1,dots,q}\{1, \ldots, q\}{1,,q}, so p w : { 1 , , q } N p w : { 1 , , q } N p_(w):{1,dots,q}rarrNp_{w}:\{1, \ldots, q\} \rightarrow \mathbb{N}pw:{1,,q}N,
(2) p w ( n ) = L n ( w ) , n { 1 , q } . (2) p w ( n ) = L n ( w ) , n { 1 , q } . {:(2)p_(w)(n)=♯L_(n)(w)","quad n in{1","dots q}.:}\begin{equation*} p_{w}(n)=\sharp L_{n}(w), \quad n \in\{1, \ldots q\} . \tag{2} \end{equation*}(2)pw(n)=Ln(w),n{1,q}.
The complexity functions p U p U p_(U)p_{U}pU, respectively p w p w p_(w)p_{w}pw, estimate the richness in factors of length n n nnn of a sequence or of a word, for every n N n N n inN^(**)n \in \mathbb{N}^{*}nN or n { 0 , , q } n { 0 , , q } n in{0,dots,q}n \in\{0, \ldots, q\}n{0,,q}. It would be of great help to have a global indicator of the complexity.
In the case of finite words there are two known definitions, the first one of total complexity given independently by Iványi [5] and Shallit [6], the second of maximal complexity, suggested by Rauzy.
Definition 1 For the word w w www of length q N q N q inN^(**)q \in \mathbb{N}^{*}qN, the total complexity is given by
(3) K w = j = 1 q p w ( j ) , (3) K w = j = 1 q p w ( j ) , {:(3)K_(w)=sum_(j=1)^(q)p_(w)(j)",":}\begin{equation*} K_{w}=\sum_{j=1}^{q} p_{w}(j), \tag{3} \end{equation*}(3)Kw=j=1qpw(j),
and the maximal complexity is
(4) C w = max j = 1 q p w ( j ) (4) C w = max j = 1 q p w ( j ) {:(4)C_(w)=max_(j=1)^(q)p_(w)(j):}\begin{equation*} C_{w}=\max _{j=1}^{q} p_{w}(j) \tag{4} \end{equation*}(4)Cw=maxj=1qpw(j)
The aim of this paper is to define similar global complexities (not depending on the specific length n n nnn of the factors) for infinite sequences.

2 The significance of total and maximal complexity for words

Given a word of length q q qqq, the complexity function p w p w p_(w)p_{w}pw may be considered as a vector in the space R q R q R^(q)\mathbb{R}^{q}Rq. On this space we have the well-known Minkowski norm x 1 = j = 1 q | x j | x 1 = j = 1 q x j ||x||_(1)=sum_(j=1)^(q)|x_(j)|\|x\|_{1}=\sum_{j=1}^{q}\left|x_{j}\right|x1=j=1q|xj|, Chebyshev norm x = max j = 1 q | x j | x = max j = 1 q x j ||x||_(oo)=max_(j=1)^(q)|x_(j)|\|x\|_{\infty}=\max _{j=1}^{q}\left|x_{j}\right|x=maxj=1q|xj|, or Euclid norm x = ( j = 1 q x j 2 ) 1 / 2 x = j = 1 q x j 2 1 / 2 ||x||=(sum_(j=1)^(q)x_(j)^(2))^(1//2)\|x\|=\left(\sum_{j=1}^{q} x_{j}^{2}\right)^{1 / 2}x=(j=1qxj2)1/2, which are of course equivalent. It is obvious that if we denote
p w = ( p w ( 1 ) , , p w ( q ) ) p w = p w ( 1 ) , , p w ( q ) p_(w)=(p_(w)(1),dots,p_(w)(q))p_{w}=\left(p_{w}(1), \ldots, p_{w}(q)\right)pw=(pw(1),,pw(q))
we shall have for the total complexity
K w = p w 1 K w = p w 1 K_(w)=||p_(w)||_(1)K_{w}=\left\|p_{w}\right\|_{1}Kw=pw1
and for the maximal complexity
C w = p w C w = p w C_(w)=||p_(w)||_(oo)C_{w}=\left\|p_{w}\right\|_{\infty}Cw=pw
We can also consider a Euclidian complexity given by p w p w ||p_(w)||\left\|p_{w}\right\|pw, which has the disadvantage that its values are not in general integer numbers.
The total and maximal complexities were used by Ferenczi and Kasa [4] to define upper and lower complexities for infinite sequences, in the following way:
  • the upper and lower total finite-word complexity function by
K U + ( n ) = max i K ( u i u i + 1 u i + n 1 ) , (5) K U ( n ) = min i K ( u i u i + 1 u i + n 1 ) ; K U + ( n ) = max i K u i u i + 1 u i + n 1 , (5) K U ( n ) = min i K u i u i + 1 u i + n 1 ; {:[K_(U)^(+)(n)=max_(i)K(u_(i)u_(i+1)dotsu_(i+n-1))","],[(5)K_(U)^(-)(n)=min_(i)K(u_(i)u_(i+1)dotsu_(i+n-1));]:}\begin{align*} & K_{U}^{+}(n)=\max _{i} K\left(u_{i} u_{i+1} \ldots u_{i+n-1}\right), \\ & K_{U}^{-}(n)=\min _{i} K\left(u_{i} u_{i+1} \ldots u_{i+n-1}\right) ; \tag{5} \end{align*}KU+(n)=maxiK(uiui+1ui+n1),(5)KU(n)=miniK(uiui+1ui+n1);
  • the upper and lower maximal finite-word complexity function by
C U + ( n ) = max i C ( u i u i + 1 u i + n 1 ) , (6) C U ( n ) = min i C ( u i u i + 1 u i + n 1 ) . C U + ( n ) = max i C u i u i + 1 u i + n 1 , (6) C U ( n ) = min i C u i u i + 1 u i + n 1 . {:[C_(U)^(+)(n)=max_(i)C(u_(i)u_(i+1)dotsu_(i+n-1))","],[(6)C_(U)^(-)(n)=min_(i)C(u_(i)u_(i+1)dotsu_(i+n-1)).]:}\begin{align*} & C_{U}^{+}(n)=\max _{i} C\left(u_{i} u_{i+1} \ldots u_{i+n-1}\right), \\ & C_{U}^{-}(n)=\min _{i} C\left(u_{i} u_{i+1} \ldots u_{i+n-1}\right) . \tag{6} \end{align*}CU+(n)=maxiC(uiui+1ui+n1),(6)CU(n)=miniC(uiui+1ui+n1).
These two notions, having many interesting properties, are extensively studied in [4]. Being functions of n N n N n inN^(**)n \in \mathbb{N}^{*}nN, the upper and lower total (maximal) finite-word complexity functions are similar to the initial complexity function p U p U p_(U)p_{U}pU. What we intend is to define global complexities for the case of infinite sequences too.

3 Global complexities for infinite sequences

There already exist some notions to estimate the global complexity for infinite sequences, as for example the topological entropy [1]
(7) h U = lim n log p U ( n ) n log A , (7) h U = lim n log p U ( n ) n log A , {:(7)h_(U)=lim_(n rarr oo)(log p_(U)(n))/(n log♯A)",":}\begin{equation*} h_{U}=\lim _{n \rightarrow \infty} \frac{\log p_{U}(n)}{n \log \sharp A}, \tag{7} \end{equation*}(7)hU=limnlogpU(n)nlogA,
which satisfies
0 h U 1 0 h U 1 0 <= h_(U) <= 10 \leq h_{U} \leq 10hU1
for every sequence U U UUU.
The definitions we give extend those for finite words to infinite sequences; to this aim it is necessary to "normalize" the values of the complexity function p U ( n ) p U ( n ) p_(U)(n)p_{U}(n)pU(n) by dividing them with ( A ) n ( A ) n (♯A)^(n)(\sharp A)^{n}(A)n.
Definition 2 For the infinite sequence U U UUU, the total complexity is given by
(8) K U = n = 1 p U ( n ) ( A ) n , (8) K U = n = 1 p U ( n ) ( A ) n , {:(8)K_(U)=sum_(n=1)^(oo)(p_(U)(n))/((♯A)^(n))",":}\begin{equation*} K_{U}=\sum_{n=1}^{\infty} \frac{p_{U}(n)}{(\sharp A)^{n}}, \tag{8} \end{equation*}(8)KU=n=1pU(n)(A)n,
and the maximal complexity by
(9) C U = sup n = 1 p U ( n ) ( A ) n . (9) C U = sup n = 1 p U ( n ) ( A ) n . {:(9)C_(U)=s u p_(n=1)^(oo)(p_(U)(n))/((♯A)^(n)).:}\begin{equation*} C_{U}=\sup _{n=1}^{\infty} \frac{p_{U}(n)}{(\sharp A)^{n}} . \tag{9} \end{equation*}(9)CU=supn=1pU(n)(A)n.
Remark 1 The total complexity in definition 2 is not necessarily finite; for example, for the Champernowne word containing successively all the binary written numbers 011011100101111 011011100101111 011011100101111 dots011011100101111 \ldots011011100101111 we have p U ( n ) = 2 n p U ( n ) = 2 n p_(U)(n)=2^(n)p_{U}(n)=2^{n}pU(n)=2n and K U = K U = K_(U)=ooK_{U}=\inftyKU=, while C U = 1 C U = 1 C_(U)=1C_{U}=1CU=1.
For this reason, instead of definition 2 we propose one for normal complexity, which will have values less than 1 .
Definition 3 For the infinite sequence U U UUU, the normal complexity is given by
(10) K U = n = 1 1 ( A ) n p U ( n ) 1 + p U ( n ) (10) K U = n = 1 1 ( A ) n p U ( n ) 1 + p U ( n ) {:(10)K_(U)=sum_(n=1)^(oo)(1)/((♯A)^(n))(p_(U)(n))/(1+p_(U)(n)):}\begin{equation*} \mathcal{K}_{U}=\sum_{n=1}^{\infty} \frac{1}{(\sharp A)^{n}} \frac{p_{U}(n)}{1+p_{U}(n)} \tag{10} \end{equation*}(10)KU=n=11(A)npU(n)1+pU(n)
Remark 2 In the definitions above, there appears the sequence
c n = p U ( n ) ( # A ) n , n N c n = p U ( n ) ( # A ) n , n N c_(n)=(p_(U)(n))/((#A)^(n)),quad n inN^(**)c_{n}=\frac{p_{U}(n)}{(\# A)^{n}}, \quad n \in \mathbb{N}^{*}cn=pU(n)(#A)n,nN
Because of the inequality
p U ( n + 1 ) ( # A ) p U ( n ) , n N p U ( n + 1 ) ( # A ) p U ( n ) , n N p_(U)(n+1) <= (#A)p_(U)(n),quad n inN^(**)p_{U}(n+1) \leq(\# A) p_{U}(n), \quad n \in \mathbb{N}^{*}pU(n+1)(#A)pU(n),nN
holding for each complexity function, the sequence c n c n c_(n)c_{n}cn is non-increasing, so C U = c 1 = 1 C U = c 1 = 1 C_(U)=c_(1)=1C_{U}=c_{1}=1CU=c1=1. For the Champernowne sequence we have c n = 1 , n N c n = 1 , n N c_(n)=1,n inN^(**)c_{n}=1, n \in \mathbb{N}^{*}cn=1,nN.

4 Global complexities for sequences with known complexity functions

In the following we consider sequences for which the complexity function is known and try to determine (at least approximately) their global complexities. The alphabet has three symbols in example 4.1, a + b a + b a+ba+ba+b in 4.2 and two symbols for Sturmian sequences and those in 4.3 and 4.4. The topological entropy is h U = 0 h U = 0 h_(U)=0h_{U}=0hU=0, excepting example 4.4.

4.1 Sequences defined by billiards in the cube

A sequence generated by the structure of billiard trajectories in the cube, associating to a trajectory starting with totally irrational direction the sequence with values in { 1 , 2 , 3 } { 1 , 2 , 3 } {1,2,3}\{1,2,3\}{1,2,3} given by coding 1 (respectively 2,3 ) any time
the particle rebounds on a frontal (lateral, horizontal) side of the cube, was shown in [2] to have the complexity p U ( n ) = n 2 + n + 1 p U ( n ) = n 2 + n + 1 p_(U)(n)=n^(2)+n+1p_{U}(n)=n^{2}+n+1pU(n)=n2+n+1.
Proposition 1 For a sequence defined by the cubic billiard with totally irrational direction, the total, maximal and normal complexity will be
K U = 2.75 , C U = 1 and K U 0.3994006256 . K U = 2.75 , C U = 1  and  K U 0.3994006256 . K_(U)=2.75,C_(U)=1" and "K_(U)~~0.3994006256.K_{U}=2.75, C_{U}=1 \text { and } \mathcal{K}_{U} \approx 0.3994006256 .KU=2.75,CU=1 and KU0.3994006256.

4.2 Sequences with p U ( n ) = a n + b p U ( n ) = a n + b p_(U)(n)=an+bp_{U}(n)=a n+bpU(n)=an+b

For an important class of sequences the complexity function is linear.
Proposition 2 Let U U UUU be a sequence having the complexity function given by p U ( n ) = a n + b , n N ( a N , b N , a + b 2 ) p U ( n ) = a n + b , n N a N , b N , a + b 2 p_(U)(n)=an+b,n inN^(**)(a inN^(**),b inN,a+b >= 2)p_{U}(n)=a n+b, n \in \mathbb{N}^{*}\left(a \in \mathbb{N}^{*}, b \in \mathbb{N}, a+b \geq 2\right)pU(n)=an+b,nN(aN,bN,a+b2). Its total, maximal and normal complexity will be
K U = 2 F 1 ( 1 , 2 a + b a ; a + b a ; 1 a + b ) , C U = 1 K U = 1 a + b + 1 3 F 2 ( 1 , 2 a + b a , a + b + 1 a ; a + b a , 2 a + b + 1 a ; 1 a + b ) K U = 2 F 1 1 , 2 a + b a ; a + b a ; 1 a + b , C U = 1 K U = 1 a + b + 1 3 F 2 1 , 2 a + b a , a + b + 1 a ; a + b a , 2 a + b + 1 a ; 1 a + b {:[K_(U)=_(2)F_(1)(1,(2a+b)/(a);(a+b)/(a);(1)/(a+b))","C_(U)=1],[K_(U)=(1)/(a+b+1)_(3)F_(2)(1,(2a+b)/(a),(a+b+1)/(a);(a+b)/(a),(2a+b+1)/(a);(1)/(a+b))]:}\begin{aligned} & K_{U}={ }_{2} F_{1}\left(1, \frac{2 a+b}{a} ; \frac{a+b}{a} ; \frac{1}{a+b}\right), C_{U}=1 \\ & \mathcal{K}_{U}=\frac{1}{a+b+1}{ }_{3} F_{2}\left(1, \frac{2 a+b}{a}, \frac{a+b+1}{a} ; \frac{a+b}{a}, \frac{2 a+b+1}{a} ; \frac{1}{a+b}\right) \end{aligned}KU=2F1(1,2a+ba;a+ba;1a+b),CU=1KU=1a+b+13F2(1,2a+ba,a+b+1a;a+ba,2a+b+1a;1a+b)
where α F β α F β _(alpha)F_(beta){ }_{\alpha} F_{\beta}αFβ denotes the hypergeometric function.
Remark 3 Well-known sequences of this type are Sturmian sequences (which are not ultimately periodic, but are recurrent), for which a = b = 1 a = b = 1 a=b=1a=b=1a=b=1; they have the global complexities given by
K U = 3 , C U = 1 and K U = 7 / 2 4 ln 2 0.727411278 . K U = 3 , C U = 1  and  K U = 7 / 2 4 ln 2 0.727411278 . K_(U)=3,C_(U)=1" and "K_(U)=7//2-4ln 2~~0.727411278.K_{U}=3, C_{U}=1 \text { and } \mathcal{K}_{U}=7 / 2-4 \ln 2 \approx 0.727411278 .KU=3,CU=1 and KU=7/24ln20.727411278.

4.3 The power sequence

Let us consider the words v i = 0 i , w i = 1 i v i = 0 i , w i = 1 i v_(i)=0^(i),w_(i)=1^(i)v_{i}=0^{i}, w_{i}=1^{i}vi=0i,wi=1i and u i = v i w i , i N , u i u i = v i w i , i N , u i u_(i)=v_(i)w_(i),i inN^(**),u_(i)u_{i}=v_{i} w_{i}, i \in \mathbb{N}^{*}, u_{i}ui=viwi,iN,ui being obtained by concatenation of v i v i v_(i)v_{i}vi and w i w i w_(i)w_{i}wi. The power sequence U U UUU is given by
(11) U = u 1 u 2 u 3 = 010011000111 (11) U = u 1 u 2 u 3 = 010011000111 {:(11)U=u_(1)u_(2)u_(3)dots=010011000111 dots:}\begin{equation*} U=u_{1} u_{2} u_{3} \ldots=010011000111 \ldots \tag{11} \end{equation*}(11)U=u1u2u3=010011000111
and has the complexity p U = n ( n + 1 ) / 2 + 1 p U = n ( n + 1 ) / 2 + 1 p_(U)=n(n+1)//2+1p_{U}=n(n+1) / 2+1pU=n(n+1)/2+1.
Proposition 3 For the power sequence (11), the global complexities are given by
K U = 5 , C U = 1 , K U 0.7595479501 K U = 5 , C U = 1 , K U 0.7595479501 K_(U)=5,C_(U)=1,K_(U)~~0.7595479501K_{U}=5, C_{U}=1, \mathcal{K}_{U} \approx 0.7595479501KU=5,CU=1,KU0.7595479501

4.4 Champernowne sequence

This sequence was mentioned in Section 3 and it is obtained by writing successively all the binary written numbers. Its complexity function is p U ( n ) = 2 n p U ( n ) = 2 n p_(U)(n)=2^(n)p_{U}(n)=2^{n}pU(n)=2n and the topological entropy is h U = 1 h U = 1 h_(U)=1h_{U}=1hU=1.
Proposition 4 For the Champernowne sequence, the global complexities are given by
K U = , C U = 1 , K U 0.7644997803 . K U = , C U = 1 , K U 0.7644997803 . K_(U)=oo,C_(U)=1,K_(U)~~0.7644997803.K_{U}=\infty, C_{U}=1, \mathcal{K}_{U} \approx 0.7644997803 .KU=,CU=1,KU0.7644997803.
Acknowledgements. The author thanks the Ministry of Science and Technology for supporting partially this work (3004GR/1997).

References

[1] J.-P. Allouche, Sur la complexité des suites infinies, Bull. Belg. Math. Soc. 1(1994), 133-143
[2] P. Arnoux, Ch. Mauduit, I. Shiokawa, Jun-ichi Tamura, Complexity of sequences defined by billiards in the cube, Bull. Soc. Math. France 122(1)(1994), 1-12
[3] S. Ferenczi, Complexity of sequences and dynamical systems, to appear in Discrete Math. 206(1999)
[4] S. Ferenczi, Z. Kása, Complexities for finite factors of infinite sequences, to appear in Theor. Comput. Sci. 218(1)(1999), 177-195
[5] A. Iványi, On the d d ddd-complexity of words, Ann. Univ. Sci. Budapest Sect. Comput. 8(1987), 69-90
[6] J. Shallit, On the maximum number of distinct factors in a binary string, Graphs and Combinatorics 9 (1993), 197-200

1999

Related Posts