PROPERTIES OF THE COMPLEXITY FUNCTION FOR FINITE WORDS

. The subword complexity function p w of a ﬁnite word w over a ﬁnite alphabet A with card A = q ≥ 1 is deﬁned by p w ( n ) = card( F ( w ) ∩ A n ) for n ∈ N , where F ( w ) represents the set of all the subwords or factors of w. The shape of the complexity function, especially its piecewise monotonicity, is studied in detail. The function h deﬁned as h ( n ) = min { q n , N − n + 1 } for n ∈ { 0 , 1 , ..., N } has values greater than or equal to those of the complexity function p w for any w ∈ A N , i.e., p w ( n ) ≤ h ( n ) for all n ∈ { 0 , 1 , ..., N } . As a ﬁrst result regarding h, it is proved that for each N ∈ N there exist words of length N for which the maximum of their complexity function is equal to the maximum of the function h ; a way to construct such words is described. This result gives rise to a further question: for a given N, is there a word of length N whose complexity function coincides with h for each n ∈ { 0 , 1 , ..., N } ? The problem is answered in aﬃrmative, with diﬀerent constructive proofs for binary alphabets ( q = 2) and for those with q > 2 . This means that for each N ∈ N , there exist words w of length N whose complexity function is equal to the function h. Such words are constructed using the de Bruijn graphs.

Let an alphabet A with card A = q ≥ 1 be given.A factor (subword) u of an infinite sequence or finite word w has the right valence j if there are j and only j distinct letters x i such that ux i , 1 ≤ i ≤ j are also in F (w) (the set of all the subwords, or factors, of w); if a factor has the right valence j it can be extended on the right in exactly j ways.The left valence is defined in a similar way.A factor having the right (left) valence ≥ 2 is called right (left) special; a factor which is both right and left special is called bispecial.The length of a word w will be denoted by |w| .
For an infinite sequence U any factor u can always be extended on the right in a factor of U .For a finite word w there are subwords which cannot be extended on the right.Such words have to be suffixes of w.Let us denote by w 0 the suffix of w of minimal length which cannot be extended on the right and by K the length of w 0 .Then any other subword λw 0 also cannot be extended on the right.Considering the prefix of w of minimal length which cannot be extended on the left, we shall denote its length by H.The constants K and H were defined by de Luca [15].
Let us denote by S 0 (w) the set of all suffixes of w which cannot be extended on the right in F (w), i.e., their right valence is 0. If the length of w is N, then we set for any 0 For all 0 ≤ n ≤ N, one has s 0 (n) ≤ 1.Moreover, K being the length of w 0 (the suffix of w of minimal length which cannot be extended on the right), s 0 is given by It follows that the number of subwords which cannot be extended on the right is card(S 0 (w)) = N − K + 1.For an infinite sequence U, the (subword) complexity function p U : N −→ N (defined in [17] as the block growth, then named subword complexity in [6]) is given by p U (n) = card(F (U ) ∩ A n ) for n ∈ N, so it maps each nonnegative number n to the number of factors of length n of U ; it verifies the iterative equation (1) p U (n + 1) = p U (n) + q j=2 (j − 1)s(j, n), s(j, n) being the cardinal of the set of the factors of U having the length n and the right valence j.For a finite word w of length N, the complexity function p w : N −→ N given by p w (n) = card(F (w) ∩ A n ), n ∈ N, has the property that p w (n) = 0 for n > N. The corresponding iterative equation is (2) p w (n + 1) = p w (n) + q j=2 (j − 1)s(j, n) − s 0 (n).
The above relations have their correspondents in terms of left extensions of the subwords.
For a finite word w of length N over the alphabet A with card A = q, the subword complexity p w (n) will be less than or equal to the number q n of all the possible words of length n over the q-letter alphabet and also less than or equal to the number N − n + 1 of all occurrences of subwords of length n in w.The map h : {0, 1, ..., N } −→ N defined in [15] (4) will have values greater than or equal to those of any complexity function p w for was stated in [11], while p w (n) ≤ h(n) appeared in [20].
We recall that for infinite sequences U one has and that there exist sequences, called complete, for which the complexity is precisely q n for all n ∈ N.An example is the Champernowne sequence 0.1.10.11.100.101.110.111.1000.... containing successively all the nonnegative integers written in base 2, and, more generally, in base q (it was used in [4] to construct a normal number in base ten).
For n ≥ 1, q ≥ 2 and N ≤ q, we have For each word w 2 containing N ≤ q distinct elements of A the complexity function is p w 2 (n) = N −n+1 for all n ∈ {1, ..., N }, and p w 2 (0) = h(0) = 1, hence in this case it also coincides with h.
In what follows we shall consider q ≥ 2 and N > q.The values of the function h are given by the minimum of the values of an increasing exponential and of a descending line, so at the beginning h will follow the exponential, and then the descending line.The following result is presented without proof in [15]: denotes the first point where (e N , h(e N )) is on the descending line, the maximum h max of the function h is attained at e N , and e N is given by log q N or log q N + 1 (for a real x, [x] denotes the largest integer which is less than or equal to x).
We shall determine precisely the point (e N , h(e N )) where the maximum of the function h is taken.
, then e N = k + 1 and this is the unique point where h attains its maximum N − k; for N = q k + k, we have e N = k + 1 and the function h attains its maximum N − k at both e N and e N − 1.
In fact, if q k + k ≤ N < q k+1 + k + 1, the function h is given by k the maximum being attained also at the point k.
Proof.Let N ∈ N be given so that q k + k ≤ N < q k+1 + k + 1.The function h being increasing on {0, 1, ..., e N −1} and decreasing on {e N , ..., N }, we have only to compare its values on e N − 1 and e N .From the definition of h and of e N we have (6) q e N −1 < N − (e N − 1) + 1 and ( 7) which means that ( 8) But h(e N − 1) = q e N −1 and h(e N ) = N − e N + 1, hence from (6) it follows that the maximum of h is taken at e N .The function f (x) = q x + x being increasing, from (8) one obtains e N = k + 1 and h(e N ) = N − k.
If q k + k < N < q k+1 + k + 1 we have h(k + 1) > h(k), so e N = k + 1 is the unique point where h attains its maximum which is equal to N − k; if N = q k + k, we have h(k + 1) = h(k), so the maximum N − k of h is taken at two points e N = k + 1 and e N − 1.
The description of h given in (5) was established in [12].The value of e N being related to the integer part of log q N, we can give a more precise result than that in the above Proposition 1.
Remark 2. For q k + k ≤ N < q k+1 + k + 1, we have k = [log q N ] for q k + k ≤ N < q k+1 , and k + 1 = [log q N ] for q k+1 ≤ N < q k+1 + k + 1; it follows that in the first case e N = [log q N ] + 1, and in the second case e N = [log q N ].
Given the number N, the maximum of the function h can be easily determined.

PROPERTIES OF THE COMPLEXITY FUNCTION p w
For an infinite sequence U, the complexity function p U is nondecreasing; if there exists m ∈ N such that p U (m + 1) = p U (m), then p U is constant for n ≥ m.The complexity function for a finite word w of length N has obviously a different behaviour, because of p w (N ) = 1 (there is a unique factor of length N, namely w).The study of the shape of p w was considered by Heinz [11] and then by de Luca in [15], the results in [15] being briefly exposed in what follows.Then the piecewise monotonicity of p w is established in Theorems 5 and 6.
Let us consider for n ∈ {0, ..., N } the number R w (n) of all right special factors of length n.Any suffix of a right special factor is still a right special factor.Since R w (N − 1) = R w (N ) = 0, one can define an integer R by One has 0 ≤ R ≤ N − 1; thus R − 1 represents the maximal length of a right special factor of w (excepting the case of the word a N which has no special factor and for which R = 0).If R = 1, in w there are no right special factors of length n ≥ 1; such an example is w = (ab) k , k ≥ 1.Similarly, there exists a number 0 ≤ L ≤ N − 1 so that L − 1 represents the maximal length of a left special factor of w (except if w = a N ).Remember the number K (H) representing the minimal length of a suffix (prefix) of w which cannot be extended on the right (left).The numbers K and R (or their duals H and L) play an important role in the description of the shape of p w .
Let us denote The function r has the property that r(n) > 0 for n ∈ [0, R − 1] , and r (n) = 0 for n ∈ [R, N ] .The recurrence relation (2) can be written as It follows Proposition 3. [15].The subword complexity p w takes its maximum at R and, moreover, Proof.In both cases analyzed above, p w has its maximum at R.
In a similar way, one can prove Proposition 4. [15].The subword complexity p w takes its maximum at L and, moreover, hence max{R, K} = max{L, H}.
From the analysis before Proposition 3, we have the following information on the shape of the function p w [15]: For R < K, it is strictly increasing (starting from p w (0) = 1 and p w (1) = q = card A), then constant, and then strictly decreasing (with p w (n + 1) = p w (n) − 1 on the last interval).
For R ≥ K, p w is at first strictly increasing, then non-decreasing, and at last strictly decreasing also with So in both cases, there is an interval on which p w is increasing and one on which p w is strictly decreasing.The only problem is that in the second case it could be that after becoming constant, p w would increase again.We show that this is not the case.
Let us consider n ∈ [K, R − 1] , so s 0 (n) = 1, r (n) > 0 and p w (n + 1) ≥ p w (n).Suppose that there exists n so that From ( 10) one obtains that s (2, n) = 1 and s (j, n) = 0 for j ≥ 3, i.e., there exists a unique right special factor having length n, and its valence is 2: let it be denoted v n .From (11) it follows that which is possible for two situations: If II. is true, there will be a right special factor of length n+1 having valence at least 3, and then the factor obtained by excluding its first letter will have length n and valence at least 3, contradicting the uniqueness of v n .
If I. is true, there will exist two different right special factors of length n + 1 and valence at least 2. They can differ only by their first letter, otherwise there would exist two different factors of length n and valence 2. So they will have the form av n , bv n , a = b, i.e., v n will be bispecial, and in the word w there will be also (12) av n c, av The subword v n cannot be a suffix of w since v n is extendable to the right and there is no extendable suffix of length greater than or equal to K. Let us consider the last occurrence of v n , suppose it is followed by c.Then ( 13) and, v n c being left special, v n c will have another occurrence in w, so Let u be the longest common prefix of v n cz 2 and v n cz 2 , which will satisfy Since the subword u is a proper prefix of v n cz 2 , u is right extendable; then it cannot be a suffix of w, hence it is also a proper prefix of v n cz 2 , and thus right special.The suffix of length n of u is then right special, in contradiction with the fact that the last occurrence of v n , the unique right special factor of length n, was chosen so that w = z 1 v n cz 2 .It follows that in the case K < R, if p w (n) = p w (n + 1) for a value n ≥ K, then p w will remain constant until it will begin (at R) to decrease to 1 (it cannot start increasing again).We mention that Heinz [11] proved that from p w (n) = p w (n + 1) and Let us denote by J the smallest number greater than or equal to K for which w has precisely one right special factor of that length, with valence 2 (if this is not the case, take J = R).We have established the following Theorem 5.For a finite word of length N , the complexity function is at first strictly increasing, then constant and at last decreasing having the slope One can easily avoid to analyze two cases by simply considering instead of a word w ∈ A N , a word W ∈ (A ∪ { * , #}) N +2 obtained by adding two different symbols which are not in A at the beginning and at the end of w, i.e., W = * w#.The complexity functions for w and W are related by p W (n) = p w (n) + 2 for n ∈ {1, ..., N + 1} (and obviously p W (N + 2) = 1).So the graph of p W is the graph of p w shifted by two units parallel to the y-axis, and the two functions have the same monotonicity.For W we have K W = 1, R W ≥ R w and, similarly, H W = 1, L W ≥ L w , hence in this case we have always R W ≥ K W ; from Proposition 4 it follows also R W = L W .The advantage of considering the word W is that instead of the four parameters K, H, R, L we are left with only one, namely the common value M of R W = L W . Denoting by J the smallest positive number for which W has precisely one right special factor of that length, with valence 2 (if there is not such a factor, J = M ), we obtain Theorem 6.For a finite word w of length N, the intervals of monotonicity of p w are [0, J], [J, M ] and [M, N ], the function increasing at first, being constant and then decreasing with the slope −1; the maximum of p w is p w (M ) = N − M + 1.The numbers J and M are those defined above for the word W = * w#.We mention that the refinement of de Luca's result has been proved independently by Levé and Séébold [14] while studying k-reachable integers.

THE FUNCTION h AND RELATED WORDS
In section 2 we found the point where the function h takes its maximum.A problem to be considered is the following: are there any words w of length N such that h(e N ) = max {p w (n) : n ∈ {0, ..., N }}?If such words do exist, they have the property that the maximum of their complexity function cannot be exceeded by the maximum of the complexity function of any other word of length N.
The answer to this problem is in affirmative and it relies on the following result which was stated by Good in [10] for q = 2.The enumeration of the words whose existence is proved was given by de Bruijn [2], who later [3] acknowledged the priority of C. Flye Sainte-Marie [7].
Lemma 7. Given an alphabet A with card A = q, for each k ∈ N the shortest word containing all the q k words of length k has q k + k − 1 letters.
Proof.The existence of such a word (which is usually named de Bruijn word of order k) is proved by considering the de Bruijn graph B k−1 (which is strongly connected) with q k−1 vertices labelled with the elements of A k−1 , and q k arcs (an arc from u to v exists if and only if there exist two letters x, y ∈ A such that ux = yv ∈ A k ).Each vertex has the same number q of inward and outward arcs; therefore, there exists an Eulerian cycle, and each path, starting from any vertex and following the cycle until coming back to that vertex, will provide a word (obviously the shortest) of length q k + k − 1 which contains exactly one occurrence of all the q k words of length k.The word of length q k + k − 1 is often identified with the cycle formed by its first q k letters.
Remark 4. For the de Bruijn word of order k, whose existence was proved in Lemma 7, we have R = K = J = k, and the maximum of its complexity function is attained at k and equals q k .Such a word can be represented in the form x 1 ...x q k ...x q k +k−1 (with x q k +1 ...x q k +k−1 = x 1 ...x k−1 ), or as a cycle (x 1 ...x q k ) or as an infinite periodic sequence with the period q k .The first algorithm which constructs such a word was given by Martin [16].Considering the alphabet A = {i 1 , ..., i q }, the algorithm in question is built up out the following three rules.
I. Each of the first k − 1 symbols is chosen equal to i 1 .
II.The symbol a m to be added to the sequence a 1 a 2 ...a k ...a m−k+1 ...a m−1 , where a 1 = ... = a k−1 = i 1 , m ≥ k and the a's stand for the i's in a certain order, is the i j with the greatest subscript consistent with the requirement that the section a m−k+1 ...a m−1 a m duplicate no previously occurring section of k symbols in the above sequence.
III. Rule II. is first applied for m = k (in which case a m = a k = i q ) and is then applied repeatedly until a further application is impossible.
This algorithm needs a very large memory (for all the subwords of length k which have already been obtained), but there exist also some memoryless algorithms exposed, for example, in [8], [9] and [18].
We can prove now Theorem 8.For each N ∈ N, there exists a word of length N over an alphabet A with card A = q for which the maximum of the complexity function is equal to the maximum h max of h; the maximum is taken at the same points for both functions.Such words can be easily constructed using the de Bruijn words.
Proof.Keeping in mind the considerations in Remark 1, which mean precisely that the theorem is true for q = 1 and for q ≥ 2, N ≤ q, we shall consider q ≥ 2 and N > q.Let k be the unique natural number so that If N = q k + k, we apply Lemma 7 for k, obtaining a word of length N − 1 containing as factors all the q k words of k letters, and q k − 1 distinct words of length k + 1.The word v obtained by adding a letter from A at its end will contain q k words of k letters and q k distinct words of length k + 1, hence p v (k) = p v (k + 1) = N − k.This is the maximum of the function p v , it is equal to h max and it is attained at the same points as the maximum of h given in Proposition 2. Actually, in this case we have p v = h.
Let us now consider the case N = q k+1 + k − m, m ∈ {0, 1, ..., q k+1 − q k − 1}.Applying Lemma 7 for the number k + 1, we obtain a shortest word w containing all the q k+1 words of length k + 1, having q k+1 + k letters.So for each m ∈ {0, 1, ..., q k+1 − q k − 1}, the prefix w m of w obtained by deleting m final letters will satisfy p wm (k + 1) = q k+1 − m > q k ≥ p wm (k), this being the maximum of the complexity function for the considered word.The maximum is attained only for k + 1.
Applying Proposition 2 for N = q k+1 + k − m we obtain h max = h(k + 1) = q k+1 +k−m−k = q k+1 −m, which means that the maximum of the complexity function for w m is equal to the maximum of the complexities of all possible words of length q k+1 + k − m.The maximum of the function h for N = 6 and N = 7 was calculated in Example 1 and it coincides with that of p v , respectively p w 3 and is taken at the same points.

THE REPRESENTATION OF h AS A COMPLEXITY FUNCTION
An interesting problem is: Let q ≥ 1 and N ∈ N be given and the function h : {0, 1, ..., N } −→ N defined as in (4).Is there a word w of length N over the q-letter alphabet A such that (14) h(n) = p w (n) for all n ∈ {0, 1, ..., N }, i.e., h is the complexity function for that word?If such a word does exist, how can it be constructed?
The question has an affirmative answer for the trivial cases q = 1 and q ≥ 2, N ≤ q, mentioned in Remark 1, so it has to be studied for q ≥ 2, N > q.In the proof of Theorem 8 it was shown that, given the number N = q k + k, k ≥ 1, there exists a word v of length N containing q k distinct words of length k + 1, and also q k words of length k.This means that h and p v coincide on k and k + 1.One the one hand, p v (k) = h(k) = q k means that v contains all possible words of length k as factors, and this implies that it also contains all possible words of shorter lengths, hence h(n) = p v (n) = q n for n ∈ {0, 1, ..., k}.On the other hand, p v (k + 1) = h(k + 1) = N − k means that each of the N − k factors of length k + 1 of v occurs exactly once, as there are precisely N − k available positions for a factor of this length, and this implies that longer factors occur only once too, hence h We have shown that h(n) = p v (n) for all n ∈ {0, 1, ..., N }, and the question is positively answered for If we consider now N = q k+1 + k, k ≥ 1, case which corresponds to the choice m = 0 in the proof of Theorem 8, we obtain the existence of a word w = w 0 of length N containing all q k+1 words of length k + 1.The point (k + 1, p w (k + 1)) being on both the curves (n, q n ) and (n, We mention at first a sufficient condition for the existence, for q ≥ 2 and N > q, of a word w of length N whose complexity function is equal to h.Lemma 9. Given an alphabet with card A = q ≥ 2, if for each k ≥ 1 there exists a de Bruijn word v of order k + 1 from which it is possible to obtain successively words shorter with one symbol so that the number of subwords of length k + 1 decreases by one, but the number of words of length k remains q k , until we are left with a word of length q k + k, then for each N ∈ q k + k, ..., q k+1 + k there exists a word v N with p v N = h.
Proof.Let v N be the word of length N ∈ q k + k, ..., q k+1 + k obtained from v after having removed q k+1 + k − N letters, at each step the number of subwords of length k + 1 being diminished by 1, while the number of subwords of length k remains constant.Then Remark 5.The condition in Lemma 9 is fulfilled if there exists a de Bruijn word of order k + 1 whose prefix is a de Bruijn word of order k.In this case we can simply delete in turn one letter from the end of the word of order k +1.
The existence of words which satisfy the conditions in Lemma 9 (in fact those in Remark 5) was proved for q ≥ 3 by Vörös [20].It follows also as a consequence of a stronger result obtained by Cummings and Wiedemann in Proposition 2 from [5].In fact the overlap of the two de Bruijn sequences in [5] is even longer than it is needed in Remark 5. We remind that the de Bruijn graph B k has as vertices the elements in A k and an arc from any vertex x 1 ...x k to x 2 ...x k x k+1 , where x i ∈ A for i ∈ {1, ..., k + 1} .The graph B k+1 has as vertices the arcs of B k , and the arcs in this graph are obtained by joining two consecutive arcs in B k .An Eulerian circuit in B k corresponds to a Hamiltonian one in B k+1 and conversely.The result of Cummings and Wiedemann follows from the fact that if one removes from the Eulerian circuit in B k , which corresponds to a de Bruijn sequence of order k + 1, the circuit corresponding to a de Bruijn sequence of order k, the remaining graph is still Eulerian and connected (it is essential that q ≥ 3).Lemma 10. [5].If q ≥ 3 and k ≥ 1 each de Bruijn sequence of order k can be strongly embedded in a de Bruijn sequence of order k + t with t ≥ 1, i.e., the two sequences have the same symbols on the first q k + k + t − 1 positions.
It follows that, for q ≥ 3, there exist infinite sequences whose prefixes of length N satisfy ( 14) for each N ∈ N.Such sequences were called in [13] and [20] supercomplex; similarly, a word of length M was called supercomplex if all its prefixes of length N ≤ M satisfied (14).In [13] and [20] it was shown that supercomplex sequences do not exist for binary alphabets, more precisely it was verified that a binary supercomplex word has the length at most 9.This means that no de Bruijn sequence of order 2 can be embedded in a de Bruijn sequence of order 3.In [5] a general negative result is given for binary alphabets: in this case no de Bruijn sequence of order k ≥ 2 ever embeds in a de Bruijn sequence of order k + 1 (even if we ask the coincidence to take place only for the first 2 k + k − 1 positions, hence a weak embedding).It follows that for a binary alphabet we cannot obtain a word as that in the sufficient condition in Remark 5, unless k = 1.Nevertheless we can construct in this case a de Bruijn word of order k + 1 from which the sequences in Lemma 9 can be obtained, even if this word has not as a prefix a de Bruijn word of order k.Lemma 11.A finite number of cycles can be appended to any de Bruijn cycle of order k over a binary alphabet in order to make it a de Bruijn cycle of order k + 1.
Proof.Let w = (x 1 ...x 2 k ) be a de Bruijn cycle of order k.It will be also a Hamiltonian circuit in the de Bruijn graph B k .The graph G, formed by all the vertices in A k and the arcs in B k which are not in the Hamiltonian circuit determined by w, has each vertex of degree 2 (one outward and one inward arc).It follows that G will be a union of vertex disjoint cycles, and w and G are arc disjoint.Each of these cycles will have common vertices with the Hamiltonian circuit determined by w, hence they can be appended one by one to it to form finally an Eulerian circuit in B k , that is a de Bruijn cycle of order k + 1.

Now we can state
Theorem 12.For each alphabet A with card A = q ≥ 1 and for each N ∈ N there exists a word of length N whose complexity function coincides with the function h.
Proof.For q = 1, or q ≥ 2 and N ≤ q, the result was already proved in Remark 1.
Let q = 2 and k ≥ 1 so that N ∈ 2 k + k, ..., 2 k+1 + k .Consider a de Bruijn cycle of order k (constructed for example using Martin's algorithm) and extend it as in Lemma 11, by adding vertex disjoint cycles, to a de Bruijn cycle of order k + 1. Write it as a de Bruijn word such that it ends with the letters of one of the appended cycles.When we remove one by one all the symbols in that cycle, the number of subwords of length k + 1 will decrease at each step by one, but the number of subwords of length k will remain the same (all these subwords are included in the initial de Bruijn cycle).Write again the obtained cycle as a word which ends with another appended cycle and delete in turn the last symbol until the cycle disappears.Finally we are left with a word of length 2 k + k obtained from the initial de Bruijn cycle of order k, which contains 2 k words of length k + 1 and 2 k words of length k.
If we are not interested to obtain all the words of length N ∈ {q k + k, ..., q k+1 + k}, but only a specific one, we can apply a result of Shallit [19]: For each i ∈ 1, ..., 2 k the graph B k contains a cycle of length i that can be used to construct a closed chain of length 2 k + i which visits every vertex at least once.
Finally, let q ≥ 3 and k ≥ 1 so that N ∈ q k + k, ..., q k+1 + k .Applying Lemma 10 for t = 1, we obtain the existence of a de Bruijn word of order k + 1 which has as prefix a de Bruijn word of order k, hence it satisfies the conditions in Lemma 9.It follows that for each N ∈ {q k + k, ..., q k+1 + k} there exists a word of length N (obtained by successively deleting a symbol from the end of the de Bruijn word of order k + 1) whose complexity function is the function h corresponding to that N.
Example 5. Let us first consider the case of a binary alphabet A = {a, b}.We shall construct, as in the proof of Theorem 12, words u N with N ∈ {1, 2, ..., 10}∪{37, ..., 69} for which p u N = h.We can obviously consider u 1 = a and u 2 = ab.We have a weak embedding, marked by a gap, of the de Bruijn word of order 1, ab, in the de Bruijn word of order 2, ab baa (situation which is no longer possible for words of order k, respectively k + 1 for k ≥ 2).We obtain in turn u 5 = abbaa, u 4 = abba and u 3 = abb.Let us now consider for k = 2 the Martin cycle w = (abba), which corresponds to the word u 5 = abbaa.The graph G obtained from B 2 by removing all the arcs of w is the union of the cycles (a), (b) and (ab), i.e., aa → aa, bb → bb, respectively ab → ba → ab.Appending these to the cycle w we obtain for instance the de Bruijn cycle of order 3 u = (aababbba), where we underlined the appended cycles; we write it as u 10 = abbbaaabab; deleting the symbols of (ab) from the end, and then those in the loops (a) and (b) (which can be deleted without shifting the cycle) we obtain in turn We have p u N (5) = 32 and p u N (6) = N − 5 for N ∈ {37, ..., 69}.
Let now a 3-letter alphabet A = {a, b, c} be given (the situation is similar for any q > 3).We have obviously w 1 = a, w 2 = ab, w 3 = abc.From a de Bruijn word of order 2 which contains a strongly embedded de Bruijn word of order 1 abca accbba, the gap marking the end of the overlapping, we can obtain the words w 10 = abcaaccbba ... w 4 = abca.
Similarly, from the de Bruijn word of order 3 which contains a strongly embedded de Bruijn word of order 2 Remark 6.It is clear now that Theorem 8 is a consequence of the stronger result from Theorem 12.However, if one is interested only in obtaining words w with max {p w (n) : 1, ..., N } = h max , the constructive methods in the proof of Theorem 8 are simpler and faster.

OTHER COMPLEXITY MEASURES FOR FINITE WORDS
The complexity function p w which was used throughout the paper has the advantage that it can be defined in the same way both for infinite sequences and for finite words, as it was stated in the introduction.As far as finite words are concerned, the first measure of subword complexity seems to have been introduced by Heinz [11] as the total number of factors of w, The problem of studying the maximum of K(w) over all the words of length N over a finite alphabet with q elements was a central one.It is easy to see that the maximum of K(w) over the words in A N is attained at w 0 if for each n ∈ {0, ..., N } the maximum of p w (n) is attained at p w 0 (n).One has then the delimitation (obtained in [20])

Example 4 .
Let us consider for the 2-letter alphabet A = {a, b} the values N = 6 and N = 7.For N = 6 = 2 2 + 2 we have k = 2 and, by adding a letter (for example a) to the Martin word of order 2, abbaa, we obtain v = abbaaa; the maximum of p v is p v (2) = p v (3) = 4.For N = 7 = 2 2 + 3 we can consider the Martin word of order 3, aabbbabaaa, and delete three symbols from the end.The word w 3 = aabbbab has the maximum of its complexity function given by p w 3 (3) = 5.