Unavoidable pattern
In mathematics and theoretical computer science, a pattern(term) is a sequence of characters over some alphabet . A word avoids pattern if does not match . If avoids , is also called -free.
A word matches pattern if there exists a factor(subword) of which is also an instance of .
A word is an instance of pattern , if there exists a non-erasing morphism(substitution) such that . Non-erasing means .
A pattern is unavoidable on a finite alphabet if there exists , word matches for any . Otherwise, is avoidable on , which implies there exist infinitely many words over alphabet that avoid . This is equivalent to saying that is avoidable on if there exists an infinite word that avoids .
A pattern is k-avoidable if is avoidable on an alphabet of size . Otherwise, is k-unavoidable, which means is unavoidable on every alphabet of size .[1][2]
A pattern is an unavoidable pattern(blocking term) if is unavoidable on any finite alphabet.
Let A be an alphabet of letters and E a disjoint alphabet of pattern symbols or "variables". Elements of E+ are patterns. For a pattern p, the pattern language is that subset of A∗ containing all words h(p) where h is a non-erasing semigroup morphism from the free monoid E∗ to A∗. A word w in A∗ matches or meets p if it contains some word in the pattern language as a factor, otherwise w avoids p.[3][4]
There is a word W(k) over an alphabet of size 4k which avoids every avoidable pattern with less than 2k variables.[5]
Zimin words
Given alphabet , Zimin words are defined recursively for and base case .
A word is an unavoidable pattern if and only if it's a factor of a Zimin word.[6]
It is unavoidable that any string, containing two unique characters, that is five or more characters long will contain a pattern of the form ABA (the second Zimin word). Using three unique characters any string containing 29 or more characters will contain a pattern of the form ABACABA[7]
Pattern reduction
Given a pattern over some alphabet , we say is free for if there exist subsets of such that the following hold:
- is a factor of and ↔ is a factor of and
e.g. let , then is free for since there exist satisfied conditions above.
A pattern can reduce to pattern if there exists a character such that is free for , and can be obtained by removing all occurrence of from .
e.g. let , then is reducible to since is free for .
Given patterns , if can reduce to and can reduce to , then can reduce to .
A pattern is unavoidable if and only if can reduce to a word of length 1.[6][8]
Examples
- The Thue–Morse sequence avoids the patterns xxx and xyxyx.[1][2]
- The patterns x and xyx are unavoidable on any alphabet.[4][9]
- The power pattern xx is 3-avoidable;[4][1] words avoiding this pattern are square-free.[2][10]
- The power patterns xn for n ≥ 3 are 2-avoidable: the Thue–Morse sequence is an example for n=3.[1]
- Sesquipowers are unavoidable.[9]
Avoidability index
The avoidability index of a pattern p is the smallest k such that p is k-avoidable, or ∞ if p is unavoidable.[11] For binary patterns (two variables x and y) we have:[12]
- The Zimin words x, xyx, xyxzxyx, xyxzxyxwxyxzxyx, etc. are unavoidable, as well as any word that can be written as a subword of a Zimin word via a homomorphism. All other words are avoidable.
- Many patterns have avoidability index 2.
- xx,xxy,xyy,xxyx,xxyy,xyxx,xyxy,xyyx,xxyxx,xxyxy,xyxyy, as well as many doubled words, have avoidability index 3, though this list is not complete.
- abwbaxbcyaczca has avoidability index 4, as well as other locked words. (Baker, McNulty, Taylor 1989)
- abvbawbcxacycdazdcd has avoidability index 5. (Clark 2004)
- no words with index greater than 5 have been found.
Square-free words
A square-free word is one avoiding the pattern xx. An example is the word over the alphabet {0,±1} obtained by taking the first difference of the Thue–Morse sequence.[13][14]
See also
References
- ^ a b c d Lothaire (2011) p. 113
- ^ a b c Berstel et al (2009) p.127
- ^ Lothaire (2011) p. 112
- ^ a b c Allouche & Shallit (2003) p.24
- ^ Lothaire (2011) p. 122
- ^ a b Zimin, A.I. (1982). BLOCKING SETS OF TERMS.
- ^ Joshua, Cooper; Rorabaugh, Danny (2013). Avoidable Patterns in Strings of Symbols (PDF). arXiv:1409.3080. Bibcode:2014arXiv1409.3080C.
- ^ BEAN, DWIGHT RICHARD; EHRENFEUCHT, ANDRZEJ; MCNULTY, GEORGE FRANK (1979). Avoidable Patterns in Strings of Symbols (PDF).
- ^ a b Lothaire (2011) p.115
- ^ Lothaire (2011) p. 114
- ^ Lothaire (2011) p.124
- ^ Lothaire (2011) p.126
- ^ Pytheas Fogg (2002) p.104
- ^ Berstel et al (2009) p.97
- Allouche, Jean-Paul; Shallit, Jeffrey (2003). Automatic Sequences: Theory, Applications, Generalizations. Cambridge University Press. ISBN 978-0-521-82332-6. Zbl 1086.11015.
- Berstel, Jean; Lauve, Aaron; Reutenauer, Christophe; Saliola, Franco V. (2009). Combinatorics on words. Christoffel words and repetitions in words. CRM Monograph Series. Vol. 27. Providence, RI: American Mathematical Society. ISBN 978-0-8218-4480-9. Zbl 1161.68043.
- Lothaire, M. (2011). Algebraic combinatorics on words. Encyclopedia of Mathematics and Its Applications. Vol. 90. With preface by Jean Berstel and Dominique Perrin (Reprint of the 2002 hardback ed.). Cambridge University Press. ISBN 978-0-521-18071-9. Zbl 1221.68183.
- Pytheas Fogg, N. (2002). Berthé, Valérie; Ferenczi, Sébastien; Mauduit, Christian; Siegel, A. (eds.). Substitutions in dynamics, arithmetics and combinatorics. Lecture Notes in Mathematics. Vol. 1794. Berlin: Springer-Verlag. ISBN 3-540-44141-7. Zbl 1014.11015.