Next Article in Journal
Distributions of Outputs Given Subsets of Inputs and Dependent Generalized Sensitivity Indices
Next Article in Special Issue
Research on Abstraction-Based Search Space Partitioning and Solving Satisfiability Problems
Previous Article in Journal
Two-Variable q-General-Appell Polynomials Within the Context of the Monomiality Principle
Previous Article in Special Issue
A Model Transformation Method Based on Simulink/Stateflow for Validation of UML Statechart Diagrams
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On Concatenations of Regular Circular Word Languages

by
Bilal Abdallah
1,2,* and
Benedek Nagy
1,3,*
1
Department of Mathematics, Faculty of Arts and Sciences, Eastern Mediterranean University, 99450 Famagusta, North Cyprus, Mersin-10, Turkey
2
Department of Mathematics and Statistics, American University of the Middle East, Egaila 54200, Kuwait
3
Department of Computer Science, Institute of Mathematics and Informatics, Eszterházy Károly Catholic University, 3300 Eger, Hungary
*
Authors to whom correspondence should be addressed.
Mathematics 2025, 13(5), 763; https://doi.org/10.3390/math13050763
Submission received: 30 December 2024 / Revised: 12 February 2025 / Accepted: 20 February 2025 / Published: 26 February 2025
(This article belongs to the Special Issue Formal Methods in Computer Science: Theory and Applications)

Abstract

:
In this paper, one-wheel and two-wheel concatenations of circular words and their languages are investigated. One-wheel concatenation is an operation that is commutative but not associative, while two-wheel concatenation is associative but not commutative. Moreover, two-wheel concatenation may produce languages that are not languages of circular words. We define two classes of regular languages of circular words based on finite automata: in a weakly accepted circular word language, at least one conjugate of each word is accepted by the automaton; in contrast, a strongly accepted language consists of words for which all conjugates are accepted. Weakly accepted circular word languages R E G w , in fact, are regular languages that are the same as their cyclic permutations. Strongly accepted circular word languages, R E G s , having words with the property that all their conjugates are also in the language, are also regular. We prove that R E G w and R E G s coincide. We also provide regular-like expressions for these languages. Closure properties of this class are also investigated.

1. Introduction

On the one hand, a relatively new area of computer science called combinatorics on words focuses on the structure of finite and infinite sequences of symbols or words. It is closely related to many different branches of mathematics, including algebra, number theory, game theory and others. The first book on the topic was produced by a number of authors using the pen name M. Lothaire [1], but the majority of researchers believe that A. Thue’s writings are among the earliest works to be published in this field [2,3]. Since then, numerous computer science applications have been found, including those for string matching, data compression, bioinformatics, etc., such as the De Bruijn sequences [4], just to name an important example.
On the other hand, circular words differ from linear words and open up some intriguing new perspectives. Circular words are often referred to as necklaces or cyclic words [5]. Similar sequences can be seen in nature; for instance, some bacteria’s DNA sequences resemble a necklace. Circular words, in a sense, can be seen as strongly periodic discrete functions [6,7]. From a mathematical point of view, in many places, the conjugate class of a word is considered as its cyclic or circular version [8] by highlighting its main difference from usual words, i.e., that there are no well-defined starting and ending points of the sequence. Thus, they can be used to model various objects and phenomena from formal, mathematical and computing points of view.
While the majority of research focuses on “normal” (linear) words, there are also some studies about circular words. The reader can refer to [9] for information on Weinbaum factorizations. D. Nowotka’s dissertation [10] addresses unbordered conjugates of words in Chapter 4. Moreover, there are several applications to integer sequences in [11,12]. Mathematicians are particularly interested in cyclic (or circular) codes [13], which are connected to the concept of circular words. This demonstrates that circular words play a significant role in various places in computer science and mathematics.
Finite automata and regular expressions are well known and well applicable tools in various parts of science and engineering. They are important from both theoretical (mathematical and computer science) and practical points of view. One of our aims is to connect these concepts to circular words to enable the use of these tools for circular words and their languages. Another important aim is to consider the concatenation operation and to find a variant that may fit better for circular words than the usual operation fitting well to traditional, linear words.
In Section 2, we recall the fundamental concepts and notions of formal languages, combinatorics on words (both linear and circular) and finite automata. In Section 3, we provide the definitions of our new concepts, the regular languages of weakly accepted and strongly accepted circular words, R E G w and R E G s . We prove the equality of these two classes, and their closure properties under various operations are also studied. Moreover, we give a regular-expression-style description of these languages. Conclusions and open questions are provided at the end of the paper.

2. Preliminaries

In this section, we recall some basic concepts and present our notations. All concepts that are not detailed here are from standard textbooks, e.g., [5,14,15]. In this paper, the set of positive integers is denoted by N , and N 0 denotes the set of non-negative integers. A nonempty set of symbols (or letters) is called an alphabet and denoted by Σ . A (linear) word or string is a finite sequence of the elements of Σ (we may use the word linear when we contrast these words with the topic of this paper, i.e., cyclic or circular words, which are defined later in this section). We denote with Σ * the set of all words over Σ . The length of a word w is the number of symbols in w and is denoted by w . An empty word, denoted by λ , is a unique word with a length of zero. The concatenation of two words u and v is u · v (we can omit the dot between u and v ). A word v Σ * is a factor of w Σ * if there exist two words x ,   y   Σ * , such that w = x v y . However, if x = λ (resp. y = λ ) , then v becomes a prefix (resp. suffix) of w . The i t h letter of w Σ * is denoted by a i for i 1 ,   ,   w . The reversal of a word w = a 1 a 2 a n of length n is denoted by w R , and it is the word written backwards, that is, w R = a n a n 1 a 1 (note that λ R = λ ). The powers of a word w are defined as w 0 = λ and w k + 1 = w w k for k 0 . The period of a word w = a 1 a 2 a n is a positive p , such that a i = a i + p for all i = 1 , , w p . Two words x and y are conjugates if there exist words u ,   v Σ * such that x = u v and y = v u .
A language, denoted by L , over an alphabet Σ , is a finite or infinite set of words, i.e., L Σ * . Set theoretical operations and concatenation are applied for languages. We can also use the Kleene star and Kleene plus closure operations to define the languages L * = k = 0 L k and L + = k = 1 L k , where L 0 = { λ } , L 1 = L , and L k = L k 1 · L . We also use L to denote the cardinality of L , which can also be infinite.
A circular word w , from a graphical point of view, is created by connecting the first letter of a linear word w to the last. Obviously, a circular word has neither a beginning nor an end, as shown in Figure 1. Thus, a circular word of w can be seen as, and will be represented by, the set of all conjugates of w or, equivalently, all the cyclic shifts (cyclic permutations) of w ; we may call this representation the linearization of w .
w = v v is   a   conjugate   of   w
As a special case, the empty word can also be seen as the empty circular word (to emphasize its usage in this manner, we may also write it as λ ). Notice that even if a circular word is a set, the length of each of its words is the same; thus, the notation w is correct for the length of them. On the other hand, w as a set has cardinality w , which is at most w if w is not the empty word. For the empty word, λ = 1 and λ = 0 . In [6,7], the periodic properties of circular words were investigated. A period is called a strong period if it is a period of each element of the conjugate class w of the word. On the other hand, weak periods are periods that belong to at least one of the conjugates. In this paper, we also use the weak and strong adjectives but for acceptance. Thus, let us briefly recall the concept of finite automata.
A finite automaton is a five-tuple A = ( Q , Σ , q 0 , δ , F ) , where Q is the finite set of states, Σ is the input alphabet, q 0 Q is the initial state, F Q is the set of final (or accepting) states and δ   : Q ×   Σ λ   2 Q is the transition function for non-deterministic finite automata (NFA) with allowed λ -transitions, while δ : Q × Σ Q is the transition function for deterministic finite automata (DFA). The accepted language of A is denoted by L(A). A language is called regular if it can be recognized by a non-deterministic or deterministic finite automaton. Regular languages can also be described by regular expressions with the three regular operations, union + , concatenation · and Kleene star *.
In this paper, we work with the linearization of languages of circular words, i.e., with languages with the property that w L implies that w L . If this property holds, then we say that L is a language of circular words.

3. Regular Languages of Circular Words

We start this section by the definitions of regular classes of circular word languages, and we define them based on finite automata. Thus, let us assume that a finite automaton A is given. Now, for a circular word w , if at least one of its elements is accepted, then we say w is weakly accepted by A . On the other hand, if all elements of w are accepted, then it is strongly accepted. Based on these, we define our language classes in a formal way.
Definition 1.
Let A be a finite automaton. A weakly accepts the language of circular words, where the language has every circular word for which at least one of its conjugates belongs to the language L ( A ) . It is L w A = u z z L A .
Definition 2.
Let A be a finite automaton. A strongly accepts the language of circular words if the language has the circular words for which all their conjugates are in L ( A ) . It is denoted by L s A = u z z L A .
Example 1.
Let A accept the language L A = w w   ends   with   a   1 over the alphabet Σ = 0,1 . The word w = 110 is not in L ( A ) . Thus, w L s A . But w L w A since one of its conjugates is 101 , which is in L ( A ) . In this example, L s A = 1 + (the language of words containing only 1’s), while L w A = ( 0 + 1 ) * 1 ( 0 + 1 ) * is the set of words having at least one 1.
Example 2.
Let automaton A accept L A = w w 0 * + 1 * over the alphabet Σ = 0,1 . Here, L s A = L w A = L ( A ) .
Proposition 1.
For any given finite automaton A , the relation L s A L w A holds.
Proof. 
For all words, u L s A implies u z , where z L ( A ) . This means all the conjugates of u are in L ( A ) . Therefore, u z where z L ( A ) , and thus u L w A . □
Theorem 1.
Let A be a finite automaton. Then, L w A is a regular language.
Proof. 
As the class of regular languages is closed under cyclic shift (see e.g., [15,16,17]), one can create a finite automaton A such that L A = L w A by, e.g., the construction due to Maslov [18]. □
As the cyclic closure operation for languages plays a crucial role here, we define the following special notation:
Definition 3.
Let L denote the unary operation taking the cyclic closure of the language L in the argument, i.e., L = u v v u L with   some v , u Σ * .
Theorem 2.
For any regular language L , L s is also a regular language.
Proof. 
Let a regular language L be given. Then, L denotes the closure of L under cyclic shift. Further, applying this closure operator for the complement L ¯ of the language L , we obtain L ¯ , the language that contains the cyclic closures of all words that are not in L. In fact, L ¯ contains each cyclic word with the property that at least one of its conjugates is not in L . Now, the complement of this language is L ¯ ¯ , the circular word language containing all circular words for which all of their conjugates are in L . However, on the one hand, by the definition of L s , we have L s = L ¯ ¯ . Finally, as the class of regular languages is closed under complement and also under cyclic shift, the resulted language is still regular. Thus, for each regular language L , L s is also regular. □
In fact, we have a (De Morgan-style) duality relation between the languages L w and L s for each language L . The classes of regular languages of weakly accepted circular words and strongly accepted circular words are denoted by R E G w and R E G s , respectively. Thus, the previous results can be written as R E G w R E G and R E G s R E G . As it is obvious from the definitions, from our point of view, circular word languages are, in fact, languages that are closed under cyclic shift operation, i.e., the language L is the same as its cyclic closure.

4. Concatenations of Circular Words and Circular Word Languages

In this section we investigate various concatenation operations for circular words and for their languages. As circular words can be seen as sets (of words), we can use the usual concatenation of these sets; however, the result may not be a set (or a language) of circular words but a kind of sequence of them.
Definition 4.
Let x and y be two circular words; their two-wheel concatenation is the sequence of the circular words obtained by:
x y = x · y = z z = u v where u x and v y .
Indeed, the output in this case is usually not a circular word. We refer to this operation as a “two-wheel concatenation” (see Figure 2), while the sequence of circular words, in general, may be obtained by some subsequent applications of the operation, i.e., with n applications, one can obtain a sequence of n + 1 circular words. We shortly refer to such objects as sequences of circular words based on the picture of how one can imagine them: in Figure 2, a sequence of two circular words is shown, and such a sequence can be generalized to longer sequences, e.g., containing n circular words for any natural number n . Thus, actually, we may use this type of concatenation in general for sequences of circular words, when the sequences with one circular word are actually the circular words. Moreover, the empty word can be seen as a sequence of zero circular words or, alternatively, as a sequence of any number of empty circular words (i.e., by applying the operation on itself, the result is also the empty circular word, but it can also be seen as a longer sequence made by this word).
As we are interested in circular word languages, we also define a concatenation to obtain circular words.
Definition 5.
For any two circular words x and y , their one-wheel concatenation is:
x y = w z z = u v   where   u x   and   v y
See also Figure 3. Note here that Figure 2 and Figure 3 show only particular cases. The result of the operation usually gives a set with all the possibilities, as we will also explain in Example 3 below. However, first, let us show a very important property of the one-wheel concatenation.
Proposition 2.
The one-wheel concatenation of two circular words is a circular word language.
Proof. 
As a circular word is the set of a word and all its conjugates, for any two circular words x and y , their one-wheel concatenation is the set of words formed by concatenating each conjugate of x with each conjugate of y , along with the conjugates of these concatenations. □
Example 3.
Let x = a b c and y = 01 be two linear words over the alphabet Σ = { a , b , c , 0,1 } . Then, x · y = { a b c 01 , a b c 10 , b c a 01 , b c a 10 , c a b 01 , c a b 10 } and
x y = { a b c 01 , b c 01 a , c 01 a b , 01 a b c , 1 a b c 0 , a b c 10 , b c 10 a , c 10 a b , 10 a b c , 0 a b c 1 , b c a 01 , c a 01 b , a 01 b c , 01 b c a , 1 b c a 0 , b c a 10 , c a 10 b , a 10 b c , 10 b c a , 0 b c a 1 , c a b 01 , a b 01 c , b 01 c a , 01 c a b , 1 c a b 0 ,   c a b 10 , a b 10 c , b 10 c a , 10 c a b , 0 c a b 1 }  
As the previous example shows, the one-wheel concatenation of two circular words may not be a circular word, but it is always a circular word language, as we have proven. We investigate later what are the necessary and sufficient conditions to have the same result for the two types of concatenations (in this case, two-wheel concatenation of the circular words also results in a circular word language).
Proposition 3.
The one-wheel concatenation of circular words is commutative but not associative.
Proof. 
In terms of commutativity, for any two circular words x and y ,
x y = w z z = u v   where u x and v y = w z = z z = v u , u x and v y = y x
Consider the following example to prove the non-associativity: Let x = a b , y = 1 and z = c be three words over the alphabet Σ = { a , b , c , 1 } . We have:
( x y ) z = a b 1 c , b 1 c a , 1 c a b , c a b 1 b 1 a c , 1 a c b , a c b 1 , c b 1 a 1 a b c , a b c 1 , b c 1 a , c 1 a b b a 1 c , a 1 c b , 1 c b a , c b a 1 a 1 b c , 1 b c a , b c a 1 , c a 1 b 1 b a c , b a c 1 , a c 1 b , c 1 b a   and x ( y z ) = a b 1 c , b 1 c a , 1 c a b , c a b 1 a b c 1 , b c 1 a , c 1 a b , 1 a b c b a 1 c , a 1 c b , 1 c b a , c b a 1 b a c 1 , a c 1 b , c 1 b a , 1 b a c
Therefore, x y z x y z . □
Proposition 4.
The unit element for the one-wheel concatenation of circular words is the empty circular word  λ .
Proof. 
Obviously, for any circular word x , λ x = x = x λ . □
One can also see that these two types of concatenations are not independent. Thus, let us investigate their relation first on circular words.
Lemma 1.
For x and y being any two circular words, x · y x y holds.
Proof. 
Let w x · y . Then, there exists two words u and v such that w = u v , where u x and v y . Eventually, w ( u v ) and, thus, w x y . □
Lemma 2.
For any two nonempty circular words x and y , the following holds:
x · y x y x · y · x y
Proof. 
By Lemma 1, since x · y x y , the first part is obvious. On the other hand, by the definitions of concatenations, x y has all the words of x · y and their conjugates. As the number of conjugates is at most the length of the word, the second part follows. □
Theorem 3.
For any two circular words x and y , let n = max { x , 1 } , m = max { y , 1 } and p = max { x y , 1 } . Then, x y n m , and x y n m p .
Proof. 
The proof goes by cases.
  • If both circular words x and y are the empty word, then x y = x y = 1 , as each contains only the empty word.
  • If only one of the two circular words x and y is the empty word, then x y = x y m a x { x , y } .
  • If x , y { a } * for an a Σ , then x y = x y = 1 .
  • If x λ and y λ , then 1 x y x · y = n m , and by Lemma 2, 1 x y x y · x y n m p . □
Theorem 4.
For any two circular words x and y , x y = x y if and only if one of the following conditions holds:
  • x , y { a i } *  for any  a i Σ , or
  • one of the circular words is  λ , and the other can be any circular word.
Proof 
(Sufficient condition). For any a i Σ and p , q N 0 , if x = { a i p } a n d y = { a i q } , then x y = x y = { a i p + q } (this equality can be λ for p = q = 0 ) . On the other hand, x λ = x λ = x and λ y = λ y = y .
(Necessary condition) Assume that x y = x y , i.e., x y is a circular word language as the one-wheel concatenation is commutative. Thus, the cases are as follows:
  • x y = x y = 1 implies x , y { a i } * ( x and y can be any or both λ ).
  • If x y = x y 2 , then there are at least two different letters a i , a j Σ occurring in x y . Let us assume, to the contrary, that none of x and y is λ . Then, there are the following three subcases:
    If there exists a letter a i Σ that is in x but not in y , then a i cannot be a suffix of any word in x y . Thus, x y is not a circular word language, which contradicts the hypothesis.
    If there exists a letter a i Σ that is in y but not in x , then a i cannot be a prefix of any word in x y , another contradiction.
    Finally, if there exist at least two different letters a i , a j Σ in both x and y , then let n be the largest number such that a i n is a factor of an element of x , and similarly, let m be the largest number such that that a i m is a factor of an element of y . In this case, both n and m are positive. Then, the longest prefix of the form a i * in x y has a length n , but in x y , it has a length n + m , contradicting the equality of these two sets. □
Based on the defined concatenation operations on circular words (and on their sequences), we may also define the extensions of these operations to (circular word) languages. Obviously, the two-wheel concatenation of circular word languages is the same as the usual concatenation of them; however, the result may not be a circular word language. Further, we have the following:
Definition 6.
Let  L 1 and L 2 be two languages of circular words. Their one-wheel concatenation, denoted by L 1 L 2 , is the union of all one-wheel concatenations of circular words stemming from the two languages:
L 1 L 2 = x L 1 y L 2 x y
For the sake of completeness, we also formally define the two-wheel concatenation of two languages of circular words.
Definition 7.
Let  L 1 and L 2 be two languages of circular words. Their two-wheel concatenation, denoted by L 1 · L 2 , is the union of all two-wheel concatenations of circular words stemming from the two languages:
L 1 · L 2 = x L 1 y L 2 x · y
Example 4.
Let L a = { 001,011,101,111 } be the language of words ending with 1 with length 3, and let L b be the language of all words ending with 1, both over the alphabet Σ = 0,1 . Then, the circular word languages associated to these languages, i.e., for automata A and B accepting L a and L b , respectively, are L w A = { 001,010,100,011,110,101,111 } , which is the language of words having at least one 1 with length 3, and L w B , which is the language of all words having at least one 1.
The word u = 000011 L w A L w B , since one of the conjugates of u is u = 001100 , which is the concatenation of two words x = 001 and y = 100 , where x L w A and y L w B . However, u L w A · L w B , since u cannot be written in form of p q where p x = ( 000 ) and q y = ( 011 ) as x L w A .
Considering another example where the word is v = 010110 . Here, v L w A L w B , since v is the concatenation of two words x = 010 and y = 110 , where x L w A and y L w B . Also, v L w A · L w B , as it is a result of a two-wheel concatenation: v x · y , where x = ( 010 ) and y = ( 110 ) .
On the other hand, the word z = 000001 exists neither in L w A · L w B nor in L w A L w B , since none of its conjugates can be written as a concatenation of two words x and y , where x L w A and y L w B simultaneously.
We can notice from these examples that there are some words in L w A L w B that are not in L w A · L w B . Thus, the equality of these two concatenations does not hold in general.
Corollary 1.
The one-wheel concatenation of circular word languages is commutative, but the two-wheel concatenation is generally not.
Proof. 
For one-wheel concatenation, it comes directly from Proposition 3, i.e., the fact that this operation is commutative on circular words. On the other hand, let L 1 = { 0,1 } * and L 2 = { a , b } * . Indeed, these languages are circular word languages, and both are in R E G w . Let x L 1 and y L 2 , such that none of them is the empty word; then, x y L 1 L 2 , but x y L 2 L 1 . Therefore, the two-wheel concatenation of these circular word languages is not commutative. □
Proposition 5.
The one-wheel concatenation of languages of circular words is distributive over the union and intersection.
Proof. 
Let L 1 , L 2 and L 3 be languages of circular words.
L 1 L 2 L 3 = x y x L 1 L 2   and   y L 3 = x y ( x L 1   or   x L 2 )   and   y L 3 = x y ( x L 1   and   y L 3 )   or   ( x L 2   and   y L 3 ) = x y x L 1   and   y L 3 x y x L 2   and   y L 3 = ( L 1 L 3 ) ( L 2 L 3 )
On the other hand,
L 1 L 2 L 3 = x y x L 1 L 2   and   y L 3 = x y ( x L 1   and   x L 2 )   and   y L 3 = x y ( x L 1   and   y L 3 )   and   ( x L 2   and   y L 3 ) = x y x L 1   and   y L 3 x y x L 2   and   y L 3 = ( L 1 L 3 ) ( L 2 L 3 )
These two properties are known as the left-distributive property of one-wheel concatenation over the union and intersection, respectively. As the one-wheel concatenation is commutative (Corollary 1), the right-distributive properties follow immediately:
L 1 L 2 L 3 = ( L 1 L 2 ) ( L 1 L 3 )   and L 1 L 2 L 3 = ( L 1 L 2 ) ( L 1 L 3 ) .    
Corollary 2.
Let  L 1  and  L 2  be circular word languages; then,    L 1 · L 2 L 1 L 2 .
Proof. 
This is a direct consequence of Lemma 1. □
Proposition 6.
Let  L 1  and  L 2  be two languages of circular words such that  x · y = x y  for all  x L 1  and  y L 2 . Then,  L 1 · L 2 = L 1 L 2 .
Proof. 
Let u L 1 L 2 ; then, there exists a word z such that u z = x y , where x L 1 and y L 2 . If x · y = x y , then u x y . Therefore, u L 1 · L 2 and L 1 L 2 L 1 · L 2 . The other direction comes from Corollary 2. Thus, the equality holds. □
In fact, the previous result can be applied for unary languages (languages over a one-letter alphabet) or languages where one of the languages could contain only the empty word.
Proposition 7.
The two-wheel concatenation of circular word languages is associative, but the one-wheel concatenation is generally not.
Proof. 
Let L 1 , L 2 and L 3 be three circular word languages. On the one hand, trivially, we have ( L 1 · L 2 ) · L 3 = L 1 · ( L 2 · L 3 ) . On the other hand, in the proof of Proposition 3, taking, e.g., the three circular words as three circular word languages, the non-associativity result is inherited to languages L 1 L 2 L 3 L 1 L 2 L 3 . □
We recall here that in [17], a third type of concatenation was defined for circular words and circular word languages; the shuffle ◊ of two circular words x and y , denoted by x y , is:
x y = u x v y u v
based on the shuffle operator of words. This operation was shown to be both commutative and associative on circular words. However, this would lead to more restricted classes of languages than the classes we intend to work with here.
Proposition 8.
The shuffle of circular words is the same as their one-wheel concatenation if one of the words has a length of at most 1.
Proof. 
If one of the words is the empty circular word, then trivially both operations result in the other circular word. Now, on the other hand, if one of the words has length 1, then both operations allow it to be inserted anywhere in the other circular word (i.e., into anywhere in any of its conjugates). □
The previous proposition may allow us to build circular word languages by one-wheel concatenation in such a way that the shuffle operation can also be simulated. However, in this paper, we use another approach.

5. Closure Properties and o -Regular Expressions

While some of the closure properties seem to be trivial, in many cases, one needs some extra care. For this reason, we start the section with some examples.
Example 5.
Consider the regular languages L = 01,100,101 , L = 10,11 and L = 11,100,110 over the alphabet Σ = 0,1 .
The corresponding circular word languages that are weakly accepted by the automata recognizing L , L and L , respectively, are L w = 01,10,100,001,010,101,011,110 , L w = 10,01,11 and L w = { 11,100,010,001,110,101,011 } . Notice that the intersection of L and L is the empty set. However, L w L w = { 01,10 } . Similarly, L L = 100 ,   L L = 11 ,   but L w L w = 100,010,001,110,101,011 and L w L w = { 11 } ; thus, L w L w = L L w holds only at the latter case, and it generally does not. Furthermore, the word 001 is in the complement of L while it is not in the complement of L w ; thus, the complement of L and L w differs, and moreover, the complement of L w is not the cyclic closure of the complement of L .
Theorem 5.
R E G w is closed under union, intersection, complementation and reversal.
Proof. 
First of all, it is obvious that for any languages of circular words, their union, intersection, complement and reversal are also circular word languages, as if they contain a word, they must contain all of its conjugates. The rest of the proof is constructive for each operation. Let A and B be two finite automata that weakly accept L w A and L w B , respectively. The union proof is trivial, as we can construct an NFA A in the same way as generally carried out for the union of two regular languages, such that A weakly accepts the regular language of circular words L w A L w B .
For the intersection and complement, however, based on Theorem 1, we first need to construct A 1 = ( Q A , Σ , q 0 A , δ A , F A ) and B 1 = ( Q B , Σ , q 0 B , δ B , F B ) , two completely defined deterministic finite automata that accept the languages L w A and L w B , respectively, in the traditional manner. Now, for the intersection, we can construct A = ( Q A × Q B , Σ ,   ( q 0 A , q 0 B ) , δ A , F A × F B ) , which accepts the language L w A L w B in the traditional manner, and so, in a weak manner as well, where, for a Σ and q = ( q A , q B ) Q A × Q B , δ A q , a = δ A q A , a , δ B q B , a . Thus, A accepts exactly those words and, thus, all circular words that are accepted by both automata A 1 and B 1 while simulating the work of these two automata in a parallel manner.
For the complement of the language L w A , we again use A 1 ; as L A 1 = L w A , their complement is also matching, i.e., L A 1 ¯ = L w A ¯ . The regular language of weakly accepted circular words not accepted by A 1 is accepted (and also weakly accepted) by A = ( Q A , Σ , q 0 A , δ A , Q A \ F A ) .
For the reversal of the regular language L w B = L B 1 , the same construction works as for the reversal of an arbitrary regular language. E.g., based on B 1 , let B = ( Q B s , Σ ,   s , δ B , q 0 B ) , which accepts the language L w R ( B ) ; this is simply achieved by reversing all the transitions of B , making the start state q 0 B of B 1 to be the only accepting state of B and creating a new initial state s Q B from which the accepting states of B 1 can be reached by λ -transitions. Notice that in this case, a similar construction starting from B also suffices. □
Theorem 6.
R E G w is closed under one-wheel concatenation.
Proof. 
Based on Theorem 1, one may assume that there are two completely defined deterministic finite automata A = ( Q A , Σ , q 0 A , δ A , F A ) and B = ( Q B , Σ , q 0 B , δ B , F B ) that accept the languages L w A and L w B , respectively, in the traditional manner, i.e., L w A = L A and L w B = L B . Then, by the usual method, connecting the accepting states of the automaton A to the initial state of the automaton B by λ -transitions and keeping only the accepting states of B, an NFA A = ( Q A Q B , Σ , q 0 A , δ A , F B ) can be obtained that recognizes the language L A · L B , where for a Σ λ and q Q A Q B ,
δ A q , a = δ A q , a ,    i f   q Q A q 0 B ,    i f   q F A   a n d   a = λ δ B q , a ,    i f   q Q B
Now, A accepts the two-wheel concatenation of L w A and L w B . On the other hand, the cyclic closure of this language, i.e., the circular word language weakly accepted by A , is exactly the one-wheel concatenation of these two languages, i.e., L w A = L w A L w B , completing the proof. □
Theorem 7.
R E G w is not closed under two-wheel concatenation, Kleene star and Kleene plus.
Proof. 
Each of these proofs goes by a counterexample. Consider L = { a a a , b b b } over the binary alphabet. Clearly, this language is in R E G w . Now, on the other hand, all the languages L · L , L * and L + contain the word a a a b b b , but none of them contain its conjugate a a b b b a , contradicting the fact that the resulting language is a circular word language. This implies that none of L · L , L * and L + is in R E G w . □
Theorem 8.
R E G w = R E G s .
Proof. 
For any language L w A based on a finite automaton A , we can construct a finite automaton A (based on Theorem 1), such that L w A = L w A , and all the conjugates of any circular word of L w A are in L ( A ) , i.e., L w A = L ( A ) . Thus, L w A = L s A , proving that any language in R E G w is also in R E G s .
On the other hand, for any language L s A in R E G s , as L s A is also regular by Theorem 2, we can also construct a finite automaton A such that L s A = L ( A ) , where L s A = L w A = L ( A ) . Thus, L s A = L w A , proving that any language in R E G s is also in R E G w . Therefore, the equality of R E G w and R E G s holds. □
Obviously, the closure properties proven for R E G w in Theorems 5–7 apply for R E G s as well. In the sequel, we still use this class of languages by R E G w , but one may keep in his/her mind that, in fact, this is the same set of languages as R E G s .
Finally, we turn to the last part of our results: describing circular word languages by regular-like expressions. If the shuffle of cyclic words is used for concatenation and we define its closure analogously to the Kleene star for traditional concatenation [17], we may start to build regular-like expressions based on the circular words λ and a for a Σ . With shuffle, union and the closure of shuffle, we can actually obtain languages that are closed not only under cyclic permutation but also under commutative closure. In this way, actually, the commutative closures of regular languages are obtained, and thus, they may not be regular or context-free. On the other hand, as these languages are closed under commutative closure, the order of the letters is not really playing any role; thus, instead of normal (linearization) of languages, we may work with them as multiset languages. The class of these languages is very restricted, as even some of our finite circular word languages are not included there; consider that, e.g., { a , a a , a a b b , a b b a , b b a a , b a a b } is not in this class as, e.g., the word a b a b is missing if one considers only commutative languages.
Remember that for any language L of circular words, the operation L leaves the language unchanged, i.e., L = L . (Actually, this is an if and only if condition for our circular word languages.)
Now, we may build regular-like expressions; we call them o -regular expressions (circle-regular expressions), using the traditional regular operations and this cyclic closure operation.
Definition 8.
Let an alphabet  Σ  be given. The symbol  λ  and each letter of  Σ  are  o  -regular expressions. They represent the singleton languages, as in the case of regular expressions. Let  g  and  h  be two  o  -regular expressions; then,  ( g + h ) ,  ( g · h ) ,  g *  and  g  are also  o  -regular expressions. They represent the regular languages obtained by the used regular operation and by the cyclic closure, respectively.
Theorem 9.
If an  o -regular expression has the cyclic closure as the main operation, then the represented language is a language of circular words; moreover, it is in  R E G w  . On the other hand, if  L R E G w , then there is a  o -regular expression with the above property that represents  L .
Proof. 
By the definition of weakly accepted regular circular word languages, it is clear that for any regular language (accepted by an automaton, let us say, A ), its cyclic closure is exactly L w A . In the other direction, if a language L R E G w , then based on the automaton A weakly accepting L , a regular expression can be given for L ( A ) , and putting the cyclic closure to this expression as the main operation, an o -regular expression is obtained describing L . □
Further, based on the union normal form for regular expressions, see, e.g., [19,20], we have a more specific form of the o -regular expressions to describe languages of L R E G w .
Theorem 10.
For any nonempty language of R E G w there is a o -regular expression that, in its tree form, contains the operation in the root; then, the operation union + with some (maybe more than two) children, and only concatenation · and Kleene-star * in the lower levels, has the letters of the alphabet as leaves.
Proof. 
Let L R E G w . By transforming the regular expression g to union normal form, where g describes the language L , the solution is provided. □
On the other hand, as all languages in R E G w are regular, they can also be described by (ordinary) regular expressions. However, o -regular expressions may be used to write these descriptions in a more concise way, and thus, one can benefit from o-regular expressions from a descriptive point of view.

6. Discussion and Concluding Remarks

Circular words play important roles describing natural circular DNA structures, various mathematical and combinatorial phenomena, etc. In this paper, we have defined two classes of regular languages of circular words with very different approaches. The definitions are based on finite automata and particularly if any or all conjugates of a word are accepted. We have also explored various types of concatenations, namely, one-wheel and two-wheel concatenations of circular words. On the one hand, the regular languages of weakly and strongly accepted circular words were proven to be the same class; moreover, it contains only regular languages, i.e., the linearizations of these circular word languages are regular in the traditional sense. We have also proven some interesting closure properties of this class. Our work gives two different ways that finite automata can be used to accept circular word languages, thus giving a well-known tool for researchers working with circular words (or objects that can be naturally represented by circular words). Further, we have defined o -regular expressions and characterized this class with these expressions. From a complexity point of view, it is an interesting question to see if/when o-regular expressions can give a more compact description of languages in R E G w than traditional regular expressions.

Author Contributions

Conceptualization, B.N. and B.A.; methodology, B.N. and B.A.; validation, B.A. and B.N.; formal analysis, B.A. and B.N.; investigation, B.A. and B.N.; writing—original draft preparation, B.A.; writing—review and editing, B.N.; visualization, B.A.; supervision, B.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Acknowledgments

A preliminary version of the paper was presented in the conference MCU 2024. The authors gratefully acknowledge the comments of the audience, especially the comments and ideas by Sergey Verlan and Jérôme Durand-Lose.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Lothaire, M. Combinatorics on Words, 2nd ed.; Cambridge University Press: Cambridge, UK, 1997. [Google Scholar]
  2. Thue, A. Über unendliche Zeichenreihen. Kra. Vidensk. Selsk. Skr. I. Mat. Nat. Kl. 1906, 7, 1–22. [Google Scholar]
  3. Thue, A. Über Die Gegenseitige Lage Gleicher Teile Gewisser Zeichenreihen; Kra. Vidensk. Christiana Videnskabs-Selskabs Skrifter, I. Math.-Naturv. Klasse; Jacob Dybwad: Oslo, Norway, 1912; Volume 46, pp. 1–67. [Google Scholar]
  4. Sawada, J.; Williams, A.; Wong, D. Generalizing the Classic Greedy and Necklace Constructions of de Bruijn Sequences and Universal Cycles. Electron. J. Comb. 2016, 23, 1–24. [Google Scholar] [CrossRef]
  5. Smyth, B. Computing Patterns in Strings; Pearson Addison-Wesley: Boston, MA, USA, 2003. [Google Scholar]
  6. Hegedüs, L.; Nagy, B. On Periodic Properties of Circular Words. Discret. Math. 2016, 339, 1189–1197. [Google Scholar] [CrossRef]
  7. Hegedüs, L.; Nagy, B. Periodicity of Circular Words. In WORDS 2013; TUCS Lecture Notes; TUCS: Turku, Finland, 2013; Volume 20, pp. 45–56. [Google Scholar]
  8. Hegedüs, L.; Nagy, B. Representations of Circular Words. In Proceedings of the AFL 2014: Automata and Formal Languages, Szeged, Hungary, 27–29 May 2014; Volume 151, pp. 261–270. [Google Scholar]
  9. Diekert, V.; Harju, T.; Nowotka, D. Factorizations of Cyclic Words. In Proceedings of the Workshop on Words and Automata at CSR, Saint Petersburg, Russia, 7 June 2006. [Google Scholar]
  10. Nowotka, D. Periodicity and Unbordered Factors of Words; TUCS Dissertations; TUCS: Turku, Finland, 2004; p. 50. [Google Scholar]
  11. Rittaud, B.; Vivier, L. Circular Words and Applications. In Proceedings of the 8th International Conference WORDS 2011, Prague, Czech Republic, 12–16 September 2011; Volume 63, pp. 31–36. [Google Scholar]
  12. Rittaud, B.; Vivier, L. Circular Words and Three Applications: Factors of the Fibonacci Word, F-Adic Numbers, and the Sequence 1, 5, 16, 45, 121, 320, …. Funct. Approx. Comment. Math. 2012, 47, 207–231. [Google Scholar] [CrossRef]
  13. Van Lint, J.H. Introduction to Coding Theory. In Graduate Texts in Mathematics; Springer: Berlin/Heidelberg, Germany, 1998; Volume 86. [Google Scholar]
  14. Erickson, J. Algorithms, 1st ed.; University of Illinois at Urbana-Champaign: Champaign, IL, USA, 2019. [Google Scholar]
  15. Hopcroft, J.E.; Ullman, J.D. Introduction to Automata Theory, Languages and Computation; Addison-Wesley: Boston, MA, USA, 1979. [Google Scholar]
  16. Jirásková, G.; Okhotin, A. State Complexity of Cyclic Shift. RAIRO Theor. Inform. Appl. 2008, 42, 335–360. [Google Scholar] [CrossRef]
  17. Kudlek, M. On Languages of Cyclic Words. In Aspects of Molecular Computing; Jonoska, N., Păun, G., Rozenberg, G., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2003; Volume 2950, pp. 278–288. [Google Scholar]
  18. Maslov, A.N. Estimates of the number of states of finite automata. Sov. Math. Dokl. 1970, 11, 1373–1375. [Google Scholar]
  19. Nagy, B. A Normal Form for Regular Expressions. In Proceedings of the Eighth International Conference on Developments in Language Theory, CDMTCS Research Reports CDMTCS-252, Supplemental Papers for DLT’04, Auckland, New Zealand, 13–17 December 2004. [Google Scholar]
  20. Nagy, B. Union-Freeness, Deterministic Union-Freeness and Union-Complexity (invited paper). In Proceedings of the DCFS 2019: Descriptional Complexity of Formal Systems; Hospodár, M., Jirásková, G., Konstantinidis, S., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2019; Volume 11612, pp. 46–56. [Google Scholar]
Figure 1. The circular word w obtained from the linear word w .
Figure 1. The circular word w obtained from the linear word w .
Mathematics 13 00763 g001
Figure 2. A two-wheel concatenation of two circular words obtaining a sequence of circular words.
Figure 2. A two-wheel concatenation of two circular words obtaining a sequence of circular words.
Mathematics 13 00763 g002
Figure 3. A one-wheel concatenation of two circular words.
Figure 3. A one-wheel concatenation of two circular words.
Mathematics 13 00763 g003
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Abdallah, B.; Nagy, B. On Concatenations of Regular Circular Word Languages. Mathematics 2025, 13, 763. https://doi.org/10.3390/math13050763

AMA Style

Abdallah B, Nagy B. On Concatenations of Regular Circular Word Languages. Mathematics. 2025; 13(5):763. https://doi.org/10.3390/math13050763

Chicago/Turabian Style

Abdallah, Bilal, and Benedek Nagy. 2025. "On Concatenations of Regular Circular Word Languages" Mathematics 13, no. 5: 763. https://doi.org/10.3390/math13050763

APA Style

Abdallah, B., & Nagy, B. (2025). On Concatenations of Regular Circular Word Languages. Mathematics, 13(5), 763. https://doi.org/10.3390/math13050763

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop