Permutation-Based Distances for Groups and Group-Valued Time Series

José M. Amigó; Roberto Dale

doi:10.3390/e27090913

and

Centro de Investigación Operativa, Universidad Miguel Hernández, 03202 Elche, Spain

^*

Author to whom correspondence should be addressed.

Entropy2025, 27(9), 913;https://doi.org/10.3390/e27090913

This article belongs to the Special Issue Ordinal Patterns-Based Tools and Their Applications

Version Notes

Order Reprints

Abstract

Permutations on a set, endowed with function composition, build a group called a symmetric group. In addition to their algebraic structure, symmetric groups have two metrics that are of particular interest to us here: the Cayley distance and the Kendall tau distance. In fact, the aim of this paper is to introduce the concept of distance in a general finite group based on them. The main tool that we use to this end is Cayley’s theorem, which states that any finite group is isomorphic to a subgroup of a certain symmetric group. We also discuss the advantages and disadvantage of these permutation-based distances compared to the conventional generator-based distances in finite groups. The reason why we are interested in distances on groups is that finite groups appear in symbolic representations of time series, most notably in the so-called ordinal representations, whose symbols are precisely permutations, usually called ordinal patterns in that context. The natural extension from groups to group-valued time series is also discussed, as well as how such metric tools can be applied in time series analysis. Both theory and applications are illustrated with examples and numerical simulations.

Keywords:

finite groups; permutations; ordinal patterns; transcripts; edit distance; Cayley and Kendall distances; Cayley’s theorem; algebraic representations; group-valued time series; time series analysis

1. Introduction

Symbolic representation of real-valued times series is a usual and useful tool in data analysis, where numbers are replaced by discrete “symbols”, in order to gain more tools and insights []. So to speak, symbolic representations coarse-grain the data in such a way that the information retained is sufficient for the purposes of the analysis. From a mathematical point of view, this technique consists of partitioning the state space, both in statistics and nonlinear methods. Traditional examples include binning and thresholding. More recently, Bandt and Pompe [] proposed to use ordinal patterns, which are the rank vectors of sliding windows along a time series, the size of the windows being the length of the ordinal patterns. Since then, ordinal representations, i.e., symbolic representations with ordinal patterns, have become a popular technique among data analysts. Common applications of ordinal patterns include classification using ordinal pattern-based indices [,,], discrimination of chaotic signals from white noise [,], characterization of dynamics and couplings [,,] and nonparametric tests of serial dependence [,], to mention a few. For general overviews, see [,,].

More importantly for the topic of this paper, ordinal patterns of any given length

L \geq 2

can be interpreted as permutations (i.e., bijections) on any set of L elements, say,

{1, 2, \dots, L}

. In fact, the Shannon entropy of a probability distribution of ordinal patterns is called permutation entropy [], and the same happens with any other entropic functional based on ordinal pattern probability distributions, e.g., divergence, mutual information, or statistical complexity. A potential advantage of viewing ordinal patterns of length L as permutations is that the latter build a group, namely, the symmetric group of degree L, denoted by

Sym (L)

, where the binary operation is function composition. In fact, the algebraic structure of

Sym (L)

provides additional leverage to ordinal representations that can be harnessed in time series analysis. An example of this is the concept of transcript introduced in [].

More generally, symbolic representations whose symbols are elements of a group are called algebraic representations, an ordinal representation being an algebraic representation with alphabet

Sym (L)

. Actually, most results for ordinal representations can be readily generalized to algebraic representations whose alphabets are any other finite group

G

. This is not surprising if no particular property of

Sym (L)

is used in a given proof or application. There may be another, more theoretical reason for this. According to Cayley’s theorem [], any finite group

G

is isomorphic to a subgroup of a symmetric group. This means that permutations are a sort of universal symbol for discretizing time series by means of group elements; a different question is whether such a “canonical” embedding is always the best option in practice.

This being the case, in this paper, we extend two distances in

Sym (L)

, namely, the Cayley distance and the Kendall tau distance (henceforth called Kendall distance), to arbitrary finite groups via Cayley’s theorem. A possible advantage of the here-proposed distances compared to others (e.g., the conventional generator-based distances) is their expediency and acceptable computation time for groups of moderate cardinality, as happens in practice. By extension, we discuss also distances for group-valued time series, which include algebraic representations of time series. This issue raises naturally when comparing two time series to measure their “similarity” (think of classification or clustering) or studying coupled systems (think of different types of synchronization). The result is a suite of permutation-based (or ordinal pattern-based) distances for groups and group-valued time series.

In sum, this is a follow-up paper on the quest to exploit the algebraic structure of group-valued time series—a possibility rarely used in the literature. Remarkably, the Cayley and Kendall distances and, hence, their extensions to general groups, are actually norms of transcripts, which shows the potential of our algebraic approach. Since our interest in distances between group elements was motivated by the study of ordinal representations and transcripts, we will speak of both permutations and ordinal patterns.

To address the aforementioned topics, we begin in Section 2 by establishing the mathematical framework, which includes group actions and group representations. In particular, we will prove Cayley’s theorem and implement it in three different ways—one of them using transcripts. There and throughout this paper, our approach is formal, the theoretical concepts being illustrated with simple examples. Section 3 is dedicated to the symmetric group and its two standard metrics: the Cayley and Kendall distances. In Section 4, we transition from the symmetric group to general groups and propose a distance based on Cayley’s Theorem (Section 4.1). This distance is compared to the conventional string metric for finitely generated groups (Section 4.2) in Section 4.3. Possible extensions to distances between group-valued times series are discussed in Section 5 and illustrated with mathematical simulations in Section 6. This paper ends with the conclusions in Section 7.

2. Groups, Group Actions and Cayley’s Theorem

In this section, we set the mathematical framework of this paper—group actions and group representations [,,].

Definition 1.

A group

(G, *)

is a nonempty set

G

endowed with a binary operation “*", sometimes called composition law or product, satisfying the following properties.

(G1): Associativity: For all $a, b, c \in G$ , it is true that $(a * b) * c = a * (b * c)$ .
(G2): Identity element: There exists an element $e \in G$ , called the identity (or neutral) element, such that $a * e = e * a = a$ for all $a \in G$ .
(G3): Inverse element: For every $a \in G$ , there exists an element $a^{- 1} \in G$ , called the inverse element of a, such that $a * a^{- 1} = a^{- 1} * a = e$ .

It can be proved that the identity of a group and the inverse of each element are unique. Groups whose product is commutative (i.e.,

a * b = b * a

for all

a, b \in G

) are called commutative or abelian. Examples of abelian groups are the real numbers endowed with addition and the nonzero real numbers endowed with multiplication. Invertible square matrices are examples of nonabelian groups under multiplication. If the binary operation is clear from the context, then

(G, *)

is shortened to

G

.

Definition 2.

If

(G, *)

is a group and S a nonempty set, then a left group action of

G

on S is a mapping

F : G \times S \to S

such that it satisfies the following two axioms:

(L1a): Identity: $F (e, s) = s$ for all $s \in S$ , where e is the identity element of $G$ .
(L2a): Compatibility: $F (a, F (b, s)) = F (a * b, s)$ for all $a, b \in G$ and $s \in S$ .

If F is a left action of

G

on S, we can define the function

F_{a} : = F (a, \cdot) : S \to S

, i.e.,

F_{a} (s) = F (a, s)

(1)

for each

a \in G

, for

F_{a}

, the axioms L1a and L2a read as follows:

(L1b): Identity: $F_{e}$ is the identity mapping $s \mapsto s$ for all $s \in S$ .
(L2b): Compatibility: $F_{a} \circ F_{b} = F_{a * b}$ for all $a, b \in G$ .

Lemma 1.

(i)

F_{a} : S \to S

is a bijection for each

a \in G

.

(ii): The set ${F_{a} : S \to S : a \in G}$ endowed with function composition is a group.

Proof.

(i) Since

F_{a}

is defined from S into itself, it suffices to prove that every

s \in S

has an inverse. Indeed,

F_{a}^{- 1} (s) = F_{a^{- 1}} (s) \in S

because

F_{a} (F_{a^{- 1}} (s)) = F_{a * a^{- 1}} (s) = F_{e} (s) = s

by axioms L2b and L1b.

(ii): According to Definition 1, we have to prove three properties: (G1) associativity is a general property of the composition of functions; (G2) $F_{e}$ is the identity because of axiom L1b; (G3) for all mappings $F_{a}$ , ${(F_{a})}^{- 1} = F_{a^{- 1}}$ as in (i).

□

Bijections from a finite set S onto itself are called permutations. So, according to Lemma 1(i), the mappings

F_{a}

are permutations. The permutations on S, endowed with function composition, build a group called the symmetric group

Sym (S)

. In this paper, we consider only finite groups

G

and finite sets S, so, if

|S|

is the cardinality of S, then

|Sym (S)| = |S|!

. Since the properties of the permutations on S do not depend on S but only on

|S|

, we choose

S = {1, 2, \dots, |S|}

, unless otherwise stated, and also refer to

Sym (S)

as the symmetric group of degree

|S|

,

Sym (|S|)

. As a historical note, the symmetric group goes back to Évariste Galois (1811–1832) and his work on the resolution of algebraic equations by means of radicals.

Furthermore, Lemma 1(ii) states that the set of permutations

{F_{a} : S \to S : a \in G}

is a subgroup (of cardinality

|G|

) of

Sym (S)

. This result together with axiom L2b, which spells out that the mapping

Φ : a \mapsto F_{a}

preserves the algebraic structure of

G

, are merged in the following theorem.

Theorem 1.

Any (left) group action

F : G \times S \to S

of a group

G

on a finite set S defines a group homomorphism

Φ : a \mapsto F_{a}

from

G

into

Sym (S)

. Therefore, Φ is a representation of the group

G

by means of permutations

F_{a} : = F (a, \cdot) : S \to S

.

In other words, every group

G

is isomorphic to a subgroup

H

of

Sym (S)

, namely

H = Φ (G)

, hence,

|H| = |G|

. In this formulation, Theorem 1 is known as Cayley’s theorem. Therefore, we will call

Φ : G \to

Sym (S)

Cayley’s homomorphism and, abusing notation,

Φ : G \to

H

Cayley’s isomorphism. Below, we will discuss three different implementations of Cayley’s isomorphism.

To apply Theorem 1, label the elements of

G

with the conventional set

{1, 2, \dots, |G|}

. For every

a \in G

, let

F_{a} = (\begin{matrix} 1 & \dots & k & \dots & |G| \\ F_{a} (1) & \dots & F_{a} (k) & \dots & F_{a} (|G|) \end{matrix}) = (\begin{matrix} 1 & \dots & k & \dots & |G| \\ n_{1} & \dots & n_{k} & \dots & n_{|G|} \end{matrix})

(2)

be the matrix (or two-line) form of the permutation

F_{a}

, where

(n_{1}, \dots, n_{k}, \dots, n_{|G|})

is a shuffle of

(1, 2, \dots, |G|)

. Therefore, every element

a \in G

can be identified with the one-line form

(n_{1}, n_{2}, \dots, n_{|G|})

of

F_{a}

. In the numerical examples below, we will juxtapose the components of

(n_{1}, n_{2}, \dots, n_{|G|})

and drop the parentheses for a compact notation.

Remark 1.

In addition to left actions of a group

(G, *)

on a finite set S, there are also right actions

\tilde{F} : S \times G \to S

, defined by (R1a)

\tilde{F} (s, e) = s

for all

s \in S

, and (R2a)

\tilde{F} (\tilde{F} (s, a), b) = \tilde{F} (s, a * b)

, as well as the corresponding group homomorphism

a \to {\tilde{F}}_{a} : = \tilde{F} (\cdot, a)

from

G

to

Sym (S)

, such that (R1b)

{\tilde{F}}_{e}

is the identity map

s \mapsto s

for all

s \in S

, and (R2b)

{\tilde{F}}_{a} \circ {\tilde{F}}_{b} = {\tilde{F}}_{a * b}

for all

a, b \in G

. The difference between left and right actions is that in the function composition

F_{a} \circ F_{b} = F_{a * b}

(L2b),

F_{b}

acts first on

s \in S

and

F_{a}

second (as in the standard convention), whereas in

{\tilde{F}}_{a} \circ {\tilde{F}}_{b} = {\tilde{F}}_{a * b}

(R2b),

{\tilde{F}}_{a}

acts first on

s \in S

and

{\tilde{F}}_{b}

second. Henceforth, we only consider left actions because the binary operation of the symmetric group, the main character of this paper, is precisely function composition and so we can use the standard convention.

There is a particular case of Theorem 1 that is of special interest here, namely,

S = G

, i.e., when the group

G

acts on itself. In this particular case, we are going to highlight three implementations of Cayley’s isomorphism

Φ : G ∋ a \mapsto F_{a} \in Sym (G)

via left actions.

(A): Left translations: The mapping $(a, b) \mapsto Λ (a, b) = a * b$ is a left action of $G$ on itself, so

$Λ_{a} (b) = a * b$

(3)

is a permutation on $G$ for every $a \in G$ , called a left translation by a.
(B): Right translations: The mapping $(a, b) \mapsto R (a, b) = b * a^{- 1}$ is a left action of $G$ on itself, so

$R_{a} (b) = b * a^{- 1}$

(4)

is a permutation on $G$ for every $a \in G$ , called a right translation by a. Let us mention that the operation $R (a, b)$ is also called the transcription from the (source) symbol a to the (target) symbol b in []. Note that $Λ_{a} (b) = R_{b^{- 1}} (a)$ and $R_{a} (b) = Λ_{b} (a^{- 1})$ .
(C): Adjoint actions: The mapping $(a, b) \mapsto Ad (a, b) = a * b * a^{- 1}$ is a left action of $G$ on itself, so

${Ad}_{a} (b) = a * b * a^{- 1}$

(5)

is a permutation on $G$ for every $a \in G$ , called the adjoint action of a.

Comparing Equations (3)–(5), we conclude that the implementation (3) of Cayley’s isomorphism

Φ : a \mapsto F_{a}

is the most convenient in practice, since the (one-line form of the) permutations

Λ_{a} : b \mapsto a * b

can be read immediately row by row in the multiplication table of

G

. Indeed, if

{a_{1}, a_{2}, \dots ., a_{|G|}}

is an enumeration of the elements of

G

, then

Λ_{a_{i}}

is the i-th row of the multiplication table

{(a_{i} * a_{j})}_{1 \leq i, j \leq |G|}

, i.e.,

Λ_{a_{i}} = (a_{i} * a_{1}, \dots, a_{i} * a_{j}, \dots, a_{i} * a_{|G|}) = (\begin{matrix} a_{1} & \dots & a_{j} & \dots & a_{|G|} \\ a_{i} * a_{1} & \dots & a_{i} * a_{j} & \dots & a_{i} * a_{|G|} \end{matrix}) .

(6)

Example 1.

Let

G = Sym (3)

. By Equation (3), the isomorphic copies

Λ_{r} \in Sym (G) = Sym (Sym (3)) = Sym (6)

of

r \in {123, 132, 213, 231, 312, 321}

are given by the rows of the “multiplication" table of

Sym (3)

,

(7)

where

r \circ s

stands for the composition of the permutation

r

that labels a row with the permutation

s

that labels a column. Therefore,

(8)

For example,

Λ_{231} : 123 \mapsto 231, 132 \mapsto 213, 213 \mapsto 321, 231 \mapsto 312, 312 \mapsto 123, 321 \mapsto 132,

or, in one-line form,

Λ_{231} = (231, 213, 321, 312, 123, 132)

. From

123^{- 1} = 123, 132^{- 1} = 132, 213^{- 1} = 213, 231^{- 1} = 312, 312^{- 1} = 231, 321^{- 1} = 321,

table (7) and Equation (4), we obtain similarly that the copies

R_{r} \in Sym (Sym (3))

of

r \in Sym (3)

via right translations are given by

(9)

Example 2.

Let

G = {θ^{0}, θ^{1}, θ^{2}, θ^{3}}

endowed with the product

θ^{i} * θ^{j} = θ^{j} * θ^{i} = θ^{i + j}

where, in this example, the exponents are taken modulo 4. Hence,

θ^{0}

is the identity and

{(θ^{i})}^{- 1} = θ^{4 - i}

. By definition,

G

is a cyclic group generated by the element

θ^{1}

. Alternatively,

G

can be identified with the additive group

{0, 1, 2, 3}

, where the sum is taken modulo 4.

(i) The four permutations

Λ_{θ^{i}} : θ^{j} \mapsto θ^{i} * θ^{j} = θ^{i + j}

, corresponding to Equation (3) under the isomorphism

Φ : θ^{i} \mapsto

Λ_{θ^{i}} \in Sym (G)

, are given in the following table:

(10)

So, for instance, the second row of this table spells out

Λ_{θ^{1}} : θ^{0} \mapsto θ^{1}, θ^{1} \mapsto θ^{2}, θ^{2} \mapsto θ^{3}, θ^{3} \mapsto θ^{0},

or

Λ_{θ^{1}} = (θ^{1}, θ^{2}, θ^{3}, θ^{0})

.

(ii) The four permutations

R_{θ^{i}} : θ^{j} \mapsto θ^{j} * {(θ^{i})}^{- 1} = θ^{j - i}

, corresponding to Equation (4), under the isomorphism

Φ : θ^{i} \mapsto

R_{θ^{i}} \in Sym (G)

, are given in the following table:

(11)

So, if in table (10),

Λ_{θ^{i + 1}}

is obtained from

Λ_{θ^{i}}

by a clockwise (negative) circular shift, in table (11), the circular shift to obtain

R_{θ^{i + 1}}

from

R_{θ^{i}}

is counterclockwise (positive).

3. Ordinal Patterns and Distances

In the previous, section we have focused on group actions and the embedding of a group in a symmetric group. What is still missing is metric tools that can further boost applications in the realm of group-valued time series. Since the motivation and objective of this paper are the applications of such tools to symbolic representations of time series via group elements, we begin this section by briefly explaining how such symbolic representations arise in time series analysis. The choice of ordinal patterns (or permutations) responds to the popularity of these symbols among time series analysts. Then, we introduce the concept of distance in the symmetric group and, in the next section, we do the same for general groups.

3.1. Ordinal Patterns

Symmetric groups are very popular for symbolic representations since the concept of ordinal pattern was introduced in []. Given a real-valued time series

x = {(x_{t})}_{t \geq 0}

, an ordinal representation of x is a symbolic time series

{(r_{t})}_{t \geq 0}

whose alphabet is

Sym (L)

, the symmetric group of degree

L \geq 2

. How are the permutations

r_{t}

obtained from x? Let

x_{t}^{L} : = x_{t}, x_{t + 1}, \dots, x_{t + L - 1}

be a window (segment, sequence, block, …) of size L. Then,

r_{t} = (r_{1}, r_{2}, \dots, r_{L})

is the rank vector of

x_{t}^{L}

, that is,

(r_{1}, r_{2}, \dots, r_{L})

is the permutation of

{1, 2, \dots, L}

such that

x_{t + r_{1} - 1} < x_{t + r_{2} - 1} < \dots < x_{t + r_{L} - 1} .

(12)

In other words, the rank vector

r_{t}

is viewed as the one-line form of the permutation

1 \mapsto r_{1}

,

2 \mapsto r_{2}

, …,

L \mapsto r_{L}

, i.e.,

r_{t} (k) = r_{k}

for

1 \leq k \leq L

. As a matter of fact, any total ranking can be viewed as a permutation. In case of a tie

x_{i} = x_{j}

, one can apply the convention that

x_{i} < x_{j}

if

i < j

. Another possibility, more recommended in case of many ties, is to add a small-amplitude noise to

x_{i}

and

x_{j}

to undo the tie. As way of illustration, if

L = 4

and

x_{t}^{L} = 2.1, 0.3, 1.5, 2.4

, then

r_{t} = (2, 3, 1, 4)

, or

r_{t} = 2314

for short.

In [], the permutations

r_{t}

were called order (or ordinal) patterns of length L, which is the usual name of the symbols

r_{t}

in time series analysis. In addition to the length L of the patterns, ordinal representations depend also on a second parameter: a possible time delay in Equation (12). In this paper, the time delay is set equal to 1 throughout.

As a side note, the concept of ordinal pattern has been generalized in several directions. Thus, it has been extended to multivariate time series in [,]. Spatial ordinal patterns were introduced in [] to analyze two-dimensional images and applied in [,] to distinguish textures.

3.2. Distances for Ordinal Patterns

In this section, we introduce the Cayley and Kendall distances for the symmetric group

Sym (L)

; see [] for a survey about distances on permutations. We remind first about the concept of distance.

Definition 3.

Given a nonempty set S, a distance is a function

d : S \times S \to R

that satisfies the following three axioms for all points

x, y, z \in S

.

(D1): Positivity: $d (x, y > 0$ and $d (x, y) = 0$ if and only if $x = y .$
(D2): Symmetry: $d (x, y) = d (y, x)$ .
(D3): Triangular inequality: $d (x, z) \leq d (x, y) + d (y, z)$ .

Following the notation in Section 3.1 for ordinal patterns, the permutations of

Sym (L)

will be written in the one-line form

r = (r_{1}, r_{2}, \dots, r_{L})

(possibly shortened to

r_{1}, r_{2}, \dots, r_{L}

in numerical examples), where

r (i) = r_{i}

. If, furthermore,

s = (s_{1}, s_{2}, \dots, s_{L}) \in Sym (L)

, then

r \circ s

is the usual function composition

(r \circ s) (i) = r (s (i))

, i.e.,

r \circ s = (r_{1}, \dots, r_{k}, \dots, r_{L}) \circ (s_{1}, \dots, s_{k}, \dots, s_{L}) = (r_{s_{1}}, \dots, r_{s_{k}}, \dots, r_{s_{L}}),

(13)

as exemplified in Equation (7) for

L = 3

. Due to the positivity and symmetry properties of a distance, the

L! \times L!

distance matrix

(d (r, s) : r, s \in Sym (L))

is symmetric, with 0’s along the diagonal.

If

{i_{1}, i_{2}, \dots, i_{m}} \subset {1, 2, \dots, L}

, then

(i_{1}, i_{2}, \dots, i_{m})

denotes the permutation

i_{1} \mapsto i_{2}, i_{2} \mapsto i_{3}, \dots, i_{m - 1} \mapsto i_{m}, i_{m} \mapsto i_{1},

(14)

called a cycle of length m,

1 \leq m \leq L

, or simply an m-cycle. The notation calls for a warning at this point: do not confuse the permutation

i_{1}, i_{2} \dots, i_{m} = (i_{1}, i_{2}, \dots, i_{m})

with the cycle

(i_{1}, i_{2}, \dots, i_{m})

. Every permutation can be written as a product of disjoint cycles, which is unique except for the order of the factors. For example, the cycle factorization of the permutation 426135 is

(14) (2) (356)

or

(14) (356)

if 1-cycles (“fixed elements”) are omitted.

Cycles of length 2 are called transpositions. That is, a transposition is a permutation

t_{i j} \in Sym (L)

such that

t_{i j} (i) = j

,

t_{i j} (j) = i

, and

t_{i j} (k) = k

for all

k \neq i, j

. If

r = (r_{1}, \dots, r_{L})

, then

r \circ t_{i j} = (r_{1}, \dots, r_{i - 1}, r_{j}, r_{i + 1}, \dots, r_{j - 1}, r_{i}, r_{j + 1}, \dots, r_{L}) .

(15)

If

|i - j| = 1

, then

t_{i j}

is called an adjacent transposition. Unlike the factorization of permutations into disjoint cycles, the factorization of permutations into adjacent transpositions (and, hence, into transpositions) is not unique, although the minimal number of factors is. For example,

321 = (12) (23) (12) = (23) (12) (23)

.

Definition 4

([,]). Let

r, s \in Sym (L)

. (a) The Cayley distance between the two permutations

r

and

s

, denoted by

d_{C} (r, s)

, is defined as the minimum number of transpositions needed to transform

r

into

s

. (b) The Kendall distance (also known as the bubble-sort distance) between

r

and

s

, denoted by

d_{K} (r, s)

, is defined as the minimum number of adjacent transpositions needed to transform

r

into

s

.

The Cayley and Kendall distances are examples of edit distances between two strings of symbols, which measure the minimum cost sequence of allowed edit operations to transform one string into the other. The use of edit distances to measure distance between permutations was proposed in []. By definition,

d_{C} (r, s) \leq d_{K} (r, s)

(16)

for all

r, s \in Sym (L)

.

The proofs of the positivity and symmetry (properties (D1) and (D2) in Definition 3) for

d_{C} (r, s)

and

d_{K} (r, s)

are straightforward. The triangular inequality can be easily proved by graph-based methods since the permutations of

Sym (L)

build a connected undirected graph where the nodes (or vertices) correspond to permutations and the links (or edges) to transpositions. For example, in the case of

d_{K} (r, s)

: (i) every node

r

is connected to exactly

L - 1

nearest neighbors, namely, those permutations that differ from

r

due to transpositions of the adjacent symbols

r_{i}, r_{i + 1}

for

1 \leq i \leq L - 1

, and, hence, (ii) for any two nearest nodes

u

and

v

,

d_{K} (u, v) = d_{K} (v, u) = 1

. Therefore,

d_{K} (r, s)

counts the number of links of the shortest path connecting the nodes

r

and

s

. In other words, each node has degree

L - 1

and all its nearest neighbors (one link apart) are at distance 1. The diameter of the graph, i.e., the farthest distance between any two nodes, corresponds to

r = (r_{1}, r_{2}, \dots, r_{L})

and the order reversing permutation

s = (r_{L}, r_{L - 1}, \dots, r_{1})

, hence

d_{K, max} (L) = (L - 1) + (L - 2) + \dots + 1 = \frac{L (L - 1)}{2} .

(17)

Such graphs are called adjacency graphs or networks.

Figure 1 and Figure 2 show the adjacency graphs of the groups

Sym (3)

(a cycle in this case) and

Sym (4)

, respectively. Unlike the adjacency graphs for the Kendall distance, the adjacency graphs for the Cayley distance are in general nonplanar, i.e., they have edge crossings (even for

Sym (3)

), so we will not use them.

Figure 1. Kendall adjacency graph of

Sym (3)

. A link between two nodes means that the corresponding permutations differ by an adjacent transposition, i.e., the Kendall distance between them is 1.

Figure 2. Kendall adjacency graph of

Sym (4)

. A link between two permutations means that the Kendall distance between them is 1.

In the following, whenever convenient for economy of notation, we denote by

d_{C, K}

both the Cayley and Kendall distances.

Proposition 1

(Invariance of

d_{C, K}

under left translations). Given

r, s \in Sym (L)

, then

d_{C, K} (r, s) = d_{C, K} (u \circ r, u \circ s)

(18)

for all

u \in Sym (L)

.

Proof.

Suppose

d_{C, K} (r, s) = k

, i.e., k is the mimimum number of transpositions or adjacent transpositions

t_{i_{1} j_{1}}, t_{i_{2} j_{2}}, \dots, t_{i_{k} j_{k}} \in Sym (L)

such that

r = (\dots ((s \circ t_{i_{1} j_{1}}) \circ t_{i_{2} j_{2}}) \circ \dots \circ t_{i_{k - 1} j_{k - 1}}) \circ t_{i_{k} j_{k}},

see Equation (15). Then,

u \circ r = (\dots ((u \circ s \circ t_{i_{1} j_{1}}) \circ t_{i_{2} j_{2}}) \circ \dots \circ t_{i_{k - 1} j_{k - 1}}) \circ t_{i_{k} j_{k}},

which proves that

d_{C, K} (u \circ r, u \circ s) = k .

□

Since

d_{C, K} (r, s) = d_{C, K} (s, r)

, then

d_{C, K} (u \circ r, u \circ s) = d_{C, K} (u \circ s, u \circ r) .

Choose

u = r^{- 1}

or

u = s^{- 1}

in Equation (18) to prove:

Corollary 1.

For every

r, s \in Sym (L)

,

d_{C, K} (r, s) = d_{C, K} (e, r^{- 1} \circ s) = d_{C, K} (e, s^{- 1} \circ r),

(19)

where

e

is the identity permutation.

Remark 2.

Owing to Equation (19), all possible values of

d_{C, K} (r, s)

appear on the row

(d_{C, K} (e, u) : u \in Sym (L))

of the distance matrix.

Equation (19) allows to define in

Sym (L)

an analogue to the concept of norm in a vector space.

Definition 5.

The norm

{∥\cdot∥}_{C, K}

of

r \in Sym (L)

is defined as

{∥r∥}_{C, K} = d_{C, K} (e, r) .

(20)

Then, by Equation (19),

d_{C, K} (r, s) = {∥r^{- 1} \circ s∥}_{C, K} = {∥s^{- 1} \circ r∥}_{C, K} .

(21)

Remark 3.

The right translation of

b \in G

by

a \in G

, or the transcript from (the source) a to (the target) b, was defined in Equation (4) as

R (a, b) = b * a^{- 1}

. In view of Equation (21), we conclude that the distance

d_{C, K} (r, s)

is the norm

{∥\cdot∥}_{C, K}

of the right translations or transcripts

R (s^{- 1}, r^{- 1}) = r^{- 1} \circ s

and

R (r^{- 1}, s^{- 1}) = s^{- 1} \circ r =

R {(s^{- 1}, r^{- 1})}^{- 1}

.

Corollary 1 is instrumental for the computation of the Cayley and Kendall distances [].

Proposition 2.

(a) Let

u = (u_{1}, \dots, u_{L}) \in Sym (L)

and

C (u)

the number of cycles (including 1-cycles) in the cycle factorization of the permutation

u

. Then,

d_{C} (r, s) = L - C (r^{- 1} \circ s) = L - C (s^{- 1} \circ r)

(22)

for all

r, s \in Sym (L)

(b) Let

I (u)

be the number of inversions in the permutation

u

, i.e., the number of ordered pairs

(u_{i}, u_{j})

,

1 \leq i < j \leq L

, such that

u_{i} > u_{j}

. Then,

d_{K} (r, s) = I (r^{- 1} \circ s) = I (s^{- 1} \circ r)

(23)

for all

r, s \in Sym (L)

.

From Equation (22), it follows

d_{C} (r, s) \in {0, 1, \dots, d_{C, max} (L)}, where d_{C, max} (L) = L - 1,

(24)

and, according to Equation (17),

d_{K} (r, s) \in \{0, 1, \dots, d_{K, max} (L)\}, where d_{K, max} (L) = \frac{L (L - 1)}{2} .

(25)

Example 3.

We illustrate Proposition 2 with

L = 6

,

r = 462531

and

s = 236514

. Then,

s^{- 1} = {(\begin{matrix} 1 & 2 & 3 & 4 & 5 & 6 \\ 2 & 3 & 6 & 5 & 1 & 4 \end{matrix})}^{- 1} = (\begin{matrix} 1 & 2 & 3 & 4 & 5 & 6 \\ 5 & 1 & 2 & 6 & 4 & 3 \end{matrix}),

so that

s^{- 1} \circ r = 512643 \circ 462531 = 631425,

whose cycle factorization is

s^{- 1} \circ r = (16523) (4) .

According to Equation (22),

d_{C} (r, s) = L - C (s^{- 1} \circ r) = 6 - 2 = 4 .

(26)

As for Equation (23), the inversions of

s^{- 1} \circ r

are

\begin{matrix} (6, 3), (6, 1), (6, 4), (6, 2), (6, 5), \\ (3, 1), (3, 2), \\ (4, 2), \end{matrix}

so that,

d_{K} (r, s) = I (s^{- 1} \circ r) = 8 .

(27)

Let us check the results (26) and (27). First, the transpositions needed to transform

r

into

s

are the following:

\begin{matrix} r = 462531 & \overset{(13)}{⟶} & 264531 & \overset{(25)}{⟶} & 234561 \\ \overset{(35)}{⟶} & 236541 & \overset{(56)}{⟶} & 236514 = s \end{matrix}

where the elements being swapped in each transposition have been boldfaced. Therefore,

d_{C} (r, s) = 4

. To check Equation (27), call

δ_{1}

the number of adjacent transpositions needed to move in

r

the symbol 2 (the first or leftmost symbol of the target

s

) to the first position; call

r^{(1)}

the result. Similarly, call

δ_{2}

the number of adjacent transpositions needed to move in

r^{(1)}

the symbol 3 (the second symbol of the target

s

) to the second position. Proceed analogously until

r^{(k)} = s

. The adjacent transpositions needed to transform

r

into

s

in this example are the following:

\begin{matrix} r = 462531 & \overset{δ_{1} = 2}{⟶} & r^{(1)} = 246531 & \overset{δ_{2} = 3}{⟶} & r^{(2)} = 234651 \\ \overset{δ_{3} = 1}{⟶} & r^{(3)} = 236451 & \overset{δ_{4} = 1}{⟶} & r^{(4)} = 236541 & \overset{δ_{5} = 1}{⟶} & r^{(5)} = 236514 = s \end{matrix}

where the element of

r^{(i)}

(

r^{(0)} : = r

) being moved to the

(i + 1)

-position has been boldfaced. This shows that

d_{K} (r, s) = δ_{1} + \dots + δ_{5} = 8

.

Example 4.

According to [],

G = Sym (3)

is the most common ordinal representation in data analysis. The Cayley and Kendall distance matrices for the group

Sym (3)

, Equation (7), are shown in the tables

(28)

and

(29)

As shown in Equations (16), (24) and (25),

d_{C} (r, s) \leq d_{K} (r, s)

for all

r, s \in Sym (3)

,

d_{C} (r, s) \in {0, 1, 2}

and

d_{K} (r, s) \in {0, 1, 2, 3}

.

Owing to their large size, the Cayley and Kendall distance matrices for

G = Sym (4)

have been moved to Appendix A. In this case,

d_{C} (r, s) \in {0, 1, 2, 3}

, and

d_{K} (r, s) \in {0, 1, 2, 3, 4, 5, 6}

. Needless to say, the distances

d_{K} (r, s)

in table (29) and Table A2 can be easily checked in the corresponding adjacency graphs, Figure 1 and Figure 2, where each link stands for distance 1.

4. Distances for General Groups

In the first part of this section, we harness Cayley’s theorem to transport the Cayley and Kendall distances in

Sym (L)

(or, for that matter, any distance defined in

Sym (L)

) to any finite group

(G, *)

with

|G| = L

. In the second part, we briefly introduce the distance with respect to a generating system. We also discuss the advantages of the first approach as compared to the second.

4.1. Permutation-Based Distance for Groups

Let

Φ : G \to H

be Cayley’s isomorphism, where

H

is a subgroup of

Sym (G)

(namely,

H =

Φ (G)

) with

|H| = |G|

). This means:

(i): $Φ (e)$ $= (1, 2, \dots, |G|)$ , where e is the identity of $G$ .
(ii): $Φ (a * b)$ $= Φ (a) \circ Φ (b)$ for all $a, b \in G$ . Hence, $Φ (a^{- 1}) = Φ {(a)}^{- 1}$ .

To endow

G

with a distance, we transport the distance

d_{C, K} (r, s)

from the group

Φ (G) \subset Sym (G)

to

G

and promote

Φ

to an isometry.

Definition 6.

Let Φ be the Cayley isomorphism for a finite group

G

. Then,

D_{C, K}^{(Φ)}

is the distance in

G

defined as

D_{C, K}^{(Φ)} (a, b) = d_{C, K} (Φ (a), Φ (b)) .

(30)

Therefore,

D_{C, K}^{(Φ)}

has the same properties as

d_{C, K}

. In particular:

Left invariance: By Equation (18),

$D_{C, K}^{(Φ)} (a, b) = D_{C, K}^{(Φ)} (c * a, c * b)$

(31)

for all $a, b, c \in G$ , hence,

$D_{C, K}^{(Φ)} (a, b) = D_{C, K}^{(Φ)} (e, a^{- 1} * b) = D_{C, K}^{(Φ)} (e, b^{- 1} * a),$

(32)

where e is the identity of $G$ .
Norm-based definition: By Equation (21),

$D_{C, K}^{(Φ)} (a, b) = {∥Φ {(a)}^{- 1} \circ Φ (b)∥}_{C, K} = {∥Φ {(b)}^{- 1} \circ Φ (a)∥}_{C, K},$

(33)

where ${∥\cdot∥}_{C, K}$ is the Cayley/Kendall norm in $Sym (G)$ , i.e.,

${∥r∥}_{C, K} = d_{C, K} (e, r)$

(34)

for all $r \in Sym (G)$ , $e$ being the identity of $Sym (G) .$

From Equations (16) and (30), it follows

D_{C}^{(Φ)} (a, b) \leq D_{K}^{(Φ)} (a, b)

(35)

for all

a, b \in G

, since

Φ (a), Φ (b) \in Sym (G)

. Furthermore, by Equation (24),

D_{C}^{(Φ)} (a, b) \in {0, 1, \dots, D_{C, max}^{(Φ)} (|G|)}, where D_{C, max}^{(Φ)} (|G|) = |G| - 1,

(36)

and, by Equation (25),

D_{K}^{(Φ)} (a, b) \in \{0, 1, \dots, D_{K, max}^{(Φ)} (|G|)\}, where D_{K, max}^{(Φ)} (|G|) = \frac{|G| (|G| - 1)}{2} .

(37)

Remark 4.

In the case

G = Sym (L)

of Section 3.2, the distances

d_{C, k} (r, s)

take on all integer values ranging from 0 to their respective maxima

d_{C, max} = L - 1

(Equation (24)), and

d_{K, max} = L (L - 1) / 2

(Equation (25)); think of the corresponding adjacency graphs. However, this does not happen with

D_{C, K}^{(Φ)} (a, b)

because

Φ (G)

is a subgroup of cardinality

|G|

of the group

Sym (G)

, whose cardinality is

|G|!

, so not all possible distances can be realized (unless

|G| = 2

). We call “forbidden distances for

D_{C, K}^{(Φ)}

" the values in

{0, 1, \dots, D_{C, K, max}^{(Φ)}}

that are missing in the adjacency subgraph of

Φ (G)

; otherwise, they are called allowed or admissible distances. By Equation (32) (or Remark 2), the admissible distances for

D_{C, K}^{(Φ)}

can be read in the row

(D_{C, K}^{(Φ)} (e, c) : c \in G))

of the distance matrix.

In general, the definition (30) depends on the implementation of Cayley’s isomorphism

Φ

, e.g., whether

Φ (a)

is (i) a left translation

Λ_{a}

(Equation (3)), (ii) a right translation

R_{a}

(Equation (4)), or (iii) an adjoint action (Equation (5)). For simplicity, we mainly use the implementation (i), so that

Λ_{a} (b)

can be read row-wise in the multiplicaction table of

G

(see Equation (6)), in which case we write

D_{C, K}^{(Λ)}

for

D_{C, K}^{(Φ)}

. In case (ii), we will write

D_{C, K}^{(R)}

.

Example 5.

The only non-cyclic group of order 4 is the Klein four-group

K

, defined by the multiplication table

(38)

so that

(39)

According to Equations (36) and (37),

D_{C}^{(Λ)} (r, s) \in {0, 1, 2, 3}

and

D_{K}^{(Λ)} (r, s) \in {0, 1, \dots, 6}

. From (39) it follows

(40)

so the forbidden values of

D_{C}^{(Λ)} (r, s)

are

{1, 3}

and the forbidden values of

D_{K}^{(Λ)} (r, s)

are

{1, 3, 5}

. Note that

K

is abelian (as any group whose cardinality is the square of a prime number) since the multiplication table in Equation (38) is symmetric and every element other than the identity has order 2, i.e., every element is its own inverse. Therefore,

R_{r} (s) = s * r^{- 1} = s * r = r * s = Λ_{r} (s),

i.e., the isomorphic copies

Λ_{r}, R_{r} \in Sym (K)

are the same for all

r \in K

, which implies

D_{C, K}^{(R)} = D_{C, K}^{(Λ)}

. Labeling the elements

e, a, b, c

as

1, 2, 3, 4

, one can locate the four copies

{Λ_{r} : r \in K}

of the group

K

in the Kendall adjacency graph of

Sym (4)

, Figure 2, and read there the distances in the right table of Equation (40). For example,

D_{K}^{(Λ)} (a, b) = d_{K} (Λ_{a}, Λ_{b}) = d_{K} (a e c b, b c e a) = d_{K} (2143, 3412) = 6 .

As a final remark, note that when

G = Sym (L)

,

D_{C, K}^{(Φ)} (r, s)

does not become

d_{C, K} (r, s)

, as one might think. The reason is that, in that event,

d_{C, K} (r, s)

is defined on

Sym (L) \times Sym (L)

, while

D_{C, K}^{(Φ)} (r, s) = d_{C, K} (Φ (r), Φ (s))

, where

d_{C, K} (Φ (r), Φ (s))

is defined on

Sym (Sym (L)) \times Sym (Sym (L)) = Sym (L!) \times Sym (L!)

. In other terms, the definition domain and the range of Cayley’s isomorphism

Φ :

Sym (L) \to Sym (L!)

are different also in the particular case

G = Sym (L)

, which prevents

Φ

from becoming the identity (unless

L = 2

). However, this does not prevent

d_{C, K} (r, s)

and

D_{C, K}^{(Λ)} (r, s)

from providing the same qualitative and even quantitative information, as shown in Example 6 below and Section 6. This fact supports the consistency of our approach to group metrics based on Cayley’s isomorphism.

Example 6.

Tables (41) and (42) below show the distances

d_{C} (Λ_{r}, Λ_{s}) = : D_{C}^{(Λ)} (r, s)

and

d_{K} (Λ_{r}, Λ_{s}) = : D_{K}^{(Λ)} (r, s)

for

r, s \in Sym (3)

, and

Λ_{r} = Φ (r)

,

Λ_{s} = Φ (s) \in Sym (6)

, see table (8):

(41)

(42)

For instance, if we encode the permutations of

Sym (3)

as

123 = 1, 132 = 2, 213 = 3, 231 = 4, 312 = 5, 321 = 6,

(43)

then

D_{C}^{(Λ)} (213, 321) = d_{C} (Λ_{213}, Λ_{321}) = d_{C} (341265, 654321) = 4,

while

D_{K}^{(Λ)} (213, 321) = d_{K} (Λ_{213}, Λ_{321}) = d_{K} (341265, 654321) = 10 .

Note that if we replace 3 by 1 and 4 by 2 in Equation (41) for

d_{C} (Λ_{r}, Λ_{s})

, then we obtain Equation (28) for

d_{C} (r, s)

. Furthermore, if we divide

d_{K} (Λ_{r}, Λ_{s})

in Equation (42) by 3, then we obtain Equation (29) for

d_{K} (r, s)

, i.e.,

D_{K}^{(Λ)} (r, s) = 3 d_{K} (r, s)

(44)

for all

r, s \in Sym (3)

. We conclude that the results obtained using

d_{C, K} (r, s)

in

G = Sym (3)

and

D_{C, K}^{(Λ)} (r, s)

in

Φ (G) \subset Sym (6)

are equivalent. According to Equations (41) and (42), the allowed distances for

D_{C}^{(Λ)}

are

{0, 3, 4}

out of

{0, 1, \dots, 5}

, while the allowed distances for

D_{K}^{(Λ)}

are

{0, 5, 10, 15} = {5 k : 0 \leq k \leq 3 = d_{K, max} (3)}

out of

{0, 1, \dots, 15}

.

4.2. Distances with Respect to a Generating Set

For the time being, let

G

be a finite or infinite group. A finite set

S = {s_{1}, \dots, s_{n}} \subset G

is a generating set (or generator) of

G

if every

a \in G

can be written as a finite product of elements of S and their inverses. In particular, groups generated by a single element are called cyclic. For example,

{θ^{0}, θ^{1}, \dots, θ^{n - 1}}

endowed with

θ^{i} * θ^{j} = θ^{k}

, where

k = i + j

mod n is a cyclic group of order n with generator

S = {θ^{1}}

. The (edit) distance (or word metric)

d_{S} (a, b)

between the elements a and b of a finitely generated group (in particular of a finite group)

G

is defined as the minimum number of elements from the generating set S needed to transform a into b. That is, if

b = a * s_{1} * \dots * s_{k}

, where

s_{i} \in S

(or

s_{i}^{- 1} \in S

), then

d_{S} (a, b)

is the smallest possible value of k. Therefore, the distance

d_{S}

depends on the generating set S. In particular, if

G =

Sym (L)

, then the Cayley distance

d_{C} (r, s)

of Section 3.2 is the distance

d_{S}

with respect to the generating set of all transpositions, while the Kendall distance

d_{K} (r, s)

is the distance

d_{S}

with respect to the generating set of all adjacent transpositions.

Example 7.

For the cyclic group

G = {θ^{0}, θ^{1}, θ^{2}, θ^{3}}

of Example 2, the distances with respect to the generating set

S = {θ^{1}}

are the following:

(45)

As for the distances

D_{K}^{(Λ)} (θ^{i}, θ^{j}) = d_{K} (Λ_{θ^{i}}, Λ_{θ^{j}})

, we find (see Equation (10)):

(46)

For example,

D_{K}^{(Λ)} (θ^{2}, θ^{3}) = d_{K} (Λ_{θ^{2}}, Λ_{θ^{3}}) = d_{K} (θ^{2} θ^{3} θ^{0} θ^{1}, θ^{3} θ^{0} θ^{1} θ^{2}) = 3 .

(47)

If right translations (4) are used instead of left translations (3), then

D_{K}^{(R)} (θ^{i}, θ^{j}) = d_{K} (R_{θ^{i}}, R_{θ^{j}})

happens to be the same as in Equation (46). For example,

D_{K}^{(R)} (θ^{2}, θ^{3}) = d_{K} (R_{θ^{2}}, R_{θ^{3}}) = d_{K} (θ^{2} θ^{3} θ^{0} θ^{1}, θ^{1} θ^{2} θ^{3} θ^{0}) = 3 .

(48)

If the group elements

θ^{0}, θ^{1}, θ^{2}, θ^{3}

are labeled

1, 2, 3, 4

, respectively, then the above distances can be read in the Kendall adjacency graph of

Sym (4)

, Figure 2. For example, distances (47) and (48) read

d_{K} (3412, 4123)

and

d_{K} (3412, 2341)

, respectively.

4.3. Discussion

When comparing the distances

D_{C, K}^{(Φ)} (a, b)

and

d_{S} (a, b)

for finite groups, a possible advantage of the former is its expediency, in the sense that

D_{C, K}^{(Φ)}

dispenses with generating sets and, hence, with the search for minimal descriptions of b as products of the form

a * s_{1} * \dots * s_{k}

. In addition, there are algorithms (such as the bubble-sort algorithm) that compute

D_{C}^{(Φ)}

in time

O (|G|)

and

D_{K}^{(Φ)}

in time

O (|G| log |G|)

[]. Computational issues are briefly discussed in Section 6.

On the other hand, a possible shortcoming of the distances

D_{C, K}^{(Φ)}

in applications is the existence of forbidden values pointed out in Remark 4. For instance, the presence of such gaps in the distances between the algebraic representations of two coupled time series (see Section 5) might be misinterpreted as a dynamical characteristic of the underlying systems, e.g., full or generalized synchronization. So, the forbidden values for

D_{C, K}^{(Φ)}

must be identified in advance, which can be easily done by calculating the row

(D_{C, K}^{(Φ)} (e, c) : c \in G))

of the distance matrix (Remark 4). Alternatively, they can be identified using independent white noises. We come back to this point in Section 6.

In sum, when embedding a group

G

in

Sym (|G|)

via Cayley’s isomorphism

Φ

, we are encoding the

|G|

elements

{a_{1}, \dots, a_{|G|}} \in G

as the

|G|

permutations

Φ (a) = (b_{1}, b_{2}, \dots, b_{|G|})

, where

(b_{1}, b_{2}, \dots, b_{|G|})

is a shuffle of

(a_{1}, \dots, a_{|G|})

; see Equation (6) for

Φ

being the left translation

a_{i} \mapsto Λ_{a_{i}}

. The penalty for doing so is a more complex representation of the elements of

G

. The pay-off is a general and computationally efficient metric

D_{C, K}^{(Φ)}

. In principle, there may be symmetric groups

Sym (M)

with

M < |G|

in which

G

can be embedded, but finding such symmetric groups, in particular, the minimum-order one, is rather difficult in general [,]. In any case, note that in the practice of symbolic representation of time series, the alphabets used have low cardinality.

5. Distances for Group-Valued Time Series and Algebraic Representations

In this section, we explore possible applications of permutation-based distances to group-valued time series. Examples of group-valued time series include binary and n-ary time series. In the first case,

G = {0, 1}

, endowed with the XOR operation (addition modulo 2); these time series arise in digital communications and cryptography. The second example is a generalization, also used in digital communications:

G = {0, 1, \dots, n - 1}

endowed with addition modulo n.

The perhaps most familiar example of group-valued time series is the ordinal representation of real-value time series, introduced in Section 3.1. A generalization thereof is the concept of algebraic representation.

Definition 7.

We say that a symbolic representation

α = {(a_{t})}_{t \geq 0}

of a time series is an algebraic time series if its elements

a_{t}

belong to a finite group

(G, *)

.

Since here we are interested in practical applications, consider two finite

G

-valued time series

α = {(a_{t})}_{1 \leq t \leq N}

and

β = {(b_{t})}_{1 \leq t \leq N}

of length N. In time series analysis,

α

and

β

could be ordinal representations of two coupled real-valued time series

{(x_{t})}_{1 \leq t \leq N}

and

{(y_{t})}_{1 \leq t \leq N}

, respectively. To carry out a data-driven analysis of the coupled dynamics of the underlying systems (think of various types of synchronization), or to measure the similarity between

α

and

β

, there are a number of metrics that we review in Section 5.1. In Section 5.2, we discuss how to extract information with those metrics.

5.1. String Metrics for Group-Valued Time Series

Below, we mention perhaps the most common metrics. Each of them targets specific situations.

(i): Some of the metrics to quantify the similarity of two symbolic time series such as $α$ and $β$ are based on the probability distributions of their symbols (estimated by their frequencies) []. This category includes the Kullback–Leibler (KL) divergence (usually symmetrized via an arithmetic or harmonic mean) [], the Jensen–Shannon (JS) divergence [], the JS distance (which is the square root of the JS divergence) [], the permutation JS distance [,], the Hellinger distance [], the Wasserstein distance [,], the total variation distance [,] and more. Since in this paper we are interested in harnessing the algebraic structure of the symbolic data (if any), we will dispense with entropic distances.
(ii): One can also exploit the algebraic structure of $G$ and calculate the transcription of $α$ and $β$ [], that is, the time series $τ = {(τ_{t})}_{t \geq 0}$ , where $τ_{t} = b_{t} * a_{t}^{- 1}$ (right translations by $a_{t}$ ) or $τ_{t} = a_{t}^{- 1} * b_{t}$ (left translations by $a_{t}^{- 1}$ ), see Equations (4) and (3). Trancriptions of coupled time series in an ordinal representation have been used to study different aspects of coupled dynamics: complexity [,], synchronization [,], information directionality (or causality) [], features for classification [], etc. Interestingly, if $G = Sym (L)$ , then the distance between the ordinal patterns $a_{t}$ and $b_{t}$ can be written as the norm ${∥\cdot∥}_{C, K}$ of the transcript $a_{t}^{- 1} \circ b_{t}$ , see Equation (21). Otherwise, we embed $G$ into $Sym (G)$ via Cayley’s isomorphism $Φ : G \to Sym (G)$ and, again, the distance between the ordinal patterns $Φ (a_{t})$ and $Φ (b_{t})$ can be written as the norm ${∥\cdot∥}_{C, K}$ of the transcript $Φ {(a_{t})}^{- 1} \circ Φ (b_{t})$ , see Equation (33).
(iii): Since a window $a_{t}^{W} : = a_{t}, a_{t + 1}, \dots, a_{t + W - 1}$ of size W of any $G$ -valued time series $α = {(a_{t})}_{t \geq 0}$ can be viewed as a string of symbols of length W, we can borrow a number of string metrics from information theory, computer science and computational linguistics to compare $a_{t}^{W}$ and $b_{t}^{W} : = b_{t}, b_{t + 1}, \dots, b_{t + W - 1}$ , where (unlike permutations) these strings can have repeated symbols. Thus, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols differ []. The Damerau–Levenshtein distance considers insertions, deletions, substitutions and adjacent transpositions of symbols [,,]. Such metrics are also examples of edit distances. Finally, we also mention the Jaro–Winkler similarity coefficient (not a true distance) which, like the Hamming distance, is based on symbol matching [,].

5.2. Extracting Information with $d_{C, K}$ and $D_{C, K}^{(Φ)}$

Next, we focus on the distances

d_{C, K}

for the group

Sym (L)

(Section 3.2) and

D_{C, K}^{(Φ)}

for other groups (Section 3.2) and their applications to the analysis of

G

-valued time series and algebraic representations. The idea is to measure the distance between (A) simultaneous symbols

a_{t}

and

b_{t}

, or (B) concurrent windows

a_{t}^{W}

and

b_{t}^{W}

, and thereby characterize the similarity or dissimilarity of the symbolic time series

α

and

β

. To this end, we consider sliding windows

a_{t}^{W}

and

b_{t}^{W}

,

1 \leq t \leq N - W + 1

, with the same size

W \geq 1

, where we allow

W = 1

in order to include distances between simultaneous symbols.

CASE A:: $W = 1 .$ To unify the notation, we will write $dist (a_{t}, b_{t})$ for the distance between the elements $a_{t}, b_{t} \in G$ , with the understanding that $dist (a_{t}, b_{t}) = d_{C, K} (a_{t}, b_{t})$ if $G = Sym (L)$ and $dist (a_{t}, b_{t}) = D_{C, K}^{(Φ)} (a_{t}, b_{t})$ otherwise. Therefore,

$dist (a_{t}, b_{t}) \in {0, 1, \dots, {dist}_{max}},$

(49)

where

${dist}_{max} \{\begin{matrix} = L - 1 & if dist = d_{C} (Equation (24)), \\ = L (L - 1) / 2 & if dist = d_{K} (Equation (25)), \\ \leq |G| - 1 & if dist = D_{C}^{(Φ)} (Equation (36)), \\ \leq |G| (|G| - 1) / 2 & if dist = D_{K}^{(Φ)} (Equation (37)) . \end{matrix}$

(50)

where the inequalities in Equation (50) allow for the possibility that $D_{C, K, max}^{(Φ)}$ is a forbidden distance (Remark 4). As a result of calculating $dist (a_{t}, b_{t})$ for $1 \leq t \leq N$ , we obtain the integer-valued time series

${(dist (a_{t}, b_{t}))}_{1 \leq t \leq N} .$

(51)

According to Equation (50), $d_{C, max} < d_{K, max}$ and $D_{C, max}^{(Φ)} < D_{K, max}^{(Φ)}$ , except for $L = 2 .$ Therefore, $d_{K}$ and $D_{K}^{(Φ)}$ have greater differentiating power in applications than their Cayley counterparts due to their larger ranges.
CASE B:: $W > 1$ . Consider now the windows $a_{t}^{W} = (a_{t}, a_{t + 1}, \dots, a_{t + W - 1})$ and $b_{t}^{W} = (b_{t}, b_{t + 1}, \dots, b_{t + W - 1})$ as W-dimensional vectors in the corresponding Cartesian product of the metric space $(G, dist)$ . In this case, we have the whole family of $l_{p}$ distances, $p \geq 1$ , at our disposal. Well-known instances include the so-called Manhattan distance,

${dist}_{1} (a_{t}^{W}, b_{t}^{W}) = \sum_{k = 0}^{W - 1} dist (a_{t + k}, b_{t + k}),$

(52)

the Euclidean distance,

${dist}_{2} (a_{t}^{W}, b_{t}^{W}) = {(\sum_{k = 0}^{W - 1} dist {(a_{t + k}, b_{t + k})}^{2})}^{1 / 2},$

(53)

and the Chebychev distance,

${dist}_{\infty} (a_{t}^{W}, b_{t}^{W}) = max \{dist (a_{t + k}, b_{t + k}) : 0 \leq k \leq W - 1\} .$

(54)

As a result, we obtain the time series

${({dist}_{p} (a_{t}^{W}, b_{t}^{W}))}_{1 \leq t \leq N - W + 1}$

(55)

which is integer-valued for $= 1, \infty$ , and real-valued otherwise.

Once the metric information from the

G

-valued time series

α

and

β

has been collected element-wise (51) and/or window-wise (55), one can proceed in several ways to process the information. We discuss some simple ways in Section 6.

6. Numerical Simulations

In this section, we illustrate the application of permutation-based distances to algebraic representations with numerical simulations. To this end, we revisit a model composed of two unidirectionally coupled, non-identical Henon systems, used in [] to study generalized synchronization. The equations of the driver X are

\{\begin{matrix} x_{t + 1}^{(1)} = 1.4 - {(x_{t}^{(1)})}^{2} + 0.1 x_{t}^{(2)} \\ x_{t + 1}^{(2)} = x_{t}^{(1)} \end{matrix}

(56)

and the equations of the responder Y are

\{\begin{matrix} y_{t + 1}^{(1)} = 1.4 - [C x_{t}^{(1)} y_{t}^{(1)} + (1 - C) {(y_{t}^{(1)})}^{2}] + 0.3 y_{t}^{(2)} \\ y_{t + 1}^{(2)} = y_{t}^{(1)} \end{matrix}

(57)

where

C \geq 0

is the coupling strength. It is numerically proved in [] that this system has generalized synchronization for C in a small interval around

0.55

and for

C ≳ 1

([], [Figure 3]).

For a given coupling strength C, let

x = {(x_{t}^{(1)})}_{1 \leq t \leq 10000}

and

y = {(y_{t}^{(1)})}_{1 \leq t \leq 10000}

be two stationary time series of length

N = 10000

composed of the first components of the states

x_{t} = (x_{t}^{(1)}, x_{t}^{(2)})

of the driver and

y_{t} = (y_{t}^{(1)}, y_{t}^{(2)})

of the responder, respectively, and generated with seeds

x_{0} = (0, 0.9)

and

y_{0} = (0.75, 0)

(after discarding the initial transient). Let

α = {(r_{t})}_{1 \leq t \leq 10000 - L + 1}

and

β = {(s_{t})}_{1 \leq t \leq 10000 - L + 1}

be the algebraic representations of x and y with ordinal patterns of length

3 \leq L \leq 6

. The values chosen for the coupling strength are

C = 0.30, 0.55, 1.10

.

Next we computed different types of distances between

α

and

β

from those presented in Section 5.2. Here, we present only the results with the Kendall distances

d_{K} (r_{t}, s_{t})

and

D_{K}^{(Λ)} (r_{t}, s_{t})

because, as explained there, they have greater differentiating power than

d_{C}

and

D_{C}^{(Λ)}

. As for the distances

{dist}_{p} (r_{t}^{W}, s_{t}^{W})

, we used

p = 1, 2, \infty

(Equations (52)–(54)). Irrational values of

{dist}_{2} (r_{t}^{W}, s_{t}^{W})

were rounded to the integer n if

{dist}_{2} (r_{t}^{W}, s_{t}^{W}) \in (n - 0.5, n + 0.5]

. To facilitate analysis, we transformed the data

{(d_{K} (r_{t}, s_{t}))}_{1 \leq t \leq N - L + 1}

,

{(D_{K}^{(Λ)} (r_{t}, s_{t}))}_{1 \leq t \leq N - L + 1}

and

{({dist}_{p} (r_{t}^{W}, s_{t}^{W}))}_{1 \leq t \leq N - L - W + 2}

into (empirical) probability distributions for the distance values.

Figure 3 illustrates CASE A of Section 5.2, i.e.,

W = 1

. Here,

G = Sym (4)

(top row) and

G = Sym (5)

(bottom row). The main conclusions can be summarized as follows.

For $C = 0.30$ (no synchronization, panels (a) and (d)), all possible values ${0, 1, \dots, L (L - 1) / 2}$ of $d_{K}$ are realized.
For $C = 0.55$ (“weak synchronization”, panels (b) and (e)), only the greater values of $d_{K}$ are allowed.
For $C = 1.10$ (“strong synchronization”, panels (c) and (f)), only the smaller values of $d_{K}$ are allowed.
So, $d_{K}$ detects that the generalized synchronizations at $C = 0.55$ and $C = 1.10$ are different: the former forbids the shorter distances between simultaneous ordinal patterns $r_{t}$ and $s_{t}$ , while the latter forbids large distances.
The results for each C are consistently similar.

Figure 3. Top row: Probability distributions of the Kendall distances

d_{K} (r_{t}, s_{t})

for the algebraic representation of the time series x and y with the group

G = Sym (4)

(i.e., ordinal patterns of length

L = 4

) and coupling strengths

C = 0.30

(left panel),

0.55

(middle panel) and

1.10

(right panel). Bottom row: Same as top row for the representation group

G = Sym (5)

(i.e., ordinal patterns of length

L = 5

).

We conclude that the distance

d_{K}

is sensitive to dynamical changes in coupled systems and robust with respect to the length of the ordinal patterns.

At this point, we draw on Figure 3 to, as in Example 6, check the consistency of the results obtained with

d_{K}

and

D_{K}^{(Λ)}

, this time using

G = Sym (4)

and

G = Sym (5)

. Figure 4 shows the probability distribution of the allowed distances for

D_{K}^{(Λ)} (r_{t}, s_{t})

with

L = 4

(panel (a)) and

L = 5

(panel (b)). The coupling strength in both panels is

C = 0.30

, so that all

L (L - 1) / 2 + 1

allowed distances are realized. The allowed distances for

D_{K}^{(Λ)} (r_{t}, s_{t})

, listed along the horizontal axes in Figure 4, happen to be

{46 k : 0 \leq k \leq 6 = d_{K, max} (4)}

for

L = 4

and

{714 k : 0 \leq k \leq 10 = d_{K, max} (5)}

for

L = 5

. Comparison of panels (a) and (b) of Figure 4 with panels (a) and (d) of Figure 3, respectively, shows that the probability distributions of

D_{K}^{(Λ)} (r_{t}, s_{t})

and

d_{K} (r_{t}, s_{t})

are exactly the same for

L = 4, 5

and

C = 0.30

, except for the labeling of the distances; notice the change of scales. In fact, and similarly to Equation (44), numerical calculations show that (i)

D_{K}^{(Λ)} (r, s) = 46 d_{K} (r, s)

(58)

for all

r, s \in Sym (4)

, where

46 = min {d_{K} (Λ_{r}, Λ_{s}) > 0 : r, s \in Sym (4)}

, and (ii)

D_{K}^{(Λ)} (r, s) = 714 d_{K} (r, s)

(59)

for all

r, s \in Sym (5)

, where

714 = min {d_{K} (Λ_{r}, Λ_{s}) > 0 : r, s \in Sym (5)}

. For example,

d_{K} (Λ_{1234}, Λ_{1243}) = 46

and

d_{K} (Λ_{12345}, Λ_{12354}) = 714

. The same occurs for

C = 0.55

and

C = 1.10

(not shown).

Figure 4. Probability distributions of the allowed distances for

D_{K}^{(Λ)} (r_{t}, s_{t})

for the algebraic representation of the time series x and y with the group

G = Sym (4)

(panel (a)) and

G = Sym (5)

(panel (b)).

C = 0.30

in both panels so that all

L (L - 1) / 2 + 1

allowed distances for

D_{K}^{(Λ)} (r_{t}, s_{t})

(listed along the horizontal axes) are actually realized.

Figure 5 illustrates CASE B of Section 5.2, i.e.,

W > 1

. Here,

W = 4

with

G = Sym (3)

,

dist (r_{t}, s_{t}) = d_{K} (r_{t}, s_{t})

, and the distance

{dist}_{p} (r_{t}^{4}, s_{t}^{4})

is (i)

{dist}_{1} (r_{t}^{4}, s_{t}^{4}) \in {0, 1, \dots, 12}

in the top row, (ii)

{dist}_{2} (r_{t}^{4}, s_{t}^{4}) \in [0, 6]

in the middle row and (iii)

{dist}_{\infty} (r_{t}^{4}, s_{t}^{4}) \in {0, 1, 2, 3}

in the bottom row. The main conclusions can be summarized as follows.

Due to the monotony property of the p-norms ( ${∥\cdot∥}_{p} \geq {∥\cdot∥}_{p^{'}}$ for $1 \leq p \leq$ $p^{'} \leq \infty$ ), the distances with smaller parameters p ( ${dist}_{1}$ and ${dist}_{2}$ in Figure 3) have greater differentiating power.
The results shown in Figure 3 (obtained with sliding windows of 4 consecutive ordinal patterns of length 3) are qualitatively similar to the results shown in Figure 1 (obtained with simultaneous pairs $(r_{t}, s_{t})$ of ordinal patterns of lengths 4 and 5).

Figure 5. Top row: Probability distributions of the distance

{dist}_{1} (r_{t}^{4}, s_{t}^{4})

for the algebraic representation of the time series x and y with the group

G = Sym (3)

(ordinal patterns of length

L = 3

) and coupling strengths

C = 0.30

(left panel),

0.55

(middle panel) and

1.10

(right panel). Middle row: Same as top row for the distance

{dist}_{2} (r_{t}^{4}, s_{t}^{4})

. Bottom row: Same as top row for the distance

{dist}_{\infty} (r_{t}^{4}, s_{t}^{4})

.

We conclude that the distances

{dist}_{p}

are also sensitive to dynamical changes in coupled systems and robust with respect to the parameter

p \geq 1

.

To wrap up the previous discussion, we are also going to compare the computational times of

D_{C, K}^{(Λ)} (a_{t}, b_{t})

(Section 4.1) and

d_{S} (a_{t}, b_{t})

(Section 4.2), where

{(a_{t})}_{1 \leq t \leq N}

,

{(b_{t})}_{1 \leq t \leq N}

are

G

-valued time series. Rather than using ad hoc groups and coupled time series, we take advange of the above ordinal representations

α

and

β

, and benchmark the computational cost of computing

D_{C, K}^{(Λ)} (r_{t}, s_{t})

for

G = Sym (L)

,

3 \leq L \leq 6

(the usual ordinal pattern lengths in applications),

N =

10,000 and

C = 0.30

, against the computational cost of calculating

d_{K} (r_{t}, s_{t})

for the same group and settings. We choose

C = 0.30

so that all allowed ordinal patterns are realized (see Figure 3 and Figure 4). Table 1 shows the times in seconds of the corresponding calculations with a laptop (Intel I9 processor, 8 cores, 64 GB of RAM, 8 GB of GPU memory) and a non-paralellized algorithm.

Table 1. Computation time in seconds of

d_{K} (r_{t}, s_{t})

and

D_{C, K}^{(Λ)} (r_{t}, s_{t})

,

1 \leq t \leq

10,000.

Altogether, the above numerical results support the usefulness of distances

d_{C, K}

,

D_{C, K}^{(Φ)}

and

{dist}_{p}

in the analysis of group-valued time series.

7. Conclusions

The results presented in this paper are an outgrowth of the study of transcripts and their applications to time series analysis in algebraic representations (Section 5), which are a generalization of transcripts in ordinal representations []. Indeed, the concept of transcript from a group element

a \in G

to another

b \in G

or, for that matter, the right translation of b by a (Equation (4)) leads directly to the isomorphism

Φ : a \mapsto R (a, \cdot) = : R_{a}

from

G

to a subgroup of the symmetric group

Sym (G)

(Cayley’s theorem). In turn, the elements of

Sym (G)

can be written as numerical or symbolic strings, which allows us to endow

Sym (G)

with any convenient edit distance, e.g., the Cayley distance

d_{C}

or the Kendall distance

d_{K}

of Section 3. This being the case, the isomorphism

Φ

can be used to transport the distance

d_{C, K}

in

Sym (G)

to

G

, as we did in Section 4. The result is the ordinal pattern-based distance for groups

D_{C, K}^{(Φ)} (a, b)

proposed in Definition 6.

Metric properties of finite groups is an unsual tool in time series analysis in algebraic representations. Even in the ordinal representation, distances or similarities between time series are usually measured with functionals of probability distributions such as divergences or functions thereof. There are also distances defined in the groups themselves, based on generating sets, which were the subject of Section 4.2. Actually, the distances

d_{C}

and

d_{K}

in the permutations groups, discussed in Section 3, are examples of distances with respect to generating sets. A possible advantage of the ordinal pattern-based distance proposed in this paper for any group

G

is its simplicity and generality, since it dispenses with generating sets and minimal descriptions of elements via generators. Furthermore, there are general-purpose algorithms to efficiently calculate the distances

d_{C}

and

d_{K}

in

Sym (G)

for the low and moderate group cardinalities used in practice, see Table 1.

In the previous sections we have presented the mathematical underpinnings of our approach, which include group actions, Cayley’s theorem, and group representations, as well as its practical implementation. It is remarkable that Cayley’s theorem gives permutations (or ordinal patterns) a certain universality in algebraic representations of time series, although other choices or isomorphisms can be more convenient in practice. For example, the Klein group (Example 5) is isomorphic to

Z_{2} \times Z_{2}

endowed with XOR addition and the cyclic group

{θ^{0}, θ^{1}, \dots, θ^{n - 1}}

endowed with

θ^{i} * θ^{j} = θ^{k}

, where

k = i + j

mod n, is isomorphic to

{0, 1, \dots, n - 1}

equipped with addition modulo n. Some of these groups were used in the previous sections to illustrate the theory. In contrast to the specificities of each group, the group distance introduced in Definition 6 is completely general, since the only input it needs is the multiplication table of the group, and can be efficiently computed. Possible applications were only touched upon in Section 5 because they are the subject of ongoing research. The numerical simulations in Section 6 show the potential of the metric tools discussed in this paper in the analysis of group-valued time series.

Author Contributions

Conceptualization, J.M.A.; methodology, J.M.A.; software, R.D.; validation, J.M.A. and R.D.; formal analysis, J.M.A.; investigation, J.M.A.; resources, R.D.; data curation, J.M.A. and R.D.; writing—original draft preparation, J.M.A.; writing—review and editing, J.M.A. and R.D.; visualization, J.M.A. and R.D.; supervision, J.M.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The numerical data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

The authors are very grateful to the Reviewers for their helpful comments.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Cayley and Kendall Distances for the Group Sym (4) (Example 4)

Table A1. Distance

d_{C}

for

Sym (4)

.

Table A1. Distance

d_{C}

for

Sym (4)

.

$d_{C}$	1234	1243	1324	1342	1423	1432	2134	2143	2314	2341	2413	2431	3124	3142	3214	3241	3412	3421	4123	4132	4213	4231	4312	4321
1234	0	1	1	2	2	1	1	2	2	3	3	2	2	3	1	2	2	3	3	2	2	1	3	2
1243	1	0	2	1	1	2	2	1	3	2	2	3	3	2	2	1	3	2	2	3	1	2	2	3
1324	1	2	0	1	1	2	2	3	1	2	2	3	1	2	2	3	3	2	2	3	3	2	2	1
1342	2	1	1	0	2	1	3	2	2	1	3	2	2	1	3	2	2	3	3	2	2	3	1	2
1423	2	1	1	2	0	1	3	2	2	3	1	2	2	3	3	2	2	1	1	2	2	3	3	2
1432	1	2	2	1	1	0	2	3	3	2	2	1	3	2	2	3	1	2	2	1	3	2	2	3
2134	1	2	2	3	3	2	0	1	1	2	2	1	1	2	2	3	3	2	2	1	3	2	2	3
2143	2	1	3	2	2	3	1	0	2	1	1	2	2	1	3	2	2	3	1	2	2	3	3	2
2314	2	3	1	2	2	3	1	2	0	1	1	2	2	3	1	2	2	3	3	2	2	3	1	2
2341	3	2	2	1	3	2	2	1	1	0	2	1	3	2	2	1	3	2	2	3	3	2	2	1
2413	3	2	2	3	1	2	2	1	1	2	0	1	3	2	2	3	1	2	2	3	1	2	2	3
2431	2	3	3	2	2	1	1	2	2	1	1	0	2	3	3	2	2	1	3	2	2	1	3	2
3124	2	3	1	2	2	3	1	2	2	3	3	2	0	1	1	2	2	1	1	2	2	3	3	2
3142	3	2	2	1	3	2	2	1	3	2	2	3	1	0	2	1	1	2	2	1	3	2	2	3
3214	1	2	2	3	3	2	2	3	1	2	2	3	1	2	0	1	1	2	2	3	1	2	2	3
3241	2	1	3	2	2	3	3	2	2	1	3	2	2	1	1	0	2	1	3	2	2	1	3	2
3412	2	3	3	2	2	1	3	2	2	3	1	2	2	1	1	2	0	1	3	2	2	3	1	2
3421	3	2	2	3	1	2	2	3	3	2	2	1	1	2	2	1	1	0	2	3	3	2	2	1
4123	3	2	2	3	1	2	2	1	3	2	2	3	1	2	2	3	3	2	0	1	1	2	2	1
4132	2	3	3	2	2	1	1	2	2	3	3	2	2	1	3	2	2	3	1	0	2	1	1	2
4213	2	1	3	2	2	3	3	2	2	3	1	2	2	3	1	2	2	3	1	2	0	1	1	2
4231	1	2	2	3	3	2	2	3	3	2	2	1	3	2	2	1	3	2	2	1	1	0	2	1
4312	3	2	2	1	3	2	2	3	1	2	2	3	3	2	2	3	1	2	2	1	1	2	0	1
4321	2	3	1	2	2	3	3	2	2	1	3	2	2	3	3	2	2	1	1	2	2	1	1	0

Table A2. Distance

d_{K}

for

Sym (4)

.

Table A2. Distance

d_{K}

for

Sym (4)

.

$d_{K}$	1234	1243	1324	1342	1423	1432	2134	2143	2314	2341	2413	2431	3124	3142	3214	3241	3412	3421	4123	4132	4213	4231	4312	4321
1234	0	1	1	2	2	3	1	2	2	3	3	4	2	3	3	4	4	5	3	4	4	5	5	6
1243	1	0	2	3	1	2	2	1	3	4	2	3	3	4	4	5	5	6	2	3	3	4	4	5
1324	1	2	0	1	3	2	2	3	3	4	4	5	1	2	2	3	3	4	4	3	5	6	4	5
1342	2	3	1	0	2	1	3	4	4	5	5	6	2	1	3	4	2	3	3	2	4	5	3	4
1423	2	1	3	2	0	1	3	2	4	5	3	4	4	3	5	6	4	5	1	2	2	3	3	4
1432	3	2	2	1	1	0	4	3	5	6	4	5	3	2	4	5	3	4	2	1	3	4	2	3
2134	1	2	2	3	3	4	0	1	1	2	2	3	3	4	2	3	5	4	4	5	3	4	6	5
2143	2	1	3	4	2	3	1	0	2	3	1	2	4	5	3	4	6	5	3	4	2	3	5	4
2314	2	3	3	4	4	5	1	2	0	1	3	2	2	3	1	2	4	3	5	6	4	3	5	4
2341	3	4	4	5	5	6	2	3	1	0	2	1	3	4	2	1	3	2	4	5	3	2	4	3
2413	3	2	4	5	3	4	2	1	3	2	0	1	5	6	4	3	5	4	2	3	1	2	4	3
2431	4	3	5	6	4	5	3	2	2	1	1	0	4	5	3	2	4	3	3	4	2	1	3	2
3124	2	3	1	2	4	3	3	4	2	3	5	4	0	1	1	2	2	3	5	4	6	5	3	4
3142	3	4	2	1	3	2	4	5	3	4	6	5	1	0	2	3	1	2	4	3	5	4	2	3
3214	3	4	2	3	5	4	2	3	1	2	4	3	1	2	0	1	3	2	6	5	5	4	4	3
3241	4	5	3	4	6	5	3	4	2	1	3	2	2	3	1	0	2	1	5	4	4	3	3	2
3412	4	5	3	2	4	3	5	6	4	3	5	4	2	1	3	2	0	1	3	2	4	3	1	2
3421	5	6	4	3	5	4	4	5	3	2	4	3	3	2	2	1	1	0	4	3	3	2	2	1
4123	3	2	4	3	1	2	4	3	5	4	2	3	5	4	6	5	3	4	0	1	1	2	2	3
4132	4	3	3	2	2	1	5	4	6	5	3	4	4	3	5	4	2	3	1	0	2	3	1	2
4213	4	3	5	4	2	3	3	2	4	3	1	2	6	5	5	4	4	3	1	2	0	1	3	2
4231	5	4	6	5	3	4	4	3	3	2	2	1	5	4	4	3	3	2	2	3	1	0	2	1
4312	5	4	4	3	3	2	6	5	5	4	4	3	3	2	4	3	1	2	2	1	3	2	0	1
4321	6	5	5	4	4	3	5	4	4	3	3	2	4	3	3	2	2	1	3	2	2	1	1	0

References

Hirata, Y.; Amigó, J.M. A review of symbolic dynamics and symbolic reconstruction of dynamical systems. Chaos 2023, 33, 052101. [Google Scholar] [CrossRef]
Bandt, C.; Pompe, B. Permutation Entropy: A Natural Complexity Measure for Time Series. Phys. Rev. Lett. 2002, 88, 174102. [Google Scholar] [CrossRef] [PubMed]
Keller, K.; Lauffer, H. Symbolic analysis of high dimensional time series. Int. J. Bifurc. Chaos 2003, 13, 2657–2668. [Google Scholar] [CrossRef]
Graff, G.; Graff, B.; Kaczkowska, A.; Makowiecz, D.; Amigó, J.M.; Piskorski, J.; Narkiewicz, K.; Guzik, P. Ordinal pattern statistics for the assessment of heart rate variability. Eur. Phys. Spec. Top. 2013, 222, 525–534. [Google Scholar] [CrossRef]
Schlemmer, A.; Berg, S.; Lilienkamp, T.; Luther, S.; Parlitz, U. Spatiotemporal permutation entropy as a measure for complexity of cardiac arrhythmia. Front. Phys. 2018, 6, 39. [Google Scholar] [CrossRef]
Rosso, O.A.; Larrondo, H.A.; Martin, M.T.; Plastino, A.; Fuentes, M.A. Distinguishing noise from chaos. Phys. Rev. Lett. 2007, 99, 154102. [Google Scholar] [CrossRef] [PubMed]
Amigó, J.M.; Zambrano, S.; Sanjuán, M.A.F. True and false forbidden patterns in deterministic and random dynamics. Europhys. Lett. 2007, 79, 50001. [Google Scholar] [CrossRef]
Monetti, R.; Bunk, W.; Aschenbrenner, T.; Jamitzky, F. Characterizing synchronization in time series using information measures extracted from symbolic representations. Phys. Rev. E 2009, 79, 046207. [Google Scholar] [CrossRef]
Parlitz, U.; Suetani, H.; Luther, S. Identification of equivalent dynamics using ordinal pattern distribution. Eur. J. Spec. Top. 2013, 222, 553–568. [Google Scholar] [CrossRef]
Weiß, C.H. Non-parametric tests for serial dependence in time series based on asymptotic implementations of ordinal-pattern statistics. Chaos 2022, 32, 093107. [Google Scholar] [CrossRef]
Weiß, C.H.; Ruiz Marín, M.; Keller, K.; Matilla-García, M. Non-parametric analysis of serial dependence in time series using ordinal patterns. Comput. Stat. Data Anal. 2022, 168, 107381. [Google Scholar] [CrossRef]
Amigó, J.M.; Keller, K.; Kurths, J. Recent Progress in Symbolic Dynamics and Permutation Complexity. Eur. Phys. J. Spec. Top. 2013, 222, 241–247. [Google Scholar] [CrossRef][Green Version]
Leyva, I.; Martinez, J.; Massoller, C.; Rosso, R.O.; Zanin, M. 20 Years of Ordinal Patterns: Perspectives and challenges. Eur. Phys. Lett. 2022, 138, 31001. [Google Scholar] [CrossRef]
Amigó, J.M.; Rosso, O.A. Ordinal methods: Concepts, applications, new developments, and challenges. Chaos 2023, 33, 080401. [Google Scholar] [CrossRef]
Herstein, I.N. Abstract Algebra; John Wiley: Hoboken, NJ, USA, 1996; ISBN 978-0471368793. [Google Scholar]
Lang, S. Undergraduate Algebra, 3rd ed.; Undergraduate Texts in Mathematics; Springer: New York, NY, USA, 2005; ISBN 978-0387220253. [Google Scholar]
Fraleigh, J.B. A First Course in Abstract Algebra; Pearson Education Limited: London, UK, 2013; ISBN 978-1292024967. [Google Scholar]
Mohr, M.; Wilhelm, F.; Hartwig, M.; Möller, R.; Keller, K. New Approaches in Ordinal Pattern Representations for Multivariate Time Series. In Proceedings of the Thirty-Third International Artificial Intelligence Research Society Conference (FLAIRS 2020), North Miami Beach, FL, USA, 17–20 May 2020; pp. 124–129. Available online: https://aaai.org/papers/124-flairs-2020-18417 (accessed on 26 August 2025).
Amigó, J.M.; Keller, K. Permutation entropy: One concept, two approaches. Eur. Phys. J. Spec. Top. 2013, 222, 263–274. [Google Scholar] [CrossRef]
Ribeiro, H.V.; Zunino, L.; Lenzi, E.K.; Santoro, P.A.; Mendes, R.S. Complexity-entropy causality plane as a complexity measure for twodimensional patterns. PLoS ONE 2012, 7, e40689. [Google Scholar] [CrossRef]
Zunino, L.; Ribeiro, H.V. Discriminating image textures with the multiscale two-dimensional complexity-entropy causality plane. Chaos Solitons Fractals 2016, 91, 679–688. [Google Scholar] [CrossRef]
Bandt, C.; Wittfeld, K. Two new parameters for the ordinal analysis of images. Chaos 2023, 33, 043124. [Google Scholar] [CrossRef] [PubMed]
Deza, M.; Huang, T. Metrics on permutations: A survey. J. Comb. Inf. Syst. Sci. 1998, 23, 173–185. [Google Scholar]
Nguyen, T. Improving the Gilbert-Varshamov bound for permutation Codes in the Cayley metric and Kendall τ-Metric. arXiv 2024, arXiv:2404.15126v2. [Google Scholar]
Kendall, M.G. A new measure of rank correlation. Biometrika 1938, 30, 81–93. [Google Scholar] [CrossRef]
Sörensen, K. Distance measures based on the edit distance for permutation-type representations. J. Heuristics 2007, 13, 35–47. [Google Scholar] [CrossRef]
Bandt, C. Small Order Patterns in Big Time Series: A Practical Guide. Entropy 2019, 21, 613. [Google Scholar] [CrossRef] [PubMed]
Cicirello, V.A. Kendall tau sequence distance: Extending Kendall tau from ranks to sequences. arXiv 2019, arXiv:1905.02752v3. [Google Scholar] [CrossRef]
Johnson, D.L. Minimal Permutation Representations of Finite Groups. Am. J. Math. 1971, 93, 857–866. [Google Scholar] [CrossRef]
Grechkoseeva, M.A. On Minimal Permutation Representations of Classical Simple Groups. Sib. Math. J. 2003, 44, 443–462. [Google Scholar] [CrossRef]
Rachev, S.T.; Klebanov, L.; Stoyanov, S.V.; Fabozzi, F. The Methods of Distances in the Theory of Probability and Statistics; Springer: New York, NY, USA, 2013; ISBN 978-1461448686. [Google Scholar]
Lin, J. Divergence measures based on the Shannon entropy. IEEE Trans. Inf. Theory 1991, 37, 145–151. [Google Scholar] [CrossRef]
Endres, D.; Schindelin, J. A new metric for probability distributions. IEEE Trans. Inf. Theory 2003, 49, 1858–1860. [Google Scholar] [CrossRef]
Zunino, L.; Olivares, F.; Ribeiro, H.V.; Rosso, O.A. Permutation Jensen-Shannon distance: A versatile and fast symbolic tool for complex time-series analysis. Phys. Rev. E 2022, 105, 045310. [Google Scholar] [CrossRef]
Zunino, L. Revisiting the Characterization of Resting Brain Dynamics with the Permutation Jensen–Shannon Distance. Entropy 2024, 26, 432. [Google Scholar] [CrossRef]
Hellinger, E. Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen. J. Fürdie Reine Angew. Math. 1909, 136, 210–271. Available online: http://eudml.org/doc/149313 (accessed on 26 August 2025). [CrossRef]
Kantorovich, L.V. On the translocation of masses. Dokl. Akad. Nauk SSSR 1942, 37, 227–229, Reprinted in J. Math. Sci. 2006, 133, 1381–1382. [Google Scholar] [CrossRef]
Figalli, A.; Glaudo, F. An Invitation to Optimal Transport, Wasserstein Distances, and Gradient Flows. In EMS Textbooks in Mathematics, 23rd ed.; European Mathematical Society: Berlin, Germany, 2021; ISBN 978-3985470105. [Google Scholar]
Pinsker, M.S. Information and Information Stability of Random Variables and Processes; Holden-Day: San Francisco, CA, USA, 1964; ISBN 978-0816268047. [Google Scholar]
Bhattacharyya, A.; Gayen, S.; Meel, K.S.; Myrisiotis, D.; Pavan, A.; Vinodchandran, N.V. On Approximating Total Variation Distance. arXiv 2023, arXiv:2206.07209v2. [Google Scholar] [PubMed]
Amigó, J.M.; Monetti, R.; Aschenbrenner, T.; Bunk, W. Transcripts: An algebraic approach to coupled time series. Chaos 2012, 22, 013105. [Google Scholar] [CrossRef]
Monetti, R.; Bunk, W.; Aschenbrenner, T.; Springer, S.; Amigó, J.M. Information directionality in coupled time series using transcripts. Phys. Rev. E 2013, 88, 022911. [Google Scholar] [CrossRef][Green Version]
Pilarczyk, P.; Graff, G.; Amigó, J.M.; Tessmer, K.; Narkiewicz, K.; Graff, B. Differentiating patients with obstructive sleep apnea from healthy controls based on heart rate-blood pressure coupling quantified by entropy-based indices. Chaos 2023, 33, 103140. [Google Scholar] [CrossRef]
Hamming, R.W. Error detecting and error correcting codes. Bell Syst. Tech. J. 1950, 29, 147–160. [Google Scholar] [CrossRef]
Damerau, F.J. A technique for computer detection and correction of spelling errors. Commun. ACM 1964, 7, 171–176. [Google Scholar] [CrossRef]
Levenshtein, V.I. Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Dokl. 1966, 10, 707–710. [Google Scholar]
Cormen, T.H.; Leiserson, C.E.; Rivest, R.L.; Stein, C. Introduction to Algorithms, 3rd ed.; The MIT Press: Cambridge, MA, USA, 2009. [Google Scholar]
Jaro, M.A. Advances in record-linkage methodology as applied to matching the 1985 census of Tampa, Florida. J. Am. Stat. Assoc. 1989, 84, 414–420. [Google Scholar] [CrossRef]
Winkler, W.E. String Comparator Metrics and Enhanced Decision Rules in the Fellegi-Sunter Model of Record Linkage. In Proceedings of the Section on Survey Research Methods, American Statistical Association, Anaheim, CA, USA, 6–9 August 1990; pp. 354–359. [Google Scholar]
Amigó, J.M.; Dale, R.; King, J.C.; Lehnertz, K. Generalized synchronization in the presence of dynamical noise and its detection via recurrent neural networks. Chaos 2024, 34, 123156. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Kendall adjacency graph of

Sym (3)

. A link between two nodes means that the corresponding permutations differ by an adjacent transposition, i.e., the Kendall distance between them is 1.

Figure 2. Kendall adjacency graph of

Sym (4)

. A link between two permutations means that the Kendall distance between them is 1.

Figure 4. Probability distributions of the allowed distances for

D_{K}^{(Λ)} (r_{t}, s_{t})

for the algebraic representation of the time series x and y with the group

G = Sym (4)

(panel (a)) and

G = Sym (5)

(panel (b)).

C = 0.30

in both panels so that all

L (L - 1) / 2 + 1

allowed distances for

D_{K}^{(Λ)} (r_{t}, s_{t})

(listed along the horizontal axes) are actually realized.

Table 1. Computation time in seconds of

d_{K} (r_{t}, s_{t})

and

D_{C, K}^{(Λ)} (r_{t}, s_{t})

,

1 \leq t \leq

10,000.

Table 1. Computation time in seconds of

d_{K} (r_{t}, s_{t})

and

D_{C, K}^{(Λ)} (r_{t}, s_{t})

,

1 \leq t \leq

10,000.

$G$	$Φ (G)$	$d_{K} (r_{t}, s_{t})$	$D_{C, K}^{(Λ)} (r_{t}, s_{t})$
$Sym (3)$	$Sym (6)$	$0.009$ s	$0.022$ s
$Sym (4)$	$Sym (24)$	$0.010$ s	$0.038$ s
$Sym (5)$	$Sym (120)$	$0.011$ s	$0.143$ s
$Sym (6)$	$Sym (720)$	$0.012$ s	$2.837$ s

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Permutation-Based Distances for Groups and Group-Valued Time Series

Abstract

1. Introduction

2. Groups, Group Actions and Cayley’s Theorem

3. Ordinal Patterns and Distances

3.1. Ordinal Patterns

3.2. Distances for Ordinal Patterns

4. Distances for General Groups

4.1. Permutation-Based Distance for Groups

4.2. Distances with Respect to a Generating Set

4.3. Discussion

5. Distances for Group-Valued Time Series and Algebraic Representations

5.1. String Metrics for Group-Valued Time Series

5.2. Extracting Information with $d_{C, K}$ and $D_{C, K}^{(Φ)}$

6. Numerical Simulations

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Cayley and Kendall Distances for the Group Sym (4) (Example 4)

References

Article Metrics

Citations

Article Access Statistics

Permutation-Based Distances for Groups and Group-Valued Time Series

Abstract

1. Introduction

2. Groups, Group Actions and Cayley’s Theorem

3. Ordinal Patterns and Distances

3.1. Ordinal Patterns

3.2. Distances for Ordinal Patterns

4. Distances for General Groups

4.1. Permutation-Based Distance for Groups

4.2. Distances with Respect to a Generating Set

4.3. Discussion

5. Distances for Group-Valued Time Series and Algebraic Representations

5.1. String Metrics for Group-Valued Time Series

5.2. Extracting Information with d C , K and D C , K ( Φ )

6. Numerical Simulations

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Cayley and Kendall Distances for the Group Sym (4) (Example 4)

References

Article Metrics

Citations

Article Access Statistics

5.2. Extracting Information with $d_{C, K}$ and $D_{C, K}^{(Φ)}$