1. Introduction
When investigating the measurement of the distances between two sentences or structures, we feel it is necessary to first form a system to measure the distance for tree or graph structures. This was our initial motivation for developing this paper. This will provide a sound foundation and measurement technique for future applications. For example, if there are two separate English sentences, such as and , and we would like to measure the distance between these two sentences, we identify with one graph and with another graph. Their vocabularies in individual sentences can be associated with vertices in the graph. The distance between the vocabularies can be identified with edges of the graphs. Then, we would have constructed graphs for and . We would then be able to measure the distance between the two graphs. The main purpose of this article is to put forward an approach to define a metric for graphs on a fixed set of vertices.
Suppose
V is a set of fixed vertices and
E is a set of directed edges. Then, for each edge
, i.e., an edge from
v to
w, one can assign a value. Since most of the mathematical models can be formalized or represented via vertices and edges, studying the properties of the distances between any two graphs becomes a vital approach to explore the intrinsic properties of a mathematical structure or a real mathematical object [
1,
2], even being used on some fuzzy objects [
3,
4]. Some ingenious metrics for handling these fuzzy objects have been explored in depth [
5,
6]. In this article, we put forward two metrics for graphs with labelled vertices and unlabelled vertices, respectively. Nonetheless, we only consider the directed edges in this article. As for the indirected edges, one can simply treat them as pairs of two directed edges.
2. Definitions and Claims
We use to denote all the positive real numbers. For any real number , we use to denote its absolute value. For any set K, we use and to denote the power set and the size of K, respectively. If both H and K are sets, we use to denote the set of all the functions from H to K. We use to denote (or ). We call or in brevity a generalized graph, in which W is a weight function satisfying following conditions:
For each ;
For all .
Definition 1. Let denote the set of all the generalized graphs whose vertex sets are exactly V.
Let
be arbitrary generalized graphs. For any
and any
, we use
to denote the set of all the endpoints beginning from
a, i.e.,
Furthermore, define the set of all the assigned values of
as follows:
3. Metric for Labelled Graphs
In this section, we assume all the vertices in
V are labelled. We show how to define a distance between
and
as follows:
Definition 2. (distance function: labelled vertices, single directed edge) Define by Example 1. Suppose is a fixed set of vertices and graph , and , where their vertices assigned to the edges , and the values for weights are given as follows: Then, the end vertices originating from via edges in could be depicted as . Others follow: Henceforth, by Definition 1, one could compute the distance for and as follows: Hence we have the result that the distance for and is 20 by metric .
Proof. It follows immediately from the fact that
□
Claim 2. (semi-metric)
- 1.
;
- 2.
;
- 3.
iff.
Proof. By the definition, the first and second statements follow immediately. Here we show the third statement. Suppose
. Then,
On the other hand, if , then and , i.e., . □
From above inferences, if one allows for some , then might not hold in some particular and . Similarly, if one allows , then for all is a requirement.
Theorem 1. is a metric space.
Proof. Since we have shown in Claim (2) that
is a semi-metric, it suffices to show
satisfies the triangle property:
On the basis of Claim (1), we have the following inferences. Let
be arbitrary. Firstly, if
, then
or
. If
, then it preserves the inequality of Equation (
2). If
, then
, i.e.,
It follows that
i.e., the inequality of Equation (
2) is preserved. Secondly, if
, by the same analogy, the inequality of Equation (
2) is also preserved. Lastly, if
, then [
or
] and [
or
], i.e.,
or
Hence, we have shown
and this completes our proof of Equation (
2). □
4. Metric for Unlabelled Graphs
In this section, we show how to define a distance between graphs with unlabelled vertices. Let
be a set of distinct unlabelled vertices with
. Let
be the set of generalized graphs whose vertex set is
. First of all, we show how to formalize unlabelled graphs. Let
be a set of dummy vertices for
. Then, each
could be modeled via this set of dummy vertices as
. Let
be arbitrary. Let
be a set of names. Now fix the domain
M and assign each dummy vertex a name via a naming function
. Let
denote the set of all the naming functions. Now each unlabelled graph
G could be formalized via naming functions as follows:
where
;
and
denote the named edges and weights via
for
and
, respectively.
and
could be formalized as
Since the modeling of unlabelled graph is not unique, we define an equivalence relation on .
Definition 3. iff such that
Example 2. Let . Then, (in a corresponding form) consists of Suppose , where Hence consists of the following elements:
Similarly, one could list all the graphs in , in particular, Therefore,
i.e., . Claim 3. ≡ is an equivalence relation on .
Proof. The result follows immediately from the definition. □
Definition 4. (distance function: single edge, unlabelled) Define by It is obvious that if , then . Let us look a simple example that is not equivalent to in the following.
Example 3. Let . Suppose , whereFollowing the same procedures in Example (2), we could gain all the elements of and . By measuring the distances of their respective pairs (there are 36 pairs), and by Equation (4), one has the minimal one , where and . Claim 4. is a semi-metric.
Proof. It is obvious that
and
. Suppose
. Then, there exist
such that
and
, i.e.,
. On the other hand, suppose
. Then, there exist
such that
i.e.,
i.e.,
□
Claim 5. for all bijective function . Proof. It suffices to show
for all
, where
denotes the relabelled edges via
of
and
denotes the weight function over the relabelled edges
.
Let
be arbitrary. Suppose
Hence, one has
where
denotes
and
denotes
. Hence, we have shown
□
Theorem 2. is a metric space.
Proof. Owing to Claim (4), it suffices to show the triangle transitivity property holds.
where
denotes
. Then, by Claim (5), one has
where
is the bijective function satisfying
and where
. □
5. Computations
In this section, we show how to implement the above-mentioned metrics. Suppose . To begin with, we implement . Let denote the edge from node i to node j.
5.1. Labelled Vertices with Single Directed Edge
Given the two graphs
and
in
Figure 1 and their respective adjacent matrices, in which the symbol ∞ (represented by a sufficient large real number) denotes there is no connection between the two nodes and represents a predetermined sufficiently large real number, in
Table 1 (a pair
denote the weights of the directed edges
and
, respectively, where
).
One obtains
; moreover, one also obtains
. The representation of these graphs via partial functions could be demonstrated by
Table 2.
By Equation (
1), one has
To simplify the whole computation, alternatively, this distance could also be obtained via the following matrix representation of Equation (
1) and computation.
Definition 6. (distance between edges) Define each element of the distance matrix between and bywhere and denote element of i’th row, j’th column in and , respectively. On the basis of this definition, one has
Definition 7. For any square matrix , define .
Then, Equation (
1) could be represented and computed via the following matrix operation:
5.2. Unlabelled Vertices with Singled Directed Edge
In this section, we show how to implement
defined in Definition (4). Assume
V is unlabelled.
and
are shown in
Figure 2.
Let
be the arbitrary names of the vertices of both
and
. Let
denote all the permutations of the identity matrix with dimension
n. By Equation (
3), the distance between two unlabelled graphs could be represented and computed via the following matrix operations:
where each
represents the transpose of the permutation matrix
P. By computation, we have
,
and the distances between
and each permutation of
are listed as follows:
Among which, the optimal permutation matrix is
and thus
where
. The corresponding minimal pair of graphs could be shown in
Figure 3.
This could be interpreted as the complexity of the overlap of these two graphs based on corresponding vertices, i.e., this overlap yields the minimal complexity of the graphs.
6. Conclusions
In this article, we have shown how to define distances between graphs over either a set of labelled or unlabelled vertices via metrics and , respectively. We also give a computational approaches to implement the computation of and via adjacent matrix operations. This implementation gives an efficient and fast computation of the distance between any two such graphs. This type of distance could then be applied in measuring the distance between networks or tree structures.