1. Introduction
Future Sustainability Computing is the computational sustainable development that meets the needs and aspirations of the present without compromising the ability of future generations to meet their own needs. In particular, it utilizes the data mining and machine learning [
1,
2] (e.g., supervised and unsupervised learning), decision and optimization problems [
3] (e.g., linear and integer programming, dynamic programming), sequential decision making under uncertainty (e.g., Markov decision processes and recommender systems) [
4,
5], and networks (e.g., fuzzy graphs and network algorithms) [
6,
7]. In this work, we design a novel approach on topological structure mining of social networks for future computational sustainability.
Background. Recent studies provide evidence of the importance of graph models used in many real world phenomena, such as social networks, biological networks and finance. A graph is a convenient way of representing information with relationships between objects [
8]. The current complicated ubiquitous information enables the relationship between objects to become more vague. For example, (1) a social network can be established through sensoring data, and communications between individuals maybe missed or anonymized; (2) the relationships are vague in nature, such as one person influencing another in a social network; (3) and fuzzy trust relationships often occur in mobile social networks [
9]. Therefore, it is natural to represent this kind of information using a Fuzzy Graph model [
8].
The research on data mining based on a Fuzzy Graph model has attracted a lot of attention from the communities of data science and fuzzy logic. To emphasize the importance of λ-maximal clique mining in fuzzy graph, the following motivating example is provided.
Motivating Example. Considering the following scenario as a motivating example, a global research institute recently received funding for support of a big Information and Communications Technology (ICT) project which covers Information retrieval (IR), Artificial Intelligence (AI), Data Mining (DM), and Computer Vision (CV). They will release this project to encourage social collaborative research among the scholars. Since the given project should be completed efficiently before a given deadline, it is expected to build a team of scientists who are good at the required research areas and have closer collaborative relationships.
Suppose
Figure 1 shows a scholar social network including five scientists who come from different countries. However, they have some previous academic collaboration. The weights on the edges refer to the number of co-authored publications between two scientists. Intuitively, these weights describe the collaboration strength between them. Hence, this scholar social network can be viewed as a fuzzy graph with the weights indicating the membership values for representing the collaboration strength between scientists. Clearly, to complete the released ICT project, there are five possible research teams
,
,
,
, and
for taking the project.
From reliability and quality points of view, the research institute usually sets a parameter for controlling the quality, and this parameter corresponds to the λ in our problem. Overall, finding a team that can cover more countries and guarantee the product’s quality is becoming a very interesting research issue and a practical problem.
Contributions. The major contributions of this paper are summarized as follows.
A newly defined concept termed λ-maximal cliques is introduced. Furthermore, a novel problem of λ-maximal clique mining from a fuzzy graph is formalized;
To address the proposed problem, an efficient fuzzy formal concept analysis based identification approach for finding the λ-maximal cliques is presented. The key idea is to prove the equivalence relation between λ-maximal clique and maximal fuzzy equiconcept;
Extensive experiments are conducted with a real-world network dataset for evaluating the identification performance of the proposed approach. In addition, we present a promising recommendation service for illustrating the usability and extensibility of the proposed problem.
Roadmap. The rest of this paper is organized as follows.
Section 2 mathematically describes the problem and targets of
λ-Maximal Clique Mining in a Fuzzy Graph. To better represent and characterize the relationships between vertices of a fuzzy graph, the theory of fuzzy formal concept analysis is provided in
Section 3. By incorporating the properties of fuzzy formal concept analysis, a novel solution for addressing the proposed problem based fuzzy formal concept analysis is presented in
Section 4.
Section 5 shows the experimental results with further discussions. An overview of related work is presented in
Section 6.
Section 7 concludes this paper.
2. Problem Definition
Any relation on a set S can be characterized with a graph with node set S and edge set R. Similarly, any fuzzy relation can be viewed as a fuzzy graph in which each edge has a weight falling into .
A fuzzy graph is a degree of membership over a deterministic graph. For simplicity, we consider the undirected graphs only since the edges can be regarded as unordered pairs of vertices. Mathematically, a fuzzy graph is defined as a 4-tuple , where V is a set of vertices, is a set of edges, σ is a fuzzy subset of a set V and is a fuzzy relation on σ that assigns a degree of membership to each edge .
First of all, let us recall two fundamental definitions in graphs about clique and maximal clique that are standard and apply not to fuzzy graphs but to deterministic graphs.
Definition 1. [10] A clique in is a subset such that, for any two vertices , there exists an edge . Definition 2. A set of vertices is a maximal clique in a graph , if (1) M is a clique in G, and (2) there is no vertex such that is a clique in G.
Definition 3. In a fuzzy graph , for a set of vertices , the clique’s degree of membership of C, termed , is defined as the degree of membership in which, in a graph sampled from , C is a clique. For a given fuzzy cut λ, C is called a λ-clique if .
For any set of vertices , let be the set of edges , i.e., the set of edges connecting vertices in C.
Observation 1. For any set of vertices in , such that C is a clique in , =.
Proof. Let G be a graph sampled from . The set C will be a clique in Giff every edge in is present in G. Since the event of selecting different edges is independent of each edge, the is to catch the minimal degree of membership from the set of edges . Observation 1 holds. ☐
Similar to the definition of maximal clique, we define the λ-maximal clique in the fuzzy graph as follows.
Definition 4. For a fuzzy graph , and a fuzzy cut λ, a set is defined as a λ-maximal clique
if Problem. (λ-Maximal Clique Mining) Given a fuzzy graph and a fuzzy cut λ, the λ-Maximal Clique Mining problem is to find all vertex sets such that M is a λ-maximal clique in .
The extensibility of our proposed problem lies in offering the topological intelligence with the flexible controlling parameter λ. Furthermore, this type of intelligence can be applied to various sustainable applications, such as personalized recommendation systems, optimized team formation with the quality parameter λ, and targeted marketing that guarantees the strength of reciprocal relationships greater than λ.
Based on Observation 1 and the problem statement, the following two important observations are easily derived.
Observation 2. Suppose C is a λ-clique in . Then, for all edges , holds.
Proof. Since a λ-clique satisfies a condition that the should be greater than λ, it is straightforward to know that . ☐
Observation 3. For any two sets of vertex A, B in , if , then we have =.
Proof. Both and obtain the minimal degrees of membership, denoted with , , from edges and , respectively. Therefore, is equivalent to obtaining the minimal values from , , since , = holds. ☐
In order to understand the above definitions and problem descriptions, an illustrative example is provided in this section as well.
Example 1. Figure 2 presents an illustrative example for demonstrating how the proposed problem works. Clearly, Figure 2a is a fuzzy graph including seven vertices and degrees of membership on each edge. For example, the relationship between individuals C and D is a fuzzy relation with a degree of membership of 0.38. In particular, the given subgraphs and are two cliques since vertices are connected each other. According to Definition 3, we assume the fuzzy cut λ is 0.2, then the subgraph is a λ-clique because =. Here is a natural question. Are there any other cliques that satisfy the property of λ-clique in the fuzzy graph? Note that vertices in subgraph are more than the subgraph and . Therefore, the output of the λ-maximal cliques from this given fuzzy graph is subgraph .
4. λ-Maximal Clique Mining in Fuzzy Graphs
Aiming to discover all λ-maximal cliques from fuzzy graphs, this section firstly provides an overview of the proposed solution for addressing the problem. After that, the associated technical details contained in the solution are elaborated on separately.
4.1. An Overview of Solution
The novelty of the proposed solution is to mine all
λ-maximal cliques by using the Fuzzy Formal Concept Analysis (FFCA) theory.
Figure 4 shows the working principle of our proposed solution, which is composed of a fuzzy graph as the initial input, the set of all
λ-maximal cliques as the final output, and three main technical modules: (1) constructing a fuzzy formal context for a given fuzzy graph according to the fuzzy matrix between vertex; (2) building the corresponding fuzzy concept lattice under the given fuzzy cut
λ; and (3) exploring the equivalence relation between the newly defined concept, named
fuzzy equiconcept with the maximal cardinality of extents/intents and
λ-maximal cliques.
With these three technical modules, the following three subsections will separately discuss:
how to construct a fuzzy formal context from a given fuzzy graph (see
Section 4.2);
how to extract the fuzzy concepts and build the corresponding fuzzy concept lattice (see
Section 4.3);
why there exists an equivalence relation between
fuzzy equiconcept with the maximal cardinality of extents/intents and
λ-maximal cliques (see
Section 4.4).
4.2. Fuzzy Formal Context Construction for a Fuzzy Graph
As mentioned in the problem definition, a fuzzy graph can be modeled as a set of vertices, in which some of them have some relationships with others with the degrees of membership. To characterize this fuzzy relation I between vertices, we regard the vertices as both objects and attributes and construct a fuzzy formal context of the fuzzy graph by using Fuzzy Matrix , denoted as .
Definition 9. (Fuzzy Matrix) A fuzzy matrix = is an matrix if:where , i.e.
, is the degree of membership between vertex and . Hence, the is equivalent to the Fuzzy Matrix of , i.e., .
Property 1. According to the properties of , the fuzzy formal context has the following important properties: Here is an example for illustrating how to construct a fuzzy formal context for a given fuzzy graph.
Example 3. Let us continue with Example 1. All the vertices appear in Figure 2a are viewed as objectives and attributes from the perspective of fuzzy formal context. Then, the weights (membership values) on the edges are the elements in the fuzzy formal context. Eventually, the fuzzy formal context of Figure 2a is constructed as follows, 4.3. Fuzzy Concept Lattice Building
A fuzzy concept lattice is built upon a given fuzzy cut parameter λ that is used for removing the relations with the lower membership values. Therefore, before building the fuzzy concept lattice, we should refine the fuzzy formal context with the fuzzy cut λ in terms of different application scenarios. According to Definition 6, the fuzzy concepts can be easily obtained. Furthermore, the fuzzy concept lattice is built by Definition 8.
4.3.1. Fuzzy Formal Context Reconstruction
When
λ = 0.6 in Example 3, then the membership values that are less than
λ are filtered out from
Table 3. Consequently, a refined fuzzy formal context is shown in
Table 4.
4.3.2. Fuzzy Concepts Extraction and Hasse Diagram Representation
According to Definition 8, we can obtain a fuzzy concept lattice denoted as
L=
, as shown in
Figure 5.
Clearly, we obtain 10 fuzzy concepts. For example, the nodes
and
in the Hasse diagram indicate the fuzzy concepts in
Figure 5.
4.4. The Proposed Mining Approach of λ-Maximal Cliques
This section first defines two new concepts, i.e., Fuzzy Equiconcept, k-Fuzzy Equiconcept, Maximal Fuzzy Equiconcept, and then we present an efficient mining approach of λ-maximal cliques by finding the Fuzzy Equiconcepts that have the maximal number of extent/intent.
Definition 10. (Fuzzy Equiconcept) For a fuzzy formal context , if a pair satisfies =Y, =X and X=Y, then the pair is called a Fuzzy Equiconcept, where X and Y indicate the extent and intent of the fuzzy equiconcept, respectively.
Definition 11. (k-Fuzzy Equiconcept) For a fuzzy formal context , if a pair satisfies =Y, =X, X=Y and ==k, then the pair is called an Fuzzy Equiconcept, where X and Y indicate the extent and intent of the fuzzy equiconcept, respectively.
Definition 12. (Maximal Fuzzy Equiconcept) For a fuzzy formal context of , if a pair satisfies =Y, =X, X=Y, is defined as a Maximal Fuzzy Equiconcept if is a Fuzzy Equiconcept in L=;
There is no vertex such that is a Fuzzy Equiconcept in L=.
Based on Definitions 11 and 12, the following observation is easily obtained.
Observation 4. Among all Fuzzy Equiconcepts , () (H is the total number of Fuzzy Equiconcepts), the Maximal Fuzzy Equiconcept has the relationship with Fuzzy Equiconcepts:This observation studies the correlation between Maximal Fuzzy Equiconcept and Fuzzy Equiconcepts (). Proof. This proof is straightforward. That is to say, among all Fuzzy Equiconcepts, there exists at least one Fuzzy Equiconcept that has the maximal cardinality of extent or intent of the fuzzy concepts. ☐
Example 4. We try to figure out these definitions and their correlations via this example. As can be seen from Figure 5, there exist four fuzzy equiconcepts, i.e.
, , , , and since their extent is the same as intent in the fuzzy concepts. Intuitively, is a three-fuzzy equiconcept, and , , and are the two-fuzzy equiconcepts. Note that, among all fuzzy equiconcepts, the amount of extents of is the maximum, thus is regarded as the maximal fuzzy equiconcept. The following sections will investigate an equivalence relation between detection of λ-maximal cliques and maximal fuzzy equiconcepts. Based on this proposed equivalence relation, an efficient algorithm for mining the λ-maximal cliques from a fuzzy graph is presented.
Observation 5. (Equivalence relation between λ-maximal clique and maximal fuzzy equiconcept) Suppose is a mining problem of λ-maximal clique, is an extraction problem of the maximal fuzzy equiconcept, and the following equivalence relation holds: Proof. For a given fuzzy graph
with the fuzzy cut
λ, the
is actually equivalent to finding the maximal cliques from the defuzzification graph after filtering out the edges on which the membership values are less than
λ. From this point of view,
can be transformed into a fundamental problem “
maximal clique mining in graph” [
12,
13] with the constraint that all membership values on the edges in this current graph should be greater than than
λ. Clearly, the procedure of removing the edges from the fuzzy graph is the same as setting the elements in the fuzzy formal context as “0”. Moreover, extracting the maximal fuzzy equiconcepts from the fuzzy concept lattice is the same as extracting the maximal equiconcepts (defined in [
10]) from the graph with the fuzzy cut constraint. In this case, both new problems are degraded to the existing problems with the corresponding constraints.
Based on previous work [
10], it is easy validating that
and
holds. Eventually,
. ☐
Inspired by the above equivalence relation between λ-maximal clique and maximal fuzzy equiconcept, a novel detection theorem of λ-maximal cliques based on fuzzy formal concept analysis is derived.
Theorem 1. Given a fuzzy graph and a fuzzy cut λ, the λ-maximal clique mining based on fuzzy formal concept analysis is to extract the maximal fuzzy equiconcepts.
Proof. The proof of this theorem is straightforward. ☐
Lemma 1. If λ = 0, the λ-maximal clique mining problem is degraded into a maximal clique mining in a fuzzy graph. The solution for addressing this problem is just to extract the equiconcepts (Note that the extent and intent of equiconcept is exactly same) which has the maximum number of extents/intents.
4.5. Algorithm
We devise an algorithm based on the proposed mining theorem. Algorithm 1 depicts how does our solution works for mining the λ-maximal cliques from the fuzzy graph.
The working process of Algorithm 1 is described as follows: At the beginning, a fuzzy graph and a fuzzy cut parameter λ are the inputs of the whole algorithm; Then, we initialize a set of λ-maximal cliques with Γ (Line 1). After the initialization of the algorithm, it goes into the fuzzy formal context construction and fuzzy concept generation codes section (Lines 26). Lines 7–13 return the index of fuzzy equiconcepts that has the maximal cardinality of extents. After obtaining the index , the algorithm outputs all λ-maximal cliques (Lines 14–18).
6. Related Work
There has been a lot of recent work on exhibiting fuzzy or possibilistic clustering and communities detection using fuzzy logic. Reichardt [
16] proposed a fast community detection algorithm based on
q-state Potts model. Aiming to identify clusters from heterogeneous data and connect these clusters between the different node types, Blochl
et al. [
17] developed a fuzzy partitional clustering method based on a non-negative matrix factorization (NMF) model. Nepusz
et al. [
18] tackled the problem of fuzzy community detection in networks. Their approach allows each vertex to belong to multiple communities with different membership degrees and transforms the problem as a constrained optimization problem. Havens [
19] recently presented a newly defined soft modularity function for detecting fuzzy communities in social networks. One previous work [
20] is similar to our work, but it just presented some theoretical definitions without any investigation on topological structures mining in fuzzy graphs. Bandyopadhyay [
21] firstly formalized the problem of finding maximum fuzzy cliques in fuzzy graphs. Then, the proposed problem was reduced to an unconstrained quadratic 0-1 programming problem. The maximum fuzzy clique in that paper is defined as a relaxed clique that can allow other vertices with higher membership degrees and merge them into cliques.
In addition, there exist some other works about clique mining from uncertain graphs. Zou
et al. [
22] present a problem of finding top-
k cliques with the top-
k highest probability of existence from an uncertain graph, which differs from [
22], a recent research work focusing on mining the maximal cliques from an uncertain graph [
23]. They also defined a new concept, named
α-maximal clique, in an uncertain graph. Obviously, the weights on the edges of an uncertain graph are the probability of existence. However, the weights on the edges of a fuzzy graph refer to the degrees of membership. Inspired by those differences, this paper investigates a novel problem of mining
λ-maximal cliques from fuzzy graphs.