A Multi-Objective Crow Search Algorithm for Influence Maximization in Social Networks

Wang, Ping; Zhang, Ruisheng

doi:10.3390/electronics12081790

Open AccessArticle

A Multi-Objective Crow Search Algorithm for Influence Maximization in Social Networks

by

Ping Wang

^1,2,*

and

Ruisheng Zhang

¹

School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China

²

School of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou 730070, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(8), 1790; https://doi.org/10.3390/electronics12081790

Submission received: 27 February 2023 / Revised: 6 April 2023 / Accepted: 7 April 2023 / Published: 10 April 2023

(This article belongs to the Section Computer Science & Engineering)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Influence maximization is a key topic of study in social network analysis. It refers to selecting a set of seed users from a social network and maximizing the number of users expected to be affected. Many related research works on the classical influence maximization problem have concentrated on increasing the influence spread, omitting the cost of seed nodes in the diffusion process. In this work, a multi-objective crow search algorithm (MOCSA) is proposed to optimize the problem with maximum influence spread and minimum cost based on a redefined discrete evolutionary scheme. Specifically, the parameter setting based on the dynamic control strategy and the random walk strategy based on black holes are adopted to improve the convergence efficiency of MOCSA. Six real social networks were selected for experiments and analyzed in comparison with other advanced algorithms. The results of experiments indicate that our proposed MOCSA algorithm performs better than the benchmark algorithm in most cases and improves the total objective function value by more than 20%. In addition, the running time of the MOCSA has also been effectively shortened.

Keywords:

social networks; influence maximization; crow search algorithm; multi-objective

1. Introduction

The main concern of social networks is the importance of social relations, which was studied in 1890. This theory shows that social actors are influenced by the personal relationships of others in the personal network [1]. As creatures with social tendencies, human beings have become accustomed to the technology of establishing contact with each other. Every year, more and more people are signing up and using online social media. Up to now, there are 4.76 billion social media users worldwide, almost 60% of the total global population [2]. Most of the top online social platforms, such as Facebook, WhatsApp, Instagram, WeChat, TikTok, Twitter, and others, have become important tools for companies to promote their products and spread their messages. The relationship between individuals in social networks can help alleviate the problem of asymmetric information. The interaction between individuals will lead to the spread of information or influence in social networks. Facts have proven that effective information or influence dissemination is very effective in some practical applications, such as the promotion of new products or political views [3]. Therefore, the analysis of the structure and behavior characteristics of social networks will provide the theoretical basis for the solution of many economic and social problems.

Social network analysis has captured a wide range of research from several fields, such as computer science, physics, and epidemiology. As sociologists, anthropologists, physicists, mathematicians, and especially graph theorists, and statisticians are getting deeper into social network analysis, influence maximization (IM) plays a crucial role in social network analysis and has become one of the very critical problems in complex networks. The IM problem refers to selecting a set of users (seed set of nodes) from a social network to maximize the expected number of affected users. Numerous practical application scenarios including social network marketing [4,5,6], social recommendation [7,8,9], opinion monitoring [10,11,12], and community identification [13,14,15,16] have successfully applied the research results related to influence maximization. For example, in social network marketing, we can carry out statistical analysis on the characteristic data of online users, such as age and gender, and use the word-of-mouth effect to help companies better analyze shopping habits and lifestyle of users, identify potential customer groups, optimize marketing strategies, promote screened advertisements accurately on a continuous basis, bring new opportunities for marketing, and provide new profit growth points for companies and social media platforms.

Considering the diversity of benefits and costs, companies will inevitably look for the seed set of users that can achieve the greatest spread of influence and the lowest promotion costs, but this is very difficult to achieve in practice. The reason for this is that impact communication and promotion costs are like two sides of the same coin and they cannot be optimized simultaneously [17]. Therefore, business operators are bound to strike a balance between the two objectives of maximizing influence diffusion and minimizing promotion costs, which need to be organically combined and jointly optimized. In existing related works, the costs paid for spreading the message in the process of influence maximization are rarely considered. Therefore, how to select the minimum cost seed node set to obtain node influence maximization has been an important problem to be solved. Therefore, in this paper, the least cost influence maximization (LCIM) problem is proposed, which is a multi-objective optimization problem considering propagation cost factors. The MOCSA is used to maximize the influence spread based on this problem. In conclusion, the major contributions of this paper are summarized as follows:

Based on the two conflicting objectives of influence and cost, influence maximization is constructed as a multi-objective optimization problem called LCIM;
The MOCSA is proposed to solve the LCIM problem. In the MOCSA, the discrete evolutionary rules of the CSA algorithm are redefined to form the discrete search space for the influence maximization problem;
The parameter setting based on the dynamic control strategy and the random walk strategy based on the black hole are used to improve a balance between the exploration and exploitation of MOCSA;
The experiments are implemented on various datasets with different characteristics. Numerous results show that the proposed MOCSA obtains satisfactory performance.

The remainder of this paper is organized as follows. The related work is given in Section 2. Section 3 describes the influence maximization problem and spread model, and then the classical crow search algorithm is introduced briefly. Section 4 proposes the MOCSA for the LCIM problem. Comparison and result analysis of experiments are given in Section 5. Section 6 concludes this paper.

2. Related Work

The problem of influence maximization is described as finding a set of seed users to maximize the expected influence on other users. Domingos and Richardson [5] first raised the IM issue in 2001 by studying viral marketing. Later, Kempe et al. [18] described the problem as a discrete optimization problem. They used a greedy approximation algorithm and a large number of Monte Carlo simulations to obtain an accurate estimation of influence diffusion, which can easily lead to serious performance bottlenecks in large-scale networks. In order to improve the efficiency of the greedy approximation algorithm, some improved greedy algorithms have been proposed. Goyal et al. [19] proposed the CELF++ algorithm, which effectively reduces the number of Monte Carlo simulations by using the submodular of the IM problem when calculating the marginal income of nodes. Although this method optimizes the greedy strategy, the running time of CELF ++ is still too long for large-scale networks. Li et al. [20] proposed a Dynamic algorithm based on cohesive Entropy for Influence Maximization (DEIM), which simplifies the seed selection process to find the most influential nodes in social networks. Although experimental results show that the proposed DEIM algorithm performs better while ensuring the influence spread, the DEIM algorithm is still slightly insufficient for handling large-scale networks. Considering the community structure in a network, Kumar et al. [21] proposed a novel Influence maximization algorithm using node seeding, tag propagation, and community detection. The algorithm achieves good performance by using Extended h-index and IM-ELPR label propagation. However, identifying the best community in a real society is a challenging task and a key direction of community testing research. Lotf et al. [22] used a dynamic generalized genetic algorithm to obtain a dynamic seed set under independent cascade models to maximize influence on dynamic social networks, and maintained the optimization process in dynamic social networks. The proposed method shows better performance and scalability in terms of influence node identification. However, as the network size and the number of individuals that need to be identified increases, the calculation cost of the central value becomes high, and the process is very time-consuming.

However, in the real world, single-objective optimization is not suitable for most problems. For example, during engineering design, the performance, cost, feasibility, reliability, maintenance, and other aspects of products should be considered so as to balance these conflicting objectives. Therefore, it is necessary to optimize multiple objectives simultaneously, as it is in the problem of influence maximization.

Bucur et al. [23] proposed a multi-objective evolutionary method to maximize the influence spread in social networks considering the cost of seed nodes in the process of diffusion. This method is tested on two practical case studies, and the results show that it is superior to the HIGHDEG and SDISC heuristic methods. Konotopska et al. [24] proposed an improved evolutionary algorithm for maximizing influence based on graph-aware schemes such as intelligent initialization, custom mutation, and node filtering. The approximate fitness function is defined to accelerate the proposed algorithm. The proposed algorithm also has some limitations; when a large number of more suitable candidate nodes is found during the search process, it will result in higher computational costs. Gong et al. [25] modeled the influence maximization problem considering fairness (FIMP) as a multi-objective optimization problem and proposed a framework called FIMMOGA to solve FIMP. Experimental results on several networks show that the proposed method can provide better and more diverse solutions. However, the proposed method still has some shortcomings. First, it has not been tested on large-scale networks. Second, it lacks theoretical proof of the relationship between fairness and influence. Wang et al. [26] proposed a multi-objective optimization (IM-CM) model, which expands the influence range of seed nodes and reduces activation costs. In order to achieve a balance between the two optimization objectives, a novel algorithm called INS-MFO is used. Although the INS-MFO algorithm integrates diversity weights and mutation mechanisms to improve the performance of the algorithm, it does not consider the overlapping effect caused by selecting highly central nodes in the seed set, which may affect the efficiency of the algorithm. Olivares et al. [27] proposed a scheme using a particle swarm optimization algorithm to solve influence spread based on the linear threshold model. The author establishes a multi-objective optimization model for the two conflicting objectives of influence spread and cost, and a particle swarm optimization algorithm is used to solve the problem. Although the algorithm directly uses random functions to initialize the position and velocity vectors of particles, which can achieve the purpose of diversity, it also causes randomness in selecting seed nodes, which cannot control the accuracy of the solution well and increase the running time.

As far as we are aware, there is still limited research on solving multi-objective IM problems, and even fewer algorithms based on swarm intelligence for this problem. Therefore, it is necessary for us to design a new swarm intelligence algorithm, taking into account the diversity and local development ability of the algorithm, so as to effectively solve the problem of multi-objective influence maximization.

3. Preliminaries

In this section, we explain the problem of influence maximization within social networks. After that, we introduced two main influence diffusion models, namely the linear threshold model and the independent cascade model. In this work, the linear threshold model will be considered. Finally, we introduced the crow search algorithm, which will be extended to solve multi-objective optimization problems in Section 4.

3.1. Influence Maximization

A social network can be represented as a weighted graph

G = (V, E, W)

, where V is the set of nodes, E is the set of edges between the nodes in G, which expresses the relationship between two users,

W (u, v)

is composed of the weight of the edge

(u, v)

, and indicates the influence of u on v.

Definition 1.

Given a social network

G = (V, E, W)

and a positive integer k, the influence maximization problem aims to find a seed set

S^{*}

with k nodes from V as the set S to maximize the influence spread

σ (S)

under a given diffusion model.

S^{*} = \underset{S \subseteq V, | S | = k}{arg max} σ (S)

(1)

In Equation (1), S is a selected seed set,

S^{*}

is the optimal set of seed nodes for the influence maximization, and

σ (S)

is the expected number of influenced nodes.

3.2. Diffusion Models

The spread models used in the influence maximization problem mainly include the Independent Cascade (IC) model and Linear Threshold (LT) model.

The LT model is used in our scheme, which is a value accumulation model. In the LT model, each directed edge

(u, v) \in E

of social network G will be associated with a corresponding weight

w (u, v) \in [0, 1]

. In fact,

w (u, v)

represents the proportion of user u influence on user v among all its neighbors. In addition, each node is associated with a threshold

θ (v) \in [0, 1]

. Once this threshold is determined, it will not change during the spread process, which means that node v will be activated only when the sum of the weights of all its activated neighbor nodes is greater than or equal to the threshold

θ (v)

. At the initial time, only the seed node is activated, and all the other nodes are inactive. After node v is activated, it will affect its neighbor nodes the next time and repeat the above process. When the sum of the influence of any active node in all the existing active nodes in the network cannot activate their inactive neighbor nodes, the spread process ends.

In the LT model, the threshold of node v is actually the acceptance of the node to the entity propagating in the current network: the lower the threshold, the easier v is affected by the entity. This entity includes information, ideas, products, behaviors, etc. No single node likely is enough to activate node v, but the total influence of several nodes can activate v. When a new entity propagates in the social network, users may need a considerable number of relatives and friends to accept the entity before they accept the entity. Therefore, node v is activated by the common influence of incoming neighbors. This common influence transmission is a group behavior that often occurs in human society when facing relatively complex choices.

3.3. Crow Search Algorithm

Crows are among the most intelligent birds in the world, and the story of “The Crow And The Pitcher” in Aesop’s Fables reflects the intelligence of crows to a certain extent. The ratio of the brain-to-body weight of crows is equivalent to that of dolphins or chimpanzees. In 2004, Nathan Emery and Nicola Clayton from the Department of animal behavior and experimental psychology of the University of Cambridge in the United Kingdom jointly published their research in the journal Science. The research shows that crows are smarter than chimpanzees in some research tests [28]. Crows can use or even make tools to collect food, understand causal relationships, have a certain ability of logical reasoning, and can identify the faces that bring them threats and warn their companions. Other studies by Nathan Emery and Nicola Clayton show that crows use their previous relevant social backgrounds to follow each other to obtain better food sources, predict the behavior of the thief through his own experience of stealing other crow food, determine the safest action plan to protect the food hiding place from theft, and retrieve the food when necessary [29,30].

Inspired by these clever thoughts of crows and group interaction patterns, Askarzadeh [31] proposed a new population-based meta-heuristic algorithm in 2016, called crow search algorithm (CSA), which is used to efficiently solve engineering design problems. The design idea of the CSA algorithm is as follows.

Suppose there is a d-dimensional search space containing flock size N, and the position of crow i at the

i t e r

time in the search space is represented by the vector

x^{i, i t e r}

, (i = 1, 2,…, N;

i t e r

= 1, 2,…,

i t e r m a x

), where itermax is the maximum number of iterations. The hidden position of food in the memory of crow i is indicated by

m^{i, i t e r}

, which is the best location that crow has obtained so far. In fact, the best positions experienced by each crow are stored in its memory. Crows move around the environment and look for better food hiding places

m^{i, i t e r}

. Suppose that in the

i t e r

iteration, crow j wants to access its food hiding place

m^{j, i t e r}

, and crow i decides to follow crow j to approach its food hiding place. At this point, two conditions may occur:

Crow j does not know that crow i is following it. In this way, crow i will fly to the place where crow j hides its food. At this time, the new location of crow i is updated as follows:

$x^{i, i t e r + 1} = x^{i, i t e r} + r_{i} * f l^{i, i t e r} * (m^{j, i t e r} - x^{i, i t e r})$

(2)

where $r_{i}$ represents a random number between [0, 1], $f l^{i, i t e r}$ is the flight length of crow i in the $i t e r$ iteration.
Crow j knows that crow i is following it. In order to protect the food hiding place from theft, crow j will create an illusion and randomly move to other locations in the search space.

To sum up, the location update method of crow i is as follows:

x^{i, i t e r + 1} = \{\begin{matrix} x^{i, i t e r} + r_{i} * f l^{i, i t e r} * (m^{j, i t e r} - x^{i, i t e r}) & r_{j} ⩾ A P^{j, i t e r} \\ a random position & o t h e r w i s e \end{matrix}

(3)

where

r_{j}

represents a random number between [0, 1],

A P^{j, i t e r}

represents the awareness probability of crow j at the

i t e r

iteration.

The crow updates its memory vector with the new location: if

f (x^{i, i t e r + 1}) > f (m^{i, i t e r})

,

m^{i, i t e r + 1} = x^{i, i t e r + 1}

; Otherwise, the memory vector of the crow will remain unchanged,

m^{i, i t e r + 1} = m^{i, i t e r}

, where f(·) denotes the value of the objective function.

4. Proposed Method

4.1. Least Cost Influence Maximization

From a commercial perspective, it is obvious that enterprises will rely on influential media or public figures as a seed set to guide consumers. At the same time, enterprise management needs to consider the cost of using these media or public figures, and companies often set budgets in advance of marketing campaigns to constrain the costs that influencer communications will entail. For example, in viral marketing, companies promote new products by offering budget discounts or coupons to a few influential customers in order to extend the influence of the product to the greatest extent possible. Therefore, a practical issue is how to select influential seed nodes at the lowest possible cost to acquire the greatest influence spread range.

Based on the above considerations, in this section, we model the LCIM problem, which takes into account the propagation cost factor, and we propose the MOCSA to solve this problem. The LCIM problem considers both individual influence and cost, and captures the characteristics of the actual social network more clearly and truly than the classical influence maximization. The definition of the LCIM problem mainly includes two parts:

Maximization sub-objective.
According to the definition of maximizing influence in Section 3.1, the mathematical form of this objective is described as:

$\underset{_{X \subseteq V, | X | ⩽ K_{m a x}}}{max |σ (X)|}$

(4)

In Equation (4), X is the selected seed set, $σ (X)$ is formed by nodes activated by the influence spread of X during the iteration process, and $K_{m a x}$ is the maximum seed set size.
Minimization sub-objective.
Most of the existing research on influence maximization revolves around problems of how to maximize the influence spread for given k seed nodes, and few works focus on the cost of activating the necessary initial seed nodes. The phenomenon reflected in real social networks is that using influential users for information dissemination requires a certain incentive cost and influence, and incentive cost are positively correlated. We aim to find a seed set $X^{*}$ that can minimize the cost of the selection of the seed nodes.

$X^{*} = \underset{X \subseteq V, | X | ⩽ K_{m a x}}{arg min} C o s t (X)$

(5)

Therefore, another objective function of LCIM is to minimize seed costs, the mathematical form is defined as follows:

$\underset{_{X \subseteq V, | X | ⩽ k}}{min C o s t (X)} = min \sum_{i = 1}^{n} x_{i} c_{i}$

(6)

where $X^{*}$ is the set of seed nodes with minimum cost, X is the seed set selected from the node set V, $c o s t (X)$ is the total cost of all seed nodes, $x_{i} \in {0, 1}$ , $c_{i}$ is the cost of selecting node i, and $\forall i \in {0, 1, \dots, N}$ .

According to the above two objectives, expanding the influence range may require increasing the number of seed nodes, and a decrease in seed nodes may also reduce the influence spread. Obviously, the objective function contains these two conflicting sub-objectives. To solve this problem, the negative values of the minimization function can be transformed into the maximization function so that both sub-objective functions become maximization functions. Therefore, the maximization of the total objective function can be expressed in mathematical form as follows:

\begin{matrix} max F (x) = \{\max |σ (X)|; min \sum_{i = 1}^{n} x_{i} c_{i}\} = max \{|σ (X)|; - \sum_{i = 1}^{n} x_{i} c_{i}\} \\ s . t . \sum_{i = 1}^{n} x_{i} c_{i} ⩽ B \\ x_{i} \in {0, 1}, \forall i \in {1, 2, . . ., n} \end{matrix}

(7)

In Equation (7), B is the advertisement budget considered by the company, which means that the total cost of selecting seed nodes should be less than budget B.

4.2. Discrete Encoding Scheme

In the classical crow search algorithm, each crow moves in a continuous d-dimensional search space, where each position vector can be assumed to be any value in a continuous real number field. The influence maximization problem, on the other hand, studies a binary relationship, and the search space used to solve it is discrete. Therefore, a suitable discrete encoding scheme is needed to ensure that the search process is mapped from the continuous search space to the discrete search space without losing the operational integrity of the algorithm. In 2018, De Souza et al. [32] proposed a discrete binary crow search algorithm (BCSA) when solving the feature selection problem. Like the discretization process of the PSO algorithm, the position vector of each crow in the search space is mapped to a binary value of 0 or 1, and moves to every corner of the search space by flipping a different number of bits. This mapping process requires transfer functions. Mirjalili et al. [33] evaluated two types of six conversion functions, namely S-shaped functions represented by sigmoid and V-shaped functions represented by the absolute value of the hyperbolic tangent. Through 25 benchmark optimization functions, it was verified that the V-shaped transfer functions significantly improved the performance of the original binary PSO. In the MOCSA algorithm proposed in this chapter, the V-shaped transfer function shown in Equation (8) is used to map the position vector.

T (x)

is a V-shaped curve that can compress a real value input to the range of [0, 1], As shown in Figure 1.

T (x) = |\frac{\sqrt{2}}{π} \int_{0}^{\frac{\sqrt{π}}{2} x} e^{- t^{2}} d t|

(8)

x_{i d}^{*} (i t e r) = \{\begin{matrix} 1 & T (x_{i d}^{*} (i t e r)) > rand () \\ 0 & T (x_{i d}^{*} (i t e r)) ⩽ rand () \end{matrix}

(9)

In Equation (9),

x_{i d}^{*} (i t e r)

indicates the position of crow i at iteration

i t e r

in dth dimension, and rand() is a random number subject to a uniform distribution in the range [0.0, 1.0].

4.3. Parameter Setting Based on Dynamic Control Strategy

Diversification means producing diversified solutions to explore space in the overall scope, while intensification means focusing on the exploitation of local areas and finding a better solution in the region. A balance between intensification and diversification must be found when choosing the best solution to improve the convergence speed of the algorithm.

In the basic CSA, the main parameter responsible for diversification and intensification is the dynamic awareness probability (

D A P

) of the crow. When the value of AP is taken incrementally, the CSA tends to explore the space at the global scale and enhance the algorithm diversification; when the value of AP decreases progressively, the CSA tends to guide the local exploitation and enhance intensification. Therefore, in this section,

D A P

will be used to balance the global exploration and local exploitation of the MOCSA algorithm.

D A P

provides higher exploration opportunities at the beginning of the iteration and it causes the global exploration capability to decrease linearly before the end of the run, ensuring the convergence of the algorithm. The definition of

D A P

is given in Equation (10).

D A P = 1 - \frac{i t e r}{M a x I t}

(10)

where

i t e r

is the current iteration number and

M a x I t

is the maximum iteration number.

In addition to the awareness probability AP, another parameter that affects the convergence of the algorithm in CSA is the flight length

f l

.

f l

is a predefined constant in the basic CSA. When the value of

f l

decreases, the algorithm tends to search the local region. As the value of

f l

increases, the algorithm tends toward global exploration. Therefore, using a small value of

f l

may cause the algorithm to fall into local optimization, while a large value of

f l

will lead to a decrease in convergence speed. It is also very time-consuming and tedious to adjust the appropriate

f l

value for local search and convergence speed. According to Emery and Clayton, crows have highly accurate spatial memory, and crows living at high altitudes store up to 30,000 pine tree seeds over a wide area and can retrieve the seeds after six months [28]. This also reflects that crows will take different flight lengths to hide or retrieve food. Therefore, in order to mimic this intelligent behavior of crows, the flight length should be kept dynamically variable. At the beginning of the iteration,

f l

should take a larger value to allow global exploration throughout the search space. As the algorithm proceeds, and the number of iterations increases, the value of

f l

should be taken decreasingly in order to perform local search and improve the exploitation of the current optimal solution, i.e., to increase the probability of finding a better solution near the current solution. Therefore, based on the above analysis, a dynamic flight length (

D f l

) adjustment strategy is proposed in this section.

D f l

is defined as described in Equation (11).

D f l = f l_{m a x} - (f l_{m a x} - f l_{m i n}) \frac{i t e r}{M a x I t}

(11)

where

f l_{m a x}

and

f l_{m i n}

are the upper and lower bounds of the flight length. The values of both our proposed

D A P

and

D f l

are obtained during each iteration of updating the crow position vector, so the algorithm complexity will not be increased.

To sum up, the location of crow i is updated as follows:

x^{i, i t e r + 1} = \{\begin{matrix} x^{i, i t e r} + r_{i} * D f l^{i, i t e r} * (m^{j, i t e r} - x^{i, i t e r}) & r_{j} ⩾ D A P^{j, i t e r} \\ a random position & o t h e r w i s e \end{matrix}

(12)

4.4. Random Walk Based on Black Hole

In the classical CSA, once crow j knows that it is being followed by crow i, crow j tricks crow i by using a random position in the space. However, the new randomly generated position in the search space may be worse than the original position and will slow down the convergence of the algorithm. To avoid the degradation of local exploitation capacity due to the randomness of the crow search, we introduce a random walk strategy based on the black hole algorithm to create a new position vector for the crow.

The black hole algorithm is a heuristic optimization algorithm based on the phenomenon of black holes in nature [34]. A black hole in the universe has a supergravity, and when the distance between the surrounding matter and the black hole is less than a certain length, i.e., the Schwarzschild radius, all matter, including light entering this radius (the horizon of the black hole) will be absorbed by the black hole. The black hole algorithm is based on this phenomenon.

According to the black hole theory, a set of randomly generated stars in the algorithm space is used to simulate a random distribution of stars in the vast universe, and the number of stars is assumed to be N. During the operation of the algorithm, the fitness values of all individual stars are calculated, then the candidate solution

x_{B H}

with the best fitness value is selected as the black hole, and all the other stars start to move towards the black hole. Stars centered on the black hole and within the Schwarzschild radius may be captured by the black hole, accelerating the convergence rate. At the same time, in order to ensure the diversity of stars, there will be a random probability to escape from the black hole after the star is captured by the black hole. The calculation of Schwarzschild radius is shown in Equation (13), where f(·) is the fitness function of stars.

R = \frac{|f (x_{B H}^{i t e r})|}{\sum_{i = 1}^{N} |f (x_{i}^{i t e r})|}

(13)

The law of stars around the black hole moving towards the black hole under the gravity of the black hole is shown in Equation (14).

x_{i}^{i t e r + 1} = x_{i}^{i t e r} + r a n d * (x_{B H}^{i t e r} - x_{i}^{i t e r})

(14)

During the operation of the algorithm, if the distance between a star and the black hole is less than the Schwarzschild radius, it will be swallowed by the black hole and disappear. In this case, in order to keep the number of candidate solutions constant, it is assumed that whenever a star is swallowed by the black hole, the algorithm has to randomly generate a new star (candidate solution) in the search space and continue a new iteration of the search until the algorithm reaches the termination condition and exits the loop.

After introducing the black hole theory into the MOCSA algorithm, crows within the Schwarzschild radius will be captured by the black hole, which will enhance the capability of local exploitation. Crows also have a random probability of escaping from the black hole to form a new position vector, which expands the search area for crows. In this way, the ability of local exploitation of the algorithm is improved without losing the global exploration ability, which improves the convergence accuracy of the algorithm and helps to get the global optimal solution quickly.

We set R as the Schwarzschild radius and p as the probability of escape. While

r_{j} < D A P^{j, i t e r}

, the random walk mechanism based on black hole is triggered to run, and the specific implementation process is shown in Algorithm 1.

Algorithm 1 RandomWalk

(x^{i, i t e r}, N, R, p)

Input:: Graph $G = (V, E)$ , position vector of crow $x_{i, i t e r}$ , number of crows N.
Output:: new position of crow $x^{i, i t e r + 1}$
1:: $x^{i, i t e r} \leftarrow ϕ$
2:: for $j = 1$ to N do
3:: $t e m p$ ← random position in search space
4:: if $T (R) > p$ then
5:: $x^{i, i t e r + 1} = x^{i, i t e r} + r_{i} * (m^{j, i t e r} - x^{i, i t e r})$
6:: Convert $x^{i, i t e r}$ to discretization ← Equation (9)
7:: else
8:: $x^{i, i t e r + 1}$ ← $t e m p$
9:: end if
10:: end for

4.5. Framework of MOCSA

In this section, the MOCSA is proposed to solve the LCIM problem on the basis of a discrete encoding scheme, dynamic parameter settings, and a black hole-based random walk strategy. The goal is to select seed nodes at the least cost to obtain the maximum influence spread. For the LCIM problem, we consider that each node uses a fixed Unit cost. Therefore, the value of

c_{i}

is set to 1 in Equation (7). Then, the fitness value of total objective functions

F (X)

is calculated using the number of activated nodes and seed nodes, i.e.,

F (X) = | σ (X) | - | X |

. The pseudocode of the proposed algorithm is shown in Algorithm 2. The steps of MOCSA are as follows:

Step 1. Define objective function and related parameters. The objective function and its solution space are defined. And the relevant parameter values used in MOCSA are also assigned, such as the number of crows (N), maximum number of iterations ( $M a x I t$ ), bound of flight length ( $f l_{m a x}$ and $f l_{m i n}$ ), random number (r).
Step 2. Initialize the population. According to the discrete encoding scheme, initial position and memory vectors are generated randomly. Each crow is evaluated using a multi-objective function to generate the non-dominated solution for the first iteration.
Step 3. Global exploration and local exploitation. According to the evolutionary mechanism proposed, the location and memory vectors of the crows are updated, and a balance between exploration and exploitation is achieved using a random walk strategy based on black holes.
Step 4. Evaluate and update new solutions. Evaluate the objective function of this iteration, the non-dominated solution in the network is selected and updated.
Step 5. Output the solution. If the number of iterations of the algorithm has reached the maximum, output the optimal solution; otherwise, continue to return to Step 3.

Algorithm 2 MOCSA for LCIM.

Input:: Graph $G = (V, E)$ , number of crows N, maximum number of iterations $M a x I t$ , the bounds of flight $f l_{m a x}$ and $f l_{m i n}$ , random number $r_{i}$ , Multi-objective functions.
Output:: Pareto solution of objective functions
1:: Initialize iterator $i t e r = 0$
2:: Randomly initialize position vector x
3:: Randomly initialize position vector m
4:: Evaluate fitness of objective functions for each $x^{i, i t e r}$
5:: Generate the first iteration of non-dominated solutions according to the fitness of objective functions
6:: while $i t e r < M a x I t$ do
7:: Calculate the dynamic awareness probability $D A P \leftarrow$ Equation (10)
8:: Calculate the dynamic flight length $D f l \leftarrow$ Equation (11)
9:: for $i = 1$ to N do
10:: for $d = 1$ to V do
11:: $j = r a n d o m (N)$
12:: if $r_{j} ⩾ D A P^{j, i t e r}$ then
13:: Update the position vector $x^{i, i t e r} \leftarrow$ Equation (12)
14:: Convert $x^{i, i t e r}$ to discretization ← Equation (9)
15:: else
16:: Update the position vector $x^{i, i t e r} \leftarrow R a n d o m W a l k (x^{i, i t e r}, N, R, p)$
17:: end if
18:: end for
19:: end for
20:: Evaluate objective functions for the new solutions
21:: Find non-dominated solutions
22:: end while

The specific framework of MOCSA is shown in Algorithm 2. In the algorithm framework, the location and memory vector of the crow are randomly initialized by the V-shaped transfer function in the discrete search space (lines 1–3). Evaluate the fitness value of each crow and find the non-dominated solutions (lines 4 and 5). The algorithm begins to perform iterations. In each iteration, all crows in the solution space are moved to new positions according to the proposed evolutionary rules, and their fitness values are calculated. The global exploration and local exploitation of the algorithm is balanced using dynamic awareness probability, dynamic flight length, and a black hole-based random walk scheme (lines 6–19). Calculate the fitness functions for the new solutions and select the non-dominated ones to update the solution set (lines 20 and 21). To update the solution set, it is first necessary to compare the fitness value

F (X)

in the current iteration with the globally optimal fitness value

F {(X)}_{b e s t}

. If these two values are equal, then determine the value of influence spread and select a set of solutions with higher spread values. If the values of fitness are equal, the solution with high propagation values will inevitably have fewer seed nodes, which meets the requirements of the objective function; otherwise, choose a solution with a high fitness value and when

F (X)

>

F {(X)}_{b e s t}

, the value of

F {(X)}_{b e s t}

is replaced with the value of

F (X)

, until the iteration satisfies its maximum value, the algorithm stops, and the Pareto Frontiers will be generated. The flowchart of MOCSA is provided in Figure 2.

5. Experiments and Analysis

In order to verify the performance of the proposed MOCSA algorithm in dealing with the multi-objective influence maximization problem, six real social network data sets are collected to carry out simulation experiments under the LT model. They are implemented on a Linux server with an Intel Xeon Gold 2.20 GHz processor and 128 MB of RAM. The parameter

c_{i}

is the cost of selecting node i, and the value of

c_{i}

is set to 1 in all experiments of this work, which means that a fixed unit cost is used when selecting any node. Then, the total cost of selecting nodes is calculated by summing the number of nodes in seed sets.

5.1. Datasets

The topological structure characteristics of the data set used in the experiment are shown in Table 1, which are weighted directed networks.

5.2. Comparison Algorithms

Four multi-objective optimization algorithms were compared with our proposed algorithm as benchmark algorithms, including MOPSO, MOBA, MOBHO, and MODBO.

MOPSO [43]: the multi-objective particle swarm optimization algorithm extends the standard particle swarm optimization algorithm to the LT model to solve the multi-objective influence spread problem. The standard particle swarm optimization algorithm comes from the simulation of the foraging strategy of birds or fish groups. Individuals form a search mode through a certain information-sharing mechanism so as to solve the optimal solution.

MOBA [44]: multi-objective bat optimization algorithm is a heuristic search algorithm that simulates bats in nature, uses an ultrasonic wave to locate prey and detect obstacles, and extends it to solving multi-objective problems. The algorithm generates local, new candidate solutions by iterating around the non-dominated solutions, which optimizes the local search ability.

MOBHA [45]: multi-objective black hole optimization algorithm is a heuristic optimization algorithm based on multi-objective, which originates from the black hole phenomenon in natural astrophysics. The algorithm coordinates the global exploration and local development capabilities through the black hole horizon and Schwarzschild radius.

MODBO [46]: the dung beetle optimizer (DBO) is a recently proposed method based on swarm intelligence. The DBO algorithm balances global exploration and local exploitation, including the use of a novel search mechanism of the ball-rolling dung beetle, dynamically changing R parameters, and different regional search strategies. We have extended it to form a multi-objective DBO (MODBO) algorithm to solve LCIM optimization problems.

5.3. Parameter Configuration of MOCSA

In this section, we take the form of experiments to determine the best setting scheme for the main parameters involved in the MOCSA algorithm. All experiments are performed 30 tests independently. In the MOCSA algorithm,

D A P

adjusts the awareness probability dynamically according to Equation (10),

D f l

dynamically adjusts the flight length according to Equation (11), the bounds of flight

f l_{m a x}

and

f l_{m i n}

are set to 1.9 and 1.0 respectively,

r_{i}

represents a random number subject to distribution, N represents the number of the crows, and

M a x I t

represents the maximum number of iterations.

Flock size N is also one of the main factors affecting the performance of the MOCSA algorithm in various optimization problems. When N is taken as a smaller value, the operation speed of the MOCSA algorithm will be accelerated, but it causes a decrease in the diversity of the algorithm and makes the algorithm reach convergence too early. When N takes a larger value, it is conducive to global exploration, but the operation time of the algorithm increases subsequently, which reduces the optimization efficiency of the algorithm. In order to determine the flock size N of the MOCSA algorithm, we continue to select the most suitable value of N through fitness optimization experiments. Figure 3 shows the experimental comparison results when N takes different values.

M a x I t

is set to 100, and the values of N are set to 10, 20, 30, 50, and 100 respectively.

Figure 3a shows that due to the small number of nodes in the Dutch-college network, the performance of the algorithm is not significantly improved when the value of N is varied. In Figure 3b, the horizontal axis shows the remaining five networks, from which it can be seen that the value of the objective function fitness increases with the value of N. However, the performance improvement of the algorithm is limited when N increases from 50 to 100, and the larger flock size increases the running time of the algorithm. Taking the Slashdot network with the largest number of network nodes as an example, the running time is 5862.954 s when the value of N is set to 100, which is 3.5 times the running time when N is 30, and this difference in running time is continuously magnified with the increase in the number of iterations. It can be seen from the figure that when the value of N is 30, the objective function fitness also reaches a relatively high value, and the solution performance and execution efficiency of the algorithm reach a relatively balanced state. Therefore, the value of N is uniformly chosen as 30 in the subsequent experiments.

5.4. Result Analysis

In this section, we conduct influence spread experiments using the six previously mentioned real social networks and select the typical algorithm in Section 5.2 as the benchmark algorithm for horizontal comparison to verify the effectiveness of the proposed MOCSA algorithm. All experiments are executed independently 30 times, the number of iterations of each algorithm in the same network is kept consistent, and the population size N is uniformly chosen as 30. In the MOPSO algorithm, according to Coello et al. [43], the value of parameter inertia weight factor

ω

is 1, and the learning factors

c_{1}

and

c_{2}

are both 1. According to Yang et al. [44], in the MOBA algorithm, two acoustic frequency

f_{m i n}

and

f_{m a x}

are set to 0.5 and 1.5 respectively, and the loudness alpha is set to 0.9, both values of constant

α

and

γ

are set to 0.9. The summary of the general results is shown in Table 2, which are based on the multi-objective paradigm. The

f_{1}

and

f_{2}

shown in Table 2 correspond to the ideal points obtained for the two sub-objectives max influence spread and min cost, respectively. The

F (x)

corresponds to the value of the total objective function. In this section, Figure 4, Figure 5 and Figure 6 describe the convergence graph, showing the evolution of nodes generated by the number of iterations.

5.4.1. The Comparison of the Influence Spread

Figure 4 depicts the comparison of the influence spread of four algorithms in six real social networks. The horizontal coordinates of the figure are the number of iterations of the algorithms, and the vertical coordinates are the influence spread values. From Figure 4a–c, it can be seen that the influence spread of the MOCSA algorithm is comparable to MOPSO, and higher than the other benchmark algorithms (MOBA, MOBHA, MODBO). Figure 4d–f show that the influence spread of the MOCSA algorithm decreases after 1000 iterations and converges after 4000 iterations, which is the result of the corresponding reduction of propagation cost.

5.4.2. The Comparison of the Seed Nodes Cost during Diffusion

From Figure 5, we can see the cost variation of the four algorithms in the process of influence spread on six real social networks. MOCSA, MODBO, and MOPSO algorithms reflect better cost control ability. Compared with six real social networks, the seed nodes cost of MOBA and MOBHA algorithms have hardly decreased.

In Figure 5a–c, the seed node cost spending of the MOCSA algorithm is close to that of the MOPSO algorithm. For large scale networks, the seed node cost of the MOCSA algorithm is significantly lower than that of the MOPSO algorithm after 1000 iterations, especially in Higgs–Reply and Slashdot, which draw a significant gap with MOPSO, as shown in Figure 5e–f.

5.4.3. The Comparison of Fitness Function Optimization Results

In Figure 6a the fitness values of MOCSA and MOPSO algorithms converge to 23 after 400 iterations, achieving the same performance. From Figure 6b–d, it can be seen that the MOCSA and MOPSO algorithms have achieved similar results; the fitness value of the objective function of these two algorithms is almost twice that of other algorithms. Within the first 4000 iterations, the growth trend of the fitness value of the objective function of MOCSA and MOPSO algorithms is almost the same. After 2000 iterations, the function fitness value of MOCSA is inversely higher than that of MOPSO, and this advantage is maintained until the end of the iteration.

Figure 6e,f show that for large-scale networks such as Higgs–Reply and Slashdot, MOCSA and MOPSO algorithms continue to maintain their advantages. Thanks to dynamic parameter setting and random walk strategy based on a black hole, the convergence speed of the MOCSA algorithm is significantly faster than that of MOPSO. According to Figure 5, MOCSA is more likely to obtain satisfactory results in the case of a low-cost budget for seed nodes or limited running time, which will obviously be more popular with users. Taking Slashdot network as an example, MOCSA began to converge after 4000 iterations, the average fitness value of the function is 30,306 at this time, which is 10.73% higher than that of MOPSO. Although the fitness value of the function in MOPSO is slightly higher than that of MOCSA after 8000 iterations, it is the result of higher seed node cost and longer running time. Except on Higgs-Reply networks, the performance ranking of the MODBO algorithm is in the middle level, slightly better than MOBA and MOBHA. In contrast, the performance of MOBA and MOBHA is poor, and the average fitness of the objective function in the iterative process has hardly increased compared with the initial value, and finally converges to 19,119 and 19,344, respectively.

The approximated Pareto Frontiers obtained using different algorithms on the six real social networks are shown in Figure 7. Compared to other solutions, the solutions in MOBA and MOBHA are relatively dense. On the contrary, in MODBO, their distribution is sparse, which may be due to its unique search method. Similar to MOPSO, the proposed algorithm MOCSA provides a larger distribution of solutions in the objective search space. It is obvious that the objective function of minimizing seed costs conflicts with the objective of minimizing seed set size. In most cases, the proposed MOCSA algorithm outperforms the benchmark algorithm. Although MOCSA may not produce the best results in any situation; it is competitive in almost all cases.

5.4.4. The Comparison of Running Time

In this section, to evaluate the efficiency of the MOCSA algorithm for solving the LCIM problem, we consider the running time spent by different algorithms as another important performance measure. Figure 8 shows the running time of each algorithm when the number of iterations in six networks is 1000. As can be seen from Figure 8, in Dutch-College, a small network with a node size of

10^{1}

, the time required for MOCSA is half that of MOPSO. With the increase in network size, the MOCSA algorithm continues to maintain the speed advantage. On Adolescent-health, Bitcoin-Alpha, and Advogato networks with

10^{3}

nodes, the running time of MOCSA is equivalent to that of MOBA, which is only 40% of that of MOPSO. In larger networks with

10^{4}

nodes, MOCSA still performs well. For example, in Slashdot network, the running time of MOCSA is 11,858.877 s, which is 64% of MOPSO and half of MOBHA. MODBO performs unsatisfactorily on almost all networks, with a runtime of 29,357.027 s on the Slashdot network. Moreover, with the increase in iterations, the trend of runtime growth far exceeds that of other algorithms. In general, compared with the other algorithms, MOCSA has a lower time cost and is more suitable for extending to large-scale networks.

6. Conclusions

The problem of maximizing influence propagation while minimizing the cost of seed nodes is studied in this work. We extend the single-objective optimization problem of classical IM to a multi-objective optimization problem in the LT model to define the LCIM problem. To solve this problem, a new meta-heuristic algorithm MOCSA is proposed by combining dynamic parameter setting and random walk strategy based on a black hole, and its results are compared and analyzed. Different experiments are performed on six real networks to evaluate the proposed method and compare it with MOPSO, MOBA, MOBHA, and MODBO algorithms. The results show that our proposed MOCSA algorithm performs better than the benchmark algorithm in most cases and can improve the fitness value by about 20% or more while maintaining a relatively low cost of seed nodes. Although MOCSA may not produce the best solutions in any situation, it is competitive in almost all cases. In addition, the shorter runtime of MOCSA has become another advantage in practical applications. For example, when dealing with sudden public health emergencies, the government can more effectively guide public opinion within a limited time.

In real society, the relationship between users and the cost of individuals may not be fixed, and the problem of influence spread may involve other factors. For example, user behavior, interest, acceptable time, and other factors play an important role in the propagation process, but these factors are not considered in our proposed approach. Therefore, in future work, we will design more effective methods to solve the problem of maximizing influence propagation with the co-existence of multiple factors. In addition, we will also consider the dynamic seed node cost and adjust the seed selection strategy accordingly.

Author Contributions

Conceptualization, P.W. and R.Z.; methodology, P.W.; software, P.W.; validation, P.W. and R.Z.; formal analysis, R.Z.; investigation, P.W.; resources, R.Z.; data curation, P.W.; writing—original draft preparation, P.W.; writing—review and editing, R.Z.; visualization, P.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the “Double-First Class” Major Research Programs, Educational Department of Gansu Province No. GSSYLXM-04.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets generated or analyzed during the study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Marsden, P.V.; Friedkin, N.E. Network studies of social influence. Sociol. Methods Res. 1993, 22, 127–151. [Google Scholar] [CrossRef]
Digital 2023 Global Overview Report. 2023. Available online: https://wearesocial.com/us/blog/2023/01/digital-2023/ (accessed on 6 February 2023).
Li, F.; Du, T.C. The effectiveness of word of mouth in offline and online social networks. Expert Syst. Appl. 2017, 88, 338–351. [Google Scholar] [CrossRef]
Brown, J.J.; Reingen, P.H. Social ties and word-of-mouth referral behavior. J. Consum. Res. 1987, 14, 350–362. [Google Scholar] [CrossRef]
Domingos, P.; Richardson, M. Mining the network value of customers. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 26–29 August 2001; pp. 57–66. [Google Scholar] [CrossRef]
Richardson, M.; Domingos, P. Mining knowledge-sharing sites for viral marketing. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, AL, Canada, 23–36 July 2002; pp. 61–70. [Google Scholar] [CrossRef]
Jamali, M.; Ester, M. A matrix factorization technique with trust propagation for recommendation in social networks. In Proceedings of the 4th ACM Conference on Recommender Systems, Barcelona, Spain, 26–30 September 2010; pp. 135–142. [Google Scholar] [CrossRef]
Ma, H.; Zhou, D.; Liu, C.; Lyu, M.R.; King, I. Recommender systems with social regularization. In Proceedings of the 4th ACM international Conference on Web Search and Data Mining, Hong Kong, China, 9–12 February 2011; pp. 287–296. [Google Scholar] [CrossRef]
Ye, M.; Liu, X.; Lee, W.C. Exploring social influence for recommendation: A generative model approach. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, Portland, OR, USA, 12–16 August 2012; pp. 671–680. [Google Scholar] [CrossRef]
Anstead, N.; O’Loughlin, B. Social media analysis and public opinion: The 2010 UK general election. J. Comput.-Mediat. Commun. 2015, 20, 204–220. [Google Scholar] [CrossRef] [Green Version]
Han, X.; Wang, J.; Zhang, M.; Wang, X. Using social media to mine and analyze public opinion related to COVID-19 in China. Int. J. Environ. Res. Public Health 2020, 17, 2788. [Google Scholar] [CrossRef] [Green Version]
Rim, H.; Lee, Y.; Yoo, S. Polarized public opinion responding to corporate social advocacy: Social network analysis of boycotters and advocators. Public Relations Rev. 2020, 46, 101869. [Google Scholar] [CrossRef]
Fortunato, S. Community detection in graphs. Phys. Rep. 2010, 486, 75–174. [Google Scholar] [CrossRef] [Green Version]
Papadopoulos, S.; Kompatsiaris, Y.; Vakali, A.; Spyridonos, P. Community detection in social media. Data Min. Knowl. Discov. 2012, 24, 515–554. [Google Scholar] [CrossRef]
Fortunato, S.; Hric, D. Community detection in networks: A user guide. Phys. Rep. 2016, 659, 1–44. [Google Scholar] [CrossRef] [Green Version]
Ma, T.; Liu, Q.; Cao, J.; Tian, Y.; Al-Dhelaan, A.; Al-Rodhaan, M. LGIEM: Global and local node influence based community detection. Future Gener. Comput. Syst. 2020, 105, 533–546. [Google Scholar] [CrossRef]
Zhu, Y.; Lu, Z.; Bi, Y.; Wu, W.; Jiang, Y.; Li, D. Influence and profit: Two sides of the coin. In Proceedings of the 13th International Conference on Data Mining, Dallas, TX, USA, 7–10 December 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 1301–1306. [Google Scholar] [CrossRef]
Kempe, D.; Kleinberg, J.; Tardos, É. Maximizing the spread of influence through a social network. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 24–27 August 2003; pp. 137–146. [Google Scholar] [CrossRef] [Green Version]
Goyal, A.; Lu, W.; Lakshmanan, L.V. Celf++ optimizing the greedy algorithm for influence maximization in social networks. In Proceedings of the 20th International Conference Companion on World Wide Web, Hyderabad, India, 28 March–1 April 2011; pp. 47–48. [Google Scholar] [CrossRef]
Li, W.; Zhong, K.; Wang, J.; Chen, D. A dynamic algorithm based on cohesive entropy for influence maximization in social networks. Expert Syst. Appl. 2021, 169, 114207. [Google Scholar] [CrossRef]
Kumar, S.; Singhla, L.; Jindal, K.; Grover, K.; Panda, B. IM-ELPR: Influence maximization in social networks using label propagation based community structure. Appl. Intell. 2021, 51, 7647–7665. [Google Scholar] [CrossRef]
Lotf, J.J.; Azgomi, M.A.; Dishabi, M.R.E. An improved influence maximization method for social networks based on genetic algorithm. Phys. A Stat. Mech. Appl. 2022, 586, 126480. [Google Scholar] [CrossRef]
Bucur, D.; Iacca, G.; Marcelli, A.; Squillero, G.; Tonda, A. Multi-objective evolutionary algorithms for influence maximization in social networks. In Applications of Evolutionary Computation, Proceedings of the 20th European Conference, EvoApplications 2017, Amsterdam, The Netherlands, 19–21 April 2017; Springer: Berlin/Heidelberg, Germany, 2017; pp. 221–233. [Google Scholar] [CrossRef]
Konotopska, K.; Iacca, G. Graph-aware evolutionary algorithms for influence maximization. In Proceedings of the Genetic and Evolutionary Computation Conference Companion, Lille, France, 10–14 July 2021; pp. 1467–1475. [Google Scholar] [CrossRef]
Gong, H.; Guo, C. Influence maximization considering fairness: A multi-objective optimization approach with prior knowledge. Expert Syst. Appl. 2023, 214, 119138. [Google Scholar] [CrossRef]
Wang, C.; Ma, L.; Ma, L.; Lai, J.W.; Zhao, J.; Wang, L.; Cheong, K.H. Identification of influential users with cost minimization via an improved moth flame optimization. J. Comput. Sci. 2023, 67, 101955. [Google Scholar] [CrossRef]
Olivares, R.; Muñoz, F.; Riquelme, F. A multi-objective linear threshold influence spread model solved by swarm intelligence-based methods. Knowl.-Based Syst. 2021, 212, 106623. [Google Scholar] [CrossRef]
Emery, N.J.; Clayton, N.S. The mentality of crows: Convergent evolution of intelligence in corvids and apes. Science 2004, 306, 1903–1907. [Google Scholar] [CrossRef]
Emery, N.; Clayton, N. Erratum: Effects of experience and social context on prospective caching strategies by scrub jays. Nature 2002, 416, 349. [Google Scholar] [CrossRef] [Green Version]
Dally, J.M.; Emery, N.J.; Clayton, N.S. Food-caching western scrub-jays keep track of who was watching when. Science 2006, 312, 1662–1665. [Google Scholar] [CrossRef] [Green Version]
Askarzadeh, A. A novel metaheuristic method for solving constrained engineering optimization problems: Crow search algorithm. Comput. Struct. 2016, 169, 1–12. [Google Scholar] [CrossRef]
De Souza, R.C.T.; dos Santos Coelho, L.; De Macedo, C.A.; Pierezan, J. A V-shaped binary crow search algorithm for feature selection. In Proceedings of the Congress on Evolutionary Computation (CEC), Rio de Janeiro, Brazil, 8–13 July 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–8. [Google Scholar] [CrossRef]
Mirjalili, S.; Lewis, A. S-shaped versus V-shaped transfer functions for binary particle swarm optimization. Swarm Evol. Comput. 2013, 9, 1–14. [Google Scholar] [CrossRef]
Hatamlou, A. Black hole: A new heuristic optimization approach for data clustering. Inf. Sci. 2013, 222, 175–184. [Google Scholar] [CrossRef]
Van de Bunt, G.G.; Van Duijn, M.A.; Snijders, T.A. Friendship networks through time: An actor-oriented dynamic statistical network model. Comput. Math. Organ. Theory 1999, 5, 167–192. [Google Scholar] [CrossRef]
Moody, J. Peer influence groups: Identifying dense clusters in large networks. Soc. Netw. 2001, 23, 261–283. [Google Scholar] [CrossRef]
Kumar, S.; Spezzano, F.; Subrahmanian, V.; Faloutsos, C. Edge weight prediction in weighted signed networks. In Proceedings of the 16th International Conference on Data Mining (ICDM), Barcelona, Spain, 12–15 December 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 221–230. [Google Scholar] [CrossRef]
Kumar, S.; Hooi, B.; Makhija, D.; Kumar, M.; Faloutsos, C.; Subrahmanian, V. Rev2: Fraudulent user prediction in rating platforms. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining, Hong Kong, China, 9–12 February 2018; pp. 333–341. [Google Scholar] [CrossRef] [Green Version]
Kunegis, J. Konect: The koblenz network collection. In Proceedings of the 22nd International Conference on World Wide Web, Rio de Janeiro, Brazil, 13–17 May 2013; pp. 1343–1350. [Google Scholar] [CrossRef]
Massa, P.; Salvetti, M.; Tomasoni, D. Bowling alone and trust decline in social network sites. In Proceedings of the 8th International Conference on Dependable, Autonomic and Secure Computing, Chengdu, China, 12–14 December 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 658–663. [Google Scholar] [CrossRef]
De Domenico, M.; Lima, A.; Mougel, P.; Musolesi, M. The anatomy of a scientific rumor. Sci. Rep. 2013, 3, 2980. [Google Scholar] [CrossRef] [Green Version]
Kunegis, J.; Lommatzsch, A.; Bauckhage, C. The slashdot zoo: Mining a social network with negative edges. In Proceedings of the 18th International Conference on World Wide Web, Madrid, Spain, 20–24 April 2009; pp. 741–750. [Google Scholar] [CrossRef] [Green Version]
Coello, C.C.; Lechuga, M.S. MOPSO: A proposal for multiple objective particle swarm optimization. In Proceedings of the Congress on Evolutionary Computation CEC’02 (Cat. No. 02TH8600), Honolulu, HI, USA, 12–17 May 2002; IEEE: Piscataway, NJ, USA, 2002; Volume 2, pp. 1051–1056. [Google Scholar] [CrossRef]
Yang, X.S. Bat algorithm for multi-objective optimisation. Int. J. Bio-Inspired Comput. 2011, 3, 267–274. [Google Scholar] [CrossRef] [Green Version]
Ebadifard, F.; Babamir, S.M. Optimizing multi objective based workflow scheduling in cloud computing using black hole algorithm. In Proceedings of the 3th International Conference on Web Research (ICWR), Tehran, Iran, 19–20 April 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 102–108. [Google Scholar] [CrossRef]
Xue, J.; Shen, B. Dung beetle optimizer: A new meta-heuristic algorithm for global optimization. J. Supercomput. 2022, 79, 7305–7336. [Google Scholar] [CrossRef]

Figure 1. The V-shaped transfer functions.

Figure 2. The flowchart of MOCSA.

Figure 3. The fitness value of the objective function corresponding to different parameters N on the six networks.

Figure 4. Influence spread of different algorithms on six real social networks.

Figure 5. The seed nodes cost of different algorithms on six real social networks.

Figure 6. Comparison on fitness of different algorithms on six real social networks.

Figure 7. Comparison on Pareto Frontiers of different algorithms on six real social networks.

Figure 8. Comparison of fitness of different algorithms on six real social networks.

Table 1. Topological structure and properties of experimental network data set.

Networks	\|V\|	\|E\|	$d_{\max}$	<k>	C	Reference
Dutch-College	32	3062	290	191.375	0.903 676	[35]
Adolescent-Health	2539	12,969	10	10.2158	0.141888	[36]
Bitcoin-Alpha	3783	24,186	490	12.7867	0.0780074	[37,38]
Advogato	6541	51,127	941	18	0.287089	[39,40]
Higgs-Reply	38,918	32,523	1259	65.0679	0.0058	[41]
Slashdot	77,357	516,575	426	13.0282	0.0549	[42]

∣V∣ and ∣E∣ represent the number of nodes and edges respectively,

d_{m a x}

is the maximum degree, <k> is the average degree, C represents the average clustering coefficient, Reference is the the data set source reference.

Table 2. Computational results of different algorithms on six real social networks.

		Dutch-College	Adolescent-Health	Bitcoin-Alpha	Advogato	Higgs-Reply	Slashdot
MOCSA	$f_{1}$	32	2574	3741	4669	30,313	48,880
	$f_{2}$	9	347	669	1690	13,579	18,463
	$F (x)$	23	2227	3072	2979	16,734	30,417
MOPSO	$f_{1}$	32	2539	3784	6226	34,263	61,031
	$f_{2}$	9	350	717	3318	16,755	30,113
	$F (x)$	23	2189	3067	2908	17,508	30,918
MOBA	$f_{1}$	32	2392	3504	4656	27,451	57,681
	$f_{2}$	12	1256	1830	3210	19,360	38,562
	$F (x)$	20	1136	1674	1446	8091	19,119
MOBHA	$f_{1}$	32	2387	3524	4702	28,029	57,648
	$f_{2}$	10	1179	1792	3205	19,551	38,304
	$F (x)$	22	1208	1732	1497	8478	19,344
MODBO	$f_{1}$	32	1983	3288	2785	18,786	47,089
	$f_{2}$	10	505	833	490	10,560	23,956
	$F (x)$	22	1478	2455	2295	8226	23,133

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, P.; Zhang, R. A Multi-Objective Crow Search Algorithm for Influence Maximization in Social Networks. Electronics 2023, 12, 1790. https://doi.org/10.3390/electronics12081790

AMA Style

Wang P, Zhang R. A Multi-Objective Crow Search Algorithm for Influence Maximization in Social Networks. Electronics. 2023; 12(8):1790. https://doi.org/10.3390/electronics12081790

Chicago/Turabian Style

Wang, Ping, and Ruisheng Zhang. 2023. "A Multi-Objective Crow Search Algorithm for Influence Maximization in Social Networks" Electronics 12, no. 8: 1790. https://doi.org/10.3390/electronics12081790

APA Style

Wang, P., & Zhang, R. (2023). A Multi-Objective Crow Search Algorithm for Influence Maximization in Social Networks. Electronics, 12(8), 1790. https://doi.org/10.3390/electronics12081790

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Multi-Objective Crow Search Algorithm for Influence Maximization in Social Networks

Abstract

1. Introduction

2. Related Work

3. Preliminaries

3.1. Influence Maximization

3.2. Diffusion Models

3.3. Crow Search Algorithm

4. Proposed Method

4.1. Least Cost Influence Maximization

4.2. Discrete Encoding Scheme

4.3. Parameter Setting Based on Dynamic Control Strategy

4.4. Random Walk Based on Black Hole

4.5. Framework of MOCSA

5. Experiments and Analysis

5.1. Datasets

5.2. Comparison Algorithms

5.3. Parameter Configuration of MOCSA

5.4. Result Analysis

5.4.1. The Comparison of the Influence Spread

5.4.2. The Comparison of the Seed Nodes Cost during Diffusion

5.4.3. The Comparison of Fitness Function Optimization Results

5.4.4. The Comparison of Running Time

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI