Next Article in Journal
Symmetry-Based Enumeration of Polyominoes on C-Coloured Checkerboards
Previous Article in Journal
Integrated Scheduling Algorithm Based on Improved Semi-Numerical Algorithm
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Landscape-Aware Discrete Particle Swarm Optimization for the Influence Maximization Problem in Social Networks

1
School of Information science and Engineering, Lanzhou University, Lanzhou 730000, China
2
School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, China
*
Author to whom correspondence should be addressed.
Symmetry 2025, 17(3), 435; https://doi.org/10.3390/sym17030435
Submission received: 11 February 2025 / Revised: 2 March 2025 / Accepted: 6 March 2025 / Published: 14 March 2025
(This article belongs to the Section Computer)

Abstract

:
Influence maximization (IM) is a pivotal challenge in social network analysis, which aims to identify a subset of key nodes that can maximize the information spread across networks. Traditional methods often sacrifice solution accuracy for spreading efficiency, while meta-heuristic approaches face limitations in escaping local optima and balancing exploration and exploitation. To address such challenges, this paper introduces a landscape-aware discrete particle swarm optimization (LA-DPSO) to solve the IM problem. The proposed algorithm employs a population partitioning strategy based on a fitness distance correlation index to enhance population diversity. For the two partitioned subpopulations, a global evolutionary mechanism and a variable neighborhood search mechanism are designed to make a symmetrical balance between the exploration and exploitation. The fitness landscape entropy is introduced to detect the local optima and prevent the population from premature convergence during the evolution. Experiments conducted on six real-world social networks demonstrate that the proposed LA-DPSO achieves an average performance improvement of 16% compared to state-of-the-art methods while exhibiting excellent scalability across diverse network types.

1. Introduction

Nowadays, social networks have emerged as central platforms for information dissemination and daily interaction, characterized by real-time updates, strong user interactions, and globalization. Influence maximization (IM) [1] is a crucial problem in social network analysis, with the objective of identifying a subset of key nodes to maximize the spread of intriguing information, innovative products, and novel ideas. This concept is widely applied in fields including information diffusion, marketing strategies, transportation optimization, resource distribution, political campaigns, and epidemic containment [2,3,4].
Influence maximization was first modeled as a discrete combinatorial optimization problem by Kempe et al. [5], and it was proven to be an NP-hard problem. To solve the problem more efficiently, several methods based on the seminal greedy algorithm have been proposed, including the CELF and CELF++ [6,7]. Such algorithms rely heavily on Monte Carlo simulations to find approximate optimal solutions but still face high computational costs, especially in dealing with large-scale networks. To overcome the bottlenecks, researchers proposed topological-based heuristic methods such as degree centrality, eigenvector centrality, and gravity centrality to solve the IM problem [8,9]. While such methods can avoid the Monte Carlo simulations and show satisfying efficiency, they fail to provide guaranteed solution accuracy easily in dealing with diverse network structures. Consequently, researchers turned to meta-heuristic approaches to achieve more precise solutions [10]. However, due to the unscalability of the heterodox fitness functions and the derivative search strategies, the solution quality always turns out to be unsatisfying. Therefore, developing effective and efficient algorithms to solve the influence maximization problem remains a hot topic in the area.
Several meta-heuristics have been proposed in the last few years to solve the influence maximization problem, such as the discrete crow search algorithm, discretized Harris’ Hawks optimization algorithm, and phased evaluation-enhanced approach [11,12,13]. Compared to traditional methods, such optimizations show the potential to enhance the solution quality through diverse evolutionary and local search strategies. However, these strategies focus solely on the population evolution rather than the characteristics of the IM problem itself, which leads to the population being trapped into local optima easily during the evolutionary process. Such bottlenecks motivate the researchers to probe the problem itself: can the distribution of the solution space for the IM problem, or the characteristics of candidate solutions, be systematically probed in advance to guide the design of effective algorithms? Building on the aforementioned need, fitness landscape analysis (FLA) emerges as a critical methodology to quantify the patterns of solution distribution, gradient dynamics, and local optima structures within the solution space, thereby offering prior knowledge for algorithm design [14]. Existing research indicates that the fitness landscape has been widely used to guide algorithm design and enable the optimization strategies to explore the search space more effectively, thereby improving the performance and convergence speed of the optimizations [15].
Motivated by the vision, in the context of IM, we make the attempt for the first time to utilize the FLA technique to reveal the “landscape” characteristics of information propagation pathways (e.g., flat regions, plateaus, or sharp peaks), such that the algorithms can dynamically balance the exploration and exploitation. A landscape-aware discrete particle swarm optimization (LA-DPSO) is proposed for the influence maximization problem. Specifically, the algorithm comprises a discrete particle swarm optimization and a landscape guiding operation. The fitness landscape entropy is adopted to guide the evolutionary process of the population effectively. Then, the fitness distance correlation coefficient is redesigned to partition the population into two subpopulations for further evolution. A modified variable neighborhood search (VNS) mechanism is applied to the elite subpopulation to enhance the solution quality, while a global search mechanism is used for the ordinary subpopulation to thoroughly explore the solution space. The main contributions of this paper are summarized as follows:
  • The fitness landscape entropy is introduced for the first time to quantify whether the population is trapped into local optima to enhance the evolutionary efficiency.
  • A fitness distance correlation coefficient is conceived for the first time to divide the population into an ordinary subpopulation and an elite subpopulation to improve the diversity of the population.
  • The VNS mechanism is designed for the IM problem specially to optimize the search capabilities of the elite population, thereby improving the algorithm’s performance.
The rest of the paper is structured as follows: Section 2 discusses related works on the influence maximization problem. Section 3 presents the necessary preliminaries for the proposed algorithm. Section 4 shows the implementation details of the LA-DPSO. Experimental results and analysis are given in Section 5. Finally, the work is summarized with future research directions.

2. Related Work

2.1. Greedy Algorithms

Greedy algorithms are widely used to solve the IM problem due to their ability to provide reliable approximate solutions. Such algorithms iteratively add nodes with the highest marginal gain to the seed set until k nodes are selected. In the seminal study, Kempe et al. [5] proposed a hill-climbing greedy method with an approximation guarantee of ( 1 1 e ϵ ) of the optimal solution. However, the algorithm is inefficient because estimating each candidate node requires extensive Monte Carlo simulations. To improve the efficiency of the classical greedy algorithm, Leskovec et al. [6] explored the properties of submodular functions and introduced the Cost-Effective Lazy Forward selection (CELF) algorithm, which achieved up to 700 times the acceleration. To improve the efficiency in large-scale networks, Lu et al. [16] proposed a probability-based recursive method that selects the node with the highest current evaluation as the seed node by using a greedy strategy. Subsequently, Lozano-Osorio et al. [17] developed a fast greedy randomized adaptive search algorithm by integrating a stochastic two-phase greedy construction technique with an intelligent neighborhood search strategy.
Greedy strategy-based methods can achieve high accuracy but are inefficient. Although recent studies have improved the running time, such algorithms still struggle with excessive time consumption and even sacrifice the algorithms’ accuracy when applied to large-scale social networks.

2.2. Heuristic Algorithms

In consideration of the time-consuming nature of the greedy strategies, some researchers turned to topology-based heuristic methods to solve the IM problem. Such methods evaluate the node importance by using topological features of the networks, such as degree centrality, eigenvector centrality, and gravity centrality [1,18]. However, such methods often focus merely on a single attribute of the network but neglect to consider integrating other topological structures. To address this limitation, Yang et al. [19] proposed a local similarity metric that dynamically adjusts according to the network topological features. It can achieve a good balance between the time complexity and solution accuracy. To alleviate clustering phenomena when selecting seed nodes, Li et al. [20] introduced a dynamic algorithm based on cohesive entropy, which adopts the node topological similarity and relative entropy, instead of Euclidean distance, to identify seed nodes. For large-scale networks, Kianian and Rostamnia [21] developed an efficient heuristic independent path algorithm, which leverages node characteristics and pruning techniques to approximate the influence spread. Recently, Zhu et al. [22] proposed a semi-local centrality method that integrates the concepts of local average shortest path and extended neighborhood to enhance efficiency in handling large networks.
Heuristic methods demonstrate significant efficiency in large-scale networks, but their accuracy is always inferior to that of greedy strategies. Furthermore, such methods are less stable due to their dependence on network topology.

2.3. Community-Based Algorithms

Social networks often exhibit community structures, where nodes within a community are densely connected while connections between communities are relatively sparse. This facilitates easier information spread within communities. To identify influential nodes in the network, Rao and Chowdary [23] proposed an efficient community-based influence maximization model, which identifies seed nodes with maximum influence through a two-stage process that filters high-quality communities and candidate nodes. To detect overlapping communities, Bouyer et al. [24] proposed a fast overlapping community-based influence maximization algorithm, which utilizes the global diffusion probability to generate candidate nodes while reducing the time costs by minimizing the search space. Complementary to these approaches, Liu et al. [25] proposed a community-based backward generation network that combines community detection with backward generation networks and employs a graph traversal method to select nodes within each community. In the fair influence maximization problem, Ma et al. [26] developed a community-based evolutionary algorithm. This algorithm identifies potential nodes by using a community node selection strategy that considers community size and node attributes.
The community-based methods perform well in large-scale networks but sometimes fail to achieve accurate community partitioning, leading to the misallocation of key nodes and unsatisfying influence propagation. Meanwhile, it should be noted that such methods prioritize influence propagation within communities while neglecting cross-community influence, potentially resulting in locally optimal.

2.4. Machine Learning-Based Algorithms

In recent years, with the rapid development of machine learning, new methods based on graph representation learning have emerged to solve the IM problem. Traditional methods often struggle to maintain efficiency while improving the solution quality. To solve this problem, Li et al. [27] developed a framework called PIANO, which combines graph embedding with reinforcement learning. This framework trains a Q-function to predict the marginal influence gain of nodes through their representations and selects the top k nodes based on descending Q-values. Building on these developments, Kumar et al. [28] employed Struc2Vec graph embeddings with graph neural network (GNN) regression to enable scalable influence prediction. Their subsequent work [29] introduced Graph-LSTM (GLSTM), integrating transfer learning mechanisms with centrality-based feature engineering for enhanced seed identification. However, this heavily depends on the accuracy of training labels for solution quality. To further improve the accuracy, Tang et al. [30] proposed a new GNN-based framework that integrates graph convolutional networks with graph transformers for capturing the network information to select seed nodes. Recently, Li et al. [31] introduced an improved K-shell algorithm with heterogeneous cross-comparison to solve the IM problem. The model utilizes encoder and graph convolutional networks to learn users’ potential representations of historical content preferences and topological structure. Meanwhile, it defines the heterogeneous similarity and the heterogeneous information entropy to measure users’ influence ability.
Machine learning-based methods can always promise satisfying solution quality without compromising the efficiency. However, when faced with insufficient or unbalanced training data, the model exhibits a tendency towards overfitting, resulting in suboptimal performance and constrained robustness.

2.5. Meta-Heuristic Algorithms

Meta-heuristic algorithms simulate natural phenomena to find approximate optimal solutions in the search space by using general evolutionary strategies. Gong et al. [10] proposed a local influence evaluation function to approximate the expected influence spread within the two-hop neighborhood of a candidate seed set, and they designed a discrete particle swarm optimization by redefining the particle velocity and position updating rules to solve the IM problem for the first time. However, this algorithm tends to fall into local optima easily. To solve this problem, Tang et al. [32] developed a discrete bat algorithm with probabilistic greedy local search and a candidate pool to enhance algorithm accuracy.
In dealing with the IM problem in large-scale networks, Biswas et al. [33] designed a two-stage differential evolution algorithm, which narrows the search space by selecting candidate nodes based on node scoring, followed by a differential evolution being adopted to select the optimal seed set. To improve seed set accuracy, Zhu et al. [13] introduced a phased evaluation enhancement method that combines an evolutionary algorithm with random range partitioning and a simulated annealing strategy to search for the optimal solution. To address the dynamic and large-scale nature of social networks, Li et al. [34] proposed an adaptive agent-based evolutionary method that dynamically adjusts candidate solutions based on a genetic algorithm. Considering the potential impact of unreliable transmission on social network connections, Wang et al. [35] proposed a discrete moth–flame optimization algorithm, which optimizes the seed set through local crossover and mutation schemes. Subsequently, Khatri et al. [12] introduced a discretized Harris hawks optimization that incorporates a new neighborhood detection strategy and an adaptive random population initialization method, demonstrating enhanced performance in networks with community structures.
The meta-heuristic approaches can achieve remarkable accuracy while maintaining acceptable efficiency. However, as real-world network scales increase rapidly, such algorithms face the challenge of balancing efficiency and effectiveness in dealing with the IM problem. Meanwhile, most researchers design discrete evolutionary mechanisms in an intuitive way, which often neglects the problem characteristics and the evolutionary process of the population. Such oversights can lead to local optima or premature convergence easily. Therefore, in this paper, we make an attempt to employ the fitness landscape techniques to redesign evolutionary mechanisms from the perspective of probing into the potential solution distribution to further balance the time efficiency and performance.

3. Preliminaries

3.1. Influence Maximization Problem

The influence maximization problem seeks to identify a subset of k key users within a social network to maximize the information spread [1]. By selecting an initial set of seed nodes, the goal is to maximize the number of nodes activated through the propagation model. It can be mathematically modeled as Equation (1):
S * = arg max | S | = k σ ( S ) ,
where S is a candidate seed set and σ ( S ) is the estimated influence spread of S. S * is the optimal seed set.

3.2. Diffusion Models

The independent cascade model (IC) is a widely probabilistic model used to describe information propagation in social networks [5]. In this model, when a node is activated, it attempts to activate its inactive neighbors with a certain probability. These attempts are independent and occur only once. If a neighboring node is successfully activated, it attempts to activate its neighbor nodes in the next round. This iterative process continues until no new nodes are activated.

3.3. Influence Estimating Function

Expected diffusion value ( E D V ) [36] is a metric used to estimate the influence propagation of given nodes based on the IC model. This function approximates the influence spread by calculating the number of one-hop neighbor nodes that can be influenced by the seed set S. The influence estimating function is defined as Equation (2):
E D V ( S ) = k + i N S ( 1 ) S 1 ( 1 p ) τ ( i ) ,
where k represents the number of nodes in the candidate seed set S, N S ( 1 ) denotes the direct neighbors (one-hop neighbors) of the seed set, and p is the activation probability. τ ( i ) represents the number of connections between node i and the seed nodes in S.

3.4. Fitness Landscape Metrics

The fitness distance correlation ( F D C ) [37] assesses problem complexity by examining the correlation between individual fitness and distance within the search space. It reflects certain characteristics of the optimization problem and has been incorporated into various optimization algorithms. The formulas for calculating the F D C value r F D are defined in Equations (3) and (4).
r F D = c F D S D S F ,
c F D = 1 n i = 1 n ( f ( i ) f ¯ ) ( d ( i ) d ¯ ) ,
where n is the total number of individuals and f ( i ) and d ( i ) are the fitness value and the distance from the best individual of the ith individual, respectively. S D and S F are the standard deviations of the distance and fitness values, respectively. f ¯ represents the average fitness, and d ¯  is the average distance.
The entropy metric [38] generates sample points through random walk sampling and estimates the landscape roughness by using the entropy of the probability distribution of rugged and non-rugged elements in the sequence. By calculating the entropy of the fitness value distribution in the solution space, it aids in designing optimization algorithms and guiding population evolution. The calculation proceeds through three steps:
(1)
Perform a random walk on the landscape to generate a time series of fitness values, { f t } t = 0 n .
(2)
The resulting time series are converted into a string S ( ϵ ) according to a threshold ϵ , calculated by Equation (5). The parameter ϵ is a real number that determines the accuracy of the string computation.
S i = Ψ f t ( i , ϵ ) = 1 , if f i f i 1 < ϵ 0 , if | f i f i 1 | ϵ 1 , if f i f i 1 > ϵ .
(3)
Calculate the entropy metric based on the S ( ϵ ) according to Equation (6):
H ( ϵ ) = p q P [ p q ] log 6 P [ p q ] P [ p q ] = n [ p q ] n ,
where p and q are elements in the set { 1 , 0 , 1 } and n [ p q ] is the number of sub-blocks p q in the string S ( ϵ ) , where p q .

3.5. Particle Swarm Optimization

Particle swarm optimization (PSO) is a population-based optimization algorithm [39]. It emulates the social behavior of bird flocks to find optimal solutions through cooperation and competition among individuals, known as particles. In PSO, each particle represents a potential solution, with its position and velocity continuously updated in the search space. Particles adjust their trajectories based on individual and swarm experiences to seek the global optimum. The updating formulas are defined in Equation (7):
v i t + 1 = w · v i t + c 1 · r 1 · ( p i t x i t ) + c 2 · r 2 · ( g t x i t ) x i t + 1 = x i t + v i t + 1 ,
where v i t represents the velocity of particle i at time t, w denotes the inertia weight, and c 1 and c 2 are the learning rates, while r 1 and r 2 are random numbers. Additionally, p i t is the historical best position of particle i until the t-th generation, g t is the global best position, and x i t is the position of particle i at time t.

4. Proposed Algorithm

In this section, the landscape-aware discrete particle swarm optimization proposed for the influence maximization problem is described in detail. Figure 1 shows the LA-DPSO flowchart, which includes the following key operations:
Operation 1: Initialize the velocity vector V and the position vector X for each individual in the population, as well as the local best position vector P b e s t and the global best position vector G b e s t , according to Algorithm 1.
Operation 2: Update the velocity vector V and the position vector X of the individuals in the population according to Equation (8) and Equation (10), respectively.
Operation 3: Compare each particle’s current fitness with its P b e s t fitness. If the current fitness is superior, update P b e s t with the current position. Then, identify the particle with the best fitness among all P b e s t values. If the fitness exceeds the current G b e s t , update G b e s t with the particle’s position.
Operation 4: Enhance the global best position vector G b e s t according to the refining search strategy (Algorithm 2).
Operation 5: Calculate the entropy metric H of the local best position vector according to Equation (6) to determine whether it is trapped in a local optimum.
Operation 6: If the H value in the current generation is different from the previous generation, continue the iteration.
Operation 7: If the H value in the current generation does not change from the previous generation, then implement the fitness landscape-aware evolution. Calculate the correlation coefficient r of the population by Equation (11), based on which of the population is divided into an ordinary subpopulation and an elite subpopulation.
Operation 8: A new population is generated after applying a global search mechanism (Algorithm 3) to the ordinary subpopulation and a variable neighborhood search (Algorithm 4) to the elite subpopulation. The operations 2 8 are performed iteratively until the stopping criterion, i.e., the maximal iteration, is met.
Algorithm 1 Initialization ( G , p o p , k )
Input: Graph G = ( V , E ) , the population size p o p , and the seed set size k.
Output: Initial the velocity vector V, the position vector X, the local best P b e s t , and the global best G b e s t .
 1:
for  i = 1 to p o p  do
 2:
     X [ i ] Select k highest degree nodes .
 3:
    for  j = 1 to k do
 4:
        if  rand ( 0 , 1 ) < 0.5  then
 5:
            X [ i ] [ j ] replace ( X [ i ] [ j ] , V X [ i ] ) ;
 6:
            P b e s t [ i ] [ j ] X [ i ] [ j ] ;
 7:
         V [ i ] [ j ] 0 ;
 8:
    Evaluate the fitness value based on E D V ( P b e s t [ i ] ) .
 9:
G b e s t Select the best fitness value from P b e s t .
10:
return  V , X , P b e s t , G b e s t
Algorithm 2 Refining_Search_Mechanism ( G , S )
Input: Graph G = ( V , E ) , and the candidate seed set S.
Output: Enhanced seed set S.
 1:
for each u S  do
 2:
    for each v G . n e i g h b o r ( u )  do;
 3:
         S ( S { u } ) { v } ;
 4:
        if  E D V ( S ) > E D V ( S )  then
 5:
            S S ;
 6:
return  S
Algorithm 3 Global_Search_Mechanism ( G , k )
Input: Graph G = ( V , E ) , and the size of seed set k.
Output: Node set S.
 1:
S ;
 2:
Sort nodes V by eigenvector centrality in descending order to form the list of nodes n o d e l i s t .
 3:
for  i = 1 to k do
 4:
     u p _ b o u n d i + k ;
 5:
    Select n o d e randomly from n o d e l i s t [ 1 : u p _ b o u n d ] ;
 6:
     S S { n o d e } ;
 7:
return  S
Algorithm 4 Variable_Neighbourhood_Search ( G , S )
Input: Graph G = ( V , E ) , and the candidate seed set S
Output: Improved seed set S *
 1:
S * S ;
 2:
for each u S  do
 3:
     S 1 S 1 ( G , u ) S ;
 4:
     S 2 S 2 ( G , u ) ( S 1 S ) ;
 5:
    for each n o d e S 1  do
 6:
         S ( S * { u } ) { n o d e } ;
 7:
        if  E D V ( S ) > E D V ( S * )  then
 8:
            S * S ;
 9:
           return  S * ;
10:
    for each n o d e S 2  do
11:
         S ( S * { u } ) { n o d e } ;
12:
        if  E D V ( S ) > E D V ( S * )  then
13:
            S * S ;
14:
           return  S * ;
15:
return  S *

4.1. Initialization

In the initialization, the velocity vectors of all particles in the population are set to zero, indicating an initial velocity of zero. For the position vector of each individual, the k nodes with the highest degree values are selected as the initial position. To enhance diversity, a perturbation factor that assigns a random probability in [ 0 , 1 ] is applied to each element of the position vector X [ i ] . When the perturbation threshold exceeds 0.5, replace X [ i ] [ j ] with a random node from the set V, which ensures that no duplicate nodes exist in the modified particle. The local best position is initialized with each individual’s initial position, while the global best position is determined by the local optimum, with the highest fitness value being calculated by the E D V function. The specific pseudocode is shown in Algorithm 1.

4.2. Evolutionary Rules

4.2.1. Updating Mechanism for Velocity

In the particle swarm optimization, the velocity vector is crucial for guiding particles toward promising regions. Inspired by the updating mechanism of DPSO [10], the velocity updating rule is defined, as shown in Equation (8).
V i F ( ω V i + c 1 r 1 ( X i P b e s t i ) + c 2 r 2 ( X i G b e s t ) ) ,
where w is the inertia weight, which controls the magnitude of velocity updating. The terms r 1 and r 2 are random numbers between 0 and 1, introducing randomness into the velocity updating. The learning factors c 1 and c 2 determine the effect of the personal best and global best positions, respectively. Adjusting these parameters allows the algorithm to balance its preference between local exploitation and global exploration.
The operation “∩” is defined similarly to an intersection. This operator helps identify the common nodes between two candidate seed sets, which are likely to have higher influence.
For example, considering A = { 1 , 3 , 5 , 7 } as the X i and B = { 3 , 4 , 5 , 6 } as the P b e s t i , the intersection A B yields an 0 1 vector ( 1 , 0 , 0 , 1 ) , where 0 represents that the corresponding node in the set is a potential influencial node, while 1 indicates that the corresponding node is a less influencial node. The argument to F ( · ) is a velocity vector. Assuming that the argument is X i , the function F ( X i ) is expressed as F ( X i ) = ( h 1 ( x i 1 ) , h 2 ( x i 2 ) , , h k ( x i k ) ) , where h j ( x i j ) (for j = 1 , 2 , , k ) is defined as a threshold function, as shown in Equation (9).
h j ( x i j ) = 0 , x i j < 2 1 , otherwise .

4.2.2. Updating Rule for Position

The position vector determines a specific solution within the search space and directly impacts the quality of the optimization results. When solving the influence maximization problem with the PSO, the updating mechanism for the position vector is redefined, as shown in Equation (10).
X i X i V i ,
where “⊕” indicates that if the element v i j in V i is 0, the corresponding position x i j in X i remains unchanged. Otherwise, a node is randomly selected from the set of candidates not present in X i to replace x i j .

4.3. Refining Search Mechanism

To further optimize the current global optimal solution, the candidate seed set is refined through local search to generate an improved seed set. The specific pseudocode is shown in Algorithm 2. The process involves generating a new candidate seed set by replacing each node u in the set S iteratively with its neighbor v. If the fitness value of the new generated set is superior, then the newly generated set is preserved for the next iteration. This strategy enhances the expected influence diffusion by gradually increasing the influence of the seed set.

4.4. Fitness Landscape-Aware Evolution

4.4.1. Population Partition

Inspired by the fitness distance correlation coefficient in the fitness landscape metric, we propose a novel discrete metric for the influence maximization problem, termed the fitness correlation coefficient r F D . The specific calculation formula is shown in Equation (11).
r F D = d i ( f i f G b e s t ) 2 d i = j = 1 k ( x [ j ] G b e s t [ j ] ) ,
where x represents individuals in the population and G b e s t is the global best position vector. The fitness value of an individual x is represented by f i , while the global optimal fitness value is indicated by f G b e s t . The distance between each individual and the global best is denoted by d i .
The correlation coefficient r is used to divide the population, with smaller values indicating closer proximity to the global optimum. This symmetric partitioning strategy ensures that both elite and ordinary subpopulations maintain balanced contributions to the exploration and exploitation, mimicking the symmetry principles observed in natural swarm behaviors. Individuals near the global optimum are classified into the elite subpopulation, while those farther away are assigned to the ordinary subpopulation. This division allows the elite subpopulation to utilize the VNS mechanism for further optimization, whereas the ordinary subpopulation performs the global search strategy, thus enhancing the overall algorithm’s search efficiency and quality.

4.4.2. Global Search Mechanism

To thoroughly explore the solution space, a global exploration mechanism is designed, as detailed in Algorithm 3. Eigenvector centrality assesses a node’s importance by considering the centrality of its neighbors. Nodes are then ranked in descending order based on the eigenvector centrality to create a list. To ensure diversity, a node is randomly selected from the node set to join the set S, and this process is repeated k times. Then, the current ordinary individual is replaced with the newly generated candidate set.

4.4.3. Variable Neighbourhood Search

The VNS method [40] exhibits strong robustness and stability by dynamically altering the searching neighborhood structures, thereby enhancing search capability and avoiding local optima. To improve the accuracy of the elite subpopulation, a modified VNS algorithm is proposed, as detailed in Algorithm 4. The first-order neighborhood consists of the other nodes within the same clique as the present node in the elite individual, which are utilized to help the population quickly escape the local optima. The second-order neighborhood excludes the first-order neighborhood and is defined within the node’s k-core layer to broaden the search scope. If a superior solution is found in the first-order neighborhood, the search stops. Otherwise, it continues to search in the second-order neighborhood.

4.5. Complexity Analysis

In this section, we analyze the time complexity of LA-DPSO, where T m a x , p o p , k, and D ^ represent the maximum number of iterations, population size, seed set size, and the maximum degree of the graph, respectively. The time complexity of the initialization phase is O ( p o p · k · D ^ ) . In the update phase, the velocity update has a time complexity of O ( p o p · k · log k ) , and the position update is O ( p o p · k ) . The time complexity for updating the local best is O ( p o p · k · D ^ ) , and updating the global best is also O ( p o p · k · D ^ ) . Thus, the total time complexity of the update phase is O ( p o p · k · D ^ ) . The time complexity of the refine search for the global best is O ( p o p · | V | ) . In the population division strategy stage, the worst-case time complexity for the ordinary population is O p o p 2 · | V | · log | V | , and for the elite population, it is O p o p 2 · k · D ^ 2 . According to the simplification rules of big O notation, the final time complexity can be expressed as O ( T m a x · p o p · max ( k · D ^ , | V | ) · D ^ ) .

5. Experimental Results and Analysis

5.1. Datasets and Baselines

In this study, six real-world networks were selected based on network scale, density, and type. A summary of these datasets is provided in Table 1, where | V | represents the number of nodes, | E | represents the number of edges, d a v e represents the average degree, and c a v e represents the average clustering coefficient. These datasets can be downloaded from the SNAP (http://snap.stanford.edu/data/, accessed on 1 January 2025).
To evaluate the performance of the proposed LA-DPSO, five state-of-the-art algorithms were selected for comparison.
  • CELF [6]: proposed in 2007, this is a greedy algorithm that prioritizes the selection of nodes with the highest marginal gain by conducting tens of thousands Monte-Carlo simulations on each node in the iterative rounds.
  • DPSO [10]: proposed in 2016, this is a swarm intelligence-based meta-heuristic algorithm that simulates the foraging behavior of bird flocks to identify the global best solution.
  • TS-VA-MODE [33]: proposed in 2022, this solves the influence maximization problem by reducing the number of candidate nodes through a multi-criteria decision-making approach. It integrates an improved differential evolution algorithm with multiple search operators to enhance performance.
  • DCGM++ [9]: proposed in 2023, this algorithm measures the importance of a node by calculating the degree of the node and the average degree of its neighboring nodes to select the greater influence node in the whole network.
  • ENIMNR [41]: proposed in 2024, this algorithm reduces the search space by combining shell decomposition with node representation, while employing a deep learning-based node embedding technique to detect key nodes.
In the experiment, the parameter settings of the above algorithms follow the configurations from their original papers. All the procedures were encoded in Python and executed on an Intel® Xeon® 5218R CPU @ 2.10 GHz with 64 G memory in a Windows system.

5.2. Parameter Settings

Parameter settings play important roles in helping the algorithm performs well in addressing the IM problem. However, trying to optimize the best setting for one parameter while fixing other parameters with predefined values is empirical and imprecise. To optimize the balance between solution quality and algorithm runtime, the orthogonal experiment method was employed to determine the optimal parameter configuration. This approach can effectively analyze the impact of each factor through simplified experimental combinations. The proposed algorithm mainly involves five parameters, each with five levels, making the L 25 orthogonal array design suitable for the experiment. The seed set size k and activation probability p are set to 30 and 0.05, respectively. Table 2 and Table 3 present the orthogonal experiment results, specifically showing the mean and standard deviation of the influence spread of each algorithm on the six networks across thirty independent runs for each parameter combination. In the tables, N u m b e r denotes the sequence number of each parameter configuration, T m a x represents the number of iterations, p o p indicates the population size, w denotes the inertia weight, and c 1 and c 2 represent the learning factors.
The orthogonal experimental results in Table 2 show that the parameter settings of number 19 perform as the optimal in the NetScience, Email, and CA-GrQc networks, while number 20 is optimal merely in the NetHEHT network, and number 18 returns the optimal value in the Blog and CA-HepTh network. In the original PSO, the learning rates c 1 and c 2 are typically set as equal to 2.0 to balance individual and group learning. However, the original PSO is designed for continuous problems. For combinatorial optimization problems, the settings of these parameters must be adjusted according to the specific problem. The experimental results indicate that when c 1 = 1.8 and c 2 = 1.8 , the proposed LA-DPSO can obtain the optimal value in most scenarios. This is because a higher learning rate can effectively balance the solution quality with computational efficiency. Although larger iterations ( T m a x ), population size ( p o p ), and inertia weight (w) typically can the enhance the performance, they also increase the algorithm’s time complexity inevitably. Table 3 shows that number 19 has the smallest standard deviation value in most cases, which suggests minimal volatility and relative stability. To further balance the efficiency and effectiveness, the parameter configuration of number 19 was selected as the optimal parameter setting for the proposed LA-DPSO.

5.3. Comparison Experiments

In this section, we present and discuss in detail the results of the comparison experiments on the typical evaluation metrics. The experiments were conducted by varying the seed set size from 10 to 50 (with an interval of 10) under the activation probability p = 0.05 and setting the number of Monte Carlo simulations to 1000. In addition, other key parameters were set as follows: the number of iterations T m a x = 150 , population size p o p = 40 , inertia weight w = 0.8 , and learning factors c 1 = 1.8 and c 2 = 1.8 . To ensure the reliability of the experimental results, each algorithm was independently executed 30 times and the average value was recorded.

5.3.1. Ablation Study

Ablation experiments were conducted to evaluate the contribution of model components to the overall performance. We conducted an ablation experiment to show the impact of the population partition strategy; meanwhile, the No_LA-DPSO that has no the partition strategy is implemented. The results are presented in Figure 2.
The experimental results show that the LA-DPSO algorithm significantly outperforms the No_LA-DPSO in influence propagation across all networks, confirming the effectiveness of the population partition strategy based on the fitness landscape. It demonstrates the fact that this strategy can effectively balance the global exploration and local exploitation by dynamically partitioning the population based on the r-value and guiding them to explore diverse search spaces, thereby avoiding premature convergence caused by population homogenization in traditional algorithms. With the network size and seed set increase, the LA-DPSO can identify key nodes in a more collective way and enhance the influence spread. The LA-DPSO consistently shows advantages in networks such as CA-GrQc, CA-HepTh, and NetHEHT, indicating strong robustness to various network topologies.

5.3.2. Comparison on the Convergence Speed

Since both LA-DPSO and DPSO are extensions of the PSO algorithm, we analyze their convergence speed on the six real networks at k = 30 . Figure 3 compares the convergence speed of LA-DPSO and DPSO on these networks.
The experimental results indicate that LA-DPSO outperforms DPSO in both exploration and exploitation during the optimization process, enabling it to identify and approach the global optimum more rapidly. In the evolutionary process, LA-DPSO enhances the population diversity by dividing the population into elite and ordinary sub-populations based on the fitness distance correlation coefficient. The figure clearly shows that the iterative process of LA-DPSO exhibits a steady upward trend, attributed to guidance based on fitness landscape entropy, which prevents it from getting trapped into local optima. In contrast, DPSO evolves only intermittently and is prone to premature convergence. This is because DPSO employs a simplistic discretization of the PSO mechanism, which causes the population to fall into local optima easily. Therefore, the LA-DPSO makes a trade-off between the solution quality and the convergence speed.

5.3.3. Comparison on Influence Spread

To verify the accuracy of LA-DPSO on different types of networks, we compare the influence propagation of each algorithm under the IC model. The influence propagation is defined as the number of nodes activated by the seed nodes and is measured by simulating one thousand Monte Carlo independent propagations to ensure the accuracy. A larger propagation range indicates higher effectiveness in information dissemination. Figure 4 presents the influence propagation results of each algorithm on the six networks.
The experimental results show that CELF consistently outperforms other algorithms across all seed set sizes, primarily due to the global optimal guarantee provided by its greedy hill-climbing strategy. The LA-DPSO exhibits superior stability, ranking second only to CELF, particularly in complex solution spaces. This performance stems from its innovative integration of a dynamic population division mechanism based on the fitness landscape, which effectively maintains the population diversity. As evidenced in Figure 4c,d, LA-DPSO demonstrates significant advantages over other algorithms (excluding CELF) in high-density, low-average degree networks, where the entropy metric mechanism prevents the algorithm from being trapped into local optima.
In contrast, DPSO relies solely on the discretization mapping of standard PSO and lacks a dynamic parameter adjustment mechanism, which leads to local optima and unstable performance. The centrality-based DCGM++ algorithm exhibits reduced effectiveness in complex real-world networks due to its sensitivity to network structural characteristics. Furthermore, TS-VA-MODE and ENIMNR exhibit obvious inferior performance, primarily due to their heavy dependence on precise candidate node selection.

5.3.4. Comparison on Running Time

To verify the efficiency of the proposed LA-DPSO on different network types, we compare the running time of the six algorithms under identical conditions. Figure 5 presents the running time of each algorithm on the six real networks at k = 30 and k = 50 .
The experimental results illustrates that the running time efficiency of LA-DPSO shows super performance on the computational efficiency while maintaining high accuracy, particularly when compared to CELF. However, due to the high time complexity of its local search mechanism, LA-DPSO can be slower than TS-VA-MODE in cases when k = 30 . Nonetheless, when k = 50 , LA-DPSO outperforms TS-VA-MODE in terms of speed, as the latter requires more iterations to find satisfactory solutions for larger seed set sizes. The DCGM++ exhibits the shortest running time across all networks but is less accurate, as it focuses solely on the structural characteristics and overlooks other key factors.

5.4. Statistical Tests

To verify the statistical significance of the influence spread outcomes, statistical tests were conducted by using SPSS software. In the experiments, five values of k ( k = 10 , , 50 ) were treated as independent optimization problems. A Wilcoxon signed-rank test was performed separately for each k value across the six networks. To ensure reliability, each algorithm was independently executed 30 times, and the best value, mean, and standard deviation for each algorithm across the six datasets were recorded in Table 4 and Table 5. The average values were used for statistical analysis, with the confidence level α set to 0.05. Table 6 presents the results, with Z indicating the magnitude of the performance differences and the p-value representing the likelihood that these differences are due to chance. Typically, a p-value below 0.05 signifies a statistically significant difference between the samples. In the table, N + and N denote the number of instances where the benchmark algorithm (e.g., LA-DPSO) performs better or worse than other algorithms (e.g., DPSO) in the comparison.
Based on the data in Table 6, the proposed LA-DPSO algorithm shows a clear advantage ( p < 0.05 ) over DCGM++ and ENIMNR in most test cases. Although differences between the CELF and LA-DPSO are not statistically significant for some sample sizes, LA-DPSO still achieves superior overall performance. Compared to the DPSO, LA-DPSO demonstrates significantly better solution performance. Additionally, LA-DPSO outperforms TS-VA-MODE across most sample sizes, except at k = 10 and k = 20 , where the differences are not significant. Overall, LA-DPSO exhibits higher stability and robustness, with statistically significant improvements over other algorithms.

6. Conclusions

This paper proposes a landscape-aware discrete particle swarm optimization algorithm to solve the influence maximization problem in social networks. The fitness distance correlation and the fitness landscape entropy metric are introduced for the first time to depict the characteristics of the solution, and then a novel population partition strategy is introduced. Three distinct search strategies are designed to balance the exploitation and exploration to avoid local optima and improve the solution quality. Experimental results demonstrate that LA-DPSO outperforms the state-of-the-art algorithms across various networks. However, it is worth noting that the population partition strategy relies on relatively simple measures, without considering a mixed model of multiple indicators. Future research will focus on developing more comprehensive metrics and efficient search strategies with low time complexity to further enhance the algorithm’s performance.

Author Contributions

Conceptualization, J.F. and J.T.; methodology, J.F.; software, J.F.; validation, B.C.; formal analysis, R.Z.; investigation, B.C.; resources, J.T.; data curation, J.F.; writing—original draft preparation, J.F.; writing—review and editing, J.T.; visualization, J.F. All authors have read and agreed to the published version of the manuscript.

Funding

This work was financially supported by the National Natural Science Foundation of China under grant number 62162040, the Gansu Provincial University Teachers Innovation Foundation under grant number 2024A-024, the Gansu Provincial Science Fund for Distinguished Young Scholars under grant number 23JRRA766, and the Gansu Provincial Science Fund for Technological Innovation Guidance Plan under grant number 24CXGA046.

Data Availability Statement

The datasets generated during and analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Jaouadi, M.; Romdhane, L.B. A survey on influence maximization models. Expert Syst. Appl. 2024, 248, 123429. [Google Scholar] [CrossRef]
  2. Tavasoli, A.; Shakeri, H.; Ardjmand, E.; Young, W.A., II. Incentive rate determination in viral marketing. Eur. J. Oper. Res. 2021, 289, 1169–1187. [Google Scholar] [CrossRef]
  3. Peng, Y.; Zhao, Y.; Hu, J. On the role of community structure in evolution of opinion formation: A new bounded confidence opinion dynamics. Inf. Sci. 2023, 621, 672–690. [Google Scholar] [CrossRef]
  4. Zhong, X.; Yang, Y.; Deng, F.; Liu, G. Rumor propagation control with anti-rumor mechanism and intermittent control strategies. IEEE Trans. Comput. Soc. Syst. 2023, 11, 2397–2409. [Google Scholar] [CrossRef]
  5. Kempe, D.; Kleinberg, J.; Tardos, É. Maximizing the spread of influence through a social network. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 24–27 August 2003; pp. 137–146. [Google Scholar]
  6. Leskovec, J.; Krause, A.; Guestrin, C.; Faloutsos, C.; VanBriesen, J.; Glance, N. Cost-effective outbreak detection in networks. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, CA, USA, 12–15 August 2007; pp. 420–429. [Google Scholar]
  7. Goyal, A.; Lu, W.; Lakshmanan, L.V. Celf++ optimizing the greedy algorithm for influence maximization in social networks. In Proceedings of the 20th International Conference Companion on World Wide Web, Hyderabad, India, 28 March–1 April 2011; pp. 47–48. [Google Scholar]
  8. Zhang, K.; Zhou, Y.; Long, H.; Wang, C.; Hong, H.; Armaghan, S.M. Towards identifying influential nodes in complex networks using semi-local centrality metrics. J. King Saud. Univ. Comput. Inf. Sci. 2023, 35, 101798. [Google Scholar] [CrossRef]
  9. Chen, D.; Su, H. Identification of influential nodes in complex networks with degree and average neighbor degree. IEEE J. Emerg. Sel. Top. Circuits Syst. 2023, 13, 734–742. [Google Scholar] [CrossRef]
  10. Gong, M.; Yan, J.; Shen, B.; Ma, L.; Cai, Q. Influence maximization in social networks based on discrete particle swarm optimization. Inf. Sci. 2016, 367, 600–614. [Google Scholar] [CrossRef]
  11. Li, H.; Zhang, R.; Zhao, Z.; Liu, X.; Yuan, Y. Identification of top-k influential nodes based on discrete crow search algorithm optimization for influence maximization. Appl. Intell. 2021, 51, 7749–7765. [Google Scholar] [CrossRef]
  12. Khatri, I.; Choudhry, A.; Rao, A.; Tyagi, A.; Vishwakarma, D.K.; Prasad, M. Influence Maximization in social networks using discretized Harris’ Hawks Optimization algorithm. Appl. Soft Comput. 2023, 149, 111037. [Google Scholar] [CrossRef]
  13. Zhu, E.; Wang, H.; Zhang, Y.; Zhang, K.; Liu, C. PHEE: Identifying influential nodes in social networks with a phased evaluation-enhanced search. Neurocomputing 2024, 572, 127195. [Google Scholar] [CrossRef]
  14. Wright, S. The Roles of Mutation, Inbreeding, Crossbreeding, and Selection in Evolution. 1932. Available online: http://www.esp.org/books/6th-congress/facsimile/contents/6th-cong-p356-wright.pdf (accessed on 5 March 2025).
  15. Zou, F.; Chen, D.; Liu, H.; Cao, S.; Ji, X.; Zhang, Y. A survey of fitness landscape analysis for optimization. Neurocomputing 2022, 503, 129–139. [Google Scholar] [CrossRef]
  16. Lu, W.X.; Zhou, C.; Wu, J. Big social network influence maximization via recursively estimating influence spread. Knowl. Based Syst. 2016, 113, 143–154. [Google Scholar] [CrossRef]
  17. Lozano-Osorio, I.; Sánchez-Oro, J.; Duarte, A.; Cordón, Ó. A quick GRASP-based method for influence maximization in social networks. J. Ambient Intell. Humaniz. Comput. 2023, 14, 3767–3779. [Google Scholar] [CrossRef]
  18. Singh, S.S.; Srivastva, D.; Verma, M.; Singh, J. Influence maximization frameworks, performance, challenges and directions on social network: A theoretical study. J. King. Saud. Univ. Comput. Inf. Sci. 2022, 34, 7570–7603. [Google Scholar] [CrossRef]
  19. Yang, P.L.; Xu, G.Q.; Yu, Q.; Guo, J.W. An adaptive heuristic clustering algorithm for influence maximization in complex networks. Chaos Interdiscip. J. Nonlinear Sci. 2020, 30, 093106. [Google Scholar] [CrossRef]
  20. Li, W.; Zhong, K.; Wang, J.; Chen, D. A dynamic algorithm based on cohesive entropy for influence maximization in social networks. Expert Syst. Appl. 2021, 169, 114207. [Google Scholar] [CrossRef]
  21. Kianian, S.; Rostamnia, M. An efficient path-based approach for influence maximization in social networks. Expert Syst. Appl. 2021, 167, 114168. [Google Scholar] [CrossRef]
  22. Xiao, Y.; Chen, Y.; Zhang, H.; Zhu, X.; Yang, Y.; Zhu, X. A new semi-local centrality for identifying influential nodes based on local average shortest path with extended neighborhood. Artif. Intell. Rev. 2024, 57, 115. [Google Scholar] [CrossRef]
  23. Rao, K.V.; Chowdary, C.R. CBIM: Community-based influence maximization in multilayer networks. Inf. Sci. 2022, 609, 578–594. [Google Scholar]
  24. Bouyer, A.; Beni, H.A.; Arasteh, B.; Aghaee, Z.; Ghanbarzadeh, R. FIP: A fast overlapping community-based Influence Maximization Algorithm using probability coefficient of global diffusion in social networks. Expert Syst. Appl. 2023, 213, 118869. [Google Scholar] [CrossRef]
  25. Liu, X.; Ye, S.; Fiumara, G.; De Meo, P. Influence nodes identifying method via community-based backward generating network framework. IEEE Trans. Netw. Sci. Eng. 2023, 11, 236–253. [Google Scholar] [CrossRef]
  26. Ma, K.; Xu, X.; Yang, H.; Cao, R.; Zhang, L. Fair Influence Maximization in Social Networks: A Community-Based Evolutionary Algorithm. IEEE Trans. Emerg. Top. Comput. 2024, 13, 262–275. [Google Scholar] [CrossRef]
  27. Li, H.; Xu, M.; Bhowmick, S.S.; Rayhan, J.S.; Sun, C.; Cui, J. PIANO: Influence maximization meets deep reinforcement learning. IEEE Trans. Comput. Soc. Syst. 2022, 10, 1288–1300. [Google Scholar] [CrossRef]
  28. Kumar, S.; Mallik, A.; Khetarpal, A.; Panda, B.S. Influence maximization in social networks using graph embedding and graph neural network. Inf. Sci. 2022, 607, 1617–1636. [Google Scholar] [CrossRef]
  29. Kumar, S.; Mallik, A.; Panda, B. Influence maximization in social networks using transfer learning via graph-based LSTM. Expert Syst. Appl. 2023, 212, 118770. [Google Scholar] [CrossRef]
  30. Tang, J.; Qu, J.; Song, S.; Zhao, Z.; Du, Q. GCNT: Identify influential seed set effectively in social networks by integrating graph convolutional networks with graph transformers. J. King. Saud. Univ. Comput. Inf. Sci. 2024, 36, 102183. [Google Scholar] [CrossRef]
  31. Li, Y.; Lu, T.; Li, W.; Zhang, P. HCCKshell: A heterogeneous cross-comparison improved Kshell algorithm for Influence Maximization. Infor. Process. Manag. 2024, 61, 103681. [Google Scholar] [CrossRef]
  32. Tang, J.; Zhang, R.; Yao, Y.; Zhao, Z.; Wang, P.; Li, H.; Yuan, J. Maximizing the spread of influence via the collective intelligence of discrete bat algorithm. Knowl. Based Syst. 2018, 160, 88–103. [Google Scholar] [CrossRef]
  33. Biswas, T.K.; Abbasi, A.; Chakrabortty, R.K. A two-stage VIKOR assisted multi-operator differential evolution approach for Influence Maximization in social networks. Expert Syst. Appl. 2022, 192, 116342. [Google Scholar] [CrossRef]
  34. Li, W.; Hu, Y.; Jiang, C.; Wu, S.; Bai, Q.; Lai, E. ABEM: An adaptive agent-based evolutionary approach for influence maximization in dynamic social networks. Appl. Soft Comput. 2023, 136, 110062. [Google Scholar] [CrossRef]
  35. Wang, L.; Ma, L.; Wang, C.; Xie, N.G.; Koh, J.M.; Cheong, K.H. Identifying influential spreaders in social networks through discrete moth-flame optimization. IEEE Trans. Evolut. Comput. 2021, 25, 1091–1102. [Google Scholar] [CrossRef]
  36. Cui, L.; Hu, H.; Yu, S.; Yan, Q.; Ming, Z.; Wen, Z.; Lu, N. DDSE: A novel evolutionary algorithm based on degree-descending search strategy for influence maximization in social networks. J. Netw. Comput. Appl. 2018, 103, 119–130. [Google Scholar] [CrossRef]
  37. Fang, J.; Liu, H.L.; Gu, F. A constrained multi-objective evolutionary algorithm based on fitness landscape indicator. Appl. Soft Comput. 2024, 166, 112128. [Google Scholar] [CrossRef]
  38. Malan, K.M.; Engelbrecht, A.P. Quantifying ruggedness of continuous landscapes using entropy. In Proceedings of the 2009 IEEE Congress on Evolutionary Computation, Trondheim, Norway, 18–21 May 2009; pp. 1440–1447. [Google Scholar]
  39. Zhao, F.; Ji, F.; Xu, T.; Zhu, N. Hierarchical parallel search with automatic parameter configuration for particle swarm optimization. Appl. Soft Comput. 2024, 151, 111126. [Google Scholar] [CrossRef]
  40. Koyuncuoğlu, M.U.; Demir, L. An adaptive hybrid variable-large neighborhood search algorithm for profit maximization problem in designing production lines. Comput. Ind. Eng. 2023, 175, 108871. [Google Scholar] [CrossRef]
  41. Wei, P.; Zhou, J.; Yan, B.; Zeng, Y. ENIMNR: Enhanced node influence maximization through node representation in social networks. Chaos Solitons Fract. 2024, 186, 115192. [Google Scholar] [CrossRef]
Figure 1. Flowchart for the proposed LA-DPSO.
Figure 1. Flowchart for the proposed LA-DPSO.
Symmetry 17 00435 g001
Figure 2. Comparisons of influence spread between LA-DPSO and No_LA-DPSO on the six networks under p = 0.05 .
Figure 2. Comparisons of influence spread between LA-DPSO and No_LA-DPSO on the six networks under p = 0.05 .
Symmetry 17 00435 g002
Figure 3. Comparison on the convergence speed between LA-DPSO and No_LA-DPSO on the six networks under p = 0.05 .
Figure 3. Comparison on the convergence speed between LA-DPSO and No_LA-DPSO on the six networks under p = 0.05 .
Symmetry 17 00435 g003
Figure 4. Comparison on the influence propagation of the six algorithms on six networks at different k.
Figure 4. Comparison on the influence propagation of the six algorithms on six networks at different k.
Symmetry 17 00435 g004aSymmetry 17 00435 g004b
Figure 5. The running time of the different algorithms in the six networks under different seed set sizes.
Figure 5. The running time of the different algorithms in the six networks under different seed set sizes.
Symmetry 17 00435 g005
Table 1. Statistical characteristics of the networks.
Table 1. Statistical characteristics of the networks.
IDNetworks | V | | E | d ave c ave
1NetScience3799144.820.74
2Email113354519.620.22
3Blog398268033.420.28
4CA-GrQc524214,4965.530.53
5CA-HepTh987725,9985.260.47
6NetHEHT15,22931,3764.120.50
Table 2. The mean value of the proposed algorithm under different parameter settings based on the orthogonal experiment on the six datasets.
Table 2. The mean value of the proposed algorithm under different parameter settings based on the orthogonal experiment on the six datasets.
Number T max pop w c 1 c 2 NetScienceEmailBlogCA-GrQcCA-HepThNetHEHT
15100.21.21.245.065145.11696.096161.043181.960195.630
25200.61.82.044.518146.69697.600162.499184.042197.906
35301.01.41.845.251147.40597.528163.687185.216198.949
45400.42.01.645.126147.49498.029164.818185.671200.093
55500.81.61.445.458147.50098.075166.025185.913201.024
650101.01.81.645.851147.65598.089165.613186.352200.568
750200.41.41.445.407149.04699.229165.319186.619200.189
850300.82.01.246.475148.35198.341165.334186.535200.263
950400.21.62.046.617148.66398.352165.593186.388200.383
1050500.61.21.846.714148.93298.459165.513186.835200.589
11100100.81.42.046.896148.787100.344165.476185.443199.950
12100200.22.01.847.150148.996100.615166.650186.852200.380
13100300.61.61.647.218149.069100.399166.734186.723200.527
14100401.01.21.448.593149.191101.028166.975186.843201.167
15100500.41.81.248.533149.289101.048166.165187.183201.493
16150100.62.01.448.597148.192100.093166.107187.352202.126
17150201.01.61.248.949148.327102.046167.382187.711201.660
18150300.41.22.049.549150.198103.035167.562187.813202.983
19150400.81.81.850.078150.276103.030167.997186.837203.011
20150500.21.41.649.744150.198102.999166.959186.801203.035
21200100.41.61.847.951149.06099.966165.111186.929200.064
22200200.81.21.647.921149.426100.354166.080186.952201.698
23200300.21.81.448.445149.513100.410166.184187.045201.833
24200400.61.41.247.460148.643100.119165.541185.925201.221
25200501.02.02.047.191149.113100.063165.101185.026201.033
Table 3. The standard deviation of the proposed algorithm under different parameter settings based on the orthogonal experiment on the six datasets.
Table 3. The standard deviation of the proposed algorithm under different parameter settings based on the orthogonal experiment on the six datasets.
Number T max pop w c 1 c 2 NetScienceEmailBlogCA-GrQcCA-HepThNetHEHT
15100.21.21.21.0372.3152.9992.8171.5371.146
25200.61.82.01.4012.4462.2113.0241.4752.606
35301.01.41.81.5283.3862.5102.7854.0362.706
45400.42.01.61.1031.6922.7823.1813.1091.294
55500.81.61.41.5362.7542.6313.7303.0422.242
650101.01.81.61.4212.6352.9272.6542.3651.436
750200.41.41.41.1231.5742.8512.1832.8214.090
850300.82.01.20.9781.0491.0771.7022.0123.283
950400.21.62.01.5022.3862.2531.7243.0382.863
1050500.61.21.81.3662.8321.5872.0852.7073.755
11100100.81.42.01.1971.2322.2041.8952.6332.288
12100200.22.01.81.3952.2421.8621.4453.3102.120
13100300.61.61.60.3200.7411.4660.7192.3810.448
14100401.01.21.40.2130.5471.6750.6972.0911.942
15100500.41.81.20.2931.2400.3730.5820.6241.167
16150100.62.01.40.2140.5770.6460.4380.4930.432
17150201.01.61.20.2440.8880.7810.5360.6680.937
18150300.41.22.00.2080.4940.6410.2620.4570.944
19150400.81.81.80.2100.3890.3350.3370.2511.216
20150500.21.41.60.2340.3980.4530.4520.2571.277
21200100.41.61.80.2581.1770.3960.2800.4880.912
22200200.81.21.60.2920.6000.5310.4120.4390.441
23200300.21.81.40.2320.7420.6990.5760.3680.616
24200400.61.41.20.3300.4171.1510.4920.8100.659
25200501.02.02.00.3491.3231.1220.4730.5040.707
Table 4. The best value, mean, and standard deviation of CELF, DPSO, and TS_VA_MODE on the six datasets.
Table 4. The best value, mean, and standard deviation of CELF, DPSO, and TS_VA_MODE on the six datasets.
CELFDPSOTS_VA_MODE
Network k Mean SD Max Mean SD Max Mean SD Max
1021.7730.06021.85720.1601.42822.87219.0602.27322.462
2037.8750.05337.95034.2221.68736.59533.0322.53036.738
Netscience3051.4620.05151.55046.5531.49148.96045.0662.13648.353
4064.0380.05764.13060.9601.47163.34459.7032.49063.011
5075.9360.05876.01173.0641.61875.43172.0572.25575.754
1089.6300.06289.73583.3351.54085.54787.7152.14591.664
20123.5040.058123.596110.7551.486112.943121.9892.412125.529
Email30149.4290.062149.518133.2051.450135.503147.7662.259151.031
40168.8070.065168.898156.5551.393159.105166.9442.400170.420
50187.9970.159188.414174.4071.449177.059183.0021.950186.636
1055.1910.05155.28648.6441.57151.21548.1582.16951.633
2088.5980.05388.68573.5161.34875.74669.8741.76573.343
Blog30116.5100.060116.59994.7031.39797.06288.2022.02792.009
40141.8140.062141.908115.7531.250118.016101.6342.205105.925
50164.7870.161164.991125.6911.332128.012118.8062.227122.176
10153.5950.003153.783123.6271.838126.59375.4532.52580.068
20190.4920.102190.647134.4282.434138.03486.6693.04391.329
CA-GrQc30220.9640.106221.161153.4961.921156.546116.4262.888121.246
40242.7150.106242.904167.0152.176170.532136.2323.064140.690
50271.1750.106271.367203.8751.892207.210151.9023.141157.217
10103.0930.025103.27582.4442.16086.00137.0613.24142.211
20160.4910.111160.673140.1661.982143.33364.6763.11568.696
CA-HepTh30204.4060.114204.546173.2072.042176.208101.8492.781106.482
40242.1150.101242.288196.2221.884199.212122.9783.006126.936
50274.0740.199274.561234.0621.856237.918163.3982.876168.003
10107.1510.109107.35891.5912.04595.35990.1532.67394.540
20160.2470.116160.432129.1592.114133.020139.3272.982144.211
NetHEHT30205.0170.134205.218165.7611.999169.923195.4162.872199.742
40237.7700.112237.936199.9132.336203.515227.3932.873231.906
50267.9140.114268.104220.3071.994222.977255.0842.510259.009
Table 5. The best value, mean, and standard deviation of DCGM++, ENIMNR, and LA-DPSO on the six datasets.
Table 5. The best value, mean, and standard deviation of DCGM++, ENIMNR, and LA-DPSO on the six datasets.
DCGM++ENIMNRLA-DPSO
Network k Mean SD Max Mean SD Max Mean SD Max
1021.0970.09721.29714.9722.10418.30720.9800.24121.378
2032.6790.09632.80927.7351.81730.94735.1580.27735.485
Netscience3046.0680.11846.29637.5552.08140.29447.8610.19248.199
4058.4490.12658.63847.8141.81251.68861.6980.20362.137
5069.2270.11469.42659.1412.29062.60074.9600.19075.394
1086.7240.11986.94385.3751.99788.38485.8970.22686.265
20116.5400.092116.701112.9181.868116.821121.9160.238122.283
Email30139.9370.109140.141138.6412.006141.708148.9320.240149.339
40158.1260.113158.326157.7981.926161.802168.5500.222168.972
50173.7860.125173.985171.6782.041174.866188.1350.259188.514
1047.1800.11247.36643.1241.78446.50850.0060.23250.359
2064.5750.13064.74659.3852.20763.01274.0520.22274.507
Blog3080.9780.13081.14870.1762.23173.93499.8360.220100.279
4094.6320.11694.79383.8572.02587.005126.1860.222126.583
50112.7650.116112.95595.1431.96498.536136.6750.197137.057
1075.4040.32575.92775.0272.56079.879127.3270.432127.970
2087.4900.34288.07988.2052.82592.389139.4280.340140.096
CA-GrQc3099.2690.38099.856123.0392.287127.338165.8970.394166.617
40115.6950.288116.248135.4903.057140.131186.4010.439187.076
50128.1320.363128.704144.0752.812147.969223.2930.403223.779
1091.5700.35392.19828.4652.79732.61293.2350.20793.701
20129.3540.380129.91044.8252.50449.003135.3270.356136.050
CA-HepTh30181.7260.385182.22879.8122.38584.394186.4240.515187.113
40209.9220.310210.54596.0932.536101.100211.6920.365212.532
50235.6870.314236.177152.7182.627156.991244.5040.414245.100
1092.0910.30992.63880.6902.65584.14099.1400.26499.735
20135.3300.378135.88992.1712.94396.149148.5370.365149.129
NetHEHT30160.5380.320161.173108.0082.673112.294200.1480.418200.736
40192.9920.403193.590152.4132.577156.275230.1930.380230.767
50227.8060.320228.370189.9762.250193.943265.5070.372266.146
Table 6. The statistical results of the Wilcoxon test on the six algorithms at α = 0.05 on the six networks.
Table 6. The statistical results of the Wilcoxon test on the six algorithms at α = 0.05 on the six networks.
LA-DPSOk N N + Zp-Value
vs.
1060−2.2010.028
2060−2.2010.028
CELF3060−2.2010.028
4060−2.2010.028
5051−1.9920.046
1006−2.2010.028
2006−2.2010.028
DPSO3006−2.2010.028
4006−2.2010.028
5006−2.2010.028
1024−1.5720.116
2024−1.5720.116
TS-VA-MODE3006−2.2010.028
4006−2.2010.028
5006−2.2010.028
1015−1.9920.046
2015−1.9920.046
DCGM++3006−2.2010.028
4006−2.2010.028
5006−2.2010.028
1006−2.2010.028
2006−2.2010.028
ENIMNR3006−2.2010.028
4006−2.2010.028
5006−2.2010.028
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chai, B.; Fu, J.; Zhang, R.; Tang, J. A Landscape-Aware Discrete Particle Swarm Optimization for the Influence Maximization Problem in Social Networks. Symmetry 2025, 17, 435. https://doi.org/10.3390/sym17030435

AMA Style

Chai B, Fu J, Zhang R, Tang J. A Landscape-Aware Discrete Particle Swarm Optimization for the Influence Maximization Problem in Social Networks. Symmetry. 2025; 17(3):435. https://doi.org/10.3390/sym17030435

Chicago/Turabian Style

Chai, Baoqiang, Jiaqiang Fu, Ruisheng Zhang, and Jianxin Tang. 2025. "A Landscape-Aware Discrete Particle Swarm Optimization for the Influence Maximization Problem in Social Networks" Symmetry 17, no. 3: 435. https://doi.org/10.3390/sym17030435

APA Style

Chai, B., Fu, J., Zhang, R., & Tang, J. (2025). A Landscape-Aware Discrete Particle Swarm Optimization for the Influence Maximization Problem in Social Networks. Symmetry, 17(3), 435. https://doi.org/10.3390/sym17030435

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop