1. Introduction
Building effective teams is critical in numerous applications, including product development, project group collaborations, and industry expert recruitment [
1,
2]. Efficiently forming teams is crucial in industry expert recruitment. Assembling a team with diverse skill sets for various projects is the responsibility of the administration manager, especially when hiring industry experts [
3]. For example, a software corporation may necessitate proficiency in machine learning, artificial intelligence, software engineering, parallel computing, and programming for their projects. The objective is to assemble a team that can efficiently cooperate and utilize their unique abilities to guarantee the project’s success. The method involves finding subject matter experts, assessing their cooperative abilities, and choosing the best team with the lowest amount of communication expenses—underscoring the significance of determining cooperative abilities. This guarantees that activities are executed efficiently, maximizing the team members’ knowledge utilization. Evaluating potential teams based on skill coverage and communication costs within their cooperative network is crucial for effective collaboration [
4,
5,
6].
When provided with a group of individuals, details of their skills, and the tasks to be accomplished, identifying a subset of individuals (team) with minimal communication expenses poses a challenging problem. This problem is known to be NP-hard [
7]. This paper proposes a metaheuristic-based approach to find a team with the lowest communication cost.
Previous works [
8,
9,
10] addressed the team formation problem using metaheuristic-based approaches with swap operators, typically comparing their results only to traditional algorithms to find a team with minimal communication costs. We can achieve better performance and more optimized outcomes using best-fit heuristic algorithms with appropriate methodologies. Furthermore, there needs to be more strategies that utilize hybrid metaheuristic algorithms to enhance results in the team formation problem.
We employ Particle Swarm Optimization (PSO) with the Modified Swap Operator (MSO) to identify teams with the lowest communication costs [
9,
11]. To further enhance the performance of our selected team, we use the Jaya algorithm [
12] to optimize the outcomes by eliminating less desirable solutions and choosing the most favorable ones. Integrating a hybrid model combining PSO and the Jaya algorithm with the Modified Swap Operator provides a significant improvement in our results, forming an acceptable team while minimizing communication costs.
The main contributions of this work are given below.
We introduce a hybrid approach that combines the exploration abilities of the PSO algorithm with the refinement strengths of the Jaya algorithm. This integration allows for a more thorough search of the solution space while effectively eliminating less desirable solutions and selecting the most favorable ones.
We implement our approach on real-world datasets. We compare the performance of our approach with some existing methods in the literature.
The rest of the paper is structured as follows.
Section 2 presents a comprehensive review of related work on team formation challenges.
Section 3 provides an overview of the preliminary background to introduce the proposed methodology. Some metaheuristic-based approaches for team formation is discussed in
Section 4. The proposed algorithm is given in
Section 5. The experimental results are given in
Section 6. A brief discussion about the proposed algorithm is given in
Section 7.
Section 8 concludes the paper and suggests areas for future research.
2. Related Work
In the area of team formation studies, the effectiveness and composition of teams are critical components. Using a communication framework based on the distance diameter and a minimum spanning tree is crucial for effective team performance, as it helps in facilitating clear communication channels and optimizing team collaboration, as highlighted in [
13]. In the exploration of team formation dynamics in social networks, measuring the distance between team members, whether with or without a team leader, plays a crucial role in enhancing efficiency and fostering collaborative interactions, as discussed in [
14,
15].
The concept of forming expert teams where agents are interconnected through social networks has also been investigated. These studies offer valuable insights into the intricate interplay of team composition, structure, and dynamics within social networks, highlighting the importance of social connections for team effectiveness, as demonstrated by [
7,
16]. These studies emphasize the critical role of strategic team formation in enhancing communication and collaboration among team members, leading to improved team performance.
An efficient team, as outlined by [
13], is characterized by a minimum spanning tree representing the most effective communication structure and a minimal diameter indicating the shortest path between any two team members. This study explores team dynamics by evaluating the proximity between team members in a social network, utilizing techniques with and without a team leader to cultivate effective and collaborative relationships [
14,
15]. In [
7,
16], the study investigates the formation of expert teams within a social network context, examining how the interconnections between agents influence their performance across various scenarios. The work [
7,
16] explores the formation of expert teams when agents are connected through a social network. It explores how the interconnections between agents affect them in various situations.
In [
17], the authors discuss a way to have independent experts work together in spread-out groups. It uses the expert connection topology graph to find distributed maximal cliques. This approach exhibits high fault tolerance due to the experts’ extensive connectivity and allows each expert to choose the most valuable coalition based on combined resources and abilities.
In [
18], the authors propose a genetic algorithm-based method for forming groups in collaborative learning, considering various student attributes to optimize group composition. Their findings reveal that groups formed using this approach demonstrate superior outcomes, including higher task completion rates and increased overall satisfaction among group members, surpassing the results achieved through conventional methods. An improved African Buffalo Optimization algorithm for collaborative team formation in social networks is suggested in [
8]. An improved Particle Swarm Optimization technique with the New Swap Operator (I-PSONSO) for the team formation problem is suggested in [
9]. An improved Jaya algorithm with the Modified Swap Operator (I-JMSO) for solving the team formation problem is suggested in [
10]. An improved Grey Wolf Optimization algorithm (I-GWO) tailored to build teams with minimized communication costs is suggested in [
19]. The abbreviations are detailed in
Table 1.
In [
9], the performance of the I-PSONSO algorithm was tested on the Digital Bibliographic Library Project (DBLP) dataset. In [
10], the performance of the I-JMSO algorithm was evaluated using two real-life datasets: the Stack Exchange dataset and the DBLP dataset. In [
19], the performance of the I-GWO algorithm is evaluated using two real-life datasets: the Stack Exchange dataset and the ACM dataset.
The following research gaps are identified based on the above. Most works have tried to make improvements to the existing algorithms by making modifications on, say, the swap operator. In the context of team formation, no research has been done on the creation of a new algorithm that is a synergistic combination of two distinct heuristic algorithms. To evaluate the performance of a new algorithm, the works discussed above have compared their algorithms with standard algorithms like Jaya, PSO, etc. The works have not reported the performance comparison of their algorithms with similar algorithms available in the literature.
3. Preliminaries
This section presents the definitions used in the team formation problem (TFP).
Notation
This section provides the notations used in the TFP, shown in
Table 2, and gives team formation examples in
Table 3 and
Figure 1.
The team formation problem is a combinatorial optimization problem that involves grouping individuals into teams based on their skills, preferences, and other characteristics. The goal is to form efficient teams that meet specific objectives.
Definition 1. [19]: [Team Formation] Given a set of agents, skills of agents, and tasks—an agent has a set of skills (abilities) and a task is defined by a set of skills—the team formation problem requires finding a subset of agents whose combined skills (union of the skills of the individual agents) are a superset of the skills required for the task. Definition 2. [19]: [Communication Graph] A communication graph represents an undirected weighted graph. The graph’s vertices correspond to the agents, and an edge is assigned a weight w, where . A lower weight indicates that the agents can communicate with each other more easily. Definition 3. [19]: [Communication Cost (CC) of Two Agents] Let the agents have skills , respectively. The communication cost of the agents is given as: Figure 1 shows the communication cost among the agents. The skills of the agents are given in
Table 3.
Consider two , and , with skills:
= {distributed data mining, agent computing, intrusion detection}, and
= speech acts, agent computing}, respectively. The intersection of skills between and is the skill ’agent computing,’ i.e., 1, while the union of their skills consists of 4 distinct skills. To calculate the communication cost between and , you divide the number of skills in the intersection by the total number of skills in the union, which equals . This division helps quantify the communication efficiency between the two agents. Subtracting this value from 1 gives us the communication cost between and , which is . You can apply this method consistently to calculate the communication cost between various pairs of agents within the network, ensuring a systematic approach to evaluating communication efficiency.
Definition 4. [19]: [Swap Operator ()] The swap operator works as follows. Consider the j-th column of any two populations numbered, say, P1 and P2; swap(P1, P2, j) means the agents at position j corresponding to these populations are interchanged. Accordingly, the total communication cost of the teams would change. For example, consider two teams (particles):
performing a single-point crossover:
We select the offspring that has the minimum communication cost.
Definition 5. [19] [Total Communication Cost () of a Team] Let a team comprise the agents , . Let the communication graph for the team be , where , and is the weight of the edge . Then, is the sum of the weights of all the edges in G: Consider a task that requires specific expertise, denoted as
T = {security, machine learning, agent computing, model checking}. We have identified a set of eligible agents
who can perform this task. The required skills for the task and the corresponding agents are outlined in
Table 3. For example, the ’security’ task is performed by agent
, the ’agent computing’ task by agents
,
, and
, the ’model checking’ task by agent
, and the ’machine learning’ task by agent
. Using this set of agents, we can form different possible teams, such as
,
, and
.
We find out the communication cost between agent
using Definition 3 as follows:
Similarly, we calculate the communication cost between other agents.
For team , the edges are , , , , , and , with respective weights of 1, 1, 0.8, 0.8, 1, and 1. Therefore, for , which is . Similarly, the total communication costs calculated for and are 5.55 and 2.8, respectively.
If
is deemed the best global solution, a single-point crossover can be executed between
and
to create solutions
and
, as illustrated in
Figure 2. Likewise, performing a single-point crossover with
generates solutions
and
, with total communication costs of 2.8, 5.55, and 5.6, respectively. Team
is preferred since it has the minimum communication cost 2.8.
4. Metaheuristic-Based Approach for Team Formation
In this section, we delve into two metaheuristic standard algorithms renowned for their simplicity, flexibility, absence of derivative mechanisms, and the ability to circumvent local optima and compare them to conventional optimization techniques [
20]. However, it is essential to note that, according to the No Free Lunch theorem [
21], no single metaheuristic is universally optimal for solving all optimization problems.
4.1. Particle Swarm Optimization (PSO)
Particle Swarm Optimization (PSO), a bio-inspired algorithm introduced by [
11], is recognized for effectively solving many problems. PSO is distinct from other optimization methods, as it relies solely on the objective function without needing gradients or differential forms. The algorithm draws inspiration from natural phenomena such as swarming, fish schooling, and bird flocking and shares similarities with genetic algorithms and evolutionary programming [
22]. Sociobiologists suggest that groups of fish or birds benefit from shared knowledge, improving their collective success in tasks like foraging. PSO’s strengths include its insensitivity to variable scaling, ease of parallel processing, derivative-free nature, few parameters, and robust global search capabilities. However, its local search ability may need to be better optimized. The equations for velocity and position updates in PSO are as follows:
Here, and are learning factors, is the best personal solution, and is the best global solution.
4.2. Jaya Algorithm
Presented in [
23], Jaya is a parameter-free, population-based metaheuristic approach designed for continuous optimization problems. It combines elements of swarm-based intelligence and evolutionary algorithms, aiming to move towards the best solutions while avoiding the worst ones. The Jaya algorithm stands out due to its simplicity and lack of algorithm-specific parameters, making it user-friendly and applicable to many optimization problems [
24]. The objective function of the Jaya algorithm is given by:
Here, and are random numbers, is the best solution, and is the worst solution.
5. Improved PSO-Jaya Optimization Algorithm (I-PSO-Jaya)
The proposed methodology leverages Particle Swarm Optimization (PSO) to tackle the team formation problem of optimizing team composition for efficient communication [
25]. This approach involves conducting multiple iterations to arrive at a suitable solution. To refine the solution and enhance team diversity, we introduce an agent-swapping technique: replacing the current agent with another agent possessing similar skills. We optimized the process by integrating the Jaya algorithm and incorporating addition and subtraction operators. Specifically, we utilized the Jaya best-case scenario. We eliminated the worst-case scenario results from our algorithm, resulting in a more accurate estimation of the shortest communication time for the team.
The flow chart of our metaheuristic technique’s IPSO with Jaya (I-PSO-Jaya) is shown in
Figure 3, and the accompanying pseudocode in Algorithm 1 visually represents the algorithm’s structure and implementation details. These loops are essential for iterating over the number of particles and the maximum number of iterations, facilitating the algorithm’s convergence to an acceptable solution. The velocity and position updates are calculated using equations that combine PSO, MSO, and Jaya elements, synergistically improving the algorithm’s ability to explore and exploit the search space effectively. The communication cost serves as the fitness function to evaluate the effectiveness of each team composition. This section will comprehensively demonstrate our technique using a specific case extracted from the ACM dataset. The steps of our Algorithm 1 can be simply stated as follows.
Initialization. We generate the initial population in the initialization step. We set the number of particles (P) and the number of skills (S). Here, i represents the dimension of the problem. Each population is generated using a nested loop that iterates from to P.
Evaluation. Next, we compute the fitness function for each particle i to assess the communication cost and identify the best personal solution and the best global solution in the population. For the team formation problem, the goal is to minimize , which is the total communication cost.
Single-point crossover. To enhance the solution, we determine the acceptable solution and the current solution , and then we choose the superior offspring.
Solution update. All equations represent the discrete domain by using different operations with modified swap operations.
1. Velocity Update: Use the Particle Update Loop Equation (
3) with modified swap operation as follows:
.
Here, and are learning factors, is the best personal solution, and is the best global solution.
2. Position Update: The position of each particle is updated based on the new velocity using Equation (
4):
.
3. Crossover Operation: A crossover operation is performed between the best global solutions, where and the current solution generate new solutions.
4. Jaya Optimization: The best offspring solution is chosen, and the position is updated using the Jaya optimization Equation (
5):
.
Here, and are random numbers, is the best solution, and is the worst solution.
5. Fitness Evaluation: The fitness of the new solution is evaluated. If the new solution is better (i.e., has a lower cost) than the current best solution, it replaces the current best solution; if , then .
Iteration Update: The iteration counter c is incremented by 1 up to the MaxIteration. The global best position is updated to reflect the best solution.
Return the Best Solution: After completing all iterations, the best solution
is returned, representing the team’s best subset with the minimum communication cost.
Algorithm 1: Improved PSO with Jaya Optimization (I-PSO-Jaya) Algorithm |
Input : P: Number of Particles, : Maximum Iterations, S: Skills |
Output: Min cost - The team’s best subset with Minimum Communication Cost: |
|
Complexity of I-PSO-Jaya:
The time complexity of our proposed algorithm is calculated as follows:
Initialization: The complexity is .
Initial Population Generation: The complexity is .
Fitness Computation and Best Solution Updates: The complexity is .
Main Loop: The loop iterates MaxIter times, with the inner operations having a complexity of for each iteration.
Therefore, the overall time complexity of the algorithm is .
Efficiency of the I-PSO-Jaya algorithm: The efficiency of our algorithm is primarily measured by its ability to generate highly acceptable solutions, even for complex problems. As the number of skills increases, the algorithm effectively navigates the expanded search space to find good solutions. A large-sized particle swarm with a larger number of iterations contributes significantly to improving the solution quality, ensuring that the algorithm can handle more complex scenarios with greater accuracy.
Illustration of the I-PSO-Jaya Algorithm
To generate the initial population (
P), we use a loop that iterates from
to
S and
to
P; where
. We generate the 5 initial populations randomly to initialize
, denoted as
,
,
,
, and
. The populations are defined as follows:
Communication cost is calculated using Equation (
2). Initial costs are as follows:
For Iteration 1:
Initial Velocities (): Assume all initial velocities are .
Updating Velocity. For the particle with index
i in iteration
, the updated velocity
is calculated using Equation (
3). Let us assume
and
. For the current iteration, the calculation is as follows:
Position Update. Using Equation (
4), the calculation of Position Update is as follows:
Update using Equation (
5), where
,
, is as follows:
Comparison of Fitness Functions. The solution is modified by comparing the fitness function values and noting that is and is . Since is not less than , the value of remains the same, i.e., .
Return the Best Solution for Iteration 1: The acceptable solution at the end of the iteration is .
For Iteration 2:
Update Velocity using Equation (
3). Let us assume
and
. Updating Velocity, the calculation is as follows:
Using Equation (
4), the calculation of Position Update is as follows:
Update using Equation (
5), where
,
, is as follows:
Comparison of Fitness Functions. Since the lowest fitness value is now , which is still greater than , our solution does not improve further.
Return the Best Solution for Iteration 2: The acceptable solution at the end of the second iteration is .
For Iteration 3:
Update Velocity using Equation (
3). Let us assume
and
. To update Velocity, the calculation is as follows:
Position Update: For Position Update, the calculation is as follows:
Update using Equation (
5), with
,
:
Comparison of Fitness Functions. The new fitness values after the third iteration are as follows:
Comparison of Fitness Functions. The solution is modified by comparing the fitness function values and noting that is and is . Since is less than , we update to .
Return the Best Solution. The acceptable solution at the end of the third iteration is:
.
6. Experimental Results
For the implementation, we used Python with Anaconda Navigator’s Jupyter notebook running Python 3.9 on Windows 10 with an i7 core CPU with 16 GB RAM and 512 GB SSD disk drive. The evaluations were conducted using two real-world datasets: the Academia Stack Exchange dataset and the Association for Computing Machinery (ACM) dataset.
Academia Stack Exchange dataset: We used a real-life dataset, the Stack Exchange dataset, obtained from Academia Stack Exchange (June 2020), for the performance evaluation of the proposed algorithm. The dataset is structured into several key tables, which are outlined below:
Votes (Id, PostId, VoteTypeId, CreationDate, UserId, BountyAmount)–1,048,572 records.
Users (Id, Reputation, CreationDate, DisplayName, LastAccessDate, WebsiteUrl, Location, AboutMe, Views, UpVotes, DownVotes, Age, AccountId)–127,761 records.
Tags (Id, TagName, Count, ExcerptPostId, WikiPostId)–461 records.
Posts (Id, PostTypeId, AcceptedAnswerId, CreationDate, Score, ViewCount, Body, OwnerUserId, LastEditorUserId, LastEditDate, LastActivityDate, Title, Tags, AnswerCount, CommentCount, FavoriteCount)–9285 records.
PostLinks (Id, CreationDate, PostId, RelatedPostId, LinkTypeId)–20,381 records.
PostHistory (Id, PostHistoryTypeId, PostId, RevisionGUID, CreationDate, UserID, Text, ContentLicense, Comment)–368,025 records.
Comments (Id, PostId, Score, Text, CreationDate, UserId, ContentLicense, UserDisplayName)–189,041 records.
Badges (Id, UserId, Name, Date, Class, TagBased)–294,449 records.
Our goal was to use particular criteria to obtain an expert set and skill set from the dataset. Users with at least 10 postings make up the expert set (i.e., distinct tags) in academia.stackexchange (461 users). If there are two experts with the ’share posts’ tag (skills), then they become connected. The communication cost
of experts
i and
j is evaluated using Equation (
1). We have considered the most important shared skills, such as ‘writing,’ ‘writing-style’, and ‘peer-review’ between experts extracted from the tags of distinct posts.
The data were collected with the help of author collaboration, reflected in a table consisting of 51,730 records. The connections are established when one author’s work is cited by another, creating a network of related data. For example, when an author shares their work with another, they are linked through shared research activities, such as graduate school involvement, entrance examinations, and the publication of scientific articles in journals, conferences, and other outlets. The dataset can be represented as 160,429 = { writing, writing-style, peer-review }, where 160,429 represents an agent, and ‘writing,’ ‘writing-style,’ and ‘peer-review’ are the skills associated with that agent [
26].
Association for Computing Machinery (ACM) dataset: The ACM dataset, a widely recognized benchmark dataset for social networks, was utilized in this study to evaluate the effectiveness and validation of our proposed method. This dataset, available through the GitHub repository as of 2020, specifically focuses on author collaboration within the field of computing.
We collected data on the research skills of 367 agents, representing individual researchers or contributors. The dataset is extensive, containing 3856 lines of data that detail the various skills associated with each agent. For instance, an entry in the dataset might be represented as ‘
[email protected] = agent computing, distributed data mining,’ where ‘
[email protected]’ identifies the agent, and the accompanying list of skills—such as agent computing, distributed data mining, and intrusion detection—represents the expertise of that agent.
This structured data allowed us to accurately assess the performance of our algorithm in forming teams with low communication costs by effectively matching agents with complementary skills. The ACM dataset provided a robust platform for testing, ensuring that the results of our study are both valid and generalizable to real-world scenarios in academic and research settings [
27].
Parameter setting: We conducted nine successive trials, varying the number of skills and corresponding iterations, as shown in
Table 4. We consider a population size of 100 for every skill.
6.1. Experimental Results for Communication Cost Analysis
By examining
Table 5 and
Table 6, we can determine the following values: for skill 2, the values were 0.1 and 0.05; for skill 3, the values were 0.1 and 0.04; for skill 4, the values were 0.1 and 0.04; for skill 5, the values were 0.1 and 0.04; for skill 6, the values were 0.1 and 0.03; for skill 7, the values were 0.1 and 0.03; for skill 8, the values were 0.11 and 0.04; for skill 9, the values were 0.34 and 0.03; and for skill 10, the values were 0.37 and 0.04. The communication cost parameter values for the remaining five heuristics ranged from 0.1 to 5.96 for the Academia Stack Exchange dataset and from 0.02 to 1.62 for the ACM dataset.
For the Academia Stack Exchange dataset, the performance of I-PSO-Jaya becomes notably better as the number of skills increases. This is illustrated in experiments 6 through 9, where I-PSO-Jaya achieves the lowest communication costs, significantly outperforming other algorithms. For example, in experiment 9, with 10 skills and 30 iterations, I-PSO-Jaya achieved a communication cost of 0.37, compared to the next best cost of 0.48 by I-PSONSO, 5.07 by I-GWO, and 4.97 by I-JMSO. Similarly, for the ACM dataset, in experiment 5, with 6 skills and 25 iterations, I-PSO-Jaya achieved a communication cost of 0.03, while the next best cost was 0.04 by I-PSONSO, 0.33 by I-GWO, and 0.38 by I-JMSO. In experiments where more skills are required, I-PSO-Jaya consistently achieves the lowest communication costs. The results indicate that I-PSO-Jaya excels, particularly when the required skills increase, as evidenced by the lower communication costs in complex scenarios.
Efficiency of the I-PSO-Jaya algorithm for the Academia Stack Exchange dataset: By examining
Table 5, the results from experiments 7, 8, and 9 demonstrate the better efficiency of the I-PSO-Jaya algorithm compared to the other methods tested. In these experiments, the communication cost for I-PSO-Jaya remains consistently lower than that of the other algorithms. Specifically, in experiment 7, the communication cost for I-PSO-Jaya is 0.11, significantly lower than the 0.25 to 3.61 range observed for the other methods. This trend continues in experiments 8 and 9, where I-PSO-Jaya achieves communication costs of 0.34 and 0.37, respectively, outperforming all other methods, which range from 0.51 to 4.00 for experiment 8 and 0.48 to 5.96 for experiment 9.
Efficiency of the I-PSO-Jaya algorithm for the ACM dataset: By examining
Table 6, the results from the ACM dataset further highlight the superior efficiency of the I-PSO-Jaya algorithm, particularly in the more complex experiments 7, 8, and 9. In experiment 7, I-PSO-Jaya achieves a communication cost of 0.04, significantly lower than the costs observed for the other methods, ranging from 0.05 to 0.86. This trend continues in experiment 8, where I-PSO-Jaya maintains a communication cost of just 0.03, outperforming all other methods, which range from 0.04 to 1.22. Similarly, in experiment 9, the I-PSO-Jaya algorithm records a communication cost of 0.04, again demonstrating its efficiency compared to the other methods, which have costs ranging from 0.05 to 1.62.
These results underscore the algorithm’s ability to maintain low communication costs even as the number of skills and iterations increases, thereby highlighting its effectiveness in handling more complex scenarios with greater efficiency.
6.2. Experimental Results for Sample Mean Communication Cost Analysis
By examining
Table 7 and
Table 8, we can determine the following values: for skill 2, the values were 0.0477 and 0.0502; for skill 3, the values were 0.1681 and 0.1004; for skill 4, the values were 0.238 and 0.1904; for skill 5, the values were 0.3595 and 0.1405; for skill 6, the values were 0.4214 and 0.060; for skill 7, the values were 0.7422 and 0.0704; for skill 8, the values were 0.9864 and 0.0842; for skill 9, the values were 1.3622 and 0.0450; and for skill 10, the values were 1.6313 and 0.0600. The communication cost parameter values for the remaining five heuristics ranged from 0.0613 to 7.2398 for the Academia Stack Exchange dataset and from 0.0511 to 3.5789 for the ACM dataset. We specifically differentiate I-PSO-Jaya from I-PSONSO by focusing on skills 6 to 10. The table illustrates the increased discrepancy between the two approaches.
The performance improvement becomes more pronounced as the number of required skills increases. For instance, in experiment 9, with 10 skills and 30 iterations, I-PSO-Jaya achieved a communication cost of 1.6313, significantly lower than the next best cost of 1.7036 by I-PSONSO, 7.0825 by I-GWO, and 7.1301 by I-JMSO. Similarly, for the ACM dataset, in experiment 9, with 10 skills and 30 iterations, I-PSO-Jaya attained a communication cost of 0.0600, compared to the following best cost of 0.0619 by I-PSONSO, 3.5659 by I-GWO, and 3.5789 by I-JMSO.
When compared to existing algorithms, I-PSO-Jaya outperforms I-JMSO by 77.48%, I-GWO by 77.49%, and I-PSONSO by up to 6.11% on the Academia Stack Exchange dataset. Additionally, for the ACM dataset, I-PSO-Jaya surpasses I-JMSO by 93.86%, I-GWO by 93.83%, and I-PSONSO by up to 6.41%. This indicates that I-PSO-Jaya is particularly effective in complex scenarios requiring extensive skill coverage, as demonstrated by its exceptional performance in all experiments and superiority to all earlier algorithms.
Efficiency of the I-PSO-Jaya algorithm on the Academia Stack Exchange dataset: By examining
Table 7, the analysis of the sample mean communication costs from the Academia Stack Exchange dataset further emphasizes the efficiency of the I-PSO-Jaya algorithm, particularly in the more complex experiments 7, 8, and 9. In experiment 7, the I-PSO-Jaya algorithm achieves a mean communication cost of 0.9864, which is significantly lower than the costs associated with other methods, which range from 1.0685 to 4.5491. This efficiency is consistently observed in experiment 8, where I-PSO-Jaya’s mean communication cost is 1.3622, outperforming all other methods that exhibit mean costs between 1.4243 and 5.7983. Similarly, in experiment 9, I-PSO-Jaya maintains a lower mean communication cost of 1.6313, while the other methods range from 1.7036 to 7.2398.
Efficiency of the I-PSO-Jaya algorithm for the ACM dataset: By examining
Table 8, the sample mean communication costs from the ACM dataset further underline the efficiency of the I-PSO-Jaya algorithm, particularly in the more complex experiments 7, 8, and 9. In experiment 7, I-PSO-Jaya achieves a mean communication cost of 0.0842, which is markedly lower than the other methods, whose costs range from 0.10059 to 2.4557. This trend continues in experiment 8, where I-PSO-Jaya records a mean communication cost of just 0.0450, significantly outperforming all other methods, which range from 0.0530 to 2.9607. Similarly, in experiment 9, I-PSO-Jaya maintains a low mean communication cost of 0.0600, while other methods show higher costs ranging from 0.0619 to 3.5789. These results consistently demonstrate that I-PSO-Jaya achieves significantly lower sample mean communication costs compared to other algorithms, especially as the complexity of the problems increases. This efficiency highlights the robustness and effectiveness of the I-PSO-Jaya algorithm in solving complex optimization problems while minimizing communication costs.
These results demonstrate that I-PSO-Jaya consistently achieves lower sample mean communication costs compared to other algorithms, particularly as the complexity of the problem increases. This highlights the superior efficiency of the I-PSO-Jaya algorithm in handling complex optimization scenarios.
6.3. Experimental Results for Standard Deviation of Communication Cost Analysis
By examining
Table 9 and
Table 10, we can determine the following values: for skill 2, the values were 0.0655 and 0.0546; for skill 3, the values were 0.1359 and 0.0855; for skill 4, the values were 0.1503 and 0.1483; for skill 5, the values were 0.2184 and 0.1234; for skill 6, the values were 0.2768 and 0.0796; for skill 7, the values were 0.3714 and 0.0697; for skill 8, the values were 0.4155 and 0.0577; for skill 9, the values were 0.5036 and 0.0401; and for skill 10, the values were 0.5014 and 0.0605. The communication cost parameter values for the five heuristic techniques ranged from 0.045 and 0.7213 for the Academia Stack Exchange dataset and from 0.0318 and 0.9441 for the ACM dataset. We specifically highlight the difference between I-PSO-Jaya and I-PSONSO for skills 6 to 10; the table shows the increased difference between them.
In the Academia Stack Exchange dataset, the I-PSO-Jaya algorithm shows a marked improvement in reducing the standard deviation of communication costs across all experiments. For instance, in experiment 9, with 10 skills and 30 iterations, the standard deviation for I-PSO-Jaya is 0.5014, significantly lower than the next best algorithm, I-PSONSO, which has a standard deviation of 0.5199, I-GWO at 0.6065, and 0.7213 at I-JMSO. For the ACM dataset, notable examples include: experiment 9, with 10 skills and 30 iterations, where I-PSO-Jaya records a standard deviation of 0.0605, whereas the next best algorithm is I-PSONSO at 0.0691, I-GWO at 0.8774, and I-JMSO at 0.9014. In experiment 5, with 6 skills and 25 iterations, I-PSO-Jaya achieves a standard deviation of 0.0796, compared to I-PSONSO’s 0.0826.
These results underscore the algorithm’s effectiveness at maintaining low variability in communication costs, a crucial aspect that ensures more predictable and reliable team formation. This further validates the algorithm’s ability to produce consistent outcomes, making it highly effective for complex team formation tasks where stability is crucial.
6.4. Experimental Results for Computing Time Analysis
According to the experimental results, the I-PSO-Jaya algorithm required more computation time (given in seconds) than the other five algorithms (I-PSONSO, I-GWO, GWO, I-JMSO, and Jaya) for all tests in both datasets, as demonstrated in
Table 11 and
Table 12. The I-PSO-Jaya algorithm required a time of around 6.2458 to 542.7532 s for the Academia Stack Exchange dataset and 2.9176 to 236.111 s for the ACM dataset.
In the Stack Exchange dataset, specifically in skill 10, the minimum communication cost of I-PSO-Jaya is 0.37, with a computing time of 542.753 s. In comparison, the other algorithms have minimum communication costs ranging from 0.48 to 5.23, with a computing time of 234.074 s to 287.941 s (as shown in
Table 5 and
Table 11). Similarly, when the communication cost of I-PSO-Jaya is 0.04 with a computing time of 236.11 s, the minimum cost obtained by the other algorithms ranges from 0.05 to 1.62, with a computing time of 0.1167 s to 115.23 s in the ACM dataset; see
Table 6 and
Table 12.
I-PSO-Jaya is a more intricate algorithm than the other five heuristics. Thus, by performing more computations, I-PSO-Jaya reaches a better solution than that achieved by the other methods.
6.5. Convergence Analysis
Convergence of I-PSO-Jaya: From
Table 8 and
Table 9, it can be seen that I-PSO-Jaya takes at most 30 iterations to arrive at a better solution than that achieved by the other methods. Thereafter, when the number of iterations is increased, the value does not change significantly.
To assess the convergence of the I-PSO-Jaya algorithm, we conducted experiments with a population size of 100 across 10 different skills. The algorithm was tested with iterations of 5, 10, 15, 20, 25, 30, 35, 40, and 45, utilizing datasets from Academia Stack Exchange and ACM. As shown in
Table 13 and
Table 14, the I-PSO-Jaya algorithm converges to an acceptable solution within 30 iterations. Beyond this point, additional iterations result in only marginal improvements, suggesting that the algorithm has stabilized and that further iterations do not significantly enhance the solution quality.
6.6. Confidence Intervals
Confidence intervals are typically used to determine the range of values a population parameter, such as a mean or a difference between means, is likely to be found within. When comparing two or more algorithms, we evaluate the uncertainty surrounding performance measures and utilize confidence intervals to judge if our suggested approach is statistically significantly superior to the alternative algorithm. Computing CI can be achieved by following these steps:
The value of
is determined by the confidence level (i.e.,
= 1 − C) and the standard deviation of the sample. The sample size is equal to the size of the population. When utilizing a 95% confidence interval (CI), the population mean (
= 1 − 0.95 = 0.05) can be determined by referring to
Table 7,
Table 8,
Table 9 and
Table 10.
Here, we may locate confidence intervals to assess how well our algorithms perform in comparison to earlier algorithms, which can determine the performance improvement in percentage terms:
where Avg(Jaya, I-JMSO, GWO, I-GWO, I-PSONSO), and
P denotes the performance. The average results achieved from the standard Jaya, I-JMSO, GWO, I-GWO, I-PSONSO, and I-PSO-Jaya algorithms are represented by Avg(I-PSO-Jaya).
Confidence intervals for the Academia Stack Exchange dataset. The CI of average (mean) communication is displayed in
Table 7. The cost is provided for 09 experiments on the Stack Exchange platform. Collection of data: The data presented in
Table 7 illustrate the mean communication cost for experiments 1 to 9, with the total of 2 skills increasing by one up to 10. For the range of iterations from 5, increase by five up to 30, inclusive. The experiment outcomes are displayed in
Table 7. I-PSO-Jaya communication cost decreases progressively with each iteration compared to other algorithms: the maximum performance improvement obtained is 73.51%.
Confidence intervals for the ACM dataset. The CI of average communication is displayed in
Table 8. The cost is provided for 9 experiments on the ACM platform. The data presented in
Table 8 illustrates the mean communication cost for experiments 1 to 9, with 2 skills increasing by 1 up to 10. The experiment outcomes are displayed in
Table 8. The I-PSO-Jaya communication cost decreases progressively with each iteration compared to other algorithms: the maximum performance improvement obtained is 92.69%.
Our proposed approach is designed to provide an optimized solution when more skills are required rather than fewer skills, which gives better results in the base alternative approach. This arises because our algorithm is specifically tuned to handle complex, multi-skilled team formations and may not be as efficient in simpler contexts where a more straightforward solution could suffice.
7. Discussion
Advantages: The I-PSO-Jaya algorithm demonstrates several key advantages over the existing works, namely, I-PSONSO, I-GWO, and I-JMSO, based on the experimental results.
Experimental results for communication cost analysis: The I-PSO-Jaya algorithm consistently achieves lower communication costs, particularly as the number of skills increases. For example, in experiment 9, 10 skills were used in the Academia Stack Exchange dataset, where I-PSO-Jaya achieved a communication cost of 0.37, significantly outperforming I-PSONSO (0.48), I-GWO (5.07), and I-JMSO (4.97). Similarly, in the ACM dataset, I-PSO-Jaya recorded the lowest communication cost of 0.04 for scenarios with 10 skills, outperforming I-PSONSO (0.05), I-GWO (1.14), and I-JMSO (1.14). These results highlight the I-PSO-Jaya algorithm’s effectiveness in complex, multi-skilled team formations, offering substantial improvements in communication cost reduction compared to other proposed algorithms.
Experimental results for sample mean communication cost analysis: I-PSO-Jaya consistently achieves lower mean communication costs as the number of skills increases. In experiment 9 with 10 skills using the Academia Stack Exchange dataset, the mean communication cost is 1.6313, outperforming I-PSONSO (1.7036), I-GWO (7.0825), and I-JMSO (7.1301). Similarly, in the ACM dataset, I-PSO-Jaya achieved the lowest mean communication cost of 0.06, outperforming I-PSONSO (0.0619), I-GWO (3.5659), and I-JMSO (3.5789). These results demonstrate the I-PSO-Jaya algorithm’s superior ability to reduce average communication costs, particularly in complex scenarios, compared to other existing algorithms.
Experimental results for the standard deviation of communication cost analysis: The I-PSO-Jaya algorithm significantly reduces the standard deviation of communication costs, indicating greater consistency and reliability. For example, in experiment 9, 10 skills were used in the Academia Stack Exchange dataset, where I-PSO-Jaya achieved a standard deviation of 0.5014, outperforming I-PSONSO (0.5199), I-GWO (0.60655), and I-JMSO (0.7213). Similarly, in the ACM dataset, I-PSO-Jaya recorded the lowest standard deviation of 0.0605, outperforming I-PSONSO (0.0691), I-GWO (0.8774), and I-JMSO (0.9014). This indicates that the I-PSO-Jaya algorithm excels at maintaining low variability in communication costs, ensuring more predictable and stable team formations, which is crucial for complex, multi-skilled tasks.
Limitations: Our proposed approach, I-PSO-Jaya, is efficient in scenarios requiring more skills, yielding more satisfactory solutions in such complex, multi-skilled team formations. However, this also means that our algorithm has higher computational time than others. This is because I-PSO-Jaya is tuned explicitly for handling more complex team formation problems, making it less efficient for cases requiring fewer skills.
8. Conclusions
In this paper, we proposed a hybrid metaheuristic-based approach for team formation. The appro h, a synergistic combination of two distinct heuristic algorithms, PSO and Jaya, can find a team with the minimum communication cost. We evaluated the algorithm using two real-world datasets: the Academia Stack Exchange and ACM datasets. Our proposed algorithm outperforms the previous approaches: for the Academia dataset, the maximum improvement obtained is 73.51% and that for the ACM dataset is 92.69%.
As part of future work, we plan to reduce the state space by creating a sub-graph of required skills and agents, utilizing hash tables and one-hot encoding to retain only agents with the necessary skills while excluding others. We also aim to explore alternative metaheuristic-based hybrid algorithms for performance improvement. Another line of work could include addressing multi-objective optimization for team formation and developing a suitable heuristic algorithm.