Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Parallel PSO for Efficient Neural Network Training Using GPGPU and Apache Spark in Edge Computing Sets

Algorithms 2024, 17(9), 378; https://doi.org/10.3390/a17090378

by Manuel I. Capel^1,*

, Alberto Salguero-Hidalgo²

and Juan A. Holgado-Terriza¹

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Algorithms 2024, 17(9), 378; https://doi.org/10.3390/a17090378

Submission received: 15 July 2024 / Revised: 10 August 2024 / Accepted: 15 August 2024 / Published: 26 August 2024

(This article belongs to the Collection Parallel and Distributed Computing: Algorithms and Applications)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper proposes asynchronous parallel PSO, which is suitable especially for distributed environments - DAPSO. The method is compared with a more baseline method DSPSO. The method and the results seem meaningful, however multiple issues need to be addressed at first.

It should be more clearly stated, what is methodologically new in comparison to the related work. At the end of introduction the authors shall state what are the main contributions of the paper.

There are some issues with the structure of the paper. First, the first related work is written too general. It should be written in the context of the proposed methodology, especially in the field of distributed optimization. Next, DSPSO and DAPSO are described multiple times. e.g.lines 163 to 169 and in line 204. Consequently, the paper is quite wordy and it should be condensed. The paper has to be organized to eliminate such redundancy. IMRAD structure shall be enforced.

Workflow images of both proposed methods are missing (DSPSO and DAPSO). It would improve comprehension considerably. The images shall clearly show distribution of workflow to multiple processing units (nodes) and where is the criteria function actually executed.

Regarding the results:

Convergence curves related to the used case studies are missing.
Some details are missing. Parameters, e.g., exact parameters for ANN: activation function are not provided.
As a baseline, PSO shall be used in the comparison. There shall be a direct comparison in the same table/chart.
Used hardware shall be described in detail.

More minor issues:

There is a strange citation style, e.g. [1][2][3], it should be [1-3] etc.
Abbreviations shall be defined at the first usage, e.g. RDD
Some sentences are really hard to read as they are very long. E.g. lines 163 to 169.

Comments on the Quality of English Language

Some sentences are really hard to read as they are very long. E.g. lines 163 to 169.

Author Response

Comment 1: “It should be more clearly stated, what is methodologically new in comparison to the related work. At the end of introduction the authors shall state what are the main contributions of the paper”.

Response to the comment: We have improved the abstract and introduction to clarify the methodological contributions by including more specific information on GPU acceleration in particle swarm optimisation, such as the works of Hangfeng Liu (2021) and Chuan-Chi Wang et al. (2022), and briefly discussing these in comparison to our own proposal in the paper. Therefore, in contrast to these and other works mentioned in the introduction, we claim that distributed PSO implementations with Apache Spark, such as the one proposed here, scale horizontally by adding more nodes to the cluster and are therefore more suitable for processing very large datasets, as we intend to do methodologically here. We have also shown that performing fitness computations and speed updates asynchronously can significantly improve the performance of PSO parallelization by reducing the latency at synchronisation points, which could be considered a novelty of our contribution, considering that it has been developed using Apache Spark and Scala, in contrast to many works that have been trying to do this for more than a decade, mainly using CUDA. Finally, at the end of the introduction, we have included a list of the main contributions (see highlighted lines 110-127 on page 3) of this work and a short sentence that summarises our ultimate goal with this research.

Comment 2: “the first related work is written too general. It should be written in the context of the proposed methodology, especially in the field of distributed optimization”

Response to the comment: We have performed a reduction of the introduction section and we have tried to focus on the important aspects regarding the distributed optimisation of PSO, mainly by highlighting those recent papers that pursue to achieve a scalable implementation of distributed PSO with GPU and, in particular, with Spark. Further evidence of our interest in scalable machine learning through distributed implementation of stochastic algorithms is distilled from the main objective of this research, which is to use Apache Spark, thus allowing the algorithm implementation to efficiently handle and process massive datasets in a distributed and edge computing environment.

Comment 3: “the paper is quite wordy and it should be condensed”

Response to the comment: The introduction has been shortened to focus more on the distributed implementation of PSO proposals and the improvements to PSO through GPU acceleration that are currently state of the art. Section 2.1 has also been shortened and the redundancies in the definitions of DSPSO and DAPSO identified by the reviewer have been removed.

Comment 4: “Workflow images of both proposed methods are missing (DSPSO and DAPSO)”.

Response to the comment: Sections 3.1 (page 7) and 3.2 (page 9) contain now flowcharts of the master and worker nodes for each algorithm, i.e. the DSPSO and DAPSO distributed variants of the PSO algorithm. In these charts can be seen which is the node responsible for evaluate the fitness function and calculate the best position of each particle.

Comment 5: “Regarding the results:

-Convergence curves are missing”

Response: Convergence curves for the DAPSO and DSPSO variants have been plotted from the calculated fitness points throughout the execution of both variant implementations and are now shown at the end of Section 4.2 (page 16).

- “Some details are missing. Parameters, e.g., exact parameters for ANN: activation function are not provided”

Response: All these obstacles have been overcome by including new tables with the requested data, i.e. the new tables 2-3 and 5 provide information on: parameters of the networks used for each case study...

- “PSO shall be used in the comparison. There shall be a direct comparison in the same table/chart”.

Response: We would like to point out to the reviewer that the graphs in Figure 7 (Performance evaluation with changing number of particles and iterations) already included the execution time for the classical PSO sequential (as shown by the green bar). Nevertheless, we have included new results for the execution times measured for the three implemented algorithms: DSPSO, DAPSO and sequential PSO for 175104 data samples, as shown in the (new) Table 3 and the modified Table 9.

- “Used hardware shall be described in detail.”

Response: solved with the inclusion of table 11

Comment 6: “Minor issues”

“citation style, e.g. [1][2][3], it should be [1-3]”: all references in the text have been corrected and carefully revised, according to the reviewer demand.

“Abbreviations shall be defined at the first usage, e.g. RDD”: We would like to point out to the reviewer that there is a list of all the abbreviations used in this manuscript on page 22, but we have tried to do as the reviewer suggests, without producing a wordy text full of parentheses that might make reading tedious.

“Some sentences are really hard to read as they are very long. E.g. lines 163 to 169.”: This sentence was duplicated, as the reviewer pointed out, and has been revised along with other similar sentences to improve the readability of the text.

Reviewer 2 Report

Comments and Suggestions for Authors

The article presents the PSO algorithm applied to ANN training.
The authors considered modifying the algorithm by using a constant and variable number of particles in each iteration (approach with and without synchronization after each iteration). A similar approach has been considered for evolutionary algorithms in the past (e.g. the work of the team of prof. J. Periaux).
The novelty in the work is the use of Apache Spark and tests of learning selected benchmark ANNs. The use of PSO for ANN training is possible, but is often not cost-effective due to the computational costs compared to other methods - a comment on this matter would increase the value of the article.
Another issue is the choice of PSO, whether other methods were considered, or whether it was about presenting a way to use Apache Spark in optimizations using one of the evolutionary algorithms class.

Author Response

Comment 1: “The novelty in the work is the use of Apache Spark and tests of learning (by) selected ANN benchmarks”. Thanks a lot for your kind observation on our work.

Comment 2: “The use of PSO for ANN training is possible, but is often no cost-effective… a comment on this matter would increase the value of the article”

Response to the comment: Thank you for this excellent suggestion. We have indeed added a sentence to the introduction that may help to clarify why we chose PSO for ANN training: "The use of only Particle Swarm Optimisation (PSO) over other Evolutionary Algorithms (EAs) in a distributed optimisation framework such as Apache Spark is mainly justified by its simplicity: for its fast convergence to optimal or near-optimal solutions, which is a cost-effective choice for large-scale optimisation problems, the communication overhead of PSOs is minimal compared to other EAs that may require more complex interactions between individuals (e.g. crossover operations in Genetic Algorithms)". On top of this, We would like to insist that the distributed implementation of the PSO, despite the inconveniences it presents for training networks, can be useful for running the algorithms at the edge given its easier parallelisation and lower resource usage.

Response to the comment: Thank you very much for your valuable feedback. You raise a valid point regarding the cost-effectiveness of using Particle Swarm Optimisation (PSO) to train Artificial Neural Networks (ANNs). While it is true that PSO may not always be the most cost-effective method compared to more traditional gradient-based approaches, genetic algorithms, ..., there are specific scenarios where the use of PSO can be justified due to its "intrinsic" parallelization scheme. Therefore, the main focus of our article is the parallelization of PSO using Apache Spark. By using parallel computing resources, the computational overhead typically associated with PSO can be significantly reduced. This parallelized approach can improve the efficiency of PSO, making it more competitive in terms of computational cost.

Comment 3: “Why not using Apache Spark in (distributed) optimizations by using one of the evolutionary algorithm class”

Response to the comment: Thank you for this valuable observation of our work, but I have to say that the choice of PSO for distributed optimisation in an Apache Spark environment is driven by its simplicity, efficiency, and scalability. While other evolutionary algorithms have their strengths, PSO's specific advantages make it particularly suitable for continuous optimization problems, especially when fast convergence and low computational overhead are priorities. In a distributed setting, PSO's ease of parallelization and minimal communication requirements further enhance its appeal, making it a robust choice for large-scale optimisation tasks.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The paper has been corrected sufficiently and therefore it can be accepted.

Article Menu

Parallel PSO for Efficient Neural Network Training Using GPGPU and Apache Spark in Edge Computing Sets

Further Information

Guidelines

MDPI Initiatives

Follow MDPI