applsci-logo

Journal Browser

Journal Browser

Parallel Computing and Grid Computing: Technologies and Applications

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 July 2025 | Viewed by 6088

Special Issue Editor


E-Mail Website
Guest Editor
Key Laboratory of Computational Geodynamics, University of Chinese Academy of Sciences, Beijing 100049, China
Interests: parallel computing and grid computing

Special Issue Information

Dear Colleagues,

Parallel computing and grid computing have been widely used to solve computational problems, especially in optimization. And more and more algorithms and methods have been developed and applied to massive computing structures and systems. 
This Special Issue is devoted to topics in parallel computing and grid computing, including theory and applications. The focus will be on applications involving parallel and grid methods of solving hard computational problems.

Prof. Dr. Huai Zhang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • parallel computing
  • parallel solvers
  • high-performance computing
  • sparse matrices
  • interconnection networks
  • grid computing

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 696 KiB  
Article
Optimizing Lattice Basis Reduction Algorithm on ARM V8 Processors
by Ronghui Cao, Julong Wang, Liming Zheng, Jincheng Zhou, Haodong Wang, Tiaojie Xiao and Chunye Gong
Appl. Sci. 2025, 15(4), 2021; https://doi.org/10.3390/app15042021 - 14 Feb 2025
Viewed by 338
Abstract
The LLL (Lenstra–Lenstra–Lovász) algorithm is an important method for lattice basis reduction and has broad applications in computer algebra, cryptography, number theory, and combinatorial optimization. However, current LLL algorithms face challenges such as inadequate adaptation to domestic supercomputers and low efficiency. To enhance [...] Read more.
The LLL (Lenstra–Lenstra–Lovász) algorithm is an important method for lattice basis reduction and has broad applications in computer algebra, cryptography, number theory, and combinatorial optimization. However, current LLL algorithms face challenges such as inadequate adaptation to domestic supercomputers and low efficiency. To enhance the efficiency of the LLL algorithm in practical applications, this research focuses on parallel optimization of the LLL_FP (LLL double-precision floating-point type) algorithm from the NTL library on the domestic Tianhe supercomputer using the Phytium ARM V8 processor. The optimization begins with the vectorization of the Gram–Schmidt coefficient calculation and row transformation using the SIMD instruction set of the Phytium chip, which significantly improve computational efficiency. Further assembly-level optimization fully utilizes the low-level instructions of the Phytium processor, and this increases execution speed. In terms of memory access, data prefetch techniques were then employed to load necessary data in advance before computation. This will reduce cache misses and accelerate data processing. To further enhance performance, loop unrolling was applied to the core loop, which allows more operations per loop iteration. Experimental results show that the optimized LLL_FP algorithm achieves up to a 42% performance improvement, with a minimum improvement of 34% and an average improvement of 38% in single-core efficiency compared to the serial LLL_FP algorithm. This study provides a more efficient solution for large-scale lattice basis reduction and demonstrates the potential of the LLL algorithm in ARM V8 high-performance computing environments. Full article
(This article belongs to the Special Issue Parallel Computing and Grid Computing: Technologies and Applications)
Show Figures

Figure 1

13 pages, 1240 KiB  
Article
A Parallel Monte Carlo Algorithm for the Life Cycle Asset Allocation Problem
by Xueying Yang, Chen Li, Xu Li and Zhonghua Lu
Appl. Sci. 2024, 14(22), 10372; https://doi.org/10.3390/app142210372 - 11 Nov 2024
Viewed by 1053
Abstract
Life cycle asset allocation is a crucial aspect of financial planning, especially for pension funds. Traditional methods often face challenges in computational efficiency and applicability to different market conditions. This study aimed to innovatively transplant an algorithm from reinforcement learning that enhances the [...] Read more.
Life cycle asset allocation is a crucial aspect of financial planning, especially for pension funds. Traditional methods often face challenges in computational efficiency and applicability to different market conditions. This study aimed to innovatively transplant an algorithm from reinforcement learning that enhances the efficiency and accuracy of life cycle asset allocation. We synergized tabular methods with Monte Carlo simulations to solve the pension problem. This algorithm was designed to correspond states in reinforcement learning to key variables in the pension model: wealth, labor income, consumption level, and proportion of risky assets. Additionally, we used cleaned and modeled survey data from Chinese consumers to validate the model’s optimal decision-making in the Chinese market. Furthermore, we optimized the algorithm using parallel computing to significantly reduce computation time. The proposed algorithm demonstrated superior efficiency compared to the traditional value iteration method. Serial execution of our algorithm took 29.88 min, while parallel execution reduced this to 1.42 min, compared to the 41.15 min required by the value iteration method. These innovations suggest significant potential for improving pension fund management strategies, particularly in the context of the Chinese market. Full article
(This article belongs to the Special Issue Parallel Computing and Grid Computing: Technologies and Applications)
Show Figures

Figure 1

19 pages, 4714 KiB  
Article
Optimization Research of Heterogeneous 2D-Parallel Lattice Boltzmann Method Based on Deep Computing Unit
by Shunan Tao, Qiang Li, Quan Zhou, Zhaobing Han and Lu Lu
Appl. Sci. 2024, 14(14), 6078; https://doi.org/10.3390/app14146078 - 12 Jul 2024
Viewed by 1097
Abstract
Currently, research on the lattice Boltzmann method mainly focuses on its numerical simulation and applications, and there is an increasing demand for large-scale simulations in practical scenarios. In response to this situation, this study successfully implemented a large-scale heterogeneous parallel algorithm for the [...] Read more.
Currently, research on the lattice Boltzmann method mainly focuses on its numerical simulation and applications, and there is an increasing demand for large-scale simulations in practical scenarios. In response to this situation, this study successfully implemented a large-scale heterogeneous parallel algorithm for the lattice Boltzmann method using OpenMP, MPI, Pthread, and OpenCL parallel technologies on the “Dongfang” supercomputer system. The accuracy and effectiveness of this algorithm were verified through the lid-driven cavity flow simulation. The paper focused on optimizing the algorithm in four aspects: Firstly, non-blocking communication was employed to overlap communication and computation, thereby improving parallel efficiency. Secondly, high-speed shared memory was utilized to enhance memory access performance and reduce latency. Thirdly, a balanced computation between the central processing unit and the accelerator was achieved through proper task partitioning and load-balancing strategies. Lastly, memory access efficiency was improved by adjusting the memory layout. Performance testing demonstrated that the optimized algorithm exhibited improved parallel efficiency and scalability, with computational performance that is 4 times greater than before optimization and 20 times that of a 32-core CPU. Full article
(This article belongs to the Special Issue Parallel Computing and Grid Computing: Technologies and Applications)
Show Figures

Figure 1

16 pages, 15647 KiB  
Article
Numerical Simulation of the Influence of the Baihetan Reservoir Impoundment on Regional Seismicity
by Zitao Wang, Huai Zhang, Yicun Guo and Qiu Meng
Appl. Sci. 2024, 14(12), 5145; https://doi.org/10.3390/app14125145 - 13 Jun 2024
Viewed by 790
Abstract
The Baihetan Reservoir is built for hydropower in China. The rise of the reservoir water leads to a series of earthquakes in the surrounding area. This study proposes fully coupled equations of pore-viscoelasticity and a parallel partition mesh model to study the short- [...] Read more.
The Baihetan Reservoir is built for hydropower in China. The rise of the reservoir water leads to a series of earthquakes in the surrounding area. This study proposes fully coupled equations of pore-viscoelasticity and a parallel partition mesh model to study the short- and long-term effects of the Baihetan Reservoir and further calculate the changes in stress, pore pressure, and Coulomb failure stress with time on the major faults. Based on the calculation results, impoundment increases regional seismicity, which is consistent with the seismic catalog. The reservoir impoundment causes an increase in pore pressure in the crust, primarily enhancing Coulomb failure stress beneath the reservoir center. This effect extends to approximately 60 km in length and 20 km in width at a depth layer of 5–10 km. Seismicity varies greatly among different faults. Coulomb failure stress increases on the northern part of the Xiaojiang Fault and Zhaotong-Ludian Fault, and decreases on the southern part of the Xiaojiang Fault and Zemuhe Fault. The Coulomb failure stress is highly correlated with the number of earthquakes along the Xiaojiang Fault. The influence of the reservoir on the local seismicity is mainly limited to several months, and it has a slight effect later on. The focal depth of the induced earthquakes increases while the magnitude decreases. The earthquakes caused by the impoundment all have a small magnitude, and the Ms4.3 Qiaojia earthquake on 30 March 2022, was more likely a natural event. Full article
(This article belongs to the Special Issue Parallel Computing and Grid Computing: Technologies and Applications)
Show Figures

Figure 1

13 pages, 8672 KiB  
Article
Efficient Parallel FDTD Method Based on Non-Uniform Conformal Mesh
by Kaihui Liu, Tao Huang, Liang Zheng, Xiaolin Jin, Guanjie Lin, Luo Huang, Wenjing Cai, Dapeng Gong and Chunwang Fang
Appl. Sci. 2024, 14(11), 4364; https://doi.org/10.3390/app14114364 - 21 May 2024
Viewed by 1678
Abstract
The finite-difference time-domain (FDTD) method is a versatile electromagnetic simulation technique, widely used for solving various broadband problems. However, when dealing with complex structures and large dimensions, especially when applying perfectly matched layer (PML) absorbing boundaries, tremendous computational burdens will occur. To reduce [...] Read more.
The finite-difference time-domain (FDTD) method is a versatile electromagnetic simulation technique, widely used for solving various broadband problems. However, when dealing with complex structures and large dimensions, especially when applying perfectly matched layer (PML) absorbing boundaries, tremendous computational burdens will occur. To reduce the computational time and memory, this paper presents a Message Passing Interface (MPI) parallel scheme based on non-uniform conformal FDTD, which is suitable for convolutional perfectly matched layer (CPML) absorbing boundaries, and adopts a domain decomposition approach, dividing the entire computational domain into several subdomains. More importantly, only one magnetic field exchange is required during the iterations, and the electric field update is divided into internal and external parts, facilitating the synchronous communication of magnetic fields between adjacent subdomains and internal electric field updates. Finally, unmanned helicopters, helical antennas, 100-period folded waveguides, and 16 × 16 phased array antennas are designed to verify the accuracy and efficiency of the algorithm. Moreover, we conducted parallel tests on a supercomputing platform, showing its satisfactory reduction in computational time and excellent parallel efficiency. Full article
(This article belongs to the Special Issue Parallel Computing and Grid Computing: Technologies and Applications)
Show Figures

Figure 1

Back to TopTop