MDPI - Publisher of Open Access Journals

25 pages, 1157 KB

Open AccessArticle

Investigating Supercomputer Performance with Sustainability in the Era of Artificial Intelligence

by Haruna Chiroma

Appl. Sci. 2025, 15(15), 8570; https://doi.org/10.3390/app15158570 - 1 Aug 2025

Viewed by 1223

The demand for high-performance computing (HPC) continues to grow, driven by its critical role in advancing innovations in the rapidly evolving field of artificial intelligence. HPC has now entered the era of exascale supercomputers, introducing significant challenges related to sustainability. Balancing HPC performance [...] Read more.

The demand for high-performance computing (HPC) continues to grow, driven by its critical role in advancing innovations in the rapidly evolving field of artificial intelligence. HPC has now entered the era of exascale supercomputers, introducing significant challenges related to sustainability. Balancing HPC performance with environmental sustainability presents a complex, multi-objective optimization problem. To the best of the author’s knowledge, no recent comprehensive investigation has explored the interplay between supercomputer performance and sustainability over a five-year period. This paper addresses this gap by examining the balance between these two aspects over a five-year period. This study collects and analyzes multi-year data on supercomputer performance and energy efficiency. The findings indicate that supercomputers pursuing higher performance often face challenges in maintaining top sustainability, while those focusing on sustainability tend to face challenges in achieving top performance. The analysis reveals that both the performance and power consumption of supercomputers have been rapidly increasing over the last five years. The findings also reveal that the performance of the most computationally powerful supercomputers is directly proportional to power consumption. The energy efficiency gains achieved by some top-performing supercomputers become challenging to maintain in the pursuit of higher performance. The findings of this study highlight the ongoing race toward zettascale supercomputers. This study can provide policymakers, researchers, and technologists with foundational evidence for rethinking supercomputing in the era of artificial intelligence. Full article

► Show Figures

Figure 1

21 pages, 428 KB

Open AccessFeature PaperArticle

Accelerated Numerical Simulations of a Reaction-Diffusion- Advection Model Using Julia-CUDA

by Angelo Ciaramella, Davide De Angelis, Pasquale De Luca and Livia Marcellino

Mathematics 2025, 13(9), 1488; https://doi.org/10.3390/math13091488 - 30 Apr 2025

Cited by 1 | Viewed by 571

Abstract

The emergence of exascale computing systems presents both opportunities and challenges in scientific computing, particularly for complex mathematical models requiring high-performance implementations. This paper addresses these challenges in the context of biomedical applications, specifically focusing on tumor angiogenesis modeling. We present a parallel [...] Read more.

The emergence of exascale computing systems presents both opportunities and challenges in scientific computing, particularly for complex mathematical models requiring high-performance implementations. This paper addresses these challenges in the context of biomedical applications, specifically focusing on tumor angiogenesis modeling. We present a parallel implementation for solving a system of partial differential equations that describe the dynamics of tumor-induced blood vessel formation. Our approach leverages the Julia programming language and its CUDA capabilities, combining a high-level paradigm with efficient GPU acceleration. The implementation incorporates advanced optimization strategies for memory management and kernel organization, demonstrating significant performance improvements for large-scale simulations while maintaining numerical accuracy. Experimental results confirm the performance gains and reliability of the proposed parallel implementation. Full article

(This article belongs to the Special Issue Advances in Numerical Mathematics for High-Performance Computing in the Exascale Era)

► Show Figures

Figure 1

35 pages, 11134 KB

Open AccessArticle

Error Classification and Static Detection Methods in Tri-Programming Models: MPI, OpenMP, and CUDA

by Saeed Musaad Altalhi, Fathy Elbouraey Eassa, Sanaa Abdullah Sharaf, Ahmed Mohammed Alghamdi, Khalid Ali Almarhabi and Rana Ahmad Bilal Khalid

Computers 2025, 14(5), 164; https://doi.org/10.3390/computers14050164 - 28 Apr 2025

Viewed by 847

Abstract

The growing adoption of supercomputers across various scientific disciplines, particularly by researchers without a background in computer science, has intensified the demand for parallel applications. These applications are typically developed using a combination of programming models within languages such as C, C++, and [...] Read more.

The growing adoption of supercomputers across various scientific disciplines, particularly by researchers without a background in computer science, has intensified the demand for parallel applications. These applications are typically developed using a combination of programming models within languages such as C, C++, and Fortran. However, modern multi-core processors and accelerators necessitate fine-grained control to achieve effective parallelism, complicating the development process. To address this, developers commonly utilize high-level programming models such as Open Multi-Processing (OpenMP), Open Accelerators (OpenACCs), Message Passing Interface (MPI), and Compute Unified Device Architecture (CUDA). These models may be used independently or combined into dual- or tri-model applications to leverage their complementary strengths. However, integrating multiple models introduces subtle and difficult-to-detect runtime errors such as data races, deadlocks, and livelocks that often elude conventional compilers. This complexity is exacerbated in applications that simultaneously incorporate MPI, OpenMP, and CUDA, where the origin of runtime errors, whether from individual models, user logic, or their interactions, becomes ambiguous. Moreover, existing tools are inadequate for detecting such errors in tri-model applications, leaving a critical gap in development support. To address this gap, the present study introduces a static analysis tool designed specifically for tri-model applications combining MPI, OpenMP, and CUDA in C++-based environments. The tool analyzes source code to identify both actual and potential runtime errors prior to execution. Central to this approach is the introduction of error dependency graphs, a novel mechanism for systematically representing and analyzing error correlations in hybrid applications. By offering both error classification and comprehensive static detection, the proposed tool enhances error visibility and reduces manual testing effort. This contributes significantly to the development of more robust parallel applications for high-performance computing (HPC) and future exascale systems. Full article

(This article belongs to the Special Issue Best Practices, Challenges and Opportunities in Software Engineering)

► Show Figures

Figure 1

35 pages, 9206 KB

Open AccessArticle

New Strategies Based on Hierarchical Matrices for Matrix Polynomial Evaluation in Exascale Computing Era

by Luisa Carracciuolo and Valeria Mele

Mathematics 2025, 13(9), 1378; https://doi.org/10.3390/math13091378 - 23 Apr 2025

Viewed by 527

Abstract

Advancements in computing platform deployment have acted as both push and pull elements for the advancement of engineering design and scientific knowledge. Historically, improvements in computing platforms were mostly dependent on simultaneous developments in hardware, software, architecture, and algorithms (a process known as [...] Read more.

Advancements in computing platform deployment have acted as both push and pull elements for the advancement of engineering design and scientific knowledge. Historically, improvements in computing platforms were mostly dependent on simultaneous developments in hardware, software, architecture, and algorithms (a process known as co-design), which raised the performance of computational models. But, there are many obstacles to using the Exascale Computing Era sophisticated computing platforms effectively. These include but are not limited to massive parallelism, effective exploitation, and high complexity in programming, such as heterogeneous computing facilities. So, now is the time to create new algorithms that are more resilient, energy-aware, and able to address the demands of increasing data locality and achieve much higher concurrency through high levels of scalability and granularity. In this context, some methods, such as those based on hierarchical matrices (HMs), have been declared among the most promising in the use of new computing resources precisely because of their strongly hierarchical nature. This work aims to start to assess the advantages, and limits, of the use of HMs in operations such as the evaluation of matrix polynomials, which are crucial, for example, in a Graph Convolutional Deep Neural Network (GC-DNN) context. A case study from the GCNN context provides some insights into the effectiveness, in terms of accuracy, of the employment of HMs. Full article

(This article belongs to the Special Issue Advances in Numerical Mathematics for High-Performance Computing in the Exascale Era)

► Show Figures

Figure 1

14 pages, 781 KB

Open AccessArticle

Efficient I/O Performance-Focused Scheduling in High-Performance Computing

by Soeun Kim, Sunggon Kim and Hwajung Kim

Appl. Sci. 2024, 14(21), 10043; https://doi.org/10.3390/app142110043 - 4 Nov 2024

Viewed by 2461

Abstract

High-performance computing (HPC) systems are becoming increasingly important as contemporary exascale applications with demand extensive computational and data processing capability. To optimize these systems, efficient scheduling of HPC applications is important. In particular, because I/O is a shared resource among applications and is [...] Read more.

High-performance computing (HPC) systems are becoming increasingly important as contemporary exascale applications with demand extensive computational and data processing capability. To optimize these systems, efficient scheduling of HPC applications is important. In particular, because I/O is a shared resource among applications and is becoming more important due to the emergence of big data, it is possible to improve performance by considering the architecture of HPC systems and scheduling jobs based on I/O resource requirements. In this paper, we propose a scheduling scheme that prioritizes HPC applications based on their I/O requirements. To accomplish this, our scheme analyzes the IOPS of scheduled applications by examining their execution history. Then, it schedules the applications at pre-configured intervals based on their expected IOPS to maximize the available IOPS across the entire system. Compared to the existing first-come first-served (FCFS) algorithm, experimental results using real-world HPC log data show that our scheme reduces total execution time by 305 h and decreases costs by USD 53 when scheduling 10,000 jobs utilizing public cloud resources. Full article

(This article belongs to the Special Issue Distributed Computing Systems: Advances, Trends and Emerging Designs)

► Show Figures

Figure 1

83 pages, 2747 KB

Open AccessReview

Mathematical Tools for Simulation of 3D Bioprinting Processes on High-Performance Computing Resources: The State of the Art

by Luisa Carracciuolo and Ugo D’Amora

Appl. Sci. 2024, 14(14), 6110; https://doi.org/10.3390/app14146110 - 13 Jul 2024

Cited by 4 | Viewed by 2424

Abstract

Three-dimensional (3D) bioprinting belongs to the wide family of additive manufacturing techniques and employs cell-laden biomaterials. In particular, these materials, named “bioink”, are based on cytocompatible hydrogel compositions. To be printable, a bioink must have certain characteristics before, during, and after [...] Read more.

Three-dimensional (3D) bioprinting belongs to the wide family of additive manufacturing techniques and employs cell-laden biomaterials. In particular, these materials, named “bioink”, are based on cytocompatible hydrogel compositions. To be printable, a bioink must have certain characteristics before, during, and after the printing process. These characteristics include achievable structural resolution, shape fidelity, and cell survival. In previous centuries, scientists have created mathematical models to understand how physical systems function. Only recently, with the quick progress of computational capabilities, high-fidelity and high-efficiency “computational simulation” tools have been developed based on such models and used as a proxy for real-world learning. Computational science, or “in silico” experimentation, is the term for this novel strategy that supplements pure theory and experiment. Moreover, a certain level of complexity characterizes the architecture of contemporary powerful computational resources, known as high-performance computing (HPC) resources, also due to the great heterogeneity of its structure. Lately, scientists and engineers have begun to develop and use computational models more extensively to also better understand the bioprinting process, rather than solely relying on experimental research, due to the large number of possible combinations of geometrical parameters and material properties, as well as the abundance of available bioprinting methods. This requires a new effort in designing and implementing computational tools capable of efficiently and effectively exploiting the potential of new HPC computing systems available in the Exascale Era. The final goal of this work is to offer an overview of the models, methods, and techniques that can be used for “in silico” experimentation of the physicochemical processes underlying the process of 3D bioprinting of cell-laden materials thanks to the use of up-to-date HPC resources. Full article

(This article belongs to the Special Issue Parallel, Distributed and Cloud Computing: Status, Prospects and Future)

► Show Figures

Figure 1

30 pages, 5007 KB

Open AccessArticle

Temporal-Logic-Based Testing Tool Architecture for Dual-Programming Model Systems

by Salwa Saad, Etimad Fadel, Ohoud Alzamzami, Fathy Eassa and Ahmed M. Alghamdi

Computers 2024, 13(4), 86; https://doi.org/10.3390/computers13040086 - 25 Mar 2024

Cited by 2 | Viewed by 2160

Abstract

Today, various applications in different domains increasingly rely on high-performance computing (HPC) to accomplish computations swiftly. Integrating one or more programming models alongside the used programming language enhances system parallelism, thereby improving its performance. However, this integration can introduce runtime errors such as [...] Read more.

Today, various applications in different domains increasingly rely on high-performance computing (HPC) to accomplish computations swiftly. Integrating one or more programming models alongside the used programming language enhances system parallelism, thereby improving its performance. However, this integration can introduce runtime errors such as race conditions, deadlocks, or livelocks. Some of these errors may go undetected using conventional testing techniques, necessitating the exploration of additional methods for enhanced reliability. Formal methods, such as temporal logic, can be useful for detecting runtime errors since they have been widely used in real-time systems. Additionally, many software systems must adhere to temporal properties to ensure correct functionality. Temporal logics indeed serve as a formal frame that takes into account the temporal aspect when describing changes in elements or states over time. This paper proposes a temporal-logic-based testing tool utilizing instrumentation techniques designed for a dual-level programming model, namely, Message Passing Interface (MPI) and Open Multi-Processing (OpenMP), integrated with the C++ programming language. After a comprehensive study of temporal logic types, we found and proved that linear temporal logic is well suited as the foundation for our tool. Notably, while the tool is currently in development, our approach is poised to effectively address the highlighted examples of runtime errors by the proposed solution. This paper thoroughly explores various types and operators of temporal logic to inform the design of the testing tool based on temporal properties, aiming for a robust and reliable system. Full article

► Show Figures

Figure 1

17 pages, 539 KB

Open AccessArticle

First Steps towards Efficient Genome Assembly on ARM-Based HPC

by Kristijan Poje, Mario Brcic, Josip Knezovic and Mario Kovac

Electronics 2024, 13(1), 39; https://doi.org/10.3390/electronics13010039 - 20 Dec 2023

Cited by 1 | Viewed by 1841

Abstract

Exponential advances in computational power have fueled advances in many disciplines, and biology is no exception. High-Performance Computing (HPC) is gaining traction as one of the essential tools in scientific research. Further advances to exascale capabilities will necessitate more energy-efficient hardware. In this [...] Read more.

Exponential advances in computational power have fueled advances in many disciplines, and biology is no exception. High-Performance Computing (HPC) is gaining traction as one of the essential tools in scientific research. Further advances to exascale capabilities will necessitate more energy-efficient hardware. In this article, we present our efforts to improve the efficiency of genome assembly on ARM-based HPC systems. We use vectorization to optimize the popular genome assembly pipeline of minimap2, miniasm, and Racon. We compare different implementations using the Scalable Vector Extension (SVE) instruction set architecture and evaluate their performance in different aspects. Additionally, we compare the performance of autovectorization to hand-tuned code with intrinsics. Lastly, we present the design of a CPU dispatcher included in the Racon consensus module that enables the automatic selection of the fastest instruction set supported by the utilized CPU. Our findings provide a promising direction for further optimization of genome assembly on ARM-based HPC systems. Full article

► Show Figures

Figure 1

16 pages, 920 KB

Open AccessArticle

An Architecture for a Tri-Programming Model-Based Parallel Hybrid Testing Tool

by Saeed Musaad Altalhi, Fathy Elbouraey Eassa, Abdullah Saad Al-Malaise Al-Ghamdi, Sanaa Abdullah Sharaf, Ahmed Mohammed Alghamdi, Khalid Ali Almarhabi and Maher Ali Khemakhem

Appl. Sci. 2023, 13(21), 11960; https://doi.org/10.3390/app132111960 - 1 Nov 2023

Cited by 5 | Viewed by 2315

Abstract

As the development of high-performance computing (HPC) is growing, exascale computing is on the horizon. Therefore, it is imperative to develop parallel systems, such as graphics processing units (GPUs) and programming models, that can effectively utilise the powerful processing resources of exascale computing. [...] Read more.

As the development of high-performance computing (HPC) is growing, exascale computing is on the horizon. Therefore, it is imperative to develop parallel systems, such as graphics processing units (GPUs) and programming models, that can effectively utilise the powerful processing resources of exascale computing. A tri-level programming model comprising message passing interface (MPI), compute unified device architecture (CUDA), and open multi-processing (OpenMP) models may significantly enhance the parallelism, performance, productivity, and programmability of the heterogeneous architecture. However, the use of multiple programming models often leads to unexpected errors and behaviours during run-time. It is also difficult to detect such errors in high-level parallel programming languages. Therefore, this present study proposes a parallel hybrid testing tool that employs both static and dynamic testing techniques to address this issue. The proposed tool was designed to identify the run-time errors of C++ and MPI + OpenMP + CUDA systems by analysing the source code during run-time, thereby optimising the testing process and ensuring comprehensive error detection. The proposed tool was able to identify and categorise the run-time errors of tri-level programming models. This highlights the need for a parallel testing tool that is specifically designed for tri-level MPI + OpenMP + CUDA programming models. As contemporary parallel testing tools cannot, at present, be used to test software applications produced using tri-level MPI + OpenMP + CUDA programming models, this present study proposes the architecture of a parallel testing tool to detect run-time errors in tri-level MPI + OpenMP + CUDA programming models. Full article

► Show Figures

Figure 1

15 pages, 767 KB

Open AccessReview

Advances in Computational Approaches for Estimating Passive Permeability in Drug Discovery

by Austen Bernardi, W. F. Drew Bennett, Stewart He, Derek Jones, Dan Kirshner, Brian J. Bennion and Timothy S. Carpenter

Membranes 2023, 13(11), 851; https://doi.org/10.3390/membranes13110851 - 25 Oct 2023

Cited by 5 | Viewed by 4806

Abstract

Passive permeation of cellular membranes is a key feature of many therapeutics. The relevance of passive permeability spans all biological systems as they all employ biomembranes for compartmentalization. A variety of computational techniques are currently utilized and under active development to facilitate the [...] Read more.

Passive permeation of cellular membranes is a key feature of many therapeutics. The relevance of passive permeability spans all biological systems as they all employ biomembranes for compartmentalization. A variety of computational techniques are currently utilized and under active development to facilitate the characterization of passive permeability. These methods include lipophilicity relations, molecular dynamics simulations, and machine learning, which vary in accuracy, complexity, and computational cost. This review briefly introduces the underlying theories, such as the prominent inhomogeneous solubility diffusion model, and covers a number of recent applications. Various machine-learning applications, which have demonstrated good potential for high-volume, data-driven permeability predictions, are also discussed. Due to the confluence of novel computational methods and next-generation exascale computers, we anticipate an exciting future for computationally driven permeability predictions. Full article

(This article belongs to the Special Issue Modern Studies on Drug-Membrane Interactions 2.0)

► Show Figures

Figure 1

44 pages, 1352 KB

Open AccessReview

Data Locality in High Performance Computing, Big Data, and Converged Systems: An Analysis of the Cutting Edge and a Future System Architecture

by Sardar Usman, Rashid Mehmood, Iyad Katib and Aiiad Albeshri

Electronics 2023, 12(1), 53; https://doi.org/10.3390/electronics12010053 - 23 Dec 2022

Cited by 19 | Viewed by 8756

Abstract

Big data has revolutionized science and technology leading to the transformation of our societies. High-performance computing (HPC) provides the necessary computational power for big data analysis using artificial intelligence and methods. Traditionally, HPC and big data had focused on different problem domains and [...] Read more.

Big data has revolutionized science and technology leading to the transformation of our societies. High-performance computing (HPC) provides the necessary computational power for big data analysis using artificial intelligence and methods. Traditionally, HPC and big data had focused on different problem domains and had grown into two different ecosystems. Efforts have been underway for the last few years on bringing the best of both paradigms into HPC and big converged architectures. Designing HPC and big data converged systems is a hard task requiring careful placement of data, analytics, and other computational tasks such that the desired performance is achieved with the least amount of resources. Energy efficiency has become the biggest hurdle in the realization of HPC, big data, and converged systems capable of delivering exascale and beyond performance. Data locality is a key parameter of HPDA system design as moving even a byte costs heavily both in time and energy with an increase in the size of the system. Performance in terms of time and energy are the most important factors for users, particularly energy, due to it being the major hurdle in high-performance system design and the increasing focus on green energy systems due to environmental sustainability. Data locality is a broad term that encapsulates different aspects including bringing computations to data, minimizing data movement by efficient exploitation of cache hierarchies, reducing intra- and inter-node communications, locality-aware process and thread mapping, and in situ and transit data analysis. This paper provides an extensive review of cutting-edge research on data locality in HPC, big data, and converged systems. We review the literature on data locality in HPC, big data, and converged environments and discuss challenges, opportunities, and future directions. Subsequently, using the knowledge gained from this extensive review, we propose a system architecture for future HPC and big data converged systems. To the best of our knowledge, there is no such review on data locality in converged HPC and big data systems. Full article

(This article belongs to the Special Issue Defining, Engineering, and Governing Green Artificial Intelligence)

► Show Figures

Figure 1

11 pages, 1335 KB

Open AccessReview

Towards Ab-Initio Simulations of Crystalline Defects at the Exascale Using Spectral Quadrature Density Functional Theory

by Swarnava Ghosh

Appl. Mech. 2022, 3(3), 1080-1090; https://doi.org/10.3390/applmech3030061 - 24 Aug 2022

Cited by 1 | Viewed by 2080

Abstract

Defects in crystalline solids play a crucial role in determining properties of materials at the nano, meso- and macroscales, such as the coalescence of vacancies at the nanoscale to form voids and prismatic dislocation loops or diffusion and segregation of solutes to nucleate [...] Read more.

Defects in crystalline solids play a crucial role in determining properties of materials at the nano, meso- and macroscales, such as the coalescence of vacancies at the nanoscale to form voids and prismatic dislocation loops or diffusion and segregation of solutes to nucleate precipitates, phase transitions in magnetic materials via disorder and doping. First principles Density Functional Theory (DFT) simulations can provide a detailed understanding of these phenomena. However, the number of atoms needed to correctly simulate these systems is often beyond the reach of many widely used DFT codes. The aim of this article is to discuss recent advances in first principles modeling of crystal defects using the spectral quadrature method. The spectral quadrature method is linear scaling with respect to the number of atoms, permits spatial coarse-graining, and is capable of simulating non-periodic systems embedded in a bulk environment, which allows the application of appropriate boundary conditions for simulations of crystalline defects. In this article, we discuss the state-of-the-art in ab-initio modeling of large metallic systems of the order of several thousand atoms that are suitable for utilizing exascale computing resourses. Full article

(This article belongs to the Special Issue Applied Thermodynamics: Modern Developments)

► Show Figures

Figure 1

12 pages, 1704 KB

Open AccessReview

High-Performance Computing in Meteorology under a Context of an Era of Graphical Processing Units

by Tosiyuki Nakaegawa

Computers 2022, 11(7), 114; https://doi.org/10.3390/computers11070114 - 13 Jul 2022

Cited by 5 | Viewed by 6056

Abstract

This short review shows how innovative processing units—including graphical processing units (GPUs)—are used in high-performance computing (HPC) in meteorology, introduces current scientific studies relevant to HPC, and discusses the latest topics in meteorology accelerated by HPC computers. The current status surrounding HPC is [...] Read more.

This short review shows how innovative processing units—including graphical processing units (GPUs)—are used in high-performance computing (HPC) in meteorology, introduces current scientific studies relevant to HPC, and discusses the latest topics in meteorology accelerated by HPC computers. The current status surrounding HPC is distinctly complicated in both hardware and software terms, and flows similar to fast cascades. It is difficult to understand and follow the status for beginners; they need to overcome the obstacle of catching up on the information on HPC and connecting it to their studies. HPC systems have accelerated weather forecasts with physical-based models since Richardson’s dream in 1922. Meteorological scientists and model developers have written the codes of the models by making the most of the latest HPC technologies available at the time. Several of the leading HPC systems used for weather forecast models are introduced. Each institute chose an HPC system from many possible alternatives to best match its purposes. Six of the selected latest topics in high-performance computing in meteorology are also reviewed: floating points; spectral transform in global weather models; heterogeneous computing; exascale computing; co-design; and data-driven weather forecasts. Full article

► Show Figures

Figure 1

32 pages, 822 KB

Open AccessArticle

A Survey on Malleability Solutions for High-Performance Distributed Computing

by Jose I. Aliaga, Maribel Castillo, Sergio Iserte, Iker Martín-Álvarez and Rafael Mayo

Appl. Sci. 2022, 12(10), 5231; https://doi.org/10.3390/app12105231 - 22 May 2022

Cited by 21 | Viewed by 3741

Abstract

Maintaining a high rate of productivity, in terms of completed jobs per unit of time, in High-Performance Computing (HPC) facilities is a cornerstone in the next generation of exascale supercomputers. Process malleability is presented as a straightforward mechanism to address that issue. Nowadays, [...] Read more.

Maintaining a high rate of productivity, in terms of completed jobs per unit of time, in High-Performance Computing (HPC) facilities is a cornerstone in the next generation of exascale supercomputers. Process malleability is presented as a straightforward mechanism to address that issue. Nowadays, the vast majority of HPC facilities are intended for distributed-memory applications based on the Message Passing (MP) paradigm. For this reason, many efforts are based on the Message Passing Interface (MPI), the de facto standard programming model. Malleability aims to rescale executions on-the-fly, in other words, reconfigure the number and layout of processes in running applications. Process malleability involves resources reallocation within the HPC system, handling processes of the application, and redistributing data among those processes to resume the execution. This manuscript compiles how different frameworks address process malleability, their main features, their integration in resource management systems, and how they may be used in user codes. This paper is a detailed state-of-the-art devised as an entry point for researchers who are interested in process malleability. Full article

(This article belongs to the Special Issue State-of-the-Art High-Performance Computing and Networking)

► Show Figures

Figure 1

23 pages, 3662 KB

Open AccessArticle

A Survey of High-Performance Interconnection Networks in High-Performance Computer Systems

by Ping-Jing Lu, Ming-Che Lai and Jun-Sheng Chang

Electronics 2022, 11(9), 1369; https://doi.org/10.3390/electronics11091369 - 25 Apr 2022

Cited by 20 | Viewed by 15807

Abstract

High-performance interconnection network is the key to realizing high-speed, collaborative, parallel computing at each node in a high-performance computer system. Its performance and scalability directly affect the performance and scalability of the whole system. With continuous improvements in the performance of high-performance computer [...] Read more.

High-performance interconnection network is the key to realizing high-speed, collaborative, parallel computing at each node in a high-performance computer system. Its performance and scalability directly affect the performance and scalability of the whole system. With continuous improvements in the performance of high-performance computer systems, the trend in the development of high-performance interconnection networks is mainly reflected in network sizes and network bandwidths. With the slowdown of Moore’s Law, it is necessary to adopt new packaging design technologies to implement high-performance interconnection networks for high-performance computing. This article analyzes the main interconnection networks used by high-performance computer systems in the Top500 list of November 2021, and it elaborates the design of representative, state-of-the-art, high-performance interconnection networks, including NVIDIA InfiniBand, Intel Omni-Path, Cray Slingshot/Aries, and custom or proprietary networks, including Fugaku Tofu, Bull BXI, TH Express, and so forth. This article also comprehensively discusses the latest technologies and trends in this field. In addition, based on the analysis of the challenges faced by high-performance interconnection network design in the post-Moore era and the exascale computing era, this article presents a perspective on high-performance interconnection networks. Full article

(This article belongs to the Special Issue New Trends for High-Performance Computing)

► Show Figures

Figure 1

Search Results (23)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (23)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI