Real-Time Interactive Parallel Visualization of Large-Scale Flow-Field Data

He, Zhouqiao; Chen, Cheng; Wu, Yadong; Tian, Xiaokun; Chu, Qikai; Huang, Zhengbin; Zhang, Weihan

doi:10.3390/app13169092

Open AccessArticle

Real-Time Interactive Parallel Visualization of Large-Scale Flow-Field Data

by

Zhouqiao He

^1,2,3,

Cheng Chen

²,

Yadong Wu

^1,3,*,

Xiaokun Tian

^1,2,3,

Qikai Chu

^1,3,4

,

Zhengbin Huang

² and

Weihan Zhang

¹

School of Computer Science and Engineering, Sichuan University of Science and Engineering, Yibin 644002, China

²

China Aerodynamics Research and Development Center, Mianyang 621050, China

³

Sichuan Provincial Engineering Laboratory of Big Data Visual Analysis, Yibin 644002, China

⁴

School of Automation and lnformation Engineering, Sichuan University of Science and Engineering, Yibin 644002, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(16), 9092; https://doi.org/10.3390/app13169092

Submission received: 13 July 2023 / Revised: 24 July 2023 / Accepted: 2 August 2023 / Published: 9 August 2023

(This article belongs to the Topic Computational Fluid Dynamics (CFD) and Its Applications)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

With the increasing demand for high precision in numerical simulations using computational fluid dynamics (CFD), the use of large-scale grids for discretized solutions has become a trend, resulting in an explosive growth of flow-field data size. To address the challenges posed by large-scale flow-field data for real-time interactive visualization, this paper proposes novel strategies for data partitioning and communication management. Firstly, we propose a data-partitioning strategy based on grid segmentation. This approach constructs metadata to create file viewports for each process and performs grid partitioning. Subsequently, it reconstructs sub-grids within each process and utilizes a coordinate-mapping algorithm to map global coordinates to local process coordinates, facilitating access to attribute variables through a lookup table. Secondly, we introduce a real-time interactive method for large-scale flow fields. This method leverages the system architecture of high-speed interconnection among compute nodes in a cluster environment and low-speed interconnection between service nodes and rendering nodes. It enables coordinated management of parallel rendering and synchronized rendering methods. The experimental results on typical flow-field data demonstrate that the proposed data-partitioning strategy improves the loading speed of millions of grid-level data by a factor of 7, surpassing ParaView’s performance by 1.5 times. Furthermore, it achieves system load balancing. Real-time interaction experiments with datasets containing 500 million and 800 million grid cells exhibit millisecond-level latencies, demonstrating the effectiveness of the proposed communication management method in meeting the real-time interactive visualization demands of large-scale flow-field data.

Keywords:

data partitioning; scientific visualization; large-scale flow-field data; EnSight Gold data format; real-time interaction; parallel visualization

1. Introduction

Computational fluid dynamics (CFD) numerical simulations often involve applications such as turbulence, vortex motion and separation characteristics, boundary layer transition, shock/boundary layer interactions, unsteady flows, and control, exhibiting multi-region, multi-medium, and multi-scale characteristics [1,2,3,4,5]. The high-dimensional, time-varying, and diverse nature of flow-field data (including unstructured grids, structured grids, and hybrid grids) requires handling complex data types in flow visualization. Additionally, the complexity of unsteady characteristics and physical models, as well as the high resolution of the flow-field grids (with grid numbers ranging from hundreds of millions to billions), leads to massive visualization data size (ranging from GB to PB) [6]. Consequently, large-scale flow-field data visualization poses significant challenges in terms of memory capacity, data I/O, graphic computation, geometric rendering, and network communication management. Traditional serial post-processing software can no longer meet the requirements for the real-time interactive control of large-scale flow-field data, necessitating the adoption of parallel visualization techniques. The process of parallel visualization for large-scale flow-field data involves data preprocessing, data organization and partitioning, parallel feature calculation, parallel rendering, and image synthesis stages [7]. In the data organization and partitioning stage, effective management of large-scale flow-field data to achieve efficient access is a key and challenging aspect of parallel visualization. Furthermore, for the entire parallel visualization process, driving the workflow from a workstation to utilize supercomputers for interactive visualization control while ensuring real-time processing of large-scale complex flow fields presents a major challenge for the parallel visualization of large-scale flow fields.

During the preprocessing and computation stages of computational fluid dynamics (CFD), different computational domains may utilize various software tools for modeling and numerical solving, resulting in the generation of diverse data formats in the CFD-solving process [8]. These formats include EnSight Gold, CGNS, VTK, HDF5, PLOT3D, and Tecplot, among others. The EnSight Gold format has demonstrated advantages in visualization and scalability, and it has found widespread applications in industrial and scientific domains [9]. However, the majority of current research still employs a serial reading approach to handle EnSight Gold data files. This serial reading method is highly inefficient, leading to difficulties in real-time interaction and frequently causing program crashes due to exceeding memory capacity, rendering the file unreadable. Large-scale parallel visualization data preprocessing requires support for these rich data formats. However, the varying organizational structures and approaches among different file types present significant challenges in establishing a unified processing methodology. As far as we know, the open-source visualization software ParaView has proposed a data-partitioning strategy specifically targeting the EnSight Gold format [10]. However, experimental evidence suggests that this strategy merely partitions the data across processes or nodes, resulting in limited improvements in data access performance [11]. For researchers performing flow-field visualization analysis using the EnSight Gold format, an efficient data-partitioning strategy can help overcome the memory limitations of serial I/O and significantly enhance the speed of reading large-scale flow-field data [12]. Such a data-partitioning strategy is of great significance to the engineering practice of large-scale flow visualization.

When dealing with large-scale visualization data in the field of computational fluid dynamics, the limitations of a single machine in terms of memory capacity and rendering efficiency make it unable to meet the demands of real-time interactive control for parallel visualization of large-scale flow fields [13]. Therefore, the computation of visualization features and graphics rendering processes must rely on supercomputers [14]. From the perspective of supercomputer architecture, the visualization terminal is connected to the rendering nodes via Ethernet (with a bandwidth of hundreds of megabits or gigabits per second), while the interconnection between rendering nodes utilizes high-speed interconnects such as Intel InfiniBand (with a bandwidth of tens of gigabits per second), resulting in a difference in bandwidth of several hundred to thousands of times. Consequently, the multi-level asymmetric communication between the visualization terminal and the rendering nodes becomes one of the bottlenecks in achieving real-time interactive control for large-scale flow-field visualization. The key to driving visualization computations on a supercomputer from a visual terminal while ensuring real-time visual interaction control lies in coordinating and managing multi-level asymmetric communication.

In summary, our research in this series makes the following contributions. Firstly, we propose a data-partitioning strategy based on grid segmentation by constructing spatial model metadata and reconstructing grid topology through the grid and coordinate-mapping models. This strategy effectively divides the spatial model into sub-datasets and assigns them to different computational nodes, enabling efficient access to large-scale flow-field data. Secondly, we design a real-time interactive method for large-scale flow fields that adequately coordinates communication resources between different cluster systems and between cluster systems and visualization terminals. This method enables real-time visual interaction control for parallel visualization of large-scale flow fields. Lastly, we validate the proposed data-partitioning strategy using typical flow-field data, demonstrating a sevenfold increase in data access speed. The effectiveness of this data-partitioning strategy is further supported by vivid visualizations. Through interactive control experiments with datasets containing 500 million and 800 million grids, we verify that our proposed communication management method achieves real-time visual interaction control for large-scale flow-field data.

2. Related Works

Our research focuses on efficient data access and real-time interactive control in large-scale parallel visualization of flow fields. In this section, we review the relevant work in these areas.

Over the past two decades, researchers have extensively studied out-of-core techniques and parallel methods to meet the demands of real-time interactive visualization of large-scale flow fields. Many of these methods aim to accelerate I/O transfers between memory and disk to achieve efficient access to large-scale data. In 1997, Ueng et al. proposed an interactive streamline construction method based on unstructured tetrahedral grids. The algorithm uses an octree to partition the original data and reconstructs it into subsets that are stored in disk files for fast data retrieval [15]. This out-of-core algorithm significantly reduces the overhead of reading data from disk and achieves good memory performance. However, its algorithm has certain limitations and only supports tetrahedral grid types, making it unable to adapt to more than a dozen other unstructured grid types.

In 2011, Vishwanath et al. proposed the GLEAN framework, which considers the characteristics of applications, analysis, and systems to facilitate simulation time data analysis and I/O acceleration while providing flexible interfaces to meet data analysis needs at the fastest speed [16]. GLEAN leverages the data semantics of applications and fully utilizes different system topologies and characteristics to accelerate I/O. However, this framework is based on two I/O parallel frameworks: PnetCDF and HDF5, thus only supporting netCDF and HDF5 data formats.

In 2015, Hu Shujian et al. addressed the data I/O problem in the parallel computing of geographic grids by constructing a mapping model from logical structure to physical storage structure of grid data and proposing a parallel read/write framework for grid data [17]. This framework can accurately read data with higher performance but is only applicable to the parallel computing of geographic grids and cannot match the I/O processes in the field of scientific visualization.

In 2020, Zhang Xiaoyang et al. proposed a parallel reading method based on fragmenting MBR (Master Boot Record), where the same data block distributed and stored on different nodes is divided into two fragments. When reading data from different nodes, only half of the original data needs to be read, reducing the amount of data read and improving the read performance of the storage system [18]. However, this parallel reading method is only applicable to distributed storage files and cannot be used for the parallel reading of centralized storage files.

In 2021, Wang Nianhua et al. proposed a parallel reading method for grid data files with billions of grids and hundreds of GBs in scale. The method stores ultra-large-scale grid data files generated by multiple objects using file grouping, where each group contains multiple files, each containing multiple data partitions. During reading, multiple file processes are used to read files and send them to the corresponding non-file processes to achieve a balanced data load distribution [19]. This scheme uses files as the basic unit for parallel reading. Although EnSight Gold format files also store multiple time-step files, each time step’s data need to be read separately according to different timings and then output as results that can be processed by visualization tools. Therefore, multiple files cannot be directly read in parallel.

With the increasing demands for precision and resolution in numerical simulations, the scale of flow-field visualization data has also grown significantly. As a result, an increasing amount of research is focusing on how to achieve real-time interactive control of large-scale flow-field data. Zhao Qinping et al. proposed a parallel rendering method based on the sort-last architecture using a cluster system with multiple rendering nodes and one fusion node. Each rendering node renders different scenes and the fusion node combines the pixel depth maps and scene images rendered by the rendering nodes into a final image [20]. This research solves the problem of parallel rendering between rendering nodes but it does not address the issue of returning the final image to the user terminal.

Peng Minfeng et al. proposed an architecture called a highly parallel multi-task parallel rendering system suitable for processing large-scale complex scenes, capable of executing multiple rendering tasks simultaneously and supporting multi-screen display [21]. This architecture implements the process of returning rendered images to user terminals, but the returned images are result images rendered separately by each process without compositing the images rendered by each process. The complete rendering result is physically spliced and displayed to users through multiple display devices.

Peng Shixiong et al. designed a data compression algorithm for a parallel rendering system based on InfiniBand. The proposed parallel rendering system consists of server nodes, display nodes, and rendering nodes controlled by service nodes, which return images rendered by rendering nodes to display nodes [22]. This research solves the problem of compositing parallel rendered images and returning them to display nodes, but its display nodes request scene images to be displayed periodically under server control without achieving real-time interaction with scenes.

Despite extensive discussions on real-time interactive methods for the large-scale parallel visualization of flow fields, there is still a need for further research to address the management challenges of efficient data access and network communication for such datasets. In this regard, this paper proposes a data-partitioning strategy based on grid segmentation and designs job management, data communication, and human–machine real-time interactive coordination methods. These steps gradually achieve the real-time interactive exploration of large-scale flow-field data.

3. Data-Partitioning Strategy Based on Grid Segmentation

With the continuous refinement of grid models in CFD applications, the resolution of numerical simulations has been increasing, leading to a rapid growth in the volume of flow-field data from terabytes (TB) to petabytes (PB) or even higher orders of magnitude. Large-scale flow-field data pose a series of challenges for visualization processing, such as storage and I/O issues. The most commonly used approach to address the data access and computational challenges in visualizing large-scale flow-field data is leveraging high-performance computing techniques, including parallel I/O hardware and software, as well as parallel visualization algorithms. This involves partitioning the massive computational tasks and evenly distributing them across numerous computing nodes of supercomputers.

Data organization and partitioning play a crucial role in the parallel visualization process of large-scale flow fields. They serve as essential preprocessing techniques aimed at effectively managing and organizing massive data, enabling efficient access to large-scale datasets. Commonly employed methods for data organization and partitioning include multiresolution-based approaches and block-based approaches. Multiresolution-based methods, albeit effective in reducing the overall data volume, may introduce a trade-off by compromising data quality. These methods employ approximate compression techniques that lower data precision or discard fine-grained details, potentially leading to a loss of data fidelity. On the other hand, block-based approaches may encounter challenges related to imbalanced data partitioning. Significant variations in data volume across different grid blocks can result in uneven distribution, thereby impacting load balance during computation and affecting overall performance.

Both of the aforementioned approaches can impact the effectiveness and quality of parallel visualization. Therefore, this paper proposes a data-partitioning strategy based on grid segmentation. This involves preprocessing flow-field data files to construct metadata (for the sake of convenience, metaInfo is used to refer to this series of data structures when metadata is mentioned later), constructing grid and coordinate-mapping models, reorganizing grid topology, and dividing geometric models by grid type. By parsing the coordinate-mapping model and grid-mapping model and reading flow-field attribute data in parallel, efficient data loading is achieved. In addition, we implement a static load-balancing configuration to address the problem of load imbalance that can occur during grid segmentation. Figure 1 describes the parallel reading process.

3.1. Preprocessing of File Data

Prior to file preprocessing, an in-depth analysis of the file is necessary to better understand its contents and identify any potential issues. Careful examination of the file allows for the extraction of key information, such as the location of keywords, the number of grid points, and the quantity of various types of grids. This data serves as a foundation for constructing file metadata. After constructing the metadata, we can utilize it to segment the grids for improved load balancing and performance optimization. Additionally, we must calculate the start and end positions of each segmented grid within the file and store this information in the metadata. This completes the file preprocessing process. Our file preprocessing method is not only efficient and accurate but also highly precise and reliable. This approach enables us to better handle complex file processing requirements and improve work efficiency and quality.

3.1.1. The EnSight Gold Data Format

The EnSight Gold format is used to store scientific and engineering visualization data. It can be used directly for 3D visualization and supports various effects such as stereoscopic display and projection. Additionally, the EnSight Gold format can directly describe vector data and is scalable, supporting multiple software platforms. Due to its significant advantages in visualization and scalability, the EnSight Gold format has been widely used in industrial and scientific fields [5]. The format consists of multiple files, with different types of data files stored in files with different suffixes. For example, geometric model files use the .geo suffix, whereas model-related attribute data are stored in files with user-defined suffixes. The EnSight Gold format is commonly used to store unsteady flow-field data and uses .case files to maintain the file-association relationships between various geometry models and attributes, as well as the data organization relationships of different time steps. Figure 2 shows the structure of an EnSight Gold file.

3.1.2. Pre-Scanning and Metadata Construction

The storage method of grids and grid point coordinates in EnSight Gold data files differs significantly from other data types. It uses a separate set of global point coordinates for each part (representing a component of the geometric model), including XYZ three-dimensional coordinate values. Different types of grids share the ID of this set of coordinates, and in EnSight Gold, the IDs of points and grids start from 0 and increase by default in the order of storage. Since the number of parts is not positively correlated with the file size (a single part occupies 3 GB of storage space), a part cannot be directly assigned to each process. However, the number of grids in a part is positively correlated with file size. Therefore, this paper’s strategy is to divide the grids in a part into sub-grids and assign them to different processes. To solve this problem, this paper designs a data structure called PartsInfoType to store key information about each part of the geometric model file. Before reading the geometric model file to build the grid, we need to scan the geo file first and store each part’s basic information in PartsInfoType as metadata.

3.1.3. Grid Segmentation

Each part typically comprises multiple types of elements, composed of 15 topologies such as quadrilaterals and pyramids. The number and topology of each type of element are explicitly represented, while the unique identifier for each grid cell (Cell) within different types is implicitly expressed. During the construction of metadata for the geometric model file, we further divide the grid based on the number of processes. To facilitate this, we design a data structure called CellTypes to represent grid division. This data structure provides an accurate description of the number and topology of different types of elements, enabling more efficient management and processing of geometric model files. For instance, consider

{part}_{i}

with N different types of elements and

c_{i}

cells for the i-th type of element.

{part}_{i}

has a total of

C = \sum_{i = 1}^{N} c_{i}

cells. We launch P processes and assign these cells to them. Initially, we evenly distribute these cells among the P processes. The remaining

r = C \mod P

cells are assigned sequentially to processes 0 to r − 1 to achieve a static load-balanced configuration. Let

vBlock

be a vector of length P, where its p-th element represents the number of cells assigned to process p. Using Formula (1), we can calculate the value of each element in vector

vBlock

based on the above division method.

{vBlock}_{p} = \{\begin{matrix} ⌊\frac{C}{P}⌋, & p \geq r \\ ⌊\frac{C}{P}⌋ + 1, & p < r \end{matrix}

(1)

Binary geometric model files typically consist of a sequence of bytecodes, with each integer represented as an int type. The mesh partition data calculated cannot be directly utilized by the file pointer. Thus, additional calculations are required to determine the starting and ending byte positions of the partitioned sub-meshes within the file. According to the EnSight Gold data format, each part usually comprises multiple types of elements, with each element type consisting of tens of thousands to tens of millions of cells. Our mesh partitioning strategy involves each process handling a portion of all elements’ cells. Therefore, to calculate the sub-meshes’ positions, we must separately compute the starting and ending positions of cells under each element. To further calculate the bytecode sequence that the process must handle after partitioning the mesh, we must separately compute the starting and ending positions of cells under each element. Initially, we employ the vector

vBlock

to compute the total number of cells allocated by processes 0 to i − 1, as demonstrated in Expression (2), where

vBlock [s]

denotes a vector component.

preBlock = \sum_{s = 0}^{i - 1} vBlock [s]

(2)

Subsequently, we employ the value of

vBlock [i]

to determine the number of cells (

curBlock

) allocated by process

i

. We utilize pointers

{realStartPos}_{ijk}

and

{realEndPos}_{ijk}

to represent the starting and ending byte positions of cells under

{element}_{k}

in

{part}_{j}

that process

i

must handle. Specifically, the calculation method for a sub-mesh’s starting position is demonstrated in Expression (3), whereas the calculation method for a sub-mesh’s ending position is demonstrated in Expression (4). In these expressions,

{cellSartPos}_{jk}

denotes the first-byte position of the first cell under

{element}_{k}

in

{part}_{j}

, numPointsOfCell denotes the number of coordinate points comprising this element type (for instance, a triangle comprises 3 points and a tetrahedron comprises 4 points),

sizeof (INT)

denotes the number of bytes occupied by an int type within the current runtime environment, and

{curBlock}_{i}

denotes the number of cells allocated by the current process.

{realStartPos}_{ijk} = {cellStartPos}_{jk} + {preBlock}_{i} * sizeof (INT) * numPointsOfCell

(3)

{realEndPos}_{ijk} = {realStartPos}_{ijk} + {curBlock}_{i} * sizeof (INT) * numPointOfCell - 1

(4)

3.2. Grid Topology Reconstruction

In our research, we successfully construct the metadata of the geometric model file through precise operations such as pre-scanning and grid segmentation. In the metadata, we carefully preserve the sub-grid information processed by each process, making full preparations for subsequent topological reconstruction. Next, we first map the global coordinates in the geometric model file to the local coordinates in the process. Then, using the point sets maintained by each process, we perform topological reconstruction on the sub-grids and remap them to each process. As a result, a new sub-grid topological structure is formed.

Taking the geometric model shown in Figure 3a as an example, the model consists of three parts, with yellow, red, and blue grids representing one part each. In order to achieve a parallel reading of the grid model, multiple processes can be launched to partition the components. As an illustration, we take four processes as an example. As shown in Figure 3c, the four processes first process the yellow part in parallel. Once completed, they begin processing the red and blue parts. Finally, as shown in Figure 3b, processes 0–3 each store a portion of data for each part, thus achieving efficient and economical parallel reading of geometric model grids.

Our research provides a new approach and method for parallelizing I/O of large-scale flow-field data, with high academic value and practical application value.

3.2.1. Coordinate-Mapping Algorithm

In our research, we successfully construct the metadata of the geometric model file through precise operations such as pre-scanning and grid segmentation. The core function of this metadata is to create file viewports for each process, thereby creating conditions for parallel reading. Figure 4 shows the situation of each process reading the viewport. Each process opens the file separately and maintains its own file pointer. With the assistance of metadata, each process only reads the data within its own viewport, constructs the grid, and performs grid mapping. This process achieves more efficient data processing, and the content processed by each process does not interfere with each other.

In EnSight Gold format files, each part shares the same set of coordinate points. Therefore, each process must maintain its own set of points. In this study, we propose a point-set mapping algorithm that enables the reconstruction of mesh topology for process mapping after partitioning the mesh into multiple parts and assigning them to different processes. This algorithm maps the point set in the geometric model part to each process through two mappings, ensuring the accuracy of key operations such as mesh construction and attribute-variable retrieval. Figure 5 illustrates these two mapping processes.

Specifically, we abstract the algorithm as a function

f

, which maps the point set in the geometric model to the point set in the process, as demonstrated in Expression (5), where

P_{model}

represents the point set in the geometric model file, and

P_{process}

represents the point set in the process. To ensure the accuracy of key operations, the mapping function includes two steps: the first mapping and the second mapping.

f : P_{model} \to P_{process}

(5)

In the first mapping step, we maintain a coordinate-mapping table

T_{point}^{i}

, recording the mapping relationship between the ID of each point in the geometric model and the corresponding ID of the point in process

i

. First, we use the information in the metadata constructed by data preprocessing to calculate the byte length that process i should read. We then maintain a vector

{nodeIdList}_{ijk}

with a length of

{processNumEleLen}_{ijk}

to read all bytes of the topology of the cell under

{element}_{k}

in

{part}_{j}

that process

i

needs to process. According to

{nodeIdList}_{ijk}

, we obtain the ID of the coordinate as an index and use the value of

pointsCnt

(a counter with an initial value of 1, incrementing in a loop) as the value of the current coordinate in the mapping table

T_{point}^{i}

to complete the mapping of a coordinate point. Whenever a new coordinate ID is obtained, we first check whether there is already a mapping in the mapping table

T_{point}^{i}

. If it does not exist, we perform mapping and use the mapping result to construct a sub-grid. If it already exists, we directly use the mapping result to construct a new cell topology structure. Finally, after all grid topologies are constructed, the mapping table

T_{point}^{i}

also completes the mapping from the global coordinates to the local coordinates.

In the second mapping stage, the point set is stored as an object for subsequent attribute data retrieval. Each process maintains a mapping table,

T_{realPoint}^{i}

, to record the relationship between the coordinate point ID used by process

i

in the geometric model file and the ID in process

i

. A temporary array,

T_{temp}^{i}

, is required for interim mapping, as the number of points used by the current process can only be determined after all grids have been constructed. Specifically, the coordinate ID in

{nodeIdList}_{i}

is obtained and stored as the value of

T_{temp}^{i}

at the index position of

pointsCnt - 1

. Upon completion of grid construction, the temporary mapping table finalizes the mapping and copies the memory space of length

pointsCnt

to

T_{realPoint}^{i}

.

The first mapping stage is utilized for sub-grid reconstruction. During the acquisition of points comprising the cell topology, duplication may occur, where a single coordinate point is shared among multiple cells. To establish a new grid topology, it is necessary to eliminate duplicate points. The second mapping stage facilitates subsequent attribute data extraction. As attribute data files store grid center and grid point data from points or units with an ID of 0 by default, it is imperative to remap the global and local coordinates. Hence, in the coordinate-mapping algorithm proposed in this paper, performing two mappings of the coordinate points is essential.

3.2.2. Grid Construction and Mapping

The traditional method of reading EnSight Gold format files involves using a file pointer to sequentially read each byte of the file from beginning to end. In our implementation, we adopted a preprocessing approach to partition the file grid and distribute it evenly among the processes. Additionally, we constructed metadata for the geometric model file to facilitate subsequent operations. With the assistance of metadata, we can move the pointer randomly within the area responsible for the process and directly obtain the required data, without having to move the pointer byte by byte. In the geometric model file, the keyword “element id given” indicates that the file will display the given cell ID. However, this ID is not the ID of the cell topology structure but is only used to display grid numbering. In the file, the ID of the cell topology structure is numbered from 0 by default and increases in storage order. When reading data serially, you can directly build grid topology in sequence according to the order in which they are read. Since our parallel reading strategy partitions and assigns grids to different processes, it is impossible to build grid topology in storage order. Therefore, we propose a sub-grid topology construction and cell ID mapping method that uses the new coordinate point set mapped in Section 3.2.1 to rebuild new grid topologies in each process and map grids in geometric model files to them.

In Section 3.2.1, we maintained a vector called

{nodeIdList}_{ijk}

to store the topology information of cells. This information consists of continuous bytes and can be directly read and saved to the vector. Using

metaInfo . {preBlock}_{ijk}

and

metaInfo . {curBlock}_{ijk}

in the metadata, we can calculate the starting ID (

{id}_{begin}

) and ending ID (

{id}_{end}

) of the cells under

{element}_{k}

in

{part}_{j}

that process

i

should handle. These values serve as boundary conditions for determining whether sub-grid construction has ended. Since partitioned sub-grids no longer share a set of coordinate points but instead use the new point set mapped to each process in Section 3.2.1 to construct new topologies for sub-grids, each process needs to maintain a grid-mapping table called

T_{realCellId}^{i}

. The variable

{originalCellId}_{ijk}

(from

{id}_{begin}

to

{id}_{end}

) is used to iteratively construct new sub-grid topologies. This variable represents the ID of the cell in the geometric model file. Whenever a new grid topology is constructed, the variable

{originalCellId}_{ijk}

is saved to the grid-mapping table

T_{realCellId}^{i}

. The index of this grid-mapping table increases from 0, and its index and value form a mapping relationship between the new grid ID and the original grid ID.

Figure 6 illustrates the process of constructing and mapping a grid. As an example, we consider a scenario in which four processes are used to handle a quad-type grid with 5,880,097 cells from a geometric model file. Our data preprocessing program evenly divides the grid and assigns it to the four processes. Process 0 is responsible for processing 1,470,025 cells, whereas the remaining three processes each process 1,470,024 cells. When processing its assigned sub-grid, process 1 can use metadata to calculate the starting and ending IDs as 1,470,026 and 2,940,049, respectively, and allocate memory space of size 1,470,024 for the grid-mapping table

T_{realCellId}^{i}

. For the cell with ID 1,470,026 in the geometric model file (comprised of four points: 170,023, 182,964, 32,689, and 763,284), our coordinate-mapping strategy maps its topology to 0, 1, 2, and 3 and its ID to 0. The other three processes similarly construct their sub-grids and map the grid IDs.

3.3. Attribute-Variable Picking

In the EnSight Gold format, flow-field attribute data are usually stored in files with user-defined suffixes. These data can be divided into cell-centered data and node-centered data according to their storage location in the flow field and are identified by the keywords “node” and “element”. These two types of data contain scalars, vectors, and tensors of physical quantities, and the values of physical quantities can be either real or complex. Therefore, the flow-field attribute data in the EnSight Gold format is composed of multiple types of physical quantities and usually needs to be stored in 1–12 different files. We have adopted a parallel reading strategy to partition and remap the grid, so it is particularly important to associate the flow-field attribute data with the partitioned sub-grids one by one. In the previous sections, we mentioned that each process needs to maintain a coordinate-mapping table

T_{realPoint}^{i}

and a grid-mapping table

T_{realCellId}^{i}

. These two mapping tables are key to picking up flow-field attribute variables.

The mapping relationship between the index and value of the mapping table reflects the mapping relationship between the coordinate point set/sub-grid and the geometric model file and process. The length of the mapping table can represent the size of the point set maintained by the process and the number of cells in the sub-grid. Therefore, we can obtain key information about the point set and sub-grid by parsing these two mapping tables separately. When reading cell-centered data, process i controls the number of iterations based on the length of the mapping table. First, all attribute data of cells under

{element}_{k}

in attribute file

{part}_{j}

are read into a list at one time, and this information consists of continuous bytes that can be read directly. Then, using three pieces of information, i.e., part number j, element type number, and the current iteration times (which also represent the ID number of the cell currently being processed in the process), we can parse the ID number of the cell in the geometric model corresponding to it from the grid-mapping table

T_{realCellId}^{i}

. Then, we can obtain the attribute value corresponding to the current cell from the attribute list using this ID number. In this way, we have successfully associated attribute data with partitioned sub-grids one by one. Similarly, this method can also be used to parse the coordinate-point-mapping table

T_{realPoint}^{i}

and help us associate attribute data with point sets maintained in processes.

4. Real-Time Interactive Methods for Large-Scale Flow Fields

In response to the visualization needs of large-scale complex scenarios, the computationally intensive processes of visual feature extraction and parallel rendering must be performed on supercomputers due to the vast amount of data involved. Therefore, researchers have been devoted to exploring parallel rendering architectures, gradually overcoming challenges such as image parallel rendering, image composition, and image transmission. However, the parallel visualization of large-scale flow fields still faces a significant technical challenge: how to drive the usage of supercomputers from visualization terminals, enabling complex visual analysis while satisfying users’ interactive requirements and achieving real-time feedback for interactive rendering.

From the perspective of supercomputer architecture, the visualization terminals are connected to the rendering nodes through Ethernet connections (with bandwidths ranging from hundreds of megabits to gigabits per second), while high-speed interconnects, such as Intel InfiniBand (with bandwidths of tens of gigabits per second), are employed between the rendering nodes. The difference in bandwidth between these two types of connections can range from several hundred to several thousand times. Consequently, the multi-level asymmetric communication problem between visualization terminals and rendering nodes has become one of the bottlenecks in large-scale scientific visualization for real-time interaction. However, researchers have not thoroughly investigated this problem. The complexity of supercomputer system architecture, coupled with the requirement for real-time human–computer interaction in scientific visualization, renders existing parallel rendering architectures unsuitable for current supercomputer system architectures. In this paper, we address this issue by managing different communication networks within the supercomputer system, leveraging communication resources effectively, and achieving efficient parallel rendering that is tailored to meet the practical needs of real-time interaction in scientific visualization. Figure 7 shows this network communication management method.

The visualization terminal runs on a service node while multiple rendering processes are launched on the compute nodes. The first step is to establish a connection between the visualization terminal and the compute nodes using Socket communication. Next, we use a data-partitioning strategy based on grid segmentation to parallel load data into each compute node.

During this process, we construct metadata for the geometric model and use metadata parsing to replace serial I/O byte-by-byte file reading, reducing the number of process communications between rendering nodes, as well as the communication overhead of the server-side high-speed interconnect system. After data loading is complete, the service node initializes the image-rendering window and sends information, such as window size, position, and background, to the rendering master node via the Socket interface. The master node then broadcasts the window initialization information to the other rendering nodes using the MPI communication interface, awakening satellite processes to update the renderer information. The service node then initiates a remote call via Socket to issue an image-rendering command to the rendering nodes. Each rendering node uses visual feature calculation methods to convert the simulation data loaded into memory into displayable primitives and compresses them into data blocks using data compression algorithms to reduce inter-process communication overhead during image composition. Due to the large scale of the flow-field data we are dealing with, after all rendering processes have completed rendering, we use a binary-swap compositing method on the rendering master node to fuse images scattered across various rendering nodes into a complete simulation image based on image depth information using MPI communication. The rendering master node uses the Socket communication interface to send this image to the service node, and the visualization terminal renders the received image to the window and waits for the user interaction. The user may perform interactive operations such as image scaling, rotation, and translation. The service node retrieves the renderer-related information such as the renderer position, camera position, camera transformation matrix, etc., and sends rendering commands to the rendering nodes via the Socket interface. After parallel rendering is completed by the rendering nodes, they synchronously return the rendered results to the visualization terminal for display.

5. Results

This study presents four sets of experiments aimed at investigating the impact of our proposed method on real-time interactive visualization in a cluster environment with different processes.

The experiments were run on a domestic high-performance cluster system with a Feiteng processor-based server CPU, 1056 compute nodes, a total storage capacity of 2.6 PB, and a single node configuration of 64 cores and 128 GB memory. Table 1 lists the detailed attributes of the five flow-field datasets used to verify the data-partitioning strategy based on grid segmentation. The numbers below each dataset represent the time steps of that dataset. All data are from the first time step. The data scale increases from millions of grids to tens of millions. From the data, it can be seen that there is no correlation between the number of parts in the geometric model and the data scale, whereas the number of grid cells is positively correlated with the data scale.

5.1. Validation of Effectiveness

For preprocessing methods of large-scale visual data, the primary objective is to ensure that the program can generate accurate visual images and guarantee the correctness of visualization computations. Additionally, the speedup ratio (the ratio of the serial execution time to the parallel execution time) is also a key performance metric for evaluation. To evaluate the data-partitioning strategy based on grid segmentation proposed in this paper, we conducted experiments by comparing it with traditional serial reading methods and the open-source scientific visualization software ParaView (V5.11.0). The visualization analysis was performed on a dataset with a scale of tens of millions of grids. In the experiment, we used 2, 4, 8, 16, 32, and 64 processes to read the datasets, respectively. Due to overheads such as process communication and context switching, the execution time of each process varied. Therefore, we took the time consumed by the last completed process as a valid experimental result. To avoid data randomness, each group of experiments was repeated five times and the average value was taken. Additionally, the speedup ratios of the grid-partitioning method and ParaView were calculated, respectively. Table 2 shows our experimental results. Since the EnSight Gold format is commonly used to store unsteady data, the datasets selected in this paper are also unsteady data. For ease of presentation, Table 2 only provides the experimental results of the first time step.

The experimental results demonstrate that the proposed grid-based data-partitioning strategy effectively organizes the partitioning of flow-field data with tens of millions of grid cells. In each group of experiments conducted on three datasets, the speedup ratio of the proposed strategy significantly outperformed ParaView. We discovered that ParaView’s data-partitioning strategy involves serially processing the tens of millions of grid cells in each process and then filtering and distributing them to the respective computational nodes. This approach not only increases program execution time but can even exceed serial reading. Such a data-partitioning strategy only improves system throughput and scalability. In contrast, the method presented in this paper utilizes geometric model metadata to allow each process to independently handle data for its own viewport. This significantly reduces the parallel reading time for datasets with tens of millions of grids to approximately one-seventh of the serial reading time. Figure 8 showcases the visualization images rendered using the proposed method on a dataset with tens of millions of grids, highlighting different characteristics such as cutting planes, streamlines, and surface textures. These images vividly demonstrate the correct segmentation and allocation of large-scale flow-field data, as well as the extraction and rendering of visual features achieved by the proposed method.

To evaluate the impact of the grid-based segmentation method on the parallel reading of all time steps of the dataset, this study initiated 16 processes to load all time steps of three million-level grid datasets. The total consumption time was recorded and compared with serial loading. Figure 9 shows the experimental results. When analyzing the longest-consuming dataset, DATASET4, it was found that using the serial method to read 60 time steps took a total of about 15 min, whereas the parallel reading based on the grid-segmentation method only took about 2 min, greatly improving data-loading efficiency and improving the interaction experience.

5.2. Analysis of Load Balancing

The grid-based parallel reading method proposed in this study not only achieves the parallel reading of files but also achieves the static load balancing of grid segmentation according to the number of processes in the parallel environment when constructing geometric model metadata. To evaluate the effect of the static load balancing of the method proposed in this study, the following experiment was designed. As described in Section 3.2, the research idea is to store the grids of each part separately in each process after segmentation. Therefore, a stacked bar chart is used to present the results of this experiment, as shown in Figure 10. In the figure, it can be seen that after segmentation, the minimum difference between the grid units of each process is 0 (DATASET2), indicating that the grid has been evenly divided, and the maximum difference is only 15 (DATASET5). The experimental results show that the method proposed in this study achieves static load balancing, thereby improving system throughput and achieving a minimization of the response time.

5.3. Scalability Analysis

To evaluate the scalability of the data-partitioning strategy based on grid segmentation proposed in this study, experiments were conducted on datasets of different sizes and with different numbers of processes. The experimental results (as shown in Figure 11) indicate that as the number of processes increased, the parallel reading time did not continue to decrease, but instead showed a trend of first decreasing and then gradually increasing. When the number of processes exceeded a certain value, the parallel reading time increased sharply. This is because file parallel reading is also limited by disk hardware, network bandwidth, etc. Starting multiple processes also incurs overheads such as context switching and inter-process communication. By comparing the experimental results of DATASET1 and DATASET5, it was found that when the data volume had not reached a certain scale, the overheads such as context switching and inter-process communication in the parallel environment occupied a large proportion of the total parallel time, so the best parallel effect could not be achieved. When the number of processes exceeded a certain value, the time consumed by these additional overheads greatly increased in proportion to the overall parallel reading time. Therefore, when performing parallel reading, it is not a case of the more processes the better, but rather that the appropriate number of processes must be selected according to the specific data to achieve the optimal parallel effect.

5.4. Real-Time Interaction Analysis

When visualizing large-scale flow-field data, it is necessary to provide users with real-time interactive means to support the observation of data from any angle, making the three-dimensional structured data more intuitive. However, due to the large scale of the data, the communication problem between the visualization terminal and the rendering node becomes a bottleneck for real-time interactive visualization. To solve this problem, this paper proposed a multi-level asymmetric communication management method and designed experiments to verify the effectiveness of this method. We launched 64, 128, 256, 512, and 1024 processes, respectively, to interact with two sets of steady-state data with 500 million grids and 800 million grids. Figure 12 displays the visualizations of these two sets of large-scale grid data renderings. The left mouse button is used to select all components and rotate them randomly along the X, Y, and Z axes. The middle mouse button is used to control the random movement of components in the viewport. The mouse wheel is used to slide to zoom in or out of components and view model details. We performed each of the above three interactive operations five times per group and took the average value. Table 3 shows the response times of the system during a real-time interaction. From the experimental results, it can be seen that the multi-level asymmetric communication management method proposed in this paper can reduce the real-time interaction response time to a millisecond level, meeting the needs of real-time interaction in the large-scale parallel visualization of flow fields.

6. Discussion

The experimental results demonstrate that our method can reduce the data file reading time for large-scale datasets by approximately one-seventh, exhibiting excellent load-balancing performance. The proposed grid-based data-partitioning strategy accelerates the loading and processing of large-scale flow-field data, mitigating the bottleneck and latency issues associated with single-point I/O. It effectively addresses the challenge of visualizing large-scale flow fields by overcoming the limitation imposed by their immense size and thus holds significant value for large-scale flow-field visualization. By managing network communication in different cluster systems and efficiently utilizing network resources, we minimize the interaction time, reduce operational delays, and meet the real-time human–computer interaction requirements of large-scale scientific visualization. The proposed multi-level asymmetric network communication management method enables large-scale flow-field visualization to perform complex visual feature computations and rendering using supercomputers driven by visualization terminals. This approach enhances the real-time and interactive nature of visualization, which is crucial for visualization tasks requiring rapid responses and feedback, such as exploratory analysis, parameter adjustments, and dynamic demonstrations.

We propose a data-partitioning strategy based on grid segmentation that can effectively exploit the fine-grained parallelism within grids and improve data access efficiency. Currently, the method is only applicable to the EnSight Gold data format, but we believe it has wide applicability and scalability to other types of data formats. Our future work will explore the performance and effectiveness of the method under different data formats.

7. Conclusions

To address the challenge of real-time interactive control in visualizing large-scale flow-field data, this paper proposes a data-partitioning strategy based on grid segmentation and a real-time interactive method for large-scale flow fields. This effectively overcomes the technical obstacles in the parallel visualization of large-scale flow fields. Firstly, we employ geometric model metadata to create file viewports for each process and reconstruct the grid topology through metadata. This involves mapping the new grid topology and coordinate points and resolving the mapping table for attribute-variable retrieval. Secondly, by managing network communication at different levels within the cluster system, we achieve coordination between synchronous and parallel rendering. We design experiments to compare our data-partitioning strategy with serial methods and the approach used in ParaView, validating the effectiveness of the communication management method. The experimental results demonstrate that our data-partitioning strategy improves the loading efficiency of datasets with ten million grid cells by a factor of seven, ensuring load balancing in the system and correctly supporting visualization rendering and computation. Compared to ParaView, our approach exhibits superior performance, avoiding bottlenecks and delays caused by single-point I/O, and addressing the challenge of visualizing large-scale flow fields due to the sheer volume of data. It holds significant value for large-scale flow-field visualization. In interactive scenarios involving 500 million and 800 million grids, our system achieves millisecond-level response times, fully demonstrating the effectiveness of our communication management method. This method enables complex visual feature computation and rendering driven by visualization terminals on supercomputers, enhancing real-time and interactive capabilities in visualization tasks that require a rapid response and feedback, such as exploratory analysis, parameter adjustment, and dynamic demonstrations.

Author Contributions

Methodology, Z.H. (Zhouqiao He) and Z.H. (Zhengbin Huang); Software, C.C.; Validation, Z.H. (Zhouqiao He) and X.T.; Formal analysis, W.Z.; Resources, Q.C.; Data curation, Z.H. (Zhouqiao He); Writing—original draft, Z.H. (Zhouqiao He); Writing—review & editing, Z.H. (Zhouqiao He); Supervision, Y.W.; Project administration, Y.W.; Funding acquisition, Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Numerical Windtunnel Project (NNW2019ZT6-A17), the National Defense Basic Research Project (JCKY2022404C001), and the Innovation Fund of Postgraduate, Sichuan University of Science & Engineering (Y2022178).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Laurien, E.; Kleiser, L. Numerical simulation of boundary-layer transition and transition control. J. Fluid Mech. 1989, 199, 403–440. [Google Scholar]
Johnson, C. Top scientific visualization research problems. IEEE Comput. Graph. Appl. 2004, 24, 13–17. [Google Scholar] [PubMed]
None, N. Synergistic Challenges in Data-Intensive Science and Exascale Computing; Summary report of the Advanced Scientific Computing Advisory Committee (ASCAC) Subcommittee, March 2013; Technical Report; USDOE Office of Science (SC): Washington, DC, USA, 2013. [Google Scholar]
Moritz, D.; Fisher, D.; Ding, B.; Wang, C. Trust, but verify: Optimistic visualizations of approximate queries for exploring big data. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, Denver, CO, USA, 6–11 May 2017; pp. 2904–2915. [Google Scholar]
Gramsch, S.; Hietel, D.; Wegener, R. Optimizing spunbond, meltblown, and airlay processes with FIDYST. Melliand Int. 2015, 2, 115–117. [Google Scholar]
Cao, Y.; Tamura, T.; Kawai, H. Investigation of wall pressures and surface flow patterns on a wall-mounted square cylinder using very high-resolution Cartesian mesh. J. Wind Eng. Ind. Aerodyn. 2019, 188, 1–18. [Google Scholar]
Permann, C.J.; Gaston, D.R.; Andrš, D.; Carlsen, R.W.; Kong, F.; Lindsay, A.D.; Miller, J.M.; Peterson, J.W.; Slaughter, A.E.; Stogner, R.H.; et al. MOOSE: Enabling massively parallel multiphysics simulation. SoftwareX 2020, 11, 100430. [Google Scholar] [CrossRef]
Han, J.; Wang, C. TSR-TVD: Temporal super-resolution for time-varying data analysis and visualization. IEEE Trans. Vis. Comput. Graph. 2019, 26, 205–215. [Google Scholar] [CrossRef] [PubMed]
Arunachalan, B.; Diamond, S.; Stevens, A.; Talaie, B.; Ghaderi, M. Visualization of residents in long-term care centres through mobile natural user interfaces (NUI). In Proceedings of the IEEE 2013 IEEE Symposium on Large-Scale Data Analysis and Visualization (LDAV), Atlanta, GA, USA, 13–14 October 2013; pp. 133–134. [Google Scholar]
Li, S.; Marsaglia, N.; Garth, C.; Woodring, J.; Clyne, J.; Childs, H. Data reduction techniques for simulation, visualization and data analysis. Comput. Graph. Forum 2018, 37, 422–447. [Google Scholar] [CrossRef]
Gagliano Taliun, S.A.; VandeHaar, P.; Boughton, A.P.; Welch, R.P.; Taliun, D.; Schmidt, E.M.; Zhou, W.; Nielsen, J.B.; Willer, C.J.; Lee, S.; et al. Exploring and visualizing large-scale genetic associations by using PheWeb. Nat. Genet. 2020, 52, 550–552. [Google Scholar] [CrossRef]
Iakobovski, M.; Nesterov, I.; Krinov, P. Large distributed datasets visualization software, progress and opportunities. Comput. Graph. Geom. 2007, 9, 1–19. [Google Scholar]
Ban, Z.G.; Shi, Y.; Wang, P. Advanced parallelism of DGTD method with local time stepping based on novel MPI+ MPI unified parallel algorithm. IEEE Trans. Antennas Propag. 2021, 70, 3916–3921. [Google Scholar] [CrossRef]
Ahrens, J. Technology Trends and Challenges for Large-Scale Scientific Visualization. IEEE Comput. Graph. Appl. 2022, 42, 114–119. [Google Scholar] [CrossRef] [PubMed]
Ueng, S.K.; Sikorski, C.; Ma, K.L. Out-of-core streamline visualization on large unstructured meshes. IEEE Trans. Vis. Comput. Graph. 1997, 3, 370–380. [Google Scholar] [CrossRef] [Green Version]
Vishwanath, V.; Hereld, M.; Papka, M.E. Toward simulation-time data analysis and i/o acceleration on leadership-class systems. In Proceedings of the 2011 IEEE Symposium on Large Data Analysis and Visualization, Providence, RL, USA, 23–24 October 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 9–14. [Google Scholar]
Hu, S.; Guan, Q.; Gong, J. pGTIOL: A parallel geoTIFF I/O library. J. Geo-Inf. Sci. 2015, 17, 575–582. [Google Scholar]
Huazhong University of Science and Technology. A Parallel Reading Method for MBR Based on Partitioning; Huazhong University of Science and Technology: Wuhan, China, 2020. [Google Scholar]
Aerodynamics SKL. A Parallel Reading Method for Grid Data Files of 100 Billion 100 GB Scale; Aerodynamics SKL: Mianyang, China, 2021. [Google Scholar]
Beihang University. A Parallel Rendering System Based on Sort-Last Architecture; Beihang University: Beijing, China, 2012. [Google Scholar]
Min-Feng, P.; Liang, Z.; Xiao-Xia, L.U.; Si-Kun, L.I. A Highly Parallel Archtecture for Multi-task Parallel Graphics Renderign System. Comput. Technol. Autom. 2006. [CrossRef]
Shixiong, P.; Yulong, J.I.; Zhihong, W.U.; University, S. Data compression in parallel rendering system based on InfiniBand. J. Terahertz Sci. Electron. Inf. Technol. 2017, 15, 5. [Google Scholar]

Figure 1. After initializing the MPI (Message Passing Interface) environment, in order to reduce communication overhead, each process scans and calculates the data area for which it is responsible for processing and constructs metadata. On this basis, the grid is partitioned and mapped.

Figure 2. The EnSight Gold data format. (a) Organization of the format, and the internal structure of the geometric model file Test.geo. Test.Nvec is a representative attribute file that stores attribute data corresponding to Test.geo and is organized by the Test.case file. (b) Supported grid types.

Figure 3. Illustration of the partitioning of the mesh by processes. (a) Yellow, red, and blue represent the three components of the geometric model; (b) four processes sequentially partition the three components and store the sub-meshes separately; the final partitioning result is shown in (c).

Figure 4. Geometric model file consisting of n parts, each containing multiple different types of elements. The blue frame diagram indicates the sub-grid portions of each part that process 0 is responsible for processing.

Figure 5. Coordinate-mapping algorithm. During the first mapping process, each mesh element is taken and its topology is mapped as an index to the process. This mapping is used only to construct the sub-mesh. During the second mapping process, the number of point sets that the current process needs to maintain is obtained first. Then, a memory copy is performed to complete the actual mapping.

Figure 6. Grid construction and mapping. As indicated by the blue and orange regions in the figure, each process reconstructs its sub-grid and remaps the grid IDs.

Figure 7. Multi-level asymmetric communication management. After loading data, the user interacts and sends control commands to the master process via Socket. The master process awakens the child processes to begin rendering. After undergoing primitive transformation, data compression, data exchange, and other processes, the image is composited in the master process using the binary-swap algorithm. At this point, the rendering processes communicate via MPI (Message Passing Interface). Finally, the complete rendered image is returned to the service node via Socket, waiting for the user’s next interaction.

Figure 8. Visual representation of a large-scale dataset comprising tens of millions of grid cells, rendered using a grid-segmentation data-partitioning strategy. This image effectively portrays the distinct features of the dataset, including shear (a), streamlines (b), and surface textures (c). (d) shows the visualization results of the tornado, (e) displays the flight attitude of the aircraft, and (f) illustrates the unsteady separation release scenario.

Figure 9. Using both serial- and data-partitioning strategies based on grid segmentation, 16 processes were initiated to load all time steps of three datasets with ten million grid levels, and the total consumption time of both methods was recorded.

Figure 10. Using the data-partitioning strategy based on grid segmentation, 4 processes were initiated to load the 5 datasets in Table 1. Also shown is the number of grids maintained by each process after grid segmentation.

Figure 11. Illustration of the trend in the time required to read different datasets as the number of processes increases.

Figure 12. Visual rendering results of 500 million (a), and 800 million (b) grids.

Table 1. Detailed information on the datasets.

Dataset	Total (G)	Geo (G)	Variable (G)	Part	Point	Cell
DATSET1 (34)	0.18	0.14	0.04	13	1,737,895	6,813,683
DATSET2 (26)	1.00	0.57	0.43	1	12,084,210	11,254,272
DATSET3 (168)	0.83	0.58	0.25	8	10,920,710	17,963,212
DATSET4 (60)	3.00	2.16	0.84	1	45,000,000	44,611,099
DATSET5 (48)	2.95	2.04	0.91	19	40,839,722	60,688,365

Table 2. Comparison of speedup ratios.

Dataset/Number of Processes			2	4	8	16	32	64
DATASET3 Serial Reading (3997 ms)	Grid Partitioning	Execution Time (ms)	2035	1013	673	596	1117	1715
	Grid Partitioning	Speedup Ratio	1.96	3.95	5.94	6.71	3.58	2.33
	ParaView	Execution Time (ms)	5124	4164	3303	3634	4164	7267
	ParaView	Speedup Ratio	0.78	0.96	1.21	1.10	0.96	0.55
DATASET4 Serial Reading (14,842 ms)	Grid Partitioning	Execution Time (ms)	7844	4248	2670	2383	3866	5872
	Grid Partitioning	Speedup Ratio	1.89	3.49	5.56	6.23	3.84	2.53
	ParaView	Execution Time (ms)	18,323	14,271	11,417	11,874	13,493	26,985
	ParaView	Speedup Ratio	0.81	1.04	1.30	1.25	1.10	0.55
DATASET5 Serial Reading (14,607 ms)	Grid Partitioning	Execution Time (ms)	8027	4376	2461	2265	3697	5592
	Grid Partitioning	Speedup Ratio	1.82	3.34	5.94	6.45	3.95	2.61
	ParaView	Execution Time (ms)	16,052	12,173	10,901	10,585	12,379	18,970
	ParaView	Speedup Ratio	0.91	1.20	1.34	1.38	1.18	0.77

Table 3. Real-time interaction response times.

Dataset (Interaction Method)/Number of Processes		64	128	256	512	1024
500 Million Grids	Rotation (ms)	4.697	4.832	4.948	5.547	4.198
	Translation (ms)	4.906	4.215	4.581	5.427	4.397
	Scaling (ms)	4.817	5.568	5.006	4.816	4.084
800 Million Grids	Rotation (ms)	4.457	6.421	4.528	4.692	4.721
	Translation (ms)	5.463	6.197	5.122	4.371	5.486
	Scaling (ms)	4.582	5.078	4.490	6.572	4.743

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

He, Z.; Chen, C.; Wu, Y.; Tian, X.; Chu, Q.; Huang, Z.; Zhang, W. Real-Time Interactive Parallel Visualization of Large-Scale Flow-Field Data. Appl. Sci. 2023, 13, 9092. https://doi.org/10.3390/app13169092

AMA Style

He Z, Chen C, Wu Y, Tian X, Chu Q, Huang Z, Zhang W. Real-Time Interactive Parallel Visualization of Large-Scale Flow-Field Data. Applied Sciences. 2023; 13(16):9092. https://doi.org/10.3390/app13169092

Chicago/Turabian Style

He, Zhouqiao, Cheng Chen, Yadong Wu, Xiaokun Tian, Qikai Chu, Zhengbin Huang, and Weihan Zhang. 2023. "Real-Time Interactive Parallel Visualization of Large-Scale Flow-Field Data" Applied Sciences 13, no. 16: 9092. https://doi.org/10.3390/app13169092

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Real-Time Interactive Parallel Visualization of Large-Scale Flow-Field Data

Abstract

1. Introduction

2. Related Works

3. Data-Partitioning Strategy Based on Grid Segmentation

3.1. Preprocessing of File Data

3.1.1. The EnSight Gold Data Format

3.1.2. Pre-Scanning and Metadata Construction

3.1.3. Grid Segmentation

3.2. Grid Topology Reconstruction

3.2.1. Coordinate-Mapping Algorithm

3.2.2. Grid Construction and Mapping

3.3. Attribute-Variable Picking

4. Real-Time Interactive Methods for Large-Scale Flow Fields

5. Results

5.1. Validation of Effectiveness

5.2. Analysis of Load Balancing

5.3. Scalability Analysis

5.4. Real-Time Interaction Analysis

6. Discussion

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI