Next Article in Journal
The Potential of Fiber-Reinforced Concrete to Reduce the Environmental Impact of Concrete Construction
Previous Article in Journal
The Quick Determination of a Fibrous Composite’s Axial Young’s Modulus via the FEM
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Design and Application of an Onboard Particle Identification Platform Based on Convolutional Neural Networks

1
National Space Science Center, Chinese Academy of Sciences, Beijing 100190, China
2
University of Chinese Academy of Sciences, Beijing 100049, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2024, 14(15), 6628; https://doi.org/10.3390/app14156628 (registering DOI)
Submission received: 2 May 2024 / Revised: 14 June 2024 / Accepted: 26 July 2024 / Published: 29 July 2024

Abstract

:
Space radiation particle detection plays a crucial role in scientific research and engineering practice, especially in particle species identification. Currently, commonly used in-orbit particle identification techniques include telescope methods, electrostatic analysis time of flight (ESA × TOF), time-of-flight energy (TOF × E), and pulse shape discrimination (PSD). However, these methods usually fail to utilize the full waveform information containing rich features, and their particle identification results may be affected by the random rise and fall of particle deposition and noise interference. In this study, a low-latency and lightweight onboard FPGA real-time particle identification platform based on full waveform information was developed utilizing the superior target classification, robustness, and generalization capabilities of convolutional neural networks (CNNs). The platform constructs diversified input datasets based on the physical features of waveforms and uses Optuna and Pytorch software architectures for model training. The hardware platform is responsible for the real-time inference of waveform data and the dynamic expansion of the dataset. The platform was utilized for deep learning training and the testing of the historical waveform data of neutron and gamma rays, and the inference time of a single waveform takes 4.9 microseconds, with an accuracy rate of over 97%. The classification expectation FOM (figure-of-merit) value of this CNN model is 133, which is better than the traditional pulse shape discrimination (PSD) algorithm’s FOM value of 0.8. The development of this platform not only improves the accuracy and efficiency of space particle discrimination but also provides an advanced tool for future space environment monitoring, which is of great value for engineering applications.

1. Introduction

The vastness of cosmic space is filled with all kinds of particles, including the solar wind, cosmic rays, and the Earth’s radiation belts, so various satellite platforms are exposed to space radiation. By studying the composition, energy spectrum, and direction of charged particles in the universe, as well as the changes in particles in space and time, one can obtain important information on the origin of particles, acceleration and propagation processes, lifetimes, and dynamical processes. In addition, the effects of space radiation on various space vehicles and astronauts should not be underestimated, as they can cause electronic equipment malfunctions, interfere with communications, damage electronic components, and jeopardize human health. Therefore, it is crucial to study the physical properties of particles in the space environment, and particle identification is one of the key aspects. Through particle identification, information such as the type, energy, and flux of particles can be determined to assess their impact on spacecraft and astronauts, and it can be used to formulate corresponding protective measures to safeguard the safety and health of spacecraft and astronauts. Particle identification is therefore of great importance in the fields of science and engineering [1].
At the present stage, spaceborne particle identification mainly includes several commonly used methods:
  • The first is the detector telescope method [2] (∆E-E). The main principle is to use one or more pieces of solid-state detectors (SSDs) to form a stack. If the first piece of the sensor is thin enough, the particle deposition energy ∆E can be measured after the particles’ penetration. The second (or multiple) sensors are thick enough to fully absorb the remaining energy of the particles, allowing the total particle energy E to be measured. Particle species can then be distinguished using a two-dimensional spectrum. The detector telescope method is simple, reliable, and widely used in practical engineering applications. However, it requires particles to lose very little energy in the transmission detector, making it less suitable for heavier particles, and the necessary corrections can be troublesome.
  • The second method is the electrostatic analysis time-of-flight method [3] (ESA-TOF). This method uses an electrostatic analyzer (ESA) to screen the energy–charge ratio (E/q) of incident ions based on the scanning voltage (V), and then combines it with the time-of-flight method to obtain the mass–charge ratio of the ions. Electrostatic analyzers are often used to test charged particles or neutral atoms with lower energies. However, due to the spatial limitations of onboard equipment, detecting medium- and high-energy particles requires a substantial increase in the physical structure, which cannot be realized in orbit.
  • The third method is the time-of-flight energy method [4] (TOF × E). The basic principle is based on the particle energy formula, where the particle energy E is measured, and the velocity v can be obtained from the particle’s mass, thereby determining the type. TOF × E is particularly effective for identifying light particles, especially low-energy light particles. As the particle mass increases, the differences in mass decrease, making it more difficult to discriminate between them. Additionally, for homogeneous isotopes (i.e., nuclides with the same mass number but different atomic numbers), the time-of-flight method alone cannot identify them; it must be combined with other methods for reliable identification.
  • The fourth method is the semiconductor sensor rising-edge energy method [5] (PSA × E) or the scintillator detector waveform analysis method [6,7]. This method uses the pulsed waveforms generated by charged particles in the semiconductor sensor. High-speed digital acquisition technology is used to produce a two-dimensional spectrogram of energy (E) versus rising-edge time for particle identification. The scintillator detector waveform analysis method identifies neutron and gamma rays through the luminescence waveforms generated by neutron and gamma rays in the scintillator, utilizing the PSD analysis method. Although this method attempts to use waveform information, it only uses a small portion of the data and does not utilize all the information from the waveform. This may affect the results due to the rise and fall of particle deposition or interference from noise.
In addition, all four methods require the downlinking of in-orbit scientific data to the ground for particle identification analysis, which is not suitable for resource-constrained satellites, especially in the field of deep space exploration.
In this paper, we propose using the powerful classification ability of convolutional neural networks (CNNs) in machine learning to establish a particle identification platform. This platform includes a particle waveform data software preprocessing module, a CNN software (1.0.0) training module, a particle waveform data FPGA forward inference module, and a CNN training set extension module. The main advantages of this CNN-based particle identification platform compared to traditional methods are as follows:
  • Convolutional neural networks are good at separating different features in the data and finally achieving classification;
  • Convolutional neural networks are analyzed using the data of the waveform as a whole, which are richer in data information, and the data are expected to improve accuracies [8];
  • Convolutional neural networks have good scalability.
FPGA offers high parallelism, low power consumption, and reconfigurability [9]; thus, it can realize real-time in-orbit processing and model updating, reduce the use of in-orbit resources, and improve the accuracy of detection.
By establishing a particle identification platform based on convolutional neural networks (CNNs), we can further expand traditional particle identification technology, realizing the technical route of “CNN + traditional particle identification technology”. This approach also improves the physical methods of traditional particle identification and interference rejection, enhancing the accuracy of particle identification and the effectiveness of interference rejection. Additionally, by building a generalized CNN onboard particle waveform signal identification FPGA platform, the in-orbit data rate can be greatly reduced, and both reconfigurable and scalable characteristics can be achieved. Therefore, the establishment of a CNN-based particle identification platform is of great significance for future space science exploration and engineering practice.

2. Platform Architecture

In onboard applications, real-time processing of particle identification not only focuses on identification accuracy, but also takes into account the unique requirements of in-orbit operation, such as the limitation of FPGA resources and the dynamic change in space particle flux. Firstly, considering the limited onboard resources and the prevalence of space FPGA applications, it is especially critical to develop a lightweight CNN network architecture for particle identification that is adapted to FPGA operation platforms. Secondly, the wide range of changes in cosmic space particle fluxes, especially during solar activity, necessitates the design of extremely low-latency network architecture to improve the system’s responsiveness and dynamic range. Moreover, the network model is expected to train more accurate architectures as the data are updated; therefore, developing a modular and reconfigurable FPGA operation architecture is also crucial.
In order to meet the above requirements, this paper builds a spaceborne convolutional neural network particle identification FPGA real-time classification platform. The platform is mainly composed of four parts, namely, the particle waveform data software preprocessing module, convolutional neural network software training module, particle waveform data FPGA forward inference module, and convolutional neural network training set extension module; moreover, its platform architecture is shown in Figure 1.

2.1. Particle Waveform Data Software Preprocessing Module

The main function of this module is to process all kinds of particle waveform data obtained from in-orbit or ground accelerators, including data normalization, random sampling, random cropping, noise addition, and frequency domain transformation, in order to improve the robustness and generalization of the model and divide processed data into training, validation, and test data. The training data are mainly used to complete the training of the weights and biases of the entire network’s structure, validation data are used to determine whether the hyperparameters of the model are reasonable, and test data are mainly used to complete the final particle identification and inference and simultaneously measure the accuracy performance of the model’s inference.
In order to construct a software preprocessing module for particle waveform data, it is necessary to have a deep understanding of the characteristics of particle waveform data and construct a corresponding training dataset based on this characteristic. Semiconductor detectors are the most commonly used means for charged particle detection. Due to the internal electric field, the electric field strength inside the detector gradually increases from the backend to the frontend, and when charged particles are incident from the backend, the plasma erosion time on the trajectory of heavier ions is longer due to their short range, higher energy loss rate, and low electric field strength. In charge carrier transport, electron mobility is faster, and hole mobility is slower. Since the incidence is from the rear end of the detector, the average distance traveled during hole migration increases, which leads to an increase in the charge carrier’s transport time. When the incident energy is certain, variations in the number of nuclear charges and mass numbers of the particles can lead to differences in charge collection times. Heavier particles produce current pulses with a longer duration, lower amplitude, longer charge rise time, and greater time to a zero value, such that different particles deposited in the sensor will produce certain waveform differences [10,11,12,13,14].
For the neutral component, take neutron gamma as an example. The organic scintillator is irradiated by neutrons or γ-rays to produce recoil protons and secondary electrons and luminescence, and the intensity of the light is quickly enhanced up to the maximum. On the contrary, the attenuation is relatively slow, and the attenuation is approximate relative to the exponential decay. Its luminous decay time contains two fast and slow components. When neutron or γ-ray irradiation occurs in the same scintillator, the formation of ionization densities in the scintillator is different, resulting in the different luminous decay times of the fast and slow components of the intensity ratio. The share of the fast component in proton-excited fluorescence generated by the interaction of neutrons with the scintillator is lower than the share of electron-excited fluorescence generated by the interaction of γ-rays with the scintillator, while the share of the slow component is higher than that of γ-rays. It is this factor that causes the difference in the decay time of the fluorescence generated by the interaction of neutrons and γ-rays with the scintillator, which in turn leads to a difference in the shape of the generated pulse [15,16,17]. The waveforms of a charged particle are shown in Figure 2a, where i represents the current generated by the charged particles in the semiconductor sensor, q represents the amount of charge generated by the charged particles in the semiconductor sensor, and U represents the voltage output by the charged particles in the semiconductor sensor and the subsequent front-end preamplifier circuit. The waveforms of a neutral particle are shown in Figure 2b, where i represents the current generated by neutrons and gamma rays in the scintillator sensor.
Based on the above, it can be seen that the waveforms generated by semiconductor sensors interacting with particles or scintillators interacting with particles have a certain degree of variability in the rising or falling edges; thus, the training dataset needs to include information on the rising edges, falling edges, and full waveforms in order to comprehensively extract the characteristics of the waveforms and further expand the training dataset by means of data enhancement methods, such as random sampling, random cropping, noise addition, and frequency-domain transformations [18], in order to improve the model’s generalization and robustness.
The data augmentation methods are implemented as follows: random downsampling of training data by controlling the extraction interval; random cropping of waveforms by randomizing the starting position and fixing the waveform length; adding Gaussian white noise to simulate real-world random noise; and frequency domain transformation using Fourier transform.
The overall implementation roadmap of the software’s preprocessing is shown in Figure 3.

2.2. Convolutional Neural Network Software Training and Testing Module

The module mainly consists of several parts, which are the hyperparameter search module, training module, and test and validation module. In the process of building a convolutional neural network, the setting of hyperparameters is a key factor affecting the accuracy of training and final inferences. Conventional hyperparameters include the convolutional kernel size, step size, padding, activation function, number of convolutional layers, number of fully connected layers, loss function, number of iterations, learning rate, optimizer, batchsize, and dropout rate, which are very inefficient if combined and experimented with manually. Optuna (1.0.0) is a state-of-the-art automated machine learning framework focused on hyperparameter optimization, aiming to improve the performance of machine learning models. Its key strengths include a highly flexible design that allows users to customize complex optimization logic and parameter search spaces; an intuitive and powerful API that ensures ease of use and integration; and improved optimization efficiency through effective sampling strategies and pruning techniques. These features have made Optuna a widely adopted tool in research and industries [19,20,21].
After determining the optimal hyperparameter combination through Optuna, the particle training waveform data and validation waveform data need to be subjected to forward propagation computation, loss function computation with backward gradient computation, and final training.
The test validation module is used to analyze the accuracy and ability of the particle discrimination model to achieve classification and discrimination under the current training state. By inputting a preprocessed hybrid test dataset, the accuracy of the validation and inference results can be obtained so as to understand the effectiveness of the model training and to provide data support for determining the model’s hyperparameters, as well as determining the model architecture. In practice, the key variables include training set accuracy, loss function value, validation set accuracy, and test set accuracy. The training set accuracy is calculated as shown in Equation (1):
T R C p e r = T R C n u m / T R n u m
where TRCper denotes training accuracy; TRCnum is the number of training data forward inference results that match with labels; and TRnum is the total number of training data.
The loss function value is calculated as shown in Equation (2):
A V E l o s s = T l o s s / T R n u m
where AVEloss is the overall average loss function value for training; Tloss is the total loss function value; and TRnum is the total number of training data.
The validation set accuracy is calculated as shown in Equation (3):
M D C p e r = M D C n u m / M D n u m
where MDCper is the validation set accuracy; MDCnum is the number of validation set forward inference results that match with labels; and MDnum is the total number of the validation set’s data.
The test set accuracy is calculated as shown in Equation (4):
T S C p e r = T S C n u m / T S n u m
where TSCper is the test set accuracy; TSCnum is the number of test set forward inference results that match with labels; and TSnum is the total number of test set data.

2.3. Particle Waveform Data FPGA Forward Inference Module

The particle waveform data FPGA forward inference module is the key link for realizing the real-time identification of particle waveforms, which must have the following characteristics in order to meet task requirements: reconfigurable, portable, flexible configuration, and reasonable resource consumption and extrapolation time.
Combined with the obtained fixed-point weights and bias parameters of each CNN layer, the construction process of the FPGA forward inference module for particle waveform data is shown in Figure 4.
To ensure the correctness of the forward inference of the FPGA-based CNN architecture, the parameters obtained from training using the Pytorch (1.0.0) architecture need to be exported and generated as fixed-point input into the FPGA to verify the correctness of the parameters; given the complexity of verification using FPGA (1.0.0), a set of forward inference models is reconstructed based on MATLAB with the function of importing and exporting the parameters, the pooling layer of the convolutional layer and the fully connected layer calculation, and the calculation results of the output layer for each type of category and final classification results. Finally, the results of the Pytorch calculations and MATLAB calculations are compared to verify the correctness of the exported parameters.
Fixed-point quantization is not only a necessary step when migrating digital models from software environments to hardware implementations, especially when employing field-programmable gate arrays (FPGAs), but also provides multiple advantages [22,23,24]. Firstly, FPGAs have a limited number of logic cells, and fixed-point quantization helps reduce the logic resources required, enabling more compact hardware designs. Secondly, fixed-point algorithms simplify the hardware implementation of the algorithms compared to floating-point algorithms, reducing processing delays and improving the speed and efficiency of data processing. Furthermore, fixed-point algorithms are usually more energy efficient than floating-point algorithms, which is especially important for power-sensitive application areas. Through quantization, a balance between accuracy and hardware resources can be found, allowing complex algorithms to be implemented on FPGAs, which is crucial for many resource-constrained embedded systems and high-performance computing applications. The convolutional kernel, bias, and input data in this architecture are quantized using a 16-bit fixed-point quantization strategy to improve the inference speed and reduce the resource footprint.
FPGA forward inference module construction at this stage typically uses platforms such as high-level synthesis language (HLS) or VITIS AI to automatically generate the model architecture into deployable hardware bitstreams using compilers and optimizers [25,26]. This approach, despite its many advantages, is not fully applicable to satellite-borne devices. Firstly, the transparency and maintainability of the code in onboard applications are of high priority, and the code facilitates error detection, system validation, and anti-irradiation designs. Secondly, onboard resources tend to be more constrained in terms of power consumption compared to ground-based devices, which require fine optimization and the finetuning of the design at the RTL level. In addition, onboard equipment often has customized functional requirements, and designing these devices at the RTL level can better organically combine different parts in order to improve operational efficiency. Finally, it is crucial to note that most SOC or AI chips at this stage do not have anti-irradiation capabilities and thus cannot meet the requirements for in-orbit operation. Therefore, the RTL-level development of the FPGA forward inference CNN module is adopted in this architecture.
The essence of the forward inference calculation of the convolutional neural network is to perform a large number of multiplication and addition operations, as well as the intercept, shift, splice, and other operations of register arrays; thus, it is necessary to use the DSP module inside the FPGA to perform parallel calculations. At the same time, considering the large flux factors of the actual application scenarios of particle identification, the inference time must be reduced as much as possible. Thus, if the resources allow, we directly use the on-chip registers in the FPGA to store all kinds of parameters. Therefore, the FPGA on-chip registers are directly used as the storage of various kinds of parameters when the resources allow in order to maximize the reduction in the data access operation times and reduce the delay of forward inferences; all the data are quantized by means of a 16-bit fixed-point number—which effectively reduces the complexity of multiplication–addition operations on the basis of not generating data overflow—and the strategy of parallel computation to meet the requirements of the inference time of the actual application environment. Considering that the model of the convolutional neural network may change with the modification of the actual model, and the strategy of parallel computation will also change with the modification of the model and the resource adjustment of the hardware platform, the modular design of the entire structure should be as flexible as possible to adapt to various changes.
The kernel architecture of the convolutional layer is shown in Figure 5.
The computation of the convolutional layer consists of four main levels, which are the computation of the elements inside the convolutional kernel, the computation of the convolutional kernel corresponding to the different input channels of a single filter, the computation of the sliding of a single filter on one-dimensional data, and the computation of multiple filters. Based on the current architectural design of convolutional layer computation, the elements within the convolutional kernel, the one-dimensional data sliding computation, and the computation of multiple output channels can be fully or partially parallelized, whereas the convolutional kernel computation of the input channels is designed to be serial due to the need to accumulate computation between multiple channels.
Convolutional computation is a big user of chip resources and time resources, and the quantitative analysis of computing resources and an understanding of time delay are key steps in building a platform. Since the convolutional operation is mainly a multiplication–addition operation and the parallelism requirement is high, the special DSP module within the FPGA is used in the construction of the operation unit (PE unit), and the relationship between the actual construction process of the DSP module and the degree of parallelism is shown in Equation (5):
DSPNum = k × n × m
where DSPNum is the number of DSP modules of the FPGA that need to be consumed, k represents the degree of parallelism within the convolutional kernel, n represents the degree of the sliding window’s parallelism, and m represents the degree of the output channel’s parallelism.
The extrapolation time is related to how serially the architecture is running, i.e., how many operations are required for each layer to complete the construction of the output data. Each computation requires one clock, and the final extrapolation time required to compute a convolutional layer is shown in Equation (6):
C l k N u m = x × y × z × a + b
where ClkNum is the number of the running clk to be consumed; x represents the degree of serialization within the convolutional kernel; y represents the degree of the serialization of the input channel element computation; z represents the degree of the serialization of the sliding window; a represents the degree of the serialization of the output channel; and b is the number of extra clocks due to synchronization or some bias computation, as determined by the actual program.
The kernel architecture of the pooling layer is shown in Figure 6.
The operation of the pooling layer is relatively simple due to the maximum pooling operation used in this architecture; the pooling size is 2. Thus, the dimensions of the input channels after the pooling operation will be reduced by half, and the larger value of neighboring data is taken as the output value. The pooling operation uses parallel processing operations for single-channel input data and serial processing operations for different input channel data. Finally, all data will be flattened and outputted.
The pooling operation purely comprises comparator processing and therefore does not consume the DSP module, for which its time delay is calculated in Equation (7):
C l k N u m p o o l = c l k p o o l × c l k c h a n n e l + c
where clkpool is the serial degree of single-channel input data pooling; clkchannel is the serial degree of multi-channel input pooling processing; and c is the number of extra clocks due to synchronization as determined by the actual program.
The kernel architecture of the full connectivity layer is shown in Figure 7.
The fully connected layer processes all spreading data obtained after convolution and pooling and obtains the mapping computation of the final classification outputs. Since it is a fully connected computation, all spreading input data will correspond to the weight; thus, the parallelism of the process will directly and significantly increase the number of DSP modules used.
The number of DSP modules used in the fully connected layer is shown in Equation (8):
D S P n u m f u l l c o n n e c t = D S P m u t i × D S P o u t p u t n o d e
where DSPmuti is the input node’s multiplication computation parallelism and the DSPoutputnode is the output category computation parallelism.
The full connectivity layer’s inference time calculation is shown in Equation (9):
C l k N u m f u l l c o n n e c t = ( c l k m u t i + 1 ) × c l k o u t p u t n o d e + k
where clkmuti is the multiplicative computation seriality, i.e., the number of times a single channel completes the computation, and +1 is needed since a biased addition is eventually required; clkoutputnode is the output category computation seriality; and k is the number of extra clocks due to synchronization as determined by the actual program.

2.4. Convolutional Neural Network Training Set Extension Module

Via forward extrapolation, the classification results of different particle waveforms can be obtained, based upon which the classification results are evaluated and compared with the conventional PSD method using the figure-of-merit (FOM) calculation method [17]. The FOM calculation method is shown in Equation (10):
F o M = P 1 P 2 F W H M 1 + F W H M 2
where P1 is the peak CNN classification count distribution or PSD parameter distribution of the first particle, P2 is the peak CNN classification count distribution or PSD parameter distribution of the second particle, FWHM1 is the half-height width of the first particle parameter distribution curve, and FWHM2 is the half-height width of the second particle parameter distribution curve.
In practice, the inferred results are filtered and expanded into a secondary training dataset according to a specified expectation threshold as a way to further improve the accuracy of the model.

3. Application and Results

In this section, the measured data of neutron and gamma waveforms are utilized to implement the application of software training, FPGA platform inference, and training set extension of neutron and gamma identification models through a convolutional neural network particle identification platform.

3.1. Neutron and Gamma Preprocessing Data Construction

Neutron and gamma data were adopted from waveform data generated by the experimental platform constructed based on the CLYC scintillator sensor [6]. In order to ensure the consistency of the amount of input data for the convolutional neural network, the input length of the data was all set to 128 and all normalized via the input data. Subsequently, data enhancement methods such as random cropping, noise addition, and downsampling were used to generate 18,000 training waveform data (9000 each for neutron and gamma), 3000 validation waveform data (1500 each for neutron and gamma), and 3000 test waveform data (1500 each for neutron and gamma), as shown in Figure 8, Figure 9 and Figure 10.

3.2. Neutron and Gamma Convolutional Neural Network Model Construction

This platform is based on the lightweight CNN architecture of LeNet-5, which includes three convolutional layers, two pooling layers, and two fully connected layers, primarily used for handwritten digit recognition. In order to further reduce the model’s complexity, the model only performs one layer of fully connected computation, which significantly reduces computational and hardware resources; adopts a small convolutional kernel of 1 × 3, which enhances the ability of feature information extraction and reduces the amount of computation; and increases the padding operation, which retains the edge information and maintains the output size. In order to reduce the hyperparameter search space and the number of convolutional layer output channels, the activation function and pooling operation are no longer traversed, and the model architecture in Figure 11 is directly adopted.
The hyperparameter optimal search for the backpropagation algorithm, learning rate, batch data size, and dropout rate is performed using Optuna architecture, and the search range is shown in Table 1.
Based on the results of the above figure, after 500 trials, the classification accuracy basically no longer increases, and under the condition of higher classification accuracy, the batchsize is mainly distributed between 64 and 128, and the optimization algorithm focuses on Adam. Therefore, upon setting the batchsize range to 64 and 128 and fixing the optimization algorithm relative to Adam for a new round of 500 iterations, the optimal accuracy results were as shown in Table 2.
Based on the last optimization state, the batchsize is set to 128, the learning rate is 0.0026, the dropout rate is 0.1197, and 1500 iterations of training are performed. The training accuracy, loss function value, and validation set accuracy curves are shown in Figure 12.
Using the trained convolutional neural network, the forward inference of the test waveforms was carried out, which included the full waveform data, the rising-edge data, the falling-edge data, and the overall test set. The evaluation results are shown in Table 3.
Comprehensive results of the test accuracy are shown in the above table. The training accuracy of the trained convolutional neural network on the rising edge of the waveform, the falling edge, and the full waveform data can reach more than 97%.

3.3. Particle Waveform Data FPGA Forward Inference Module

The premise of the FPGA forward extrapolation of particle waveform data is the correctness of the derived weight data. The same forward extrapolation model is constructed using Pytorch and MATLAB (1.0.0) to compute the arbitrarily selected neutron and gamma data, and the two are compared to determine whether the output parameters are correct. The results of the comparison are shown in Table 4.
The Pytorch and MATLAB calculations are in perfect agreement.
The overall framework is constructed using vivado, which includes convolutional layer 1, pooling layer 1, convolutional layer 2, pooling layer 2, convolutional layer 3, and fully connected layers.
The number of DSP modules required per layer is shown in Table 5.
Based on the content of Formulas (5) and (8), in convolution layer 1, the convolution kernel operates serially with k1 = 1, the sliding window computes 64 elements simultaneously with n1 = 64, and two output channels are computed simultaneously with m1 = 2. In convolution layer 2, the convolution kernel operates serially with k2 = 1, the sliding window computes 32 elements simultaneously with n2 = 32, and two output channels are computed simultaneously with m2 = 2. In convolution layer 3, the convolution kernel operates serially with k3 = 1, the sliding window computes 16 elements simultaneously with n3 = 16, and two output channels are computed simultaneously with m3 = 2. In the fully connected layer, eight multiplications are performed per clock cycle with DSPmuti = 8, and there are two output categories with DSPoutputmode = 2.
Comparison shows that the number of calculated DSPs is the same as the actual number of occupied DSPs.
The required extrapolation time is calculated as shown in Table 6.
In Table 6, b1 represents the clock cycles for the first convolution layer used for synchronizing data or bias calculations. c1 represents the clock cycles for the first pooling layer used for synchronizing data. b2 represents the clock cycles for the second convolution layer used for synchronizing data or bias calculations. c2 represents the clock cycles for the second pooling layer used for synchronizing data. b3 represents the clock cycles for the third convolution layer used for synchronizing data or bias calculations. k represents the clock cycles for the fully connected layer used for synchronizing data. The above data are all determined by the actual program.
In convolution layer 1, the convolution kernel operates serially, requiring three clock cycles (x1 = 3), with one input channel (y1 = 1). The sliding window computation requires two passes (z1 = 2), and the output channel computation also requires two passes (a1 = 2). For pooling layer 1, the pooling operation requires one clock cycle (clk1pool = 1), with input channels operating serially over four cycles (clk1channel = 4). In convolution layer 2, the convolution kernel operates serially, requiring three clock cycles (x2 = 3), with four input channels (y2 = 4). The sliding window computation requires two passes (z2 = 2), and the output channel computation requires four passes (a2 = 4). For pooling layer 2, the pooling operation requires one clock cycle (clk2pool = 1), with input channels operating serially over eight cycles (clk2channel = 8). In convolution layer 3, the convolution kernel operates serially, requiring three clock cycles (x3 = 3), with eight input channels (y3 = 8). The sliding window computation requires two passes (z3 = 2), and the output channel computation requires four passes (a3 = 4). In the fully connected layer, all parameters flatten to 256 elements. Each cycle performs eight multiplications; thus, the total number of multiplication clock cycles is clkmuti = 256/8 = 32. The eight multiplications are completed in one clock cycle, so clkoutputnode = 1.
The comparison shows that the deviation value between the theoretical calculation and the actual time required is more than 100 clk, which is used for data synchronization, assignment, and other operations.
Through layout and routing, the final results of FPGA resource utilization can be obtained, as shown in Table 7. The reason for the high LUT resource usage is that both the model weights and intermediate variables are stored using registers to maximize inference speed. The utilization rate of DSP modules can be further improved by implementing parallel computation within the convolution kernels or increasing the number of multiplication operations per clock cycle in the fully connected layers.
The actual operating platform is the AX7z035 ZYNQ development board, which uses a 422 expansion board and a USB-to-422 converter to input waveform data and output and save the forward inference results. Additionally, the program for the development board is loaded in real-time by an FPGA programmer, and the board is powered by a 220 V to 12 V transformer. The overall hardware architecture is shown in Figure 13.
Based on the results of the FPGA inference, two side-by-side comparisons need to be made: one is the match between the FPGA inference results and the test dataset to judge the actual accuracy, and the other is the comparison between the FPGA inference results and the floating-point computation results output from Pytorch to judge the difference between the fixed-point number and the floating-point number so as to assess the usability of the fixed-point computation results.
The results of the analysis with the full waveform test dataset are shown in Table 8.
From the above table, it can be observed that the results calculated using fixed-point forward extrapolation are only 0.1% lower than the floating-point numbers, and the match between floating- and fixed-point numbers is 99.9%.

3.4. Convolutional Neural Network Training Set Extension and Update

From the forward extrapolation results of the test dataset, the probability of each test waveform in the two classification results can be obtained. The expectations of the classification labels for neutron and gamma are calculated separately and the specific calculation method is shown in Equation (11):
E = exp ( o u t p u t n e u t r o n ) exp ( o u t p u t n e u t r o n ) + exp ( o u t p u t g a m m a ) × l a b e l n e u t r o n + exp ( o u t p u t g a m m a ) exp ( o u t p u t n e u t r o n ) + exp ( o u t p u t g a m m a ) × l a b e l g a m m a
where E is the expected value of the particle inference result; labelneutron refers to the label for neutron classification by the CNN, which is 0; and labelgamma refers to the label for gamma classification by the CNN, which is 1. The logarithmic calculation part is the SoftMax calculation process for the neutron and gamma classification results, thereby obtaining the classification probabilities for neutron and gamma.
Table 9 shows the expected distribution of neutron and gamma in the final output of the CNN test dataset.
The distributions of the classification expectations are plotted in Figure 14a,b.
The method for calculating the PSD parameter using pulse waveform analysis [17] is shown in Equation (12):
P S D = Q t Q p Q t
where Qt is the total window charge integral and Qp is the prompt window charge integral. The schematic diagram is shown in Figure 15.
By calculating the PSD parameters for all neutron and gamma test waveforms, a table of PSD parameters and the frequency of each interval can be obtained, as shown in Table 10.
The distribution state diagram is shown in Figure 16.
From the calculations, it can be observed that the expected FOM value of CNN classification for neutron gamma is 133, while the FOM value of the PSD method is 0.8. In addition, the dataset with neutron expectation values less than 0.05 and gamma expectation values greater than 0.95 is selected as the extended dataset, which has 463 neutrons and 466 gamma rays.

4. Discussion

In this study, taking the fact that the pulse waveforms generated by different particles after deposition in the sensor have differentiation as the starting point, a complete software and hardware convolutional neural network particle identification platform is constructed, including data preprocessing, software CNN architecture, FPGA forward inference platforms, and dataset expansion modules. The platform can be widely applied to particle identification and can be extended on the basis of traditional methods.

4.1. Comprehensive Discussion of Pulse Waveform Datasets

In order to fully consider all types of pulse waveforms and maximize the extraction of waveform features, the dataset covers the waveform categories of full waveforms, rising edges, and falling edges, and downsampling, noise addition, and random cropping are used to reduce the model’s complexity and enhance its robustness. In subsequent studies, when multiple particle identification is performed, the waveform data can be further converted to the frequency domain, and dual-channel data in both time and frequency domains can be used as input data, which are expected to further improve the identification accuracy. In addition, possible wave stacking and variations in temperature should also be part of the training data.

4.2. Training and Optimisation of Neutron and Gamma CNN Networks

In the process of training neutron and gamma CNN networks, the determination of hyperparameters plays a crucial role in the identification results. However, the variety of hyperparameters and the sample set is very large, and efficiency will be greatly reduced if manual trials are used. Optuna utilizes Bayesian optimization methods to greatly improve the efficiency of the hyperparameter optimal sample set’s search operations. In this paper, only a few of the hyperparameters are analyzed and optimized, and in subsequent work, a wider sample set of hyperparameters will be used to maximize the accuracy of the network.

4.3. Construction and Application of the FPGA-Based Forward Inference Acceleration Module

The FPGA-based forward inference acceleration module mainly implements the computation of the convolutional, pooling, and fully connected layers of the CNN network. Given the different flux sizes of particles at different orbital heights in space, the required extrapolation time is not the same. Through the modular program architecture, different parallel capabilities can be flexibly constructed, and the forward inference time can be adjusted to achieve the optimal matching of resources and latency. In addition, in order to minimize the extrapolation time, all kinds of parameters and intermediate variables of the lightweight model are stored in the form of registers so that all data can be read and calculated in parallel.
Although the acceleration algorithm can realize neutron and gamma extrapolation analyses, it does not integrate hardware preprocessing algorithms, and the resources for larger-scale models cannot meet the requirements; thus, in the future, it will be combined with a more universal and dedicated module for the construction of the architecture, such as the irradiation-resistant VITIS AI platform or other onboard AI chips to enhance performance.

4.4. Convolutional Neural Network Training Set Extension

By comparing the expected FOM value of CNN classification with the FOM value of the PSD method, it can be seen that the CNN classification method is better than the PSD method. Additionally, the dataset can be further expanded based on the calculated expected values.
In practical applications, payloads will encounter extreme high and low temperatures; in addition, front-end sensors operating in orbit for a long period of time will produce corresponding performance degradation, which will have an impact on the back-end data, and these are difficult to simulate in the ground calibration or test. Thus, the acquisition of in-orbit data can further expand the coverage of the training sample set and improve the robustness and generalization of the model. By constructing a waveform library of ground and in-orbit data and iteratively updating the training, the accuracy of the model will be enhanced to a new level.

5. Conclusions

The research points of this paper mainly include the following aspects:
(1)
Based on the starting point, in which the pulse waveforms generated by different particles deposited in the sensor are different, this paper constructs a complete software and hardware convolutional neural network particle identification platform, which includes data preprocessing, software CNN architecture, an FPGA forward inference platform, and a dataset expansion module. The platform can be widely applied to particle identification and can be extended on the basis of traditional methods.
(2)
With the help of Optuna + Pytorch architecture, a multi-dimensional hyperparameter search of the CNN model is carried out on the multi-angle enhanced waveform data of neutrons and gamma rays, and the optimal hyperparameter combinations are determined through several iterations. The software’s CNN model architecture is obtained during this training process. The model is utilized to test the existing test datasets separately, and the test accuracy is better than 97%.
(3)
With the help of the software model, the weights and bias parameters are solidified in the form of fixed points on the FPGA platform, and the hardware forward inference function is realized by the parallelized accelerated computing architecture. In the actual test, the forward inference of a single-waveform data value takes 4.9 us, and the accuracy is better than 97%. Meanwhile, the FOM parameter is used to compare the results of CNN computation and PSD operation. The value of the FOM parameter of CNN is 133, which is better than that of the FOM parameter of PSD by 0.8, further proving the superiority of the classification ability of CNNs.
In subsequent studies, we will expand the dataset types and quantities in more dimensions, such as using both time and frequency domains as inputs or expanding the theoretical waveform dataset by simulation and analysis. We will further improve the feature extraction capability of the model’s architecture with respect to waveform data, such as adopting CNN + RNN, combining the network voting model and multi-task output model, or expanding the scope of the Optuna searching hyperparameter samples. We will further create combinations with vivado HLS or other AI chips to improve the efficiency and accuracy of building the FPGA accelerated inference platform while considering irradiation resistance. It is reasonable to believe that, with the continuous improvement in IC arithmetic power and resources, the application of deep learning in the field of particle identification will become increasingly mature, and its application in the field of space environment detection will gradually move to the foreground.

Author Contributions

Conceptualization, C.B., X.Z. (Xin Zhang), S.Z. (Shenyi Zhang), Y.S., and X.Z. (Xianguo Zhang); funding acquisition, C.B.; investigation, C.B.; methodology, C.B. and X.Z. (Xin Zhang); project administration, Y.S. and X.Z. (Xianguo Zhang); resources, S.Z. (Shenyi Zhang); software, C.B.; supervision, X.Z. (Xin Zhang) and S.Z. (Shenyi Zhang); validation, C.B., Z.W., and S.Z. (Shuai Zhang); visualization, C.B.; writing—original draft, C.B.; writing—review and editing, C.B., X.Z. (Xin Zhang), and S.Z. (Shenyi Zhang). All authors have read and agreed to the published version of the manuscript.

Funding

This project is supported by National Natural Science Foundation of China (grant no. 42204180).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Zhu, J. Research on Charged Particle Identification Methods Based on Pulse Shape Analysis. Master’s Thesis, National University of Defense Technology, Changsha, China, 2014. [Google Scholar]
  2. Zhu, J.; Liu, G.; Yang, Y.; Luo, X. Development of Charged Particle Identification Methods. Nucl. Electron. Detect. Technol. 2014, 34, 194–199+225. [Google Scholar]
  3. Kong, L.; Zhang, A.; Tian, Z.; Zheng, X.; Wang, W.; Liu, Y.; Ding, J. Integrated ion and neutral particle analyzer for Chinese Mars mission. J. Deep Space Explor. 2019, 6, 142–149. [Google Scholar]
  4. Zhang, H.; Wang, Z.; Zhang, W. The brief introduction of particle identification. Nucl. Electron. Detect. Technol. 2010, 30, 1473–1479. [Google Scholar]
  5. Mahata, K.; Shrivastava, A.; Gore, J.; Pandit, S.; Parkar, V.; Ramachandran, K.; Kumar, A.; Gupta, S.; Patale, P. Particle identification using digital pulse shape discrimination in a nTD silicon detector with a 1 GHz sampling digitizer. Nucl. Instrum. Methods Phys. Res. Sect. A Accel. Spectrometers Detect. Assoc. Equip. 2018, 894, 20–24. [Google Scholar] [CrossRef]
  6. Hou, D.; Zhang, S.; Yang, Y.; Wang, Q.; Zhang, B.; Yu, Q. Neutron measurement and inversion based on CLYC scintillator. J. Beijing Univ. Aeronaut. Astronaut. 2021, 47, 106–114. [Google Scholar]
  7. Wang, Q.; Tuo, X.; Deng, C.; Liu, L.; Cheng, Y.; Zhang, C.; Yang, Y. Characterization of a Cs2LiYCl6: Ce3+ scintillator coupled with two silicon photomultiplier arrays of different sizes. Nucl. Instrum. Methods Phys. Res. Sect. A Accel. Spectrometers Detect. Assoc. Equip. 2019, 942, 162339. [Google Scholar] [CrossRef]
  8. Fobar, D.; Phillips, L.; Wilhelm, A.; Chapman, P. Considerations for Training an Artificial Neural Network for Particle Type Identification. IEEE Trans. Nucl. Sci. 2021, 68, 2350–2357. [Google Scholar] [CrossRef]
  9. Ma, Y.; Cao, Y.; Vrudhula, S.; Seo, J.S. Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks. In Proceedings of the ACM/SIGDA International Symposium on Field-programmable Gate Arrays, Monterey, CA, USA, 22–24 February 2017. [Google Scholar]
  10. Pausch, G.; Moszyński, M.; Wolski, D.; Bohne, W.; Grawe, H.; Hilscher, D.; Schubart, R.; Angelis, G.D.; Poli, M.D. Application of the pulse-shape technique to proton-alpha discrimination in Si-detector arrays. Nucl. Instrum. Methods Phys. Res. Sect. A Accel. Spectrometers Detect. Assoc. Equip. 1995, 365, 176–184. [Google Scholar] [CrossRef]
  11. Pausch, G.; Ortlepp, H.G.; Bohne, W.; Grawe, H.; Hilscher, D.; Moszynski, M.; Wolski, D.; Chubart, R.; De Angelis, G.; De Poli, M. Identification of light charged particles and heavy ions in silicon detectors by means of pulse-shape discrimination. IEEE Trans. Nucl. Sci. 1996, 43, 1097–1101. [Google Scholar] [CrossRef]
  12. Pausch, G.; Bohne, W.; Fuchs, H.; Hilscher, D.; Homeyer, H.; Morgenstern, H.; Tutay, A.; Wagner, W. Particle identification in solid-state detectors by exploiting pulse shape information. Nucl. Instrum. Methods Phys. Res. Sect. A Accel. Spectrometers Detect. Assoc. Equip. 1992, 322, 43–52. [Google Scholar] [CrossRef]
  13. Quaranta, A.A.; Martini, M.; Ottaviani, G. The pulse shape and the timing problem in solid state detectors—A review paper. IEEE Trans. Nucl. Sci. 1969, 16, 35–61. [Google Scholar] [CrossRef]
  14. Zhu, J.; Liu, G.; Yang, J.; Zhang, L. Pulse Shape Analysis Comparison Research of Charged Particle Identification. In Proceedings of the Seventeenth Annual National Conference on Nuclear Electronics and Nuclear Detection Technology, Shanghai, China, 13–15 August 2014. [Google Scholar]
  15. Droz, D.; Tykhonov, A.; Wu, X.; Alemanno, F.; Ambrosi, G.; Catanzani, E.; Santo, M.; Kyratzis, D.; Zimmer, S. A neural network classifier for electron identification on the DAMPE experiment. J. Instrum. 2021, 16, P07036. [Google Scholar] [CrossRef]
  16. Astrain, M.; Ruiz, M.; Stephen, A.V.; Sarwar, R.; Carpeño, A.; Esquembri, S.; Murari, A.; Belli, F.; Riva, M. Real-time implementation of the neutron/gamma discrimination in an FPGA-based DAQ MTCA platform using a convolutional neural network. IEEE Trans. Nucl. Sci. 2021, 68, 2173–2178. [Google Scholar] [CrossRef]
  17. Lu, J.; Tuo, X.; Yang, H.; Luo, Y.; Liu, H.; Deng, C.; Wang, Q. Pulse-shape discrimination of SiPM array-coupled CLYC detector using convolutional neural network. Appl. Sci. 2022, 12, 2400. [Google Scholar] [CrossRef]
  18. Khan, A.; Hwang, H.; Kim, H.S. Synthetic Data Augmentation and Deep Learning for the Fault Diagnosis of Rotating Machines. Mathematics 2021, 9, 2336. [Google Scholar] [CrossRef]
  19. Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-generation Hyperparameter Optimization Framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; Association for Computing Machinery: Anchorage, AK, USA, 2019; pp. 2623–2631. [Google Scholar]
  20. Srinivas, P.; Katarya, R. hyOPTXg: OPTUNA hyper-parameter optimization framework for predicting cardiovascular disease using XGBoost. Biomed. Signal Process. Control 2022, 73, 103456. [Google Scholar] [CrossRef]
  21. Shekhar, S.; Bansode, A.; Salim, A. A Comparative study of Hyper-Parameter Optimization Tools. In Proceedings of the 2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), Brisbane, Australia, 8–10 December 2021. [Google Scholar]
  22. Dai, D.; Zhang, Y.; Zhang, J.; Hu, Z.; Cai, Y.; Sun, Q.; Zhang, Z. Trainable Fixed-Point Quantization for Deep Learning Acceleration on FPGAs. arXiv 2024, arXiv:2401.17544. [Google Scholar]
  23. Goyal, R.; Vanschoren, J.; Van Acht, V.; Nijssen, S. Fixed-point quantization of convolutional neural networks for quantized inference on embedded platforms. arXiv 2021, arXiv:2102.02147. [Google Scholar]
  24. Yanamala, R.M.R.; Pullakandam, M. A high-speed reusable quantized hardware accelerator design for CNN on constrained edge device. Des. Autom. Embed. Syst. 2023, 27, 165–189. [Google Scholar] [CrossRef]
  25. Aarrestad, T.; Loncar, V.; Ghielmetti, N.; Pierini, M.; Summers, S.; Ngadiuba, J.; Petersson, C.; Linander, H.; Iiyama, Y.; Di Guglielmo, G.; et al. Fast convolutional neural networks on FPGAs with hls4ml. Mach. Learn. Sci. Technol. 2021, 2, 045015. [Google Scholar] [CrossRef]
  26. Liu, B.; Zhou, Y.; Feng, L.; Fu, H.; Fu, P. Hybrid CNN-SVM Inference Accelerator on FPGA Using HLS. Electronics 2022, 11, 2208. [Google Scholar] [CrossRef]
Figure 1. Structural diagram of the convolutional neural network particle identification platform.
Figure 1. Structural diagram of the convolutional neural network particle identification platform.
Applsci 14 06628 g001
Figure 2. (a) Schematic diagram of waveform signals of different particles with the same energy; (b) neutron and gamma waveform schematics.
Figure 2. (a) Schematic diagram of waveform signals of different particles with the same energy; (b) neutron and gamma waveform schematics.
Applsci 14 06628 g002
Figure 3. Dataset construction flowchart.
Figure 3. Dataset construction flowchart.
Applsci 14 06628 g003
Figure 4. Forward inference architecture construction flowchart.
Figure 4. Forward inference architecture construction flowchart.
Applsci 14 06628 g004
Figure 5. Block diagram of the data flow of the convolutional layer.
Figure 5. Block diagram of the data flow of the convolutional layer.
Applsci 14 06628 g005
Figure 6. Pooling layer data flow block diagram.
Figure 6. Pooling layer data flow block diagram.
Applsci 14 06628 g006
Figure 7. Block diagram of the full connectivity layer’s data flow.
Figure 7. Block diagram of the full connectivity layer’s data flow.
Applsci 14 06628 g007
Figure 8. Three types of training dataset.
Figure 8. Three types of training dataset.
Applsci 14 06628 g008
Figure 9. Three types of validation dataset.
Figure 9. Three types of validation dataset.
Applsci 14 06628 g009
Figure 10. Three types of test dataset.
Figure 10. Three types of test dataset.
Applsci 14 06628 g010
Figure 11. CNN network architecture diagram.
Figure 11. CNN network architecture diagram.
Applsci 14 06628 g011
Figure 12. Training and validation set computation results.
Figure 12. Training and validation set computation results.
Applsci 14 06628 g012
Figure 13. Physical diagram of CNN operation platform.
Figure 13. Physical diagram of CNN operation platform.
Applsci 14 06628 g013
Figure 14. (a) Distribution of neutron classification expectations. (b) Gamma classification expectation distribution.
Figure 14. (a) Distribution of neutron classification expectations. (b) Gamma classification expectation distribution.
Applsci 14 06628 g014
Figure 15. The pulse shape and time window of neutron and gamma in CLYC.
Figure 15. The pulse shape and time window of neutron and gamma in CLYC.
Applsci 14 06628 g015
Figure 16. Neutron–gamma PSD frequency curve.
Figure 16. Neutron–gamma PSD frequency curve.
Applsci 14 06628 g016
Table 1. Hyperparameter search range table.
Table 1. Hyperparameter search range table.
Search CategoriesSearch Scope
Algorithmic spaceAdam, RMSprop, SGD
Range of learning rates0.0001–0.1
batchsize64,128,256,512
Dropout range0–0.5
Table 2. Optimal search results table.
Table 2. Optimal search results table.
BatchsizeLrDropout_rateAccuracy
1280.00260.1197320.9783
1280.00280.0931310.978
640.00610.4054550.978
1280.00250.108840.976
Table 3. Table of test accuracy evaluation.
Table 3. Table of test accuracy evaluation.
Test Set TypeTesting Accuracy
Full waveform test set98.1%
Rising-edge test set99.2%
Falling-edge test set97.5%
Overall dataset98.2%
Table 4. MATLAB vs. Pytorch calculation comparison table.
Table 4. MATLAB vs. Pytorch calculation comparison table.
MATLABPytorch
2.0075, −3.84552.0075, −3.8455
1.1923, −3.00421.1923, −3.0042
−8.7143, 7.0735−8.7143, 7.0735
−10.65, 9.2606−10.65, 9.2606
Table 5. DSP module requirements table.
Table 5. DSP module requirements table.
Network LayerTheoretical Calculation FormulaTheoretical Value of the DSP Unit RequiredNumber of Actual DSP Units Consumed
First convolution layer DSPNum 1 = k 1 × n 1 × m 1 1 × 64 × 2 = 128128
Second convolution layer DSPNum 2 = k 2 × n 2 × m 2 1 × 32 × 2 = 6464
Third convolution layer DSPNum 3 = k 3 × n 3 × m 3 1 × 16 × 2 = 3232
Full connection D S P n u m f u l l c o n n e c t = D S P m u t i × D S P o u t p u t n o d e 8 × 2 = 1616
Total 240240
Table 6. Extrapolation of the timetable.
Table 6. Extrapolation of the timetable.
Network LayerTheoretical Calculation FormulaTime Required for Simulation
(CLK_100M)
Actual Time Required
(CLK_100M)
First convolution layer C l k N u m 1 = x 1 × y 1 × z 1 × a 1 + b 1 3 × 1 × 2 × 2 + b142
First pooling layer C l k N u m p o o l 1 = c l k 1 p o o l × c l k 1 c h a n n e l + c 1 1 × 4 + c110
Second convolution layer C l k N u m 2 = x 2 × y 2 × z 2 × a 2 + b 2 3 × 4 × 2 × 4 + b2145
Second pooling layer C l k N u m p o o l 2 = c l k 2 p o o l × c l k 2 c h a n n e l + c 2 1 × 8 + c218
Third convolution layer C l k N u m 3 = x 3 × y 3 × z 3 × a 3 + b 3 3 × 8 × 2 × 4 + b3241
Full connection C l k N u m f u l l c o n n e c t = ( c l k m u t i + 1 ) × c l k o u t p u t n o d e + k (256/8 + 1) × 1 + k34
Total 341 + b1 + c1 + b2 + c2 + b3 + k490
Table 7. Final results of FPGA resource utilization.
Table 7. Final results of FPGA resource utilization.
ResourceUtilizationAvailableUtilization (%)
LUT143,859171,90083.69
LUTRAM9770,4000.14
FF70,898343,80020.62
BRAM0.505000.10
DSP24090026.67
IO3625014.40
BUFG3329.38
MMCM1812.50
Table 8. Comparison table of test data and floating-point and fixed-point numbers.
Table 8. Comparison table of test data and floating-point and fixed-point numbers.
Test Data and Floating-Point Result ComplianceTest Data and Fixed-Point ComplianceFloating-Point and Fixed-Point Number Compliance
97.900%97.800%99.900%
Table 9. Expected value and distribution frequency.
Table 9. Expected value and distribution frequency.
Expected ValueNeutron Distribution FrequencyGamma
Distribution Frequency
000
0.0053172
0.01511
0.015290
0.02190
0.025160
0.0370
0.03580
0.0480
0.04540
0.0541
0.05510
0.0620
0.06510
0.0700
0.07520
0.0800
0.08501
0.0910
0.09520
0.130
0.282
0.330
0.403
0.531
0.622
0.723
0.806
0.907
0.90500
0.9100
0.91500
0.9202
0.92500
0.9300
0.93501
0.9401
0.94501
0.9501
0.95512
0.9621
0.96501
0.9701
0.97502
0.9802
0.98510
0.9903
0.99502
13451
Table 10. PSD parameter and distribution frequency.
Table 10. PSD parameter and distribution frequency.
PSD ValueNeutron Distribution FrequencyGamma
Distribution Frequency
0.5100
0.5210
0.5343
0.54244
0.552339
0.5619413
0.574148
0.58365
0.59078
0.6081
0.61097
0.62070
0.63029
0.6403
0.6500
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bai, C.; Zhang, X.; Zhang, S.; Sun, Y.; Zhang, X.; Wang, Z.; Zhang, S. Design and Application of an Onboard Particle Identification Platform Based on Convolutional Neural Networks. Appl. Sci. 2024, 14, 6628. https://doi.org/10.3390/app14156628

AMA Style

Bai C, Zhang X, Zhang S, Sun Y, Zhang X, Wang Z, Zhang S. Design and Application of an Onboard Particle Identification Platform Based on Convolutional Neural Networks. Applied Sciences. 2024; 14(15):6628. https://doi.org/10.3390/app14156628

Chicago/Turabian Style

Bai, Chaoping, Xin Zhang, Shenyi Zhang, Yueqiang Sun, Xianguo Zhang, Ziting Wang, and Shuai Zhang. 2024. "Design and Application of an Onboard Particle Identification Platform Based on Convolutional Neural Networks" Applied Sciences 14, no. 15: 6628. https://doi.org/10.3390/app14156628

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop