A Data-Driven Method for Calculating Neutron Flux Distribution Based on Deep Learning and the Discrete Ordinates Method

Li, Yanchao; Zhang, Bin; Yang, Shouhai; Chen, Yixue

doi:10.3390/en17143440

Open AccessArticle

A Data-Driven Method for Calculating Neutron Flux Distribution Based on Deep Learning and the Discrete Ordinates Method

by

Yanchao Li

¹,

Bin Zhang

^1,*,

Shouhai Yang

² and

Yixue Chen

¹

School of Nuclear Science and Engineering, North China Electric Power University, Beijing 102206, China

²

State Key Laboratory of Nuclear Power Safety Technology and Equipment, China Nuclear Power Engineering Co., Ltd., Shenzhen 518172, China

^*

Author to whom correspondence should be addressed.

Energies 2024, 17(14), 3440; https://doi.org/10.3390/en17143440

Submission received: 16 April 2024 / Revised: 17 May 2024 / Accepted: 3 June 2024 / Published: 12 July 2024

(This article belongs to the Special Issue Advancements in Nuclear Energy Technology)

Download

Browse Figures

Versions Notes

Abstract

The efficient and accurate calculation of neutron flux distribution is essential for evaluating the safety of nuclear facilities and the surrounding environment. While traditional numerical simulation methods such as the discrete ordinates (S_N) method and Monte Carlo method have demonstrated excellent performance in terms of accuracy, their complex solving process incurs significant computational costs. This paper explores a data-driven and efficient method for obtaining neutron flux distribution based on deep learning, specifically targeting shielding problems with constant geometry and varying material cross-sections in practical engineering. The proposed method bypasses the intricate numerical transport calculation process of the discrete ordinates method by constructing a surrogate model that captures the correlation between transport characteristics and neutron flux from data characteristics. Simulations were carried out using Kobayashi-1 and Kobayashi-2 geometric models for shielding problems with constant geometry and varying material cross-sections. A series of validations have proved that the data-driven surrogate model demonstrates high generalization ability and reliability, while reducing the time required to obtain neutron flux distribution to 0.1 s without compromising on calculation accuracy compared to the discrete ordinates method.

Keywords:

data-driven; deep learning; discrete ordinates method; surrogate model

1. Introduction

The neutron flux is a measure of the rate of neutron flow through a unit area, and it is the key physical quantity in radiation shielding calculations. It provides key information for ensuring environmental safety in radiation. Therefore, the efficient and accurate calculation of this value is the primary goal of shielding calculations [1]. In practical engineering, in order to judge whether the radiation field is within the safe radiation dose range, it is necessary to monitor the complex three-dimensional radiation field in real time, which requires the method for calculating the neutron flux distribution not only to adapt to the dynamic conditions of the radiation field, but also to have the ability to quickly calculate large geometrically complex problems. However, the traditional numerical calculation methods are unable to reconcile the demands of computational efficiency with those of accuracy.

Currently, the commonly used methods [2,3] for solving the neutron flux distribution in the shielding calculations are the discrete ordinates method and the Monte Carlo method. To solve the transport equation by the discrete ordinates method, the transport equation must be discretized with multiple variables and coupled equations must be established. The scale and complexity of the coupled equations increase with the complexity of the shielding structure and scale, and the solution efficiency is directly affected. In order to improve the computational efficiency of the discrete ordinates method, many different methods have been studied. The iterative optimization method enhances computational efficiency by optimizing the solution process [4]. GPU acceleration [5] and CPU parallel acceleration [6,7,8] leverage advanced computing resources and high-performance computer hardware to expedite the solution speed, thereby improving transport computing efficiency. The adaptive method [9,10,11,12,13,14] reduces the scale of transport equations based on estimated local errors in order to save time. The multi-level tree grid [15,16] utilizes coarse grids to represent problem areas that do not require fine grids, thus reducing the scale of transport equations and alleviating solving pressure. The Monte Carlo method uses the continuous energy point cross-sections to simulate the calculation, and the statistical error can be obtained directly. The whole process of particles from appearance to disappearance can be simulated in the discrete system, and it has strong geometric adaptability. Using the Monte Carlo method to simulate large complex reactors means that as the particle simulation scale increases, the particle simulation process slows down and computational efficiency decreases. Furthermore, Stochastic Differential Equations (SDEs) are an appropriate mathematical tool for modelling transport equations, and their results are in close agreement with Monte Carlo calculations. E. J. Allen derived the Stochastic Differential Equation and the Stochastic Partial Differential Equation for the neutron angular flux with time in general three-dimensional media [17], and compared the numerical solution of the stochastic differential equation with the Monte Carlo calculation of an independent formula to ensure the accuracy of the derivation. Hajas T established a connection between the Dynamic Monte Carlo (DMC) method and differential equation formalism [18]. A Non-Analog Monte Carlo (NAMC) model of the Stochastic Point-Kinetics equation (SPKe) is developed to determine a noise model that can effectively approximate the DMC locus.

The discrete ordinates method and Monte Carlo method have boundary dependence and need to set appropriate boundary conditions to complete calculation. As a highly parallel information processing system, the neural network possesses robust adaptive learning capabilities and is capable of managing complex nonlinear systems with multiple inputs and outputs [19,20]. The neural network learns the complex relationship between inputs and outputs from rich data, and summarizes the approximate function that can describe the relationship between inputs and outputs. The trained neural network can effectively determine the output target physical quantity based on the input, making it suitable for binary classification, multi-classification, and mathematical regression problems. With the continuous exploration and improvement of neural network functions and the continuous improvement of computer performance, many studies have focused on the intersectant and syncretic of neural networks and neutron flux. Zhou W studied a radiation field reconstruction method based on BP neural networks [21]. The accuracy of the reconstructed radiation field is improved by learning rate adaptive attenuation and multiple sampling methods. Li Z et al. used a deep neural network to calculate the cross-sections, taking into account the complex nonlinear relationship between the reactor variables [22]. Zhu O discussed the superiority of the artificial neural network method in neutron spectrum expansion by comparing it with the maximum entropy expansion method [23]. Cao C proposed a “two-step” neutron spectrum unwrapping method based on artificial neural networks to improve the computational efficiency of online wide-range neutron spectrum unwrapping technology [24]. Dos Santos demonstrated the performance of the Deep Rectifier Neural Network (DRNN) in identifying nuclear accidents [25]. Pei C proposed a method to reconstruct the neutron field based on the energy distribution of neural networks [26]. Through transport calculation and reconstruction calculation, the neutron field inside the reactor pressure vessel and in the core region is reconstructed by using the measured data outside the reactor. Song used neural networks to optimize shielding structures for marine reactors [27], and applied the neural network algorithm and the genetic algorithm to optimize the shielding design [28], which is two orders of magnitude faster than the Monte Carlo method. The neural network method is also used in the study of radiation dose [29]. The results of dose computation with the Artificial Neural Network (ANN) are available in less than 2 s. Based on the physics-informed neural network (PINN), Wang J proposed the conservative PINN (cPINN) to solve the neutron diffusion problem in inhomogeneous media, and the proposed BC-imposed method can help to improve the cPINN performance [30]. Overall, the neural network approach allows the neutron flux distribution of the radiation field to be obtained without the need for complex numerical solution methods, and effectively improves the efficiency of obtaining the neutron flux distribution.

In the context of large geometries, both the discrete ordinates and Monte Carlo methods can be time-consuming, taking hours or more to compute the shielding problem. This paper explores the use of deep learning as a potential method for efficiently obtaining neutron flux distribution based on the discrete ordinates method. By incorporating the specific characteristics of changes in shielding structure, the discrete ordinates method is utilized to simulate potential radiation field changes during actual transport. The neutron flux distribution is collected at each instance and used to train a deep learning neural network in order to develop a surrogate model. This model can then be employed to obtain the neutron flux distribution in an efficient manner.

2. Methodology

2.1. Deep Learning Neural Network

Deep learning employs backpropagation algorithms to identify intricate structures within expansive datasets [31]. Deep learning is a machine learning method based on neural networks that learns input data layer by layer through multi-layer neural networks, thereby enabling the modelling of complex nonlinear relationships. The basic structure of the neural network is shown in Figure 1. The relationship between input neurons x, y, z and output neuron

u

is described by weight

ω_{1}, ω_{2}, ω_{3}

, bias

b

, and activation function

f

, as shown:

u = f (x ω_{1} + y ω_{2} + z ω_{3} + b),

(1)

As the complexity of neural networks increases, their capacity to represent intricate relationships within data improves. The activation function in neural networks plays a crucial role in introducing nonlinearity to the output of neurons, thereby enhancing the neural network’s ability to approximate complex functions. Throughout the training process, the forward loss function calculates the difference between the predicted and actual values. The overall performance is evaluated through the loss function, which aggregates errors across all samples. Utilizing the derivative of the loss function, the optimizer adjusts each weight in the forward calculation along the direction of the minimum gradient. The iterative process continues until the loss function reaches a satisfactory value. The process of training is to improve the capacity of the neural network to approximate complex functions.

However, in practical applications, it is not sufficient for neural networks to demonstrate strong fitting performance; they must also exhibit a certain degree of generalization ability. It is necessary for neural networks to be capable of effectively predicting the output of unknown inputs. The train loss and the test loss are worthy of observation.

2.2. Discrete Ordinates Method Transport Solution

The steady state of neutron transport equation is a linear differential-integral equation with six independent variables, the phase space

(\vec{r}, E, \vec{Ω})

, containing space

\vec{r} (x, y, z)

, energy, and angle

\vec{Ω} (μ, η)

, as shown:

\begin{array}{l} \vec{Ω} \cdot \nabla ψ (\vec{r}, E, \vec{Ω}) + Σ_{t} (\vec{r}, E) ψ (\vec{r}, E, \vec{Ω}) \\ = \int_{0}^{\infty} d E^{'} \int_{0}^{4 π} Σ_{s} (\vec{r}, E^{'} \to E, {\vec{Ω}}^{'} \to \vec{Ω}) ψ (\vec{r}, E^{'}, \vec{Ω}) d {\vec{Ω}}^{'} + q (\vec{r}, E, \vec{Ω}) \end{array}

(2)

where

ψ (\vec{r}, E, \vec{Ω})

is the flux,

q (\vec{r}, E, \vec{Ω})

is the fixed source,

Σ_{t} (\vec{r}, E)

is the total cross-sections, and

Σ_{s} (\vec{r}, E^{'} \to E, {\vec{Ω}}^{'} \to \vec{Ω})

is the scattering cross-sections. The first term on the right side of the equation is the scattering source. Due to the differential and integral properties of the transport equation, the exact analytical solution of the transport equation can be obtained only in simple problems. Therefore, numerical approximation is often used to solve practical problems. The discrete ordinates method is a deterministic method that requires each variable to be processed discretely. The continuous energy is divided into several appropriate energy groups by the multi-group approximation method. The discrete ordinates method is used to discretize the angle variables. The original continuous direction variable is discretized into M specific discrete directions. The transport equation after angle discretization by the discrete ordinates method is:

Ω_{m} \cdot \nabla ψ_{m} (\vec{r}) + Σ_{t} (\vec{r}) ψ_{m} (\vec{r}) = Q_{s, m} (\vec{r}) + q_{m} (\vec{r}),

(3)

where m is the discrete direction number,

Q_{s, m}

and

q_{m}

are the scattering and fixed source terms, respectively. For spatial variables, the finite difference method [32] or finite element method [33] is used. The discretization method transforms the original transport equation into a large set of equations that can be easily solved. After solving the neutron angular flux of each phase space, the neutron scalar flux and the flux moment in the phase space are calculated by the numerical integral approximation method, as shown:

ϕ_{n}^{k} (\vec{r}) = \sum_{m = 1}^{M} w_{m} ψ_{m} Y_{n}^{* k} ({\vec{Ω}}_{m}),

(4)

where

ϕ_{n}^{k} (\vec{r})

is the flux moment of order n, k;

Y_{n}^{* k}

is the adjoint spherical harmonic, M is the total number of discrete directions,

w_{m}

is the weight coefficient corresponding to the discrete direction m, and the anisotropic scattering source can be approximated by the following formula:

Q_{s, m} (\vec{r}) \approx \sum_{n = 0}^{N} \frac{2 n + 1}{4 π} Σ_{s, n} (\vec{r}) \sum_{k = - n}^{n} ϕ_{n}^{k} (\vec{r}) Y_{n}^{k} ({\vec{Ω}}_{m}) .

(5)

The discrete equation updates the neutron angular flux and scattering source term by source iteration until the neutron angular flux errors of the two iterations meet the requirements.

The discrete ordinates method is widely utilized in shielding calculations due to its advantages of efficient calculation speed, high precision, and suitability for solving deep penetration transport problems. By utilizing the data generated by the discrete ordinates method to train neural networks, each neutron flux value is rigorously solved from the neutron transport equation. This approach can enhance the interpretability of deep learning neural networks to a certain extent. The utilization of this extensive dataset derived from realistic simulations enhances the neural network’s ability to capture the intricate relationships between input parameters and neutron flux distribution. This approach not only can validate the neural network model but also can ensure its applicability to real problems in shielding calculations.

2.3. Dataset Acquisition and Construction

Given the inherent interdependence of neutron flux on source strength and cross-sections parameters during the shielding transport calculation process, the dataset comprises the average source strength of the source region, total cross-sections, and scattering cross-sections information for each material. Meanwhile, the dataset encapsulates the neutron flux distribution at specific positions within the geometric model. The neutron flux distribution was obtained using the three-dimensional particle transport calculation code ARES [34], which employs the discrete ordinates method for transport simulation calculations. The Kobayashi benchmark [35] was developed by OECD/NEA. It is typically employed to assess the computing power of shielded transport procedures. The Kobayashi benchmark is comprised of three independent questions, each of which is divided into two categories: total absorption and semi-scattering, of which the anisotropy of models 2 and 3 increases gradually with the channel due to the existence of straight and zigzagging channels. The isolated source region of Kobayashi-1 (Figure 2) is surrounded by a large cube cavity with a side length of 100 cm. Basic Dataset-1 is constructed based on the Kobayashi-1 geometric model to simulate the problem of constant geometry and changing material sections; the isolated source region is wrapped by two different shielding layers. Kobayashi-2 (Figure 3) adds straight channels and enhanced anisotropy, which increases the difficulty of the shielding calculation. Basic Dataset-2 is constructed based on the Kobayashi-2 geometric model to simulate the problem of constant geometry and changing material cross-sections.

The range of source strength and material cross-sections parameters (Table 1) for Dataset-1 always requires that the scattering cross-sections of the same material is less than or equal to the total cross-sections. Additional settings for transportation are as follows: the shielding model is divided according to a uniform grid of 1 cm × 1 cm × 1 cm, the exponential directional weighted (EDW) difference scheme is selected, and the P_NT_N-S₂₄-order quadrature sets are used to calculate, and a total of 886 sets of effective transport data are collected to form Dataset-1.

The material cross-sections information of Dataset-2 is generated by the ARES cross-sections generation code to generate the P₃ Legendre scattering coefficient. The random generation range of all nuclide composition and atom density in the three regions is shown in Table 2. To ensure that the relationship between the total cross-sections and the scattering cross-sections is not influenced by the material itself, Table 3 presents the total cross-sections of three regions and the range of source strengths. Additional settings for transportation are as follows: the shielding model is divided according to a uniform grid of 1 cm × 1 cm × 1 cm, the short characteristic (SC) scheme is selected, and the P_NT_N-S₂₄-order quadrature sets are used to calculate, and a total of 681 sets of effective transport data are collected to form Dataset-2.

To mitigate the impact of order of magnitude differences in neutron flux within the dataset, a crucial preprocessing step is implemented. This step is essential for ensuring the neural network effectively learns and generalizes the neutron flux across diverse magnitude scales. Given the characteristic exponential decline of neutron flux, the log normalization approach is employed, as outlined:

x^{'} = \frac{\log 10 (x)}{\log 10 (\max (x))},

(6)

where x is the original data, max(x) is the maximum value of the original data, and

x^{'}

is the standardized data. This transformation compresses a wide range of neutron flux values to a more manageable scale, which facilitates stable and efficient convergence of neural networks during training. Log normalization helps to handle exponential changes, ensuring that the neural network can recognize subtle differences in the entire distribution of neutron flux. By applying this pre-processing step, it will be easier for the neural network to capture the subtle relationships in the dataset.

To evaluate the efficacy of log standardization, a comparative analysis was conducted using Dataset-1. Table 4 presents the train loss of two tests, while Figure 4 illustrates the fitting results of the neural network.

The findings indicate that log standardization effectively addresses the issue of increased training difficulty resulting from variations in neutron flux.

2.4. Deep Learning Neural Network Topology Construction and Model Training

As the depth and width of neural networks increase, the fitting capabilities of neural networks become more powerful. Therefore, we use deep learning neural networks to construct surrogate models. For a fixed source problem, the discrete ordinates method is used to simulate the transport calculation process, where the source strength and material cross-sections are taken as inputs and the final physical output is the neutron flux distribution. Therefore, the key transport parameters used to describe the problem will serve as the input layer of the deep learning neural network, and the neutron flux distribution at the spatial location of interest will serve as the output layer of the deep learning neural network.

The fully connected neural network represents the most fundamental type of neural network, offering greater flexibility compared to convolutional and recurrent neural networks. It is particularly well-suited for addressing regression-type problems. Furthermore, by combing the discrete ordinates method, the input consists of parameters that can effectively describe shielding issues without requiring convolutional summary features. In this paper, four surrogate models based on fully connected neural network were developed utilizing the TensorFlow [36] framework’s Keras model module. These surrogate models are utilized to forecast the neutron flux distribution within the radiation field of a specific geometry under various shielding conditions, as depicted in Figure 5. These surrogate models will replace the complex transport calculation process of the discrete ordinates method. The input and output architecture of each surrogate model is designed as follows:

Surrogate Model-1, Surrogate Model-2, and Surrogate Model-3 were trained using Dataset-1 to predict the neutron flux distribution for shielding problems that are consistent with the Kobayashi-1 geometry. The three models have an input dimension of 7. The output layer of Surrogate Model-1 contains 2500 neurons in total, representing the neutron flux distribution of 2500 grids at specific coordinates (x = 45.5 cm, 0 cm < y < 50 cm, 0 cm < z < 50 cm). The output layer of Surrogate Model-2 consists of 10,000 neurons in total, representing the neutron flux distribution of 10,000 grids at x = 55.5 cm, 0 cm < y < 100 cm, and 0 cm < z < 100 cm. The output layer of Surrogate Model-3 consists of 2500 neurons in total, representing the neutron flux distribution of 2500 grids is x = 55.5 cm, 50 cm < y < 100 cm, 50 cm < z < 100 cm. Surrogate Model-4 was trained using Dataset-2 to predict the neutron flux distribution for shielding problems that are consistent with the Kobayashi-2 geometry. The model has an input dimension of 16. The output layer contains 2880 neurons in total, representing the neutron flux distribution of 2880 grids. These grids are evenly distributed in the shielded space. For the deep learning neural network training phase, the dataset is randomly divided into a training dataset and test dataset, maintaining a ratio of 10:1. This ensures robust training and evaluation of the deep learning neural network’s capacity to generalize across various problems within the specified geometric model.

The training of the surrogate model includes the selection of the deep learning neural network structure and hyperparameters, but since there is no explicit machine learning theory to guide the hyperparameter design, the optimal structure and hyperparameters are determined by using prior experience and experimental evidence.

The mean absolute error (MAE) and the mean square error (MSE) are commonly used loss functions in deep learning neural networks. Their equations are Equations (7) and (8):

M A E = \sum_{i = 1}^{n} \frac{P_{i} - A_{i}}{n},

(7)

M S E = \sum_{i = 1}^{n} \frac{{(P_{i} - A_{i})}^{2}}{n},

(8)

which are used to quantify the prediction difference between predicted (

P_{i}

) and actual (

A_{i}

) values, where n is the size of the dataset. In shielding calculations, both low and high flux are equally important. The square calculation of the MSE loss function cannot accurately evaluate the value of low flux, which increases the difficulty of training neural networks. Therefore, the neural network uses MAE as the loss function.

Moreover, to ensure effective deep learning, the weights and biases of nodes are initialized using uniform distribution initialization (Glorot [37]). The activation function selected is the Exponential Linear Unit (ELU) [38], known for its effectiveness in capturing complex relationships. For optimization, the Adaptive Moment Estimation (Adam) [39] algorithm is chosen due to its superior performance in regression problems. The learning rate is designed to be self-adaptive, decreasing to half of the original learning rate if the loss value of the training dataset does not decrease. This adaptive learning rate decline strategy mechanism contributes to stable and efficient convergence during training. Furthermore, the Batch Size method [40] was adopted. To prevent overfitting in fully connected neural networks, dropout layers or regularization can be incorporated. However, these approaches can significantly increase computational load. To avoid unnecessary training time, overfitting is addressed by comparing test loss at specific intervals during training and saving parameters that perform better on the test set. If the test loss is not improved after 400 iterations of training, the neural network stops training. This prevents the replacement of the neural network’s parameters even if overfitting occurs during subsequent training.

Through the sensitivity analysis of hyperparameters, the effect of each hyperparameter on the performance of the deep learning neural network can be observed more clearly, and the training of the deep learning neural network can be completed more effectively. Therefore, the hyperparameters of the deep learning neural network are determined by sensitivity analysis. The experiment details were as follows: The sensitivity analysis of the initial learning rate (see Table 5) was conducted with a batch size of 20, ELU as the activation function, and a learning rate decline strategy where the learning rate becomes half of the original when the test loss does not decrease after every 300 iterations. It can be found that when the initial learning rate is 1 × 10⁻³, the model appears to show overfitting. When the initial learning rate is too low, the model loss value is large and underfitting occurs. When the initial learning rate is 1 × 10⁻⁴, the model performs well.

Similarly, for the sensitivity analysis of the batch size (see Table 6), ELU was used as the activation function, the initial learning rate was set to 1 × 10⁻⁴, and the learning rate decline strategy was the same as before. From the perspective of loss, when batch size is one, the model will appear overfitting, when batch size is 10, the model will perform the best, and with the increase in batch size, the performance of the model will decrease. In consideration of the time required for each training iteration, the batch size of the model is set to 20.

When performing sensitivity analysis on the activation function (see Table 7), the batch size is set to 20, the initial learning rate is 1.00 × 10⁴, and the learning rate decline strategy is to reduce the learning rate to half of the original value when the test loss does not decrease for 300 iterations. It is obvious that the ELU activation function is more advantageous in the prediction of neutron flux.

For sensitivity analysis of the learning rate decline strategy (see Table 8), the batch size is 20, the activation function used is ELU, and the initial learning rate is 1.00 × 10⁻⁴. The strategy of adjusting the learning rate when test loss is not improved after 300 iterations can improve the performance of the model.

The above training of the deep learning neural network is to select the partial hyperparameters with the best ability to fit the neutron flux. But the surrogate model’s performance varies under the same structure due to differences in the composition of the output layer. Therefore, each surrogate model is trained separately. The sensitivity analysis is conducted for both the number of hidden layers and the number of neurons in each hidden layer of every surrogate model. Table 9 illustrates the performance of Surrogate Model-1 across different hidden layers with 800 neurons in each layer. It is evident that when the number of layers exceeds eight, the train loss is lower compared to other layers, but the test loss is higher, indicating a more pronounced overfitting phenomenon. Conversely, when the number of layers is three or four, the model’s loss remains similar. Table 10 presents the performance of Surrogate Model-1 across varying numbers of hidden layer neurons, while maintaining a fixed configuration of four hidden layers. It can be observed that as the number of neurons increases to 1000, both train loss and test loss converge to a minimum. However, deteriorating performance is noted when the number of neurons reaches 1500. In conclusion, it can be determined that for optimal performance, Surrogate Model-1 should have three hidden layers and each layer should consist of approximately 1000 neurons.

Table 11 illustrates the performance of Surrogate Model-2 across various hidden layers with 800 neurons in each layer. It is evident that as the number of layers exceeds six, the train loss decreases compared to other layers, while the test loss increases, indicating overfitting. The model reaches its minimum loss when there are four layers. Table 12 presents the performance of Surrogate Model-2 across varying numbers of hidden layer neurons, while maintaining a fixed configuration of four hidden layers. It is observed that as the number of neurons increases to 800, both training and test losses converge to a minimum. However, when the number of neurons reaches 1000, the train loss decreases but the test loss increases, suggesting overfitting. In conclusion, for optimal performance, it is determined that Surrogate Model-2 should consist of four hidden layers with approximately 800 neurons in each layer.

Table 13 illustrates the performance of Surrogate Model-4 across various hidden layers with 800 neurons in each layer. Consistent with the sensitivity analysis for Surrogate Model-2, when the number of layers exceeds six, train loss decreases relative to other layers, while the loss of the test set increases, indicating overfitting. The model loses the least when there are four layers. Table 14 presents the performance of Surrogate Model-3 across varying numbers of hidden layer neurons, while maintaining a fixed configuration of four hidden layers. When the number of neurons increases to 800, the loss of both the training set and the test set converges to a minimum. However, when the number of neurons reached 1000, the train loss decreased while the test loss increased, suggesting overfitting. To sum up, for optimal performance, it was determined that Surrogate Model-3 should consist of four hidden layers, each with approximately 800 neurons.

Table 15 illustrates the performance of Surrogate Model-4 across various hidden layers with 800 neurons in each layer. Consistent with the sensitivity analysis for Surrogate Model-2, when the number of layers exceeds six, train loss decreases relative to other layers, while the loss of the test set increases, indicating overfitting. The model loses the least when there are four layers. Table 16 presents the performance of Surrogate Model-4 across varying numbers of hidden layer neurons, while maintaining a fixed configuration of four hidden layers. When the number of neurons increases to 2000, the loss of both the training set and the test set converges to a minimum. When the number of neurons reaches 2500, train loss decreases, the test loss remains unchanged, and if the number of neurons continues to increase, overfitting will occur. To sum up, for optimal performance, we determined that Surrogate Model-4 should consist of four hidden layers, each with approximately 800 neurons.

2.5. Model Evaluation

For a fixed source problem, the discrete ordinates method is used to simulate the transport calculation process.

The assessment of the surrogate model’s performance in predicting neutron flux distribution in the radiation field involves two key evaluation metrics: the loss value and the relative error. These metrics provide a comprehensive understanding of the model’s accuracy and predictive capabilities.

Loss Value Evaluation: The loss value of the test dataset serves as a quantitative measure of how well the surrogate model generalizes. Generalization refers to the ability of a deep learning neural network to correctly predict or recognize an input with the absence of an explicit input. Such capabilities are essential in engineering applications. A lower loss value indicates better agreement between predicted and actual values within the test dataset. This metric is crucial for assessing the overall predictive performance of the neural network across various shielding problems.

Relative Error Evaluation: The relative error offers a normalized measure of the difference between calculated and reference values. It is defined as:

E r r o r = (y_{p r e} - y_{t r u e}) / y_{t r u e},

(9)

where y_pre is the value calculated by the neural network, y_true is the neutron flux calculated by the discrete ordinates method, and i is the grid number. This metric is used to evaluate the predictive performance of the neural network surrogate model for each grid in various shielding problems. A lower relative error signifies greater accuracy in the model’s predictions. In the shielding calculation, any flux value has the same value. Consequently, it is necessary to accurately calculate the flux at every position. However, MAE is only capable of calculating the average absolute error of all grids. Furthermore, the absolute error can only assess the numerical discrepancy, whereas the relative error is the ratio of the calculated error to the actual value, which enables the measurement of error on an equal footing. Consequently, it is more appropriate to select relative error as an additional evaluation criterion.

The loss value and relative error metrics ensure a thorough assessment of the surrogate model’s performance in capturing the complexities of neutron flux distribution in diverse radiation field problem. These evaluation criteria serve as essential benchmarks for validating the reliability and accuracy of the deep learning neural network model in the context of shielding calculations.

3. Numerical Result Analysis

In this section, the accuracy and limitations of the research methods employed in this paper will be thoroughly analyzed and validated. The study involves the construction of four distinct surrogate models using deep learning neural networks, each tasked with predicting the neutron flux distribution at different spatial locations within a geometrically invariant problem. Specifically, one of the surrogate models (Surrogate Model-1) is dedicated to observing the neutron flux distribution within Shield Region 1 of the Kobayashi-1 geometric configuration, while Surrogate Model-2 and Surrogate Model-3 are focused on Shield Region 2. Surrogate Model-4 is employed to predict the distribution of neutron flux for shielding applications in accordance with the Kobayashi-2 geometric model.

Taking the calculated results of the discrete ordinates method as experimental results, the predictions of Surrogate Model-1, Surrogate Model-2, and Surrogate Model-3 will be compared with the calculations by the discrete ordinates method under three validation use cases (see Table 17 and Table 18). In addition, the predictions of Surrogate Model-4 will be compared with the calculations by the discrete ordinates method under three validation use cases (see Table 19, Table 20, Table 21 and Table 22).

3.1. Prediction of Neutron Flux Distribution in Kobayashi-1 Geometric Shield Region 1

The input layer of Surrogate Model-1 consists of seven neurons, representing the key parameters source strength, total cross-sections and scattering cross-sections of the source region, total cross-sections and scattering cross-sections of Shielding Region 1, and total cross-sections and scattering cross-sections of Shielding Region 2. The output layer contains the neutron flux distribution of 2500 grids at specific coordinates (x = 45.5 cm, 0 cm < y < 50 cm, 0 cm < z < 50 cm), 2500 neurons in total.

The prediction performance of Surrogate Model-1 is verified using three validation use cases and the single prediction time of Surrogate Model-1 is less than 0.1 s. The settings in the transport process using the discrete ordinates method are as follows: the shielding model is divided according to a uniform grid of 1 cm × 1 cm × 1 cm, the EDW scheme is selected, the iterative convergence criterion is 1 × 10⁻³, and the calculation is performed using the quadrature size of order P_NT_N-S₂₄. By comparing the neutron flux distribution (see Figure 6, Figure 7 and Figure 8) and the error Figures (Figure 9, Figure 10 and Figure 11), it is found that the prediction results of the surrogate model are almost consistent with the S_N calculation results under the three validation examples, and the maximum error is 6.00 × 10⁻². In summary, when the flux changes little, the surrogate model can be calculated in 0.1 s while ensuring the accuracy of the discrete ordinates method.

3.2. Prediction of Neutron Flux Distribution in Kobayashi-1 Geometric Shield Region 2

The input layer of the deep learning neural network for Surrogate Model-2 consists of seven neurons, which represent the key parameters source strength, total cross-sections and scattering cross-sections of the source region, total cross-sections and scattering cross-sections of Shielding Region 1, and total cross-sections and scattering cross-sections of Shielding Region 2. The output layer consists of the neutron flux distribution of 10,000 grids at x = 55.5 cm, 0 cm < y < 100 cm, and 0 cm < z < 100 cm.

Three validation use cases are used to verify the prediction performance of Surrogate Model-2. The single prediction time of Surrogate Model-2 was less than 0.1 s. The settings in the transport process using the discrete ordinates method are as follows: the shielding model is divided according to a uniform grid of 1 cm × 1 cm × 1 cm, the EDW scheme is selected, the iterative convergence criterion is 1 × 10⁻³, and the calculation is performed using the quadrature size of order P_NT_N-S₂₄. After comparing the prediction results (see Figure 12, Figure 13, Figure 14, Figure 15, Figure 16 and Figure 17), it is evident that the prediction error of the surrogate model is within ±50% across most of the grids, but the prediction error of the grid with low relative flux is larger in each example, and the prediction error exceeding than ±100%. The performance of the surrogate model becomes increasingly unstable as the flux magnitude and output dimension of the model increase. By comparing Table 10 and Table 12, it can also be found that the test loss of Surrogate Model-2 is significantly higher than that of Surrogate Model-1, which means that Surrogate Model-2 is more difficult to train than Surrogate Model-1. To investigate the cause of this increased prediction error, adjustments were made to align the output dimension of the surrogate model with Surrogate Model-1.

The input layer of Surrogate Model-3 is still composed of seven neurons, which are the source strength of the source region, the total cross-sections and scattering cross-sections of the source region, the total cross-sections and scattering cross-sections of Shield Region 1, and the total cross-sections and scattering cross-sections of Shield Region 2. The output layer consists of the neutron flux distribution of 2500 grids at x = 55.5 cm, 50 cm < y < 100 cm, and 50 cm < z < 100 cm.

Three validation use cases are used to verify the prediction performance of Surrogate Model-3. The single prediction time of Surrogate Model-3 was less than 0.1 s. The settings in the transport process using the discrete ordinates method are as follows: the shielding model is divided according to a uniform grid of 1 cm × 1 cm × 1 cm, the EDW scheme is selected, the iterative convergence criterion is 1 × 10⁻³, and the calculation is performed using the quadrature size of order P_NT_N-S₂₄. By comparing the neutron flux distribution and the error Figures (see Figure 18, Figure 19, Figure 20, Figure 21, Figure 22 and Figure 23), it can be found that the prediction error for most grids is within ±30%, and for grids with large relative flux, the prediction error is within ±45%, while the prediction error of the surrogate model is still large when the neutron flux value is around 1.0 × 10⁻¹⁶. Compared with Surrogate Model-2, Surrogate Model-3 has better prediction ability, and the method of reducing the output dimension effectively improves the prediction accuracy of the model.

The datasets used to build the three surrogate models mentioned above were trained using transport data in the EDW difference scheme. However, the ARES transport code does not take into account angular flux less than 1.0 × 10⁻¹² when using the EDW difference scheme for simulation calculation. As a result, the standard flux near 1.0 × 10⁻¹² is not precise enough for accurate calculation of the value. This is the main reason for the poor performance of the surrogate model on the low flux grid. Therefore, we should pay more attention to the prediction results of the surrogate model on other grids, and the low error of the surrogate model on other grids can also show that the surrogate model constructed by using the deep learning neural network algorithm is feasible and reliable in the neutron flux.

3.3. Prediction of Neutron Flux Distribution in Kobayashi-2 Geometry

The input layer of Surrogate Model-4 is still composed of 16 neurons, which are the source strength of the source region, the total cross-sections and scattering coefficient of the source region, the total cross-sections and scattering coefficient of Shielding Region 1, and the total cross-sections and scattering coefficient of Shielding Region 2. The output layer is the neutron flux of 2880 grids uniformly distributed in the shielded space.

Surrogate Model-4 extends the prediction range from 2D to 3D and is trained on SC transport Dataset-2. Three validation use cases were used to verify the prediction performance of Surrogate Model-4, and the prediction time of Surrogate Model-3 did not exceed 0.1 s. The settings in the transport process using the discrete ordinates method are as follows: the shielding model is divided according to a uniform grid of 1 cm × 1 cm × 1 cm, the SC scheme is selected, the iterative convergence criterion is 1 × 10⁻³, and the calculation is performed using the quadrature size of order P_NT_N-S₂₄. By comparing the calculated results of S_N with those of Surrogate Model-4 (see Figure 24, Figure 25, Figure 26, Figure 27, Figure 28 and Figure 29), it can be seen that the error of the prediction results of the surrogate model on all grids does not exceed 10% in the face of three validation use cases with different characteristics. This clearly indicates that the high prediction error of Surrogate Model-2 and Surrogate Model-3 in the previous section, on the grid with actual flux less than 1.0 × 10⁻¹², is not due to inadequate training of the surrogate model. In summary, the surrogate model can be calculated in 0.1 s while ensuring the accuracy of the discrete ordinates method.

The comparison of the training processes and calculation results of the four surrogate models reveals that Surrogate Model-1, Surrogate Model-3, and Surrogate Model-4 exhibit reduced train and test losses, as well as lower calculation errors. This indicates that when the output dimension of the surrogate model is below 2500, it is associated with higher calculation accuracy and a more straightforward training process. In a similar context, the prediction outcome of Surrogate Model-1 exhibited superior accuracy in comparison to the other surrogate models, suggesting that the surrogate model is capable of effective prediction in the presence of minor flux variations. In the context of significant flux variation, a larger output dimension (Surrogate Model-3) is likely to result in a reduction in the calculation accuracy of the surrogate model. Surrogate Model-4 extends the prediction position to three-dimensional space. The calculation results demonstrate that the calculation accuracy of Surrogate Model-4 is less than 10% for the strong anisotropy problem.

4. Summary

This paper explores a rapid method for obtaining neutron flux distribution using deep learning techniques. Geometric models based on Kobayashi-1 and Kobayshi-2 utilize the discrete ordinates method to generate comprehensive training data. A deep learning neural network is employed to train four surrogate models capable of predicting the neutron flux distribution at various locations. By constructing the dataset with information related to the discrete ordinates method, the interpretability between the input (transport parameters) and output (neutron flux) is enhanced, leading to improved reliability of the surrogate model. Experimental results demonstrate that each individual prediction time of the surrogate model is less than 0.1 s. In terms of prediction accuracy, transport parameters with varying characteristics were utilized to validate the predictive performance of each surrogate model, the reasons for the high grid errors in the prediction results were elucidated. The predicted grid position was expanded from two dimensions to three dimensions. The efficacy of the surrogate model in neutron flux calculations is demonstrated through a comparison of the prediction accuracy between the surrogate models trained with EDW and SC transport datasets. The surrogate model is capable of effectively maintaining prediction accuracy below 10%. Future research should consider more practical engineering problems, compare the results with those of other numerical calculation methods, and summarize and analyze the advantages and disadvantages of surrogate models and other numerical calculation methods.

Author Contributions

Conceptualization, Y.L.; methodology, Y.L.; software, Y.L.; validation, Y.L.; formal analysis, Y.L.; investigation, Y.L.; resources, Y.L. and B.Z.; data curation, Y.L.; writing—original draft preparation, Y.L.; writing—review and editing, B.Z. and S.Y.; visualization, Y.L. and B.Z.; supervision, B.Z., S.Y. and Y.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

S.Y. was employed by China Nuclear Power Engineering Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Bell, G.I.; Glasstone, S. Nuclear Reactor Theory; US Atomic Energy Commission: Washington, DC, USA, 1970. [Google Scholar]
Larsen, E.W. An Overview of Neutron Transport Problems and Simulation Techniques. Comput. Methods Transp. 2004, 48, 513–533. [Google Scholar] [CrossRef]
Lewis, E.E.; Miller, W.F. Computational Methods of Neutron Transport; John Wiley and Sons, Inc.: New York, NY, USA, 1993. [Google Scholar]
Adams, M.L.; Larsen, E.W. Fast Iterative Methods for Discrete Ordinates Particle Transport Calculations. Prog. Nucl. Energy 2012, 40, 3–159. [Google Scholar] [CrossRef]
Gong, C.Y.; Liu, J.; Chi, L.H.; Huang, H.W.; Fang, J.Y.; Gong, Z.H. GPU Accelerated Simulations of 3D Deterministic Particle Transport Using Discrete Ordinates Method. J. Comput. Phys. 2011, 230, 6010–6022. [Google Scholar] [CrossRef]
Baker, R.S.; Koch, K.R. An S_N Algorithm for the Massively Parallel CM-200 Computer. Nucl. Sci. Eng. 1998, 128, 312–320. [Google Scholar] [CrossRef]
Plimpton, S.; Hendrickson, B.; Burns, S.; McLendon, W. Parallel Algorithms for Radiation Transport on Unstructured Grids. In Proceedings of the SC ’00: Proceedings of the 2000 ACM/IEEE Conference on Supercomputing, Dallas, TX, USA, 4–10 November 2000; p. 25. [Google Scholar] [CrossRef]
Mo, Z.Y.; Zhang, A.Y.; Zhang, Y. A New Parallel Algorithm for Vertex Priorities of Data Flow Acyclic Digraphs. J. Supercomput. 2014, 68, 49–64. [Google Scholar] [CrossRef]
Baker, R.S. A Block Adaptive Mesh Refinement Algorithm for the Neutral Particle Transport Equation. Nucl. Sci. Eng. 2002, 141, 1–12. [Google Scholar] [CrossRef]
Zhang, H.; Lewis, E.E. Spatial Adaptivity Applied to the Variational Nodal P_N Equations. Nucl. Sci. Eng. 2002, 142, 57–63. [Google Scholar] [CrossRef]
Wang, Y.Q.; Ragusa, J.C. Application of hp Adaptivity to the Multigroup Diffusion Equations. Nucl. Sci. Eng. 2009, 161, 22–48. [Google Scholar] [CrossRef]
Lathouwers, D. Goal-Oriented Spatial Adaptivity for the S_N Equations on Unstructured Triangular Meshes. Ann. Nucl. Energy 2011, 38, 1373–1381. [Google Scholar] [CrossRef]
Zhang, B.; Zhang, L.; Liu, C.; Chen, Y.X. Goal-Oriented Regional Angular Adaptive Algorithm for the S_N Equations. Nucl. Sci. Eng. 2018, 189, 120–134. [Google Scholar] [CrossRef]
Zhang, L.; Zhang, B.; Chen, Y.X. Spatial Adaptive Algorithm for Discrete Ordinates Shielding Calculation. At. Energy Sci. Technol. 2018, 52, 2233–2242. (In Chinese) [Google Scholar] [CrossRef]
Liu, C.; Zhang, B.; Zhang, L.; Chen, Y.X. Nonmatching Discontinuous Cartesian Grid Algorithm Based on the Multilevel Octree Architecture for Discrete Ordinates Transport Calculation. Nucl. Sci. Eng. 2020, 194, 1175–1201. [Google Scholar] [CrossRef]
Liu, C.; Zhang, B.; Wang, X.Y.; Zhang, L.; Chen, Y.X. Reformulation and Evaluation of Robust Characteristic-based Discretization for the Discrete Ordinates Equation on Structured Hexahedron Grids. Prog. Nucl. Energy 2020, 126, 103403. [Google Scholar] [CrossRef]
Allen, E.J. Stochastic Difference Equations and A Stochastic Partial Differential Equation for Neutron Transport. J. Differ. Equ. Appl. 2012, 18, 1267–1285. [Google Scholar] [CrossRef]
Hajas, T.Z.; Tolnai, G.; Margoczi, M.; Legrady, D. Noise Term Modeling of Dynamic Monte Carlo Using Stochastic Differential Equations. Ann. Nucl. Energy 2024, 195, 110061. [Google Scholar] [CrossRef]
Berry, J.J.; Gil-Delgado, G.G.; Osborne, A.G.S. Classification of Group Structures for a Multigroup Collision Probability Model Using Machine Learning. Ann. Nucl. Energy 2021, 160, 108367. [Google Scholar] [CrossRef]
Schmidhuber, J. Deep Learning in Neural Networks: An Overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [CrossRef] [PubMed]
Zhou, W.; Sun, G.M.; Yang, Z.H.; Wang, H.; Fang, L.; Wang, J.Y. BP Neural Network Based Reconstruction Method for Radiation Field Applications. Nucl. Eng. Des. 2021, 380, 111228. [Google Scholar] [CrossRef]
Li, Z.G.; Sun, J.; Wei, C.L.; Sui, Z.; Qian, X.Y. A New Cross-sections Calculation Method in HTGR Engineering Simulator System Based on Machine Learning Methods. Ann. Nucl. Energy 2020, 145, 107553. [Google Scholar] [CrossRef]
Zhu, Q.J.; Tian, L.C.; Yang, X.H.; Gan, L.F.; Zhao, N.; Ma, Y.Y. Advantages of artificial neural network in neutron spectra unfolding. Chin. Phys. Lett. 2014, 31, 69–72. [Google Scholar] [CrossRef]
Cao, C.L.; Gan, Q.; Song, J.; Long, P.C.; Wu, B.; Wu, Y.C. A Two-Step Neutron Spectrum Unfolding Method for Fission Reactors Based on Artificial Neural Network. Ann. Nucl. Energy 2019, 139, 107219. [Google Scholar] [CrossRef]
dos Santos, M.C.; Pinheiro VH, C.; do Desterro FS, M.; de Avellar, R.K.; Schirru, R.; dos Santos Nicolau, A.; de Lima, A.M.M. Deep Rectifier Neural Network Applied to the Accident Identification Problem in A PWR Nuclear Power Plant. Ann. Nucl. Energy 2019, 133, 400–408. [Google Scholar] [CrossRef]
Cao, C.L.; Gan, Q.; Song, J.; Long, P.C.; Wu, B.; Wu, Y.C. An Artificial Neural Network-based Neutron Field Reconstruction Method for Reactor. Ann. Nucl. Energy 2020, 138, 107195. [Google Scholar] [CrossRef]
Song, Y.M.; Zhao, Y.B.; Li, X.X.; Wang, K.; Zhang, Z.H.; Luo, W.; Zhu, Z.C. A Method for Optimizing the Shielding Structure of Marine Reactors. Nucl. Sci. Eng. 2017, 37, 355–361. (In Chinese) [Google Scholar]
Song, Y.M.; Zhang, Z.H.; Mao, J.; Lu, C.; Tang, S.Q.; Xiao, F.; Lyu, H.W. Research on Fast Intelligence Multi-objective Optimization Method of Nuclear Reactor Radiation Shielding. Ann. Nucl. Energy 2020, 149, 107771. [Google Scholar] [CrossRef]
Vasseur, A.; Makovicka, L.; Martin, É.; Sauget, M.; Contassot-Vivier, S.; Bahi, J. Dose Calculations Using Artificial Neural Networks: A Feasibility Study for Photon Beams. Nucl. Instrum. Methods Phys. Res. Sect. B Beam Interact. Mater. At. 2018, 266, 1085–1093. [Google Scholar] [CrossRef]
Wang, J.; Peng, X.; Chen, Z. Surrogate Modeling for Neutron Diffusion Problems Based on Conservative Physics-informed Neural Networks with Boundary Conditions Enforcement. Ann. Nucl. Energy 2022, 176, 109234. [Google Scholar] [CrossRef]
Yann, L.C.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Fowler, T.B.; Vondy, D.R. Nuclear Reactor Core Analysis Code; ORNL-TM-2496; Oak Ridge National Laboratory (ORNL): Oak Ridge, TN, USA, 1969. [Google Scholar]
Semenza, L.A.; Lewis, E.E.; Rossow, E.C. The Application of the Finite Element Method to the Multigroup Neutron Diffusion Equation. Nucl. Sci. Eng. 1972, 47, 302–310. [Google Scholar] [CrossRef]
Chen, Y.X.; Zhang, B.; Zhang, L.; Zheng, J.X.; Zheng, Y.; Liu, C. ARES: A Parallel Discrete Ordinates Transport Code for Radiation Shielding Applications and Reactor Physics Analysis. Sci. Technol. Nucl. Ins. 2017, 2017, 2596727. [Google Scholar] [CrossRef]
Kobayashi, K.; Sugimura, N.; Nagaya, Y. 3D Radiation Transport Benchmark Problems and Results for Simple Geometries with Void Region. Prog. Nucl. Energy 2001, 39, 119–144. [Google Scholar] [CrossRef]
Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Zheng, X. TensorFlow: Large Scale Machine Learning on Heterogeneous Distributed Systems. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), Savannah, GA, USA, 2–4 November 2016; pp. 265–283. [Google Scholar]
Glorot, X.; Bengio, Y. Understanding the Difficulty of Training Deep Feedforward Neural Networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, 13–15 May 2010; pp. 249–256. [Google Scholar]
Clevert, D.A.; Unterthiner, T.; Hochreiter, S. Fast and Accurate Deep Network Learning by Exponential Linear Units (elus). arXiv 2015, arXiv:1511.07289. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Smith, L.N. A Disciplined Approach to Neural Network Hyperparameters: Part 1–Learning Rate, Batch Size, Momentum, and Weight Decay. arXiv 2018, arXiv:1803.09820. [Google Scholar]

Figure 1. Basic structure of the neural network.

Figure 2. Kobayashi-1 geometry model.

Figure 3. Kobayashi-2 geometry model.

Figure 4. The fitting results of the neural network. (a) S_N calculation, (b) no standardization, (c) log standardization.

Figure 5. Prediction of neutron flux distributions by deep learning neural network.

Figure 6. Comparison of neutron flux for validation use case 1: (a) S_N calculation, (b) Surrogate Model-1 prediction.

Figure 7. Comparison of neutron flux for validation use case 2: (a) S_N calculation, (b) Surrogate Model-1 prediction.

Figure 8. Comparison of neutron flux for validation use case 3: (a) S_N calculation, (b) Surrogate Model-1 prediction.

Figure 9. Validation use case 1: The error between S_N calculation and Surrogate Model-1 prediction.

Figure 10. Validation use case 2: The error between S_N calculation and Surrogate Model-1 prediction.

Figure 11. Validation use case 3: The error between S_N calculation and Surrogate Model-1 prediction.

Figure 12. Comparison of neutron flux for validation use case 1: (a) S_N calculation, (b) Surrogate Model-2 prediction.

Figure 13. Comparison of neutron flux for validation use case 2: (a) S_N calculation, (b) Surrogate Model-2 prediction.

Figure 14. Comparison of neutron flux for validation use case 3: (a) S_N calculation, (b) Surrogate Model-2 prediction.

Figure 15. Validation use case 1: The error between S_N calculation and Surrogate Model-2 prediction.

Figure 16. Validation use case 2: The error between S_N calculation and Surrogate Model-2 prediction.

Figure 17. Validation use case 3: The error between S_N calculation and Surrogate Model-2 prediction.

Figure 18. Comparison of neutron flux for validation use case 1: (a) S_N calculation, (b) Surrogate Model-3 prediction.

Figure 19. Comparison of neutron flux for validation use case 2: (a) S_N calculation, (b) Surrogate Model-3 prediction.

Figure 20. Comparison of neutron flux for validation use case 3: (a) S_N calculation, (b) Surrogate Model-3 prediction.

Figure 21. Validation use case 1: The error between S_N calculation and Surrogate Model-3 prediction on 2500 grids at x = 55.5 cm, 50 cm < y < 100 cm, and 50 cm < z < 100 cm.

Figure 22. Validation use case 2: The error between S_N calculation and Surrogate Model-3 prediction on 2500 grids at x = 55.5 cm, 50 cm < y < 100 cm, and 50 cm < z < 100 cm.

Figure 23. Validation use case 3: The error between S_N calculation and Surrogate Model-3 prediction on 2500 grids at x = 55.5 cm, 50 cm < y < 100 cm, and 50 cm < z < 100 cm.

Figure 24. Comparison of neutron flux for validation use case 4: (a) S_N calculation, (b) Surrogate Model-4 prediction.

Figure 25. Comparison of neutron flux for validation use case 5: (a) S_N calculation, (b) Surrogate Model-4 prediction.

Figure 26. Comparison of neutron flux for validation use case 6: (a) S_N calculation, (b) Surrogate Model-4 prediction.

Figure 27. Validation use case 4: The error between S_N calculation and Surrogate Model-4 prediction.

Figure 28. Validation use case 5: The error between S_N calculation and Surrogate Model-4 prediction.

Figure 29. Validation use case 6: The error between S_N calculation and Surrogate Model-4 prediction.

Table 1. The range of source strength, total and scattering cross-sections of Dataset-1.

Zone	S (n·cm⁻³·s⁻¹)	Σ_t (cm⁻¹)	Σ_s (cm⁻¹)
Source region	1–1 × 10¹	5 × 10⁻²–1	5 × 10⁻²–1
Shield zone 1	0	1 × 10⁻⁴–5 × 10⁻²	1 × 10⁻⁴–5 × 10⁻²
Shield zone 2	0	5 × 10⁻²–1	5 × 10⁻²–1

Table 2. All nuclide compositions and atom densities of Dataset-2.

Zone	Nuclide	The Range of Atom Density (barn⁻¹·cm⁻¹)
Source region	²H	1 × 10⁻⁴–1 × 10⁻¹
	¹⁶O	1 × 10⁻⁴–1 × 10⁻¹
	²³⁵U	1 × 10⁻⁶–1 × 10⁻²
	²³⁸U	1 × 10⁻⁴–1 × 10⁻⁴
	⁵⁶Fe	1 × 10⁻⁷–1 × 10⁻⁴
	¹⁰B	1 × 10⁻⁸–1 × 10⁻⁵
	⁹¹Zr	1 × 10⁻⁵–1 × 10⁻²
	¹⁴C	1 × 10⁻⁸–1 × 10⁻⁵
Shield zone 1	²H	5 × 10⁻⁸–1 × 10⁻⁵
	¹⁴N	1 × 10⁻⁷–1 × 10⁻⁴
	¹⁶O	1 × 10⁻⁷–1 × 10⁻⁴
Shield zone 2	²H	1 × 10⁻⁴–1 × 10⁻¹
	¹⁶O	1 × 10⁻⁴–1 × 10⁻¹
	¹⁴C	1 × 10⁻⁶–1 × 10⁻³
	²⁷Al	1 × 10⁻⁵–1 × 10⁻²
	²⁸Si	1 × 10⁻⁴–1 × 10⁻¹
	³²S	1 × 10⁻⁶–1 × 10⁻³
	⁴⁰Ca	1 × 10⁻⁵–1 × 10⁻²
	⁵⁶Fe	1 × 10⁻⁵–1 × 10⁻²

Table 3. The range of source strength and total cross-sections of dataset 2.

Zone	S (n·cm⁻³·s⁻¹)	Σ_t (cm⁻¹)
Source region	1–1 × 10¹	(1 × 10⁻¹–1) + Σ_s0
Shield zone 1	0	(1 × 10⁻⁴–1 × 10⁻²) + Σ_s0
Shield zone 2	0	(1 × 10⁻¹–1) + Σ_s0

Table 4. Train loss using different standardized methods.

Standardized Method	No Standardization	Log Standardization
Train loss	2.90 × 10⁻³	6.90 × 10⁻³

Table 5. Loss after 10,000 iterations with different initial learning rates.

Initial Learning Rates	1 × 10⁻³	1 × 10⁻⁴	1 × 10⁻⁵
Train loss	9.80 × 10⁻³	2.74 × 10⁻²	1.44 × 10⁻¹
Test loss	7.76 × 10⁻²	4.35 × 10⁻²	1.60 × 10⁻¹

Table 6. Loss values and time spent for different batch sizes.

Batch Size	1	10	20	50	100
Train loss	1.31 × 10⁻²	1.89 × 10⁻²	2.74 × 10⁻²	5.11 × 10⁻²	5.01 × 10⁻²
Test loss	5.63 × 10⁻²	3.56 × 10⁻²	4.35 × 10⁻²	7.70 × 10⁻²	7.01 × 10⁻²
Time Spent (s)	2.41	6.90 × 10⁻¹	5.00 × 10⁻¹	4.70 × 10⁻¹	4.30 × 10⁻¹

Table 7. Loss value for different activation functions.

Activation Function	Relu	ELU	Sigmoid	Tanh
Train loss	4.01 × 10⁻²	2.74 × 10⁻²	4.75	9.13 × 10⁻²
Test loss	9.36 × 10⁻²	4.35 × 10⁻²	4.51	2.90 × 10⁻¹

Table 8. Loss value of different learning rate decline strategies.

Learning Rate Decline	100	200	300	400
Train loss	1.12 × 10⁻¹	2.73 × 10⁻²	2.74 × 10⁻²	2.68 × 10⁻²
Test loss	1.41 × 10⁻¹	4.46 × 10⁻²	4.35 × 10⁻²	4.82 × 10⁻²

Table 9. Surrogate Model-1: Loss values of neural networks with different hidden layers.

Hidden Layers	2	3	4	6	8
Train loss	5.30 × 10⁻³	2.70 × 10⁻³	2.50 × 10⁻³	3.70 × 10⁻³	1.20 × 10⁻³
Test loss	7.40 × 10⁻³	4.90 × 10⁻³	5.20 × 10⁻³	6.10 × 10⁻³	1.85 × 10⁻²

Table 10. Surrogate Model-1: Loss values of neural networks with the number of neurons.

Number of Neurons	200	500	800	1000	1500
Train loss	7.60 × 10⁻³	3.20 × 10⁻³	2.50 × 10⁻³	2.40 × 10⁻³	3.30 × 10⁻³
Test loss	1.35 × 10⁻²	5.50 × 10⁻³	5.20 × 10⁻³	4.80 × 10⁻³	5.30 × 10⁻³

Table 11. Surrogate Model-2: Loss values of neural networks with different hidden layers.

Hidden Layers	3	4	6	8
Train loss	3.77 × 10⁻²	3.26 × 10⁻²	1.26 × 10⁻²	9.60 × 10⁻³
Test Loss	7.26 × 10⁻²	6.14 × 10⁻²	6.90 × 10⁻²	1.19 × 10⁻¹

Table 12. Surrogate Model-2: Loss value of neural network with the number of neurons.

Number of Neurons	200	500	800	1000	1500
Train loss	3.47 × 10⁻²	3.91 × 10⁻²	3.26 × 10⁻²	1.36 × 10⁻²	1.89 × 10⁻²
Test Loss	7.64 × 10⁻²	7.75 × 10⁻²	6.14 × 10⁻²	6.47 × 10⁻²	6.26 × 10⁻²

Table 13. Surrogate Model-3: Loss values of neural networks with different hidden layers.

Hidden Layers	3	4	6	7
Train loss	1.28 × 10⁻²	9.80 × 10⁻³	6.60 × 10⁻³	6.60 × 10⁻³
Test loss	1.90 × 10⁻²	1.43 × 10⁻²	1.65 × 10⁻²	2.01 × 10⁻²

Table 14. Surrogate Model-3: Loss value of neural network with the number of neurons.

Number of Neurons	200	500	800	1000	1500
Train loss	1.42 × 10⁻²	9.70 × 10⁻³	9.80 × 10⁻³	7.20 × 10⁻³	6.70 × 10⁻³
Test loss	1.93 × 10⁻²	1.55 × 10⁻²	1.43 × 10⁻²	1.46E × 10⁻²	1.53 × 10⁻²

Table 15. Surrogate Model-4: Loss values of neural networks with different hidden layers.

Hidden Layers	3	4	6	8
Train loss	1.50 × 10⁻³	1.30 × 10⁻³	7.57 × 10⁻⁴	7.70 × 10⁻⁴
Test loss	2.20 × 10⁻³	1.91 × 10⁻³	2.30 × 10⁻³	2.70 × 10⁻³

Table 16. Surrogate Model-4: Loss value of neural network with the number of neurons.

Number of Neurons	800	1000	1500	2000	2500
Train loss	1.40 × 10⁻³	1.30 × 10⁻³	8.52 × 10⁻³	6.86 × 10⁻⁴	6.23 × 10⁻⁴
Test loss	2.00 × 10⁻³	1.91 × 10⁻³	1.70 × 10⁻³	1.50 × 10⁻³	1.50 × 10⁻³

Table 17. Validation use cases 1–3: Source strength, total cross-sections, and scattering cross-sections values for Source region.

Validation Use Case	S (n·cm⁻³·s⁻¹)	Σ_t (cm⁻¹)	Σ_s (cm⁻¹)
1	9.26	9.24 × 10⁻¹	4.55 × 10⁻¹
2	8.29	6.84 × 10⁻¹	5.35 × 10⁻¹
3	5.32	3.23 × 10⁻¹	8.06 × 10⁻¹

Table 18. Validation use cases 1–3: total cross-sections and scattering cross-sections values for Shielding Zones 1 and 2.

Validation Use Case	Shielding Zone 1 Σ_t (cm⁻¹)	Shielding Zone 1 Σ_s (cm⁻¹)	Shielding Zone 2 Σ_t (cm⁻¹)	Shielding Zone 2 Σ_s (cm⁻¹)
1	3.47 × 10⁻²	8.20 × 10⁻³	3.55 × 10⁻¹	3.29 × 10⁻¹
2	3.02 × 10⁻²	2.33 × 10⁻²	1.21 × 10⁻¹	1.04 × 10⁻¹
3	3.61 × 10⁻²	1.35 × 10⁻²	5.59 × 10⁻¹	5.16 × 10⁻²

Table 19. Validation use cases 4–6: Material source strength and total cross-section for each region.

Validation Use Case	Source Region S (n·cm⁻³·s⁻¹)	Source Region Σ_t (cm⁻¹)	Shielding Zone 1 Σ_t (cm⁻¹)	Shielding Zone 2 Σ_t (cm⁻¹)
4	2.00	1.80	5.70 × 10⁻³	7.26 × 10⁻¹
5	7.60	1.58	8.44 × 10⁻³	1.49
6	3.19	2.81	5.92 × 10⁻³	1.74

Table 20. Validation use case 4: Scattering coefficients.

Zone	P₀ Scattering Coefficients (cm⁻¹)	P₁ Scattering Coefficients (cm⁻¹)	P₂ Scattering Coefficients (cm⁻¹)	P₃ Scattering Coefficients (cm⁻¹)
Source region	1.02	1.05 × 10⁻¹	3.87 × 10⁻²	8.23 × 10⁻³
Shield zone 1	5.41 × 10⁻⁴	1.36 × 10⁻⁴	5.24 × 10⁻⁵	7.50 × 10⁻⁶
Shield zone 2	6.02 × 10⁻¹	2.84 × 10⁻¹	1.16 × 10⁻¹	1.66 × 10⁻²

Table 21. Validation use case 5: Scattering coefficients.

Zone	P₀ Scattering Coefficients (cm⁻¹)	P₁ Scattering Coefficients (cm⁻¹)	P₂ Scattering Coefficients (cm⁻¹)	P₃ Scattering Coefficients (cm⁻¹)
Source region	1.32	3.20 × 10⁻¹	1.27 × 10⁻¹	2.09 × 10⁻²
Shield zone 1	6.35 × 10⁻⁴	1.38 × 10⁻⁴	5.30 × 10⁻⁵	7.59 × 10⁻⁶
Shield zone 2	1.34	6.57 × 10⁻¹	2.69 × 10⁻¹	3.85 × 10⁻²

Table 22. Validation use case 6: Scattering coefficients.

Zone	P₀ Scattering Coefficients (cm⁻¹)	P₁ Scattering Coefficients (cm⁻¹)	P₂ Scattering Coefficients (cm⁻¹)	P₃ Scattering Coefficients (cm⁻¹)
Source region	1.76	6.64 × 10⁻¹	2.69 × 10⁻¹	4.02 × 10⁻²
Shield zone 1	6.72 × 10⁻⁴	7.56 × 10⁻⁵	2.17 × 10⁻⁵	3.08 × 10⁻⁶
Shield zone 2	1.52	7.93 × 10⁻¹	3.25 × 10⁻¹	4.65 × 10⁻²

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, Y.; Zhang, B.; Yang, S.; Chen, Y. A Data-Driven Method for Calculating Neutron Flux Distribution Based on Deep Learning and the Discrete Ordinates Method. Energies 2024, 17, 3440. https://doi.org/10.3390/en17143440

AMA Style

Li Y, Zhang B, Yang S, Chen Y. A Data-Driven Method for Calculating Neutron Flux Distribution Based on Deep Learning and the Discrete Ordinates Method. Energies. 2024; 17(14):3440. https://doi.org/10.3390/en17143440

Chicago/Turabian Style

Li, Yanchao, Bin Zhang, Shouhai Yang, and Yixue Chen. 2024. "A Data-Driven Method for Calculating Neutron Flux Distribution Based on Deep Learning and the Discrete Ordinates Method" Energies 17, no. 14: 3440. https://doi.org/10.3390/en17143440

APA Style

Li, Y., Zhang, B., Yang, S., & Chen, Y. (2024). A Data-Driven Method for Calculating Neutron Flux Distribution Based on Deep Learning and the Discrete Ordinates Method. Energies, 17(14), 3440. https://doi.org/10.3390/en17143440

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Data-Driven Method for Calculating Neutron Flux Distribution Based on Deep Learning and the Discrete Ordinates Method

Abstract

1. Introduction

2. Methodology

2.1. Deep Learning Neural Network

2.2. Discrete Ordinates Method Transport Solution

2.3. Dataset Acquisition and Construction

2.4. Deep Learning Neural Network Topology Construction and Model Training

2.5. Model Evaluation

3. Numerical Result Analysis

3.1. Prediction of Neutron Flux Distribution in Kobayashi-1 Geometric Shield Region 1

3.2. Prediction of Neutron Flux Distribution in Kobayashi-1 Geometric Shield Region 2

3.3. Prediction of Neutron Flux Distribution in Kobayashi-2 Geometry

4. Summary

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI