A Fast Framework for Generating Radioactive Mixture Spectra and Its Application to Remote High-Performance Mixture Identification

Kwan, Chiman; Ayhan, Bulent; Stavola, Adam; Islam, Kazi Aminul; Zhang, Hongfang; Li, Jiang

doi:10.3390/electronics14081688

Open AccessArticle

A Fast Framework for Generating Radioactive Mixture Spectra and Its Application to Remote High-Performance Mixture Identification

by

Chiman Kwan

^1,*

,

Bulent Ayhan

¹,

Adam Stavola

²,

Kazi Aminul Islam

³

,

Hongfang Zhang

⁴

and

Jiang Li

⁴

¹

Applied Research LLC, Rockville, MD 20850, USA

²

Thomas Jefferson National Accelerator Facility, Newport News, VA 23606, USA

³

Department of Computer Science, Kennesaw State University, Kennesaw, GA 30144, USA

⁴

Department of Electrical and Computer Engineering, Old Dominion University, Norfolk, VA 23529, USA

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(8), 1688; https://doi.org/10.3390/electronics14081688

Submission received: 22 February 2025 / Revised: 8 April 2025 / Accepted: 16 April 2025 / Published: 21 April 2025

Download

Browse Figures

Versions Notes

Abstract

:

Remote detection of radioactive materials in mixtures using handheld or portal detectors remains a challenge because of factors such as low concentration, environmental interference, sensor noise, and other complications. This work introduces a fast framework for generating realistic mixture spectra. Moreover, we present mixture isotope identification using data generated by the fast framework. Researchers have examined a range of conventional and recent algorithms within the fields of machine learning and deep learning. An application to uranium enrichment-level prediction has been included. Extensive simulation experiments validated the efficacy of the proposed framework.

Keywords:

deep learning; GADRAS; machine learning; mixtures; radioactive isotopes; remote detection

1. Introduction

The unauthorized use of nuclear materials poses a significant threat to public security and social stability and requires effective interception at customs and border checkpoints. While detectors such as low-cost Sodium Iodide (NaI) and high-performance alternatives are available, their performance may degrade under conditions of low material concentration or when multiple isotopes are present.

Recent advances in machine learning and deep learning have shown great potential for improving the detection and classification of low-concentration nuclear materials in mixtures [1,2,3,4]. Nevertheless, several challenges remain. One primary limitation is that machine learning (ML) models typically require large amounts of training data. Fortunately, software simulation tools are available such as Gamma Detector Response and Analysis Software (GADRAS) [5] and GEometry ANd Tracking (Geant4) [6,7,8,9] that enable the generation of synthetic training data. However, GADRAS is limited to US government employees and contractors. Moreover, GADRAS is limited to one person per license. Geant4 is powerful, but it has many functionalities that may not be needed in mixture identification. The learning curve required for Geant4 is also huge. Compared to GADRAS, Geant4 does not have a user-friendly user interface. Simpler and license-free mixture generation software will help ordinary researchers working in nuclear power plants and private medical radiation research facilities to experiment with new identification algorithms. Second, the spectral data acquired by detectors typically contain background noise and various interferences [10], thereby requiring the development of robust algorithms for accurate material classification. Third, the presence of multiple nuclear materials may further complicate the spectral signatures and increase the difficulty of accurate detection. Some spectral unmixing may be needed to accurately classify nuclear materials and estimate their relative count contributions within mixtures [11,12].

Traditional radiation spectrum analysis relies on examining specific regions of interest (ROIs) within the gamma ray spectrum [13,14,15,16]. As mentioned in [17,18], a key limitation of these approaches is that their reduced performance when ROIs exhibit significant overlap with large libraries of radioisotopes. Recently, researchers have taken the entire spectrum into account for isotope identification, enhancing the accuracy of detecting and quantifying various isotopes [19,20]. A key advantage is that the Compton continuum can be considered, and the entire spectrum is shown to allow for some tolerance to gain shift.

Several conventional methods have been developed for analyzing spectral signatures from material mixtures. Non-negativity Constrained Least Square (NCLS) has been applied to chemical agent detection [21]. Partial Least Square (PLS) has been utilized for rock composition analysis in Laser-Induced Breakdown Spectroscopy (LIBS) [22]. The Deep Belief Network (DBN) has been applied to hyperspectral image classification tasks [23]. In addition, linear regression (LR) and random forest regression (RFR) [24] are also conventional machine learning tools that have been employed for unmixing analysis. A recent doctoral dissertation in 2019 [18] applied deep learning techniques for isotope classification. However, the study did not address the detection or classification of mixtures.

Deep learning has achieved remarkable progress since the seminal work of Hinton’s group in 2012 [25]. After that, deep learning has been widely applied across various domains, including target detection and classification [26,27], stock market forecasting [28], land cover classification [29], image enhancement [30], and many others [31].

Although GADRAS can be used to generate realistic synthetic mixture spectra for training ML algorithms, the user interface in GADRAS is not tailored for massive spectra generation. It is tedious, labor intensive, and cumbersome to generate thousands of mixtures. It is, therefore, necessary to develop an efficient and fast mixture generation framework.

In this paper, we modified the single-isotope spectrum data generation framework known as artificial neural networks for spectroscopic analysis (ANNSA) [18], developed by researchers at the U. of Illinois at Urbana-Champaign, for multi-isotope mixture generation. The data generation framework in [18] was originally used for single-source spectrum generation at different signal-to-background ratios considering several data augmentation parameters, and it is used for isotope identification. We modified this framework for multi-isotope mixture generation and quantification (mixing ratio estimation) of the isotopes present in these mixtures. One key reason for modifying that framework is to allow us to easily generate thousands of mixtures in our experiments. Generating thousands of mixtures using GADRAS will be too tedious and time consuming because some parameters may need to be entered manually. Moreover, as mentioned earlier, GADRAS is limited to US government employees and contractors. Civilians such as medical researchers may not be able to obtain licenses for GADRAS. The modified data generation framework integrates several augmentation parameters into the spectrum generation process such as integration time, background count rate, signal-to-background ratio, and calibration. Overall, with this modified framework, one can form multi-isotope mixtures with respect to a user-set signal-to-background ratio and several other detector and augmentation parameters such as shielding, shielding density, etc. The output of the framework is the foreground and background spectra. Following that, the framework also generates the measured spectrum (foreground + background) by incorporating a Poisson process which creates a measured mixture spectrum from foreground and background spectra with realistic counting statistics. For three different datasets (homogeneous, slightly heterogeneous, and heterogeneous) investigated, one deep learning-based algorithm demonstrated superior performance compared to other methods by yielding lower root-mean-squared error (RMSE) values.

The main contributions of this work are summarized as follows. First, we propose a novel and efficient framework for rapid generation of mixture spectra. Second, we investigated conventional and deep learning algorithms for relative count contribution estimation for mixtures generated using the proposed fast framework. Third, we applied the proposed framework and various algorithms to uranium enrichment-level prediction.

The remainder of this paper is organized as follows. Section 2 summarizes the fast mixture spectra generation framework. Section 3 summarizes the investigations of two-mixture mixing ratio estimation results using several ML/DL algorithms. Section 4 includes one application of the proposed framework to uranium enrichment-level estimation. Finally, Section 5 provides concluding remarks.

2. Multi-Isotope Spectrum Mixture Generation

2.1. Background

A related study in [18] investigated the identification of isotopes in gamma ray spectra. Different from our work, the work in [18] considered single-isotope detection and does not consider isotope mixtures and quantification. In that work, isotope gamma ray spectra and background spectra were simulated using GADRAS with a custom NaI detector and through variation in detector parameters such as the source–detector distance and detector height, FWHM (full width at half maximum), shielding material, and shielding density. The simulated isotope and background gamma ray spectra were considered as templates. Additionally, in [18], several augmentation parameters were utilized such that, using these templates and the augmentation parameters, one can create augmented spectra of these single-isotope templates with different background counts per second, signal-to-background rate, integration time, calibration, etc. We modified this single-isotope generation framework and adapted it for multiple-isotope mixture generation at a user-set signal-to-background ratio (SBR). The modified framework thus can be used not only for isotope identification but also for the quantification of isotopes in the mixture. In order to use these foreground and background templates for creating isotope mixtures, we first conducted a few investigations with respect to how GADRAS mixes multiple-isotope spectra (each with its own activity rates) when forming a mixed spectrum of these isotopes.

2.2. Examining Spectrum Mixing in GADRAS

For this investigation, the following isotope simulations (single isotope at different activities and three-isotope mixture) were conducted in GADRAS with NaI detector (detector height = 56 cm, distance of detector to material = 122 cm, no Poisson noise):

¹³⁷Cs, 9.7687 uCi (single isotope);
²²³Ra, 790.07 nCi (single isotope);
²³⁵U, 977.4 nCi (single isotope);
¹³⁷Cs, 150 uCi (single isotope);
¹³⁷Cs, 9.7687 uCi + ²²³Ra, 790.07 nCi + ²³⁵U, 977.4 nCi (three-source mixture).

By multiplying the “¹³⁷Cs, 150 uCi” spectrum with a scalar coefficient of “9.7687/150”, where 150 uCi * (9.7687/150) is equal to 9.7687 uCi, scaling operation is tested by checking whether or not the resultant spectrum is equal to the GADRAS simulation spectrum result for “¹³⁷Cs, 9.7687 uCi”. From Figure 1, it can be seen that the two spectra (computed (dotted blue line) and simulated (blue line)) are found to be almost the same.

By adding three separate GADRAS-simulated spectra “¹³⁷Cs, 9.7687 uCi”, “²²³Ra, 790.07 nCi” and “²³⁵U, 977.4 nCi”, the mixing operation is tested by checking whether the computed spectrum is equal to the GADRAS-simulated spectrum for the three-isotope mixture of three different activities “¹³⁷Cs, 9.7687 uCi + ²²³Ra, 790.07 nCi + ²³⁵U,977.4 nCi”. As can be seen from Figure 2, the two spectra are found to be almost the same. This shows that, when forming a mixture of multiple isotopes, GADRAS adds the spectra of the isotopes at their activity units in the mixture, and this indicates a linear mixing phenomenon.

In summary, as we anticipated, if template spectra with various detector parameter variations are generated for individual isotopes for a specific detector, or the isotope and background templates from [18] are used, simple addition and multiplication operators can be used to simulate multiple-isotope mixtures using these individual source and background templates.

2.3. New Spectral Data Generation Framework

Because a linear spectral mixing phenomena is observed in GADRAS, when simulating multiple-isotope mixtures, we considered using foreground (source) and background templates when forming multi-isotope mixture training and test datasets. This way, instead of running GADRAS to simulate various parameter variations on the mixture, we simply utilized linear mixing phenomena and used the individual-isotope spectra simulated with GADRAS at various parameter variations. The work in [18] considered the single-isotope identification problem and only augmented single-isotope spectra with various parameter variations using a framework called ANNSA. We first modified the ANNSA framework in [18] such that we can generate multiple-isotope mixture spectra with these variations. In the following, we introduce how we conducted this framework modification.

In [18], isotope identification including mixture form of gamma ray spectra was studied. In that work, the term “relative count contribution” was used rather than mixing ratio or activity for the isotopes that form the mixture. In this work, we are going to use the term “mixing ratio” to refer to relative count contribution. The modified ANNSA framework involves including the background as if it is an isotope in the mixture. For this, the background’s mixing ratio (relative count contribution) in the mixture is assigned with consideration of the user-defined signal-to-background ratio. In the following, we provided a technical write-up that introduces this modification followed by the results.

To introduce the modification steps, we will consider a two-isotope mixture case in which one of the isotopes in the mixture is denoted by X and the other isotope is denoted by Y. The measured gamma ray spectrum for this two-isotope mixture including background is then denoted by Ms. The background spectrum portion in Ms is denoted by Bs. Suppose Xs and Ys correspond to the individual spectra for the two isotopes. Ms can then be depicted as follows which is decomposed into the background and the two isotopes:

M s = X s + Y s + B s

(1)

Let T denote the total number of counts for Ms, and let the mixing ratio (relative count contribution) be denoted as Xs, Ys, and Bs, which are denoted as r_Xs, r_Ys and r_Bs, respectively, where r_Xs + r_Ys + r_Bs = 1. The number of count contribution for Xs, Ys, and Bs can be mathematically expressed as T⋅r_Xs, T⋅r_Ys, and T⋅r_Bs. Suppose the signal-to-background ratio is denoted by SBR. With consideration to count contributions from the source and background, SBR can be mathematically expressed as follows:

\begin{matrix} S B R = S o u r c e c o u n t s / B a c k g r o u n d c o u n t s \\ = (T \cdot r_{X s} + T \cdot r_{Y s}) / T \cdot r_{B s} \\ = (r_{X s} + r_{Y s}) / r_{B s} \end{matrix}

(2)

Using (2) and considering r_Xs + r_Ys + r_Bs = 1, r_Bs is found to be equal to 1/(SBR + 1), and (r_Xs + r_Ys) is found to be equal to SBR/(SBR + 1). The mixture spectrum, Ms, can be written as follows:

\begin{matrix} M s = T \cdot {M s}_{1}^{n o r m} \\ = T \cdot r_{X S} \cdot {X s}^{n o r m} + T \cdot r_{Y S} \cdot {Y s}^{n o r m} + T \cdot r_{B S} \cdot {B s}^{n o r m}, \end{matrix}

(3)

where Ms₁^norm, Xs^norm, and Ys^norm are the normalized spectra for Ms, Xs, and Ys, respectively. r_Xs and r_Ys can then be randomly selected or manually set such that the sum of them (r_Xs + r_Ys) is equal to 1 − r_Bs.

Considering there are N two-isotope mixture spectra with M channels in the spectrum for a K isotopes pool (Xs^norm, Ys^norm, …, Zs^norm), the regression problem can be formulated as shown in (4). The modified formulation includes background, Bs^norm, as if it is an isotope and also estimates its mixing ratio (relative count contribution).

[\begin{matrix} M s_{1}^{n o r m} \\ M s_{2}^{n o r m} \\ \dots \\ M s_{N}^{n o r m} \end{matrix}] = [\begin{matrix} r_{1}^{X s} \\ r_{2}^{X s} \\ \dots \\ r_{N}^{X s} \end{matrix} \begin{matrix} r_{1}^{Y s} \\ r_{2}^{Y s} \\ \dots \\ r_{N}^{Y s} \end{matrix} \begin{matrix} \dots \\ \dots \\ \dots \\ \dots \end{matrix} \begin{matrix} r_{1}^{Z s} \\ r_{2}^{Z s} \\ \dots \\ r_{N}^{Z s} \end{matrix} \begin{matrix} r_{1}^{B G s} \\ r_{2}^{B G s} \\ \dots \\ r_{N}^{B G s} \end{matrix}] [\begin{matrix} {X s}^{n o r m} \\ {Y s}^{n o r m} \\ \dots \\ {Z s}^{n o r m} \\ B^{n o r m} \end{matrix}]

(4)

It should be noted that due to the variation in detector-related parameters (such as source distance, height, shielding, shielding density, etc.) and augmentation parameters, there is not a unique gamma ray signature that can represent an isotope. In the mixture gamma ray spectrum generation phase, the isotope templates are picked from a large source template pool in which these templates are simulated with different detector parameters (source distance, height, shielding, shielding density, etc.) using GADRAS. Similarly, for the background template, B, of a specific mixture, background is also picked from the background template pool.

PLS, LR, RFR, and Deep Regression methods are thus found to be more suitable in a scenario like this since they do not directly require unique isotope signatures but rather only mixture spectra and the corresponding mixing ratios (relative count contribution rates) for the isotopes and background in the mixture. Additionally, the spectrum data are affected by Poisson noise due to the randomness of counting events in the detector during actual spectrum measurements. Poisson noise is a statistical noise with a variance proportional to the event counts. At lower count levels, the noise becomes more noticeable and increases errors in estimating the mixing ratio. However, the spectral data generated using the GADRAS template in this study have adequately accounted for multiple parameter variations. Moreover, the consistency of the training and test datasets was maintained, so the effect of Poisson noise did not change the overall performance trend of the experimental method.

2.4. Processing Steps in the Framework

With the modified framework, one can generate a multi-isotope mixture with a user-defined signal-to-background ratio. A Poisson process is also included at the end to create a mixture spectrum with realistic counting statistics. The block diagram for multi-source mixture spectrum data generation is shown in Figure 3. The block diagram is for a two-source mixture generation; however, the framework can be extended for more than two-isotope mixtures in a similar fashion. The block diagram provides the gamma ray mixture spectrum simulation processing steps. The following processing steps are undertaken:

Choose templates: Source (foreground) and background templates are chosen from the GADRAS-simulated template libraries. Note that these templates are not publicly available and must be generated using licensed GADRAS under specific simulation settings.
Normalize templates: The chosen templates are normalized with respect to the sum of channel counts.
Assign mixing ratios: Assign the mixing ratio for background based on the signal-to-background ratio specified by the user. The mixture proportions of the sources and foreground are then either randomly selected or set such that the sum of the assigned mixing ratios for background and sources is equal to 1.
Form the source spectrum: Add mixing-ratio-multiplied source templates to form the source spectrum.
Rebin: Rebin source and background templates phase using “Calibration” parameters. The calibration parameters are used for rebinning the spectrum data according to a quadratic. The quadratic consists of three parameters. The first parameter is a constant rebinning term, which is also known as offset. The second term is a linear rebinning term, which is also known as gain. The third term is an optional quadratic rebinning term, which is also known as non-linear term. Cubic interpolation method is used to find the spectrum values at the rebinned channels. This processing phase is applied to both source and background templates separately.
Apply low-level discriminator (LLD) phase: This process uses the LLD parameter. It basically sets all the spectrum values at and before the set parameter LLD to 0. This process is applied to source and background templates separately.
Scale: Scale mixed-source and background spectra with total counts where total count is calculated as the sum of source counts (foreground counts) and background counts as expressed in (5). Background counts and source counts of the computation phase uses “Integration time”, “Background count rate”, and “Signal-to-background ratio” parameters as mathematically described in (6) and (7), respectively, where background_cps corresponds to background counts per second.

$t o t a l_c o u n t s = s o u r c e_c o u n t s (f o r e g r o u n d c o u n t s) + b a c k g r o u n d_c o u n t s$

(5)

$s o u r c e_{c o u n t s} = b a c k g r o u n d_{c p s} \cdot i n t e g r a t i o n_{t i m e} \cdot s i g n a l_t o_b a c k g r o u n d$

(6)

$b a c k g r o u n d_c o u n t s = b a c k g r o u n d_c p s \cdot i n t e g r a t i o n_t i m e$

(7)
Form final measured spectra: This phase adds the source and background spectrum followed by a Poisson process to create a simulated measured spectrum with realistic counting statistics for the mixed sources and corresponding background.

Figure 3. Block diagram of the multi-isotope mixture generation framework.

2.5. Detector and Augmentation Parameters

NaI detector was used when forming the source and background templates in [18]. We used these templates in [18] to form mixed-spectrum training and test datasets in our modified framework. It is worth mentioning that it is also possible to use other detectors and form source and background templates accordingly. The detector described in [18] simulates the Ortec 905-3 2x2-in NaI (Tl) detector, which is incorporated in the Algorithm Improvement Program (AIP) software package developed by the Department of Homeland Security. The GADRAS parameters used to simulate this detector can be seen in Figure 4. In [18], the default energy calibration of the NaI detector model was modified to simulate template spectra, so the detector-measured energies range from 0 MeV to 3.5 MeV. This configuration was realized by assigning zero to the calibration offset (Order 0 in E) and setting the calibration gain (Order 1 in E) to 3500. The default number of channels was changed to 1194. The default spectrum length used was 3 MeV.

The simulated source templates in [18] correspond to a total of 29 isotopes. The isotopes used in the source template dataset are derived from the ANSI N42-34-2006 standard [32] for isotope identification devices [18] and consist of: ²⁴¹Am, ¹³³Ba, ⁵⁷Co, ⁶⁰Co, ⁵¹Cr, ¹³⁷Cs, ¹⁵²Eu, ⁶⁷Ga, ¹²³I, ¹²⁵I, ¹³¹I, ¹¹¹In, ¹⁹²Ir, ¹⁷⁷mLu, ⁹⁹Mo, ²³⁷Np, ¹⁰³Pd, ²³⁹Pu, ²⁴⁰Pu, ²²⁶Ra, ⁷⁵Se, ¹⁵³Sm, 99mTc, ²⁰¹Tl, ²⁰⁴Tl, ²³³U, ²³⁵U, ²³⁸U, and ¹³³Xe. The number of source templates in [18] with all parameter variations is 65,975. These parameter combinations for the NaI detector are listed below.

Source height (in cm): 100, 125, 150, 50, 75;
Source distance (in cm): 112.5, 175, 237.5, 300, 50;
Shielding: alum, iron, lead, none;
Shielding density (g/cm²): 1.82 (alum), 4.18 (alum), 7.49 (alum), 1.53 (iron), 3.50 (iron), 6.28 (iron), 0.22 (lead), 0.51 (lead), 0.92 (0.51), 0.0 (none);
FWHM at 662 keV (%): 6, 6.5, 7.0, 7.5, 8.0, 8.5, 9.0.

The number of background templates in [18] is 84 and consists of the following parameters:

FWHM at 662 keV (%): 6, 6.5, 7.0, 7.5, 8.0, 8.5, 9.0;
Location: Albuquerque, Atlanta, Austin, Chicago, Knoxville, Miami;
Cosmic: 0, 1 (0 indicating cosmic effect is not included and 1 indicating it is included).

Other than the detector and background specific parameters that are mentioned above, for multi-isotope mixture simulation, we adapted the same augmentation parameters from [18]. These augmentation parameters are listed in the following. Among them, the “mixing ratio” parameter is a new one which we included for multi-isotope mixture generation, and the other parameters were used in [18] when generating augmented single-isotope source spectra. By varying these augmentation parameters in the mixture spectrum generation, diversity in the training and test spectrum datasets can be created.

Integration time (s);
Background count rate (background cps);
Signal-to-background ratio;
Calibration;
Low-level discriminator (LLD) parameter;
Mixing ratio for each source and background in the mixture.

3. Investigations with the Two-Isotope Mixtures

3.1. Mixture Identification Algorithms

A brief technical description of the four applied methods for estimating relative count contribution or mixing ratios is presented in this section. Three methods are developed using machine learning techniques, and one is based on deep learning.

Partial Least Square (PLS)

Let a process be represented by

Y = X \cdot B + E,

where

X \in R^{N \times m}

and

Y \in R^{N \times l}

denote the input and output data matrices, respectively, and

B \in R^{m \times l}

denotes the parameter matrix. Assume X is defined as

X = [\begin{matrix} {x_{1}}^{'} \\ {x_{2}}^{'} \\ ⋮ \\ {x_{N}}^{'} \end{matrix}] \in R^{N \times m},

where

x_{i} \in R^{m}

represents the

i^{t h}

input observation [22]. In PLS, let

X

represent the gamma ray spectra and

Y

denote the corresponding radioactive material compositions. The PLS model aims to predict

Y

from

X

in the linear relationship

Y = X B

by estimating the coefficient matrix

B

.

A deep learning-based dense architecture for multi-input multi-output regression

This study investigates the mixing ratio estimation performance of a dense deep learning model for multi-input multi-output regression. The dense deep learning model is evaluated on the GADRAS datasets, including high-mixing-rate and low-mixing-rate scenarios with two-, three-, four-, and five-source mixtures, which consist of various combinations of 13 radioactive isotopes.

The dense deep learning model is implemented using the Sequential API in Keras [23]. The sequential model consists of three fully connected (Dense) layers with ReLU activations, as illustrated in Figure 5. The Adam optimizer is employed for model optimization. Through empirical tuning, the number of neurons was set to 800 in the first layer, 256 in the second layer, and n in the third layer, where n corresponds to the number of radioactive isotope materials. This model is referred to as Deep Regression (DR) throughout the paper.

Regression algorithms based on Linear Regression and Random Forest

This study further employed linear regression (LR) and random forest regression (RFR) algorithms available in the Keras library [23].

3.2. Homogeneous Dataset Generation and Mixture Analysis Results

For this case, in addition to setting the augmentation parameters (such as source-to-background rate) to constant values, we also set four detector-related parameters to constant values to create a homogeneous dataset with the exception of background variation. That is, all generated two-isotope mixtures would have exactly the same source height, source distance, shielding, and shielding density values. To generate this homogeneous dataset, based on the user-provided signal-to-background-rate, we computed the mixing ratio (relative count contribution) of background (rB). We then randomly picked the mixing ratio of one of the two isotopes in the mixture to be between 0.1 and (1-rB). Table 1 shows the set augmentation and detector parameters used in the homogeneous dataset generation.

A total of 5800 two-isotope mixture spectra (source and background) are generated with the same detector and augmentation parameters. Among these spectra, 5300 of them are used for training and 500 of them are used for testing. For relative count contribution estimation, we used the foreground spectra of the mixtures (source + background) and checked the average RMSE values for four estimation methods. Figure 6 shows the RMSE values for 500 test spectra. Table 2 shows the average RMSE values. As shown in Figure 6 and Table 2, the Deep Regression method outperforms other approaches by achieving lower RMSE values.

The RMSE values of 500 test spectra generated by four different methods were compared using the Wilcoxon Signed-Rank Test. This non-parametric test evaluates the median differences between two related samples by ranking the differences between paired samples and determining whether the sum of the ranks shows a significant difference [33]. As shown in Table 2, the DR method achieved RMSE values that were significantly different from those of the PLS, LR, and RFR methods (p < 0.05). At the same time, there was no significant difference between the RMSE values of the PLS and LR methods, suggesting that these two methods performed similarly. Figure 7b demonstrates an example test spectrum together with estimated and actual mixing ratios with the four estimation methods.

3.3. Slightly Homogeneous Dataset Generation and Mixture Analysis Results

In this case, we varied some of the detector parameters (source distance, source height, shielding, and shielding density) and set some of the augmentation parameters (background cps, integration time, source-to-background, calibration offset and gain, FWHM) to constant values which partially constrains the diversity of the generated data and limit the diversity in the generated data to only detector associated parameters, including source height, source-to-detector distance, shielding material, and shielding density. We randomly picked the relative count contribution value of one of the two isotopes in the mixture to be between 0.1 and (1 − rB). Table 3 shows these fixed augmentation parameters used in generating this slightly heterogeneous dataset.

Similarly to the homogeneous dataset case, 5800 two-isotope mixture spectra (source and background) are generated. Among these spectra, 5300 of them are used for training and 500 of them are used for testing. For relative count contribution estimation, we used the foreground spectra of the mixtures (source + background) and checked the average RMSE values for four estimation methods. Figure 8 shows the RMSE values for 500 test spectra. Table 4 shows the average RMSE values. It can be seen from Figure 8 and Table 4 that even though the average RMSE values increased in comparison to the homogeneous dataset, the Deep Regression method consistently outperforms others approaches. Figure 9b demonstrates an example test spectrum together with estimated and actual mixing ratios with four estimation methods. In addition, the Wilcoxon Signed-Rank Test was conducted on the RMSE values obtained from the four methods. The indicated that the RMSE values of the Deep Regression method were still significantly different from those of the PLS, LR, and RFR methods (p < 0.05), confirming its stable advantage in the slightly heterogeneous scenario.

3.4. Heterogeneous Dataset Generation and Mixture Analysis Results

This is the most challenging case among the three since, in this case, in addition to varying the detector parameters (source distance, source height, shielding, shielding density), we also varied the augmentation parameters (background cps, integration time, source-to-background, calibration offset and gain, FWHM), making the dataset highly heterogeneous. Table 5 shows how the augmentation parameters are varied and their ranges when generating this heterogeneous dataset.

Figure 10 shows the RMSE values for 500 test spectra. Table 6 shows the average RMSE values. It can be seen from Figure 10 and Table 6 that even though the average RMSE values increased in comparison to the RMSE values of the slightly heterogeneous dataset, the Deep Regression method consistently demonstrates superior performance over other methods on this dataset.

Figure 11b demonstrates an example test spectrum together with estimated and simulated relative count contributions with four estimation methods. Wilcoxon Signed-Rank Test results on this dataset also showed that Deep Regression significantly outperformed other methods.

3.5. Observations

In all the three datasets investigated (homogeneous, slightly heterogeneous, and heterogeneous), the experimental results showed that the Deep Regression method achieved superior performance, yielding the lowest RMSE values among all methods. As anticipated for the homogeneous case, the mixing ratio estimations were highly accurate for almost all of the methods. As the parameter variation increased in the generated dataset, all methods’ performances decreased, while the performance drop was smaller in the Deep Regression method compared to other methods. While the Deep Regression method demonstrates strong performance across all scenarios, it is important to acknowledge its potential risk of overfitting in certain experimental conditions, particularly when using synthetic datasets with fixed structures. The method exhibits stable performance even as data heterogeneity increases, indicating its solid generalization ability. Therefore, although the possibility of overfitting cannot be entirely dismissed, it did not significantly compromise the robustness and validity of the findings presented in this study.

4. Uranium Enrichment-Level Prediction

In this section, we used the uranium enrichment source templates in [18] to generate an augmented uranium enrichment dataset and examined four methods’ performance for uranium enrichment prediction problem using gamma ray spectra. In this study, we also manually extracted predicted uranium enrichment values from a plot in [18] to achieve a rough idea about the average RMSE values obtained in that study. For this, we used Figure 5.5’s plot in [18]. Even though we cannot make a direct comparison since the dataset used in [18] and the dataset we generated using the uranium source templates are different, we observed that the average RMSE values we obtained with the Deep Regression method are lower.

4.1. Background

Uranium enrichment is an essential process for producing effective nuclear fuel out of mined uranium by increasing the percentage of uranium-235 which undergoes fission with thermal neutrons [34]. Naturally occurring uranium contains only about 0.72% of 235U with the remaining majority being 238U [35]. As uranium-238 is fissionable rather than fissile, increasing the concentration of uranium-235 is necessary for its application as nuclear fuel [36].

4.2. Data Generation at Different Enrichment Levels

Detecting the uranium enrichment level for a given gamma ray spectrum is of critical importance and had been studied in [18]. We used the proposed spectral data generation codes and uranium/background templates in [18] to generate a uranium enrichment dataset for a similar investigation using our methods (DR, PLS, LR, and RFR). The data generation codes in [18] uses uranium and background templates to generate gamma ray spectra at designated enrichment levels with several augmentation parameters. The enrichment levels are randomly selected between 0 and 1 when generating a uranium enrichment dataset. For details, one can see Chapter 5 in [18]. The following information is from [18] and provides information about how the uranium templates are simulated in [18]. Spectra for individual uranium isotopes were simulated using a coupled MCNP (Monte Carlo N-Particle Transport Code) and GADRAS framework, with isotopes uniformly distributed in a solid uranium sphere of 5.5 cm radius. The MCNP simulation accounted for the physics of self-attenuation in uranium, whereas GADRAS was applied to model the gamma ray spectra detected by a 2″ × 2″ NaI(Tl) detector.

The MCNP simulation setup is illustrated in Figure 12. A 19 cm concrete barrier was placed 180 cm away from the origin, and a 5.5 cm radius bare uranium sphere located at the origin. A 2″ × 2″ NaI (Tl) cylinder detector was positioned between the uranium sphere and the concrete block. The concrete block was incorporated in the MCNP simulation to account for backscattered radiation [18]. The simulation was conducted with a total of 10⁸ particles.

In [18], RadSrc [17] was employed to generate the gamma ray intensities of 235U, 238U, and 232U templates. Developed at Lawrence Livermore National Laboratory, this software applies the Bateman equations to model daughter in-growth and calculate their respective specific gamma-ray intensities [18]. The uranium enrichment data generation code consists of augmentation parameters such as source-to-background-ratio, integration time, and calibration parameters for spectrum binning. Sample gamma ray spectra generated with four different enrichment levels for aluminum shielding at density 4.15, FWHM = 7.0 and four different source distance levels can be seen in Figure 13.

4.3. Results and Analyses

There are three separate studies in this sub-section. The first study compares four methods (DR, PLS, LR, RFR) for enrichment-level prediction. We did not use the ROI method, which is not suitable for this application of estimating the enrichment levels. The second study focused on DR only, and we focused on a 5-fold cross-validation with early stopping idea. The third study is to compare our results with those results in [18].

Analysis-1: Compared four methods

In this analysis, we used 1 × 10⁵ uranium enrichment gamma ray spectrum data for training. For testing, we used 100 gamma ray spectra. For the various methods, we conducted training for 10,000 epochs. Table 7 presents the average RMSE values for the test dataset. Clearly, DR provided the lowest RMSE values, and the other three methods did not perform well. Figure 14 shows the predicted enrichments in comparison to the simulated (ground truth) enrichments. In Figure 14, ideally, the results should be located on the diagonal line as is illustrated with a black line. It can be noticed that the DR method results are mostly scattered on or around the diagonal line.

Analysis-2: DR with 5-fold cross-validation with early stopping

In this analysis, we only considered the DR method since it performed better than other methods. However, instead of training a single model for 10,000 epochs using the 1 × 10⁵ spectrum training dataset, we used 5-fold cross-validation training with early stopping. We introduced k-fold cross-validation [37] with early stopping in Keras [38] when training the Deep Regression model. We observed that this not only saved training time due to early stopping but also improved the accuracy of the predictions since it used the average of five separate models trained with the cross-validation mechanism. A five-fold cross-validation was performed, where each model was trained on 80% of the data and validated on the remaining 20%. This procedure was repeated five times, applying a different 20% split of the dataset for validation while using the remaining data for model training.

The validation data were employed to implement early stopping during training [39]. When the validation loss did not improve for a designated number of epochs (which is set to 200), the training was stopped since after that point training becomes more like overfitting. A demonstration of early stopping of training by monitoring the validation data loss value can be seen in Figure 15. From Figure 15, it can be noticed that even though training data loss value starts decreasing in value, the validation data loss value does not decrease after some epochs but rather converges. Thus, training is stopped when validation loss is not observed to decrease anymore.

After training five separate models for each fold, the model decisions from each fold were averaged to reduce variance in the individual model decisions on the 100-spectrum test dataset. The average RMSE values on the enrichment test dataset obtained from five individual fold models, average 5-fold model and single model (no cross-validation and no early stopping) are shown in Table 8. It is observed that the average 5-fold cross-validation model with early stopping yields the lowest RMSE value among all methods. In comparison to the results obtained from a single model without cross-validation or early stopping, there is a considerable reduction in the average RMSE values as well. Predicted enrichments versus simulated enrichments (ground truth) for the test dataset with the average 5-fold cross-validation model and previous single training model (no cross-validation and no early stopping) can also be seen in Figure 16.

Analysis-3: Comparison with results in [18]

In this analysis, we manually extracted predicted enrichment values from a result plot in [18] to gain a rough idea about the average RMSE values observed in [18]. For this, we used Figure 5.5’s plots in [18], which can be seen in Figure 17. Among the four prediction methods used in [18], it is mentioned that the CNN1D method provided somewhat better results than the others, so we visually extracted the predicted enrichment values for the CNN1D method in that plot. It is worth mentioning that the enrichment dataset used in [18] and the dataset we generated using the uranium source templates in [18] are not the same since we noticed different shielding densities in [18], whereas the uranium enrichment templates in [15], which we obtained the templates from, are different. The values we extracted for the CNN1D method after the visual screening of the plot can be seen in Table 9. Table 10 presents the average RMSE values for the extracted enrichment prediction results from the plots in [18] and our earlier result with the generated test dataset of 100 spectra. Even though we cannot make a direct comparison since the two datasets are different, it can be noticed that the average RMSE values are lower in our case. More importantly, this improvement is not solely due to minor adjustments in parameters such as shielding density. Our model incorporates a modified mixture generation framework, structured enrichment prediction, and consistent integration of foreground/background templates. Together, these elements contribute to enhanced generalization and improved estimation accuracy. In contrast, the methods in [18] relied on fixed simulated spectra under limited variations. Therefore, our results reflect improvements in model design and data robustness rather than fitting artifacts from specific simulation conditions. Figure 18 shows the predicted enrichments versus simulated enrichments (ground truth) for the visually extracted values from Figure 5.5 in [18].

5. Conclusions

This paper introduces an efficient framework for the rapid generation of mixture spectra for nuclear materials. The generated spectra are similar to those generated by GADRAS, except that it is a lot more efficient and user-friendly. Moreover, our software is license free so that civilians can also use our generation framework to synthesize mixture spectra. The new framework was applied to two-mixture mixing ratio estimation. Four algorithms were evaluated. One deep learning algorithm was observed to outperform others. Moreover, we also applied the framework to uranium enrichment-level estimation. It was observed that enrichment levels can be better predicted using a deep learning approach.

A potential research direction involves comparing various algorithms using real spectra, which we plan to do in Phase 2 of our research. Another direction is to investigate whether or not one can apply low-shot transfer learning to build models for new application scenarios such as new location, new shielding, etc.

Author Contributions

Conceptualization, B.A. and C.K.; methodology, B.A., A.S., K.A.I. and H.Z.; writing—original draft preparation, C.K.; writing—A.S., H.Z. and J.L.; project administration, C.K. and J.L.; funding acquisition, C.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the NGA under HM047620C0039. The views, opinions, and/or findings expressed are those of the authors and should not be interpreted as representing the official views or policies of the U.S. Government.

Data Availability Statement

Data are contained within the article.

Acknowledgments

We would also like to thank Mitaire Ojaruega and Kevin Jackman at NGA for fruitful discussions and valuable suggestions over the course of this project. This work was performed when B.A. and C.K. were with Applied Research LLC. B.A. is now with MITRE Corporation and C.K. is with the Johns Hopkins University Applied Physics Laboratory.

Conflicts of Interest

Authors Chiman Kwan and Bulent Ayhan were employed by the company Applied Research LLC. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Durbin, M.; Kuntz, A.; Lintereur, A. Machine Learning Applications for the Detection of Missing Radioactive Sources. In Proceedings of the IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), Manchester, UK, 26 October–2 November 2019; p. 1. [Google Scholar]
Cordone, G.; Brooks, R.R.; Sen, S.; Rao, N.S.; Wu, C.Q.; Berry, M.L.; Grieme, K.M. Regression for Radioactive Source Detection. In Proceedings of the IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), Atlanta, GA, USA, 21–28 October 2017; pp. 1–3. [Google Scholar]
Kim, D.; Yu, D.; Sawant, A.; Choe, M.S.; Choi, E. First Experimental Observation of Plasma Breakdown for Detection of Radioactive Material Using a Gyrotron in Real-Time. In Proceedings of the 18th International Vacuum Electronics Conference (IVEC), London, UK, 24–26 April 2017; pp. 1–2. [Google Scholar]
Eleon, C.; Battiston, F.; Bounaud, M.; Mosbah, M.B.; Passard, C.; Perot, B. Study of Boron Coated Straws and Mixed (¹⁰B/³He) Detectors for Passive Neutron Measurements of Radioactive Waste Drums. In Proceedings of the IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), Sydney, NSW, Australia, 10–17 November 2018; pp. 1–4. [Google Scholar]
GADRAS. Available online: https://osti.gov/biblio/1166695-gadras-drf-user-manual (accessed on 1 January 2021).
Gean4. Available online: https://geant4.web.cern.ch/docs/ (accessed on 21 February 2025).
Allison, J.; Amako, K.; Apostolakis, J.; Arce, P.; Asai, M.; Aso, T.; Bagli, E.; Bogdanov, A.G.; Burkhardt, H.; Chauvie, S.; et al. Recent Developments in Geant4. Nucl. Instrum. Methods Phys. Res. A 2016, 835, 186–225. [Google Scholar] [CrossRef]
Allison, J.; Amako, K.; Apostolakis, J.; Araujo, H.; Dubois, P.A.; Asai, M.; Barrand, G.; Capra, R.; Chauvie, S.; Chytracek, R.; et al. Geant4 Developments and Applications. IEEE Trans. Nucl. Sci. 2006, 53, 270–278. [Google Scholar] [CrossRef]
Agostinelli, S.; Allison, J.; Amako, K.; Apostolakis, J.; Araujo, H.; Arce, P.; Asai, M.; Axen, D.; Banerjee, S.; Barrand, G.; et al. Geant4—A Simulation Toolkit. Nucl. Instrum. Methods Phys. Res. A 2003, 506, 250–303. [Google Scholar] [CrossRef]
Mano, C.-P.; Chapelle, C.; Der Mesrobian-Kabakian, A.; Patryl, L. Algorithm Development for Low Level Radioxenon 2D Spectra Analysis: A First Case of Study Using Spectral Unmixing for a β-γ Detector. Appl. Radiat. Isot. 2023, 180, 110064. [Google Scholar] [CrossRef] [PubMed]
Paradis, H.; Bobin, C.; Bobin, J.; Thevenin, M. Spectral Unmixing Applied to Fast Identification of γ-Emitting Radionuclides Using NaI(Tl) Detectors. Appl. Radiat. Isot. 2020, 155, 108927. [Google Scholar] [CrossRef] [PubMed]
Fioretti, V.; Antonelli, L.A.; Gianotti, F.; Marano, G.; Morini, M.; Ricciarini, S.B.; Verrecchia, F.; Vercellone, S.; Bulgarelli, A.; Tavani, M. Machine Learning-Enhanced Discrimination of Gamma-Ray and Hadron Events Using Temporal Features: An ASTRI Mini-Array Analysis. Appl. Sci. 2025, 15, 3879. [Google Scholar] [CrossRef]
Olmos, P.; Diaz, J.C.; Perez, J.M.; Gomez, P.; Rodellar, V.; Aguayo, P.; Bru, A.; Garcia-Belmonte, G.; De Pablos, J.L. A New Approach to Automatic Radiation Spectrum Analysis. IEEE Trans. Nucl. Sci. 1991, 38, 971–975. [Google Scholar] [CrossRef]
Pilato, V.; Tola, F.; Martinez, J.M.; Huver, M. Application of Neural Networks to Quantitative Spectrometry Analysis. Nucl. Instrum. Methods Phys. Res. A 1999, 422, 423–427. [Google Scholar] [CrossRef]
Yoshida, E.; Shizuma, K.; Endo, S.; Oka, T. Application of Neural Networks for the Analysis of Gamma-Ray Spectra Measured with a Ge Spectrometer. Nucl. Instrum. Methods Phys. Res. A 2002, 484, 557–563. [Google Scholar] [CrossRef]
Chen, L.; Wei, Y.-X. Nuclide Identification Algorithm Based on KL Transform and Neural Networks. Nucl. Instrum. Methods Phys. Res. A 2009, 598, 450–453. [Google Scholar] [CrossRef]
Kamuda, M.; Stinnett, J.; Sullivan, C.J. Automated Isotope Identification Algorithm Using Artificial Neural Networks. IEEE Trans. Nucl. Sci. 2017, 64, 1858–1864. [Google Scholar] [CrossRef]
Kamuda, M. Automated Isotope Identification and Quantification Using Artificial Neural Networks. Ph.D. Thesis, University of Illinois at Urbana-Champaign, Urbana, IL, USA, 2019. Available online: https://www.ideals.illinois.edu/items/113773 (accessed on 15 April 2025).
Matta, J.T.; Rowe, A.J.; Dion, M.P.; Willis, M.J.; Nicholson, A.D.; Archer, D.E.; Wightman, H.H. Maximum Likelihood Spectrum Decomposition for Isotope Identification and Quantification. IEEE Trans. Nucl. Sci. 2021, 69, 1212–1224. [Google Scholar] [CrossRef]
Geissel, H.; Weick, H.; Scheidenberger, C.; Bimbot, R.; Gardes, D. Experimental Studies of Heavy-Ion Slowing Down in Matter. Nucl. Instrum. Methods Phys. Res. B 2002, 195, 3–54. [Google Scholar] [CrossRef]
Kwan, C.; Ayhan, B.; Chen, G.; Chang, C.; Wang, J.; Ji, B. A Novel Approach for Spectral Unmixing, Classification, and Concentration Estimation of Chemical and Biological Agents. IEEE Trans. Geosci. Remote Sens. 2006, 44, 409–419. [Google Scholar] [CrossRef]
Ayhan, B.; Kwan, C.; Galbacs, G. Gold Fineness Determination Using LIBS Spectra with PLS and Spectral Unmixing Techniques. In Proceedings of the 2nd International Conference on Applied and Theoretical Information Systems Research, Taipei, Taiwan, 27–29 December 2012. [Google Scholar]
Ayhan, B.; Kwan, C. Application of Deep Belief Network to Land Cover Classification Using Hyperspectral Images. In Proceedings of the International Symposium on Neural Networks, Hokkaido, Japan, 21–26 June 2017; Springer: Cham, Switzerland; pp. 269–276. [Google Scholar]
Keras Sequential Model. Available online: https://keras.io/guides/sequential_model/ (accessed on 25 June 2023).
Krizhevsky, A.; Sutskever, I.; Hinton, G. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS 2012), Lake Tahoe, NV, USA, 3–8 December 2012; pp. 1097–1105. [Google Scholar]
Ayhan, B.; Kwan, C.; Budavari, B.; Larkin, J.; Gribben, D.; Li, B. Video Activity Recognition with Varying Rhythms. IEEE Access 2020, 8, 191997–192008. [Google Scholar] [CrossRef]
Kwan, C.; Gribben, D.; Tran, T. Tracking and Classification of Multiple Human Objects Directly in Compressive Measurement Domain for Low Quality Optical Videos. In Proceedings of the IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York, NY, USA, 10–12 October 2019; pp. 488–494. [Google Scholar]
Rababaah, A.; Sharma, D.K. Integration of Two Different Signal Processing Techniques with Artificial Neural Network for Stock Market Forecasting. Acad. Inf. Manag. Sci. J. 2015, 18, 63–80. [Google Scholar]
Ayhan, B.; Kwan, C.; Budavari, B.; Kwan, L.; Lu, Y.; Perez, D.; Li, J.; Skarlatos, D.; Vlachos, M. Vegetation Detection Using Deep Learning and Conventional Methods. Remote Sens. 2020, 12, 2502. [Google Scholar] [CrossRef]
Qu, Y.; Baghbaderani, R.K.; Qi, H.; Kwan, C. Unsupervised Pansharpening Based on Self-Attention Mechanism. IEEE Trans. Geosci. Remote Sens. 2020, 59, 3192–3208. [Google Scholar] [CrossRef]
Top 20 Applications of Deep Learning in 2021 Across Industries. Available online: https://www.mygreatlearning.com/blog/deep-learning-applications/#cars (accessed on 16 January 2025).
ANSI. Performance Criteria for Hand-Held Instruments for the Detection and Identification of Radionuclides; ANSI N42.34-2006; American National Standards Institute: Washington, DC, USA, 2006. [Google Scholar]
Wilcoxon, F. Individual Comparisons by Ranking Methods. Biom. Bull. 1945, 1, 80–83. [Google Scholar] [CrossRef]
EnergyEducation.ca. Uranium Enrichment. Available online: https://energyeducation.ca/encyclopedia/Uranium_enrichment (accessed on 1 August 2022).
Bryan, J.C. Introduction to Nuclear Science, 1st ed.; CRC Press: Boca Raton, FL, USA, 2009. [Google Scholar]
Glaser, A. Characteristics of the Gas Centrifuge for Uranium Enrichment and Their Relevance for Nuclear Weapon Proliferation. Sci. Glob. Secur. 2008, 16, 1–25. [Google Scholar] [CrossRef]
Machine Learning Mastery. Evaluate Performance of Deep Learning Models in Keras. Available online: https://machinelearningmastery.com/evaluate-performance-deep-learning-models-keras/ (accessed on 7 August 2022).
Machine Learning Mastery. How to Stop Training Deep Neural Networks at the Right Time Using Early Stopping. Available online: https://machinelearningmastery.com/how-to-stop-training-deep-neural-networks-at-the-right-time-using-early-stopping/ (accessed on 25 August 2020).
Prechelt, L. Early Stopping—But When? In Neural Networks: Tricks of the Trade; Montavon, G., Orr, G.B., Müller, K.-R., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 53–67. [Google Scholar]

Figure 1. Checking scalar coefficient multiplication in GADRAS.

Figure 2. Checking spectrum addition in GADRAS.

Figure 4. GADRAS parameters used to simulate NaI detector used in [18].

Figure 5. Structure of the dense deep learning model for multi-input multi-output regression.

Figure 6. RMSE values for the four methods (homogeneous dataset—NaI).

Figure 7. Example test spectrum and relative count contribution estimations from the homogeneous test dataset (NaI).

Figure 8. RMSE values for the four methods (slightly heterogeneous dataset—NaI).

Figure 9. Example test spectrum and relative count contribution estimations from the slightly heterogeneous test dataset (NaI).

Figure 10. RMSE values for the four methods (heterogeneous dataset—NaI).

Figure 11. Example test spectrum and relative count contribution estimations from the heterogeneous test dataset (NaI).

Figure 12. MCNP simulation diagram [18].

Figure 13. Sample gamma ray spectra using the uranium enrichment source and background templates in [18].

Figure 14. Predicted enrichments vs. simulated enrichments (ground truth) for the test dataset.

Figure 15. Early stopping demonstration by monitoring validation data loss value.

Figure 16. Predicted enrichments vs. simulated enrichments (ground truth) for the test dataset with average 5-fold cross-validation model and previous single model (no cross-validation and no early stopping).

Figure 17. Predicted enrichments vs. simulated enrichments (ground truth) for the test dataset in [18].

Figure 18. Predicted enrichments vs. simulated enrichments (ground truth) for the visually extracted enrichment values in Figure 5.5 of [18].

Table 1. Augmentation parameters used in homogeneous dataset generation (NaI) with the modified mixture spectrum generation framework.

Parameter Name	Value	Unit
Background cps	200	counts per second (cps)
Integration time	1000	seconds (s)
Source to background	2	ratio
rB	1/(source_to_background + 1)	–
rX_plus_rY	1-rB	–
calibration =	[0, 1, 0]	offset, gain, non-linearity
fwhm	7.5	percent (%)
rX	np.random.uniform (0.1, 1-rB)	–
rY	rX plus rY-rX	–
rcc	[rX, rY, rB]	relative count contributions
Source height	100.0	centimeters (cm)
Source dist	175.0	centimeters (cm)
shielding	‘alum’	aluminum
Shielding density	1.82	g/cm³

Table 2. Comparison of average RMSE on the test dataset for four methods using foreground spectra (Homogeneous dataset—NaI).

Spectrum Type	PLS	DR	LR	RFR
Foreground	0.0081	0.0017	0.0092	0.0251
p-value vs. DR	0.0021	—	0.0014	<0.0001

Table 3. Augmentation parameters used in slightly heterogeneous dataset (NaI) generation with the modified mixture data generation framework.

Parameter Name	Value
integration_time	1000
source_to_background	2
rB	1/(source_to_background + 1)
rX_plus_rY	1 − rB
calibration	[0, 1, 0]
fwhm	7.5
rX	np.random.uniform (0.1, 1 − rB)
rY	rX_plus_rY − rX
rcc	[rX,rY,rB]

Table 4. Comparison of average RMSE values on the test dataset obtained by four methods with source and foreground spectra (slightly heterogeneous dataset—NaI).

Spectrum Type	PLS	DR	LR	RFR
Foreground	0.0310	0.0101	0.0351	0.0379
p-value vs. DR	0.0012	—	0.0007	<0.0001

Table 5. Augmentation parameters used in heterogeneous dataset generation (NaI) with the modified mixture data generation framework.

Parameter Name	Value
background_cps	np.random.poisson (200)
integration_time	10 ** np.random.uniform(np.log10(60), np.log10(600))
signal_to_background	np.random.uniform (0.5, 2)
rB	1/(source_to_background + 1)
rX_plus_rY	1 − rB
calibration	[np.random.uniform (0, 10), np.random.uniform (0.9, 1.1),0
fwhm	Choice ([7.0, 7.5, 8.0])
rX	np.random.uniform (0.1, 1 − rB)
rY	rX_plus_rY − rX
rcc	[rX,rY,rB]

Table 6. Comparison of average RMSE on the test dataset for four methods using source and foreground spectra (slightly heterogeneous dataset—NaI detector).

Spectrum Type	PLS	DR	LR	RFR
Foreground	0.0310	0.0101	0.0351	0.0379

Table 7. Comparison of average RMSE values on the enrichment test dataset using four different methods.

Spectrum Type	PLS	DR	LR	RFR
Foreground	0.1905	0.0651	0.1917	0.1517

Table 8. Comparison of average RMSE values on the enrichment test dataset across 5-fold-specific models, the 5-fold ensemble model, and a single model without cross-validation or early stopping.

Deep Regression Model	Average RMSE
Fold-0 model with early stopping	0.0528
Fold-1 model with early stopping	0.0517
Fold-2 model with early stopping	0.0511
Fold-3 model with early stopping	0.0484
Fold-4 model with early stopping	0.0581
Average 5-fold cross-validation model with early stopping	0.0434
Single model (no 5-fold cross-validation and no early stop)	0.0651

Table 9. Extracted CNN1D complete enrichment values after visual screening of Figure 5.5 in [18].

Shielding Condition	3% Enrichment	25% Enrichment	50% Enrichment	75% Enrichment
Simulated	0.03	0.18	0.55	0.70
Unshielded	0.03	0.27	0.61	0.76
1.48 Al	0.09	0.26	0.48	0.80
0.45 Lead	0.07	0.30	0.48	0.70
1.42 Lead	0.03	0.25	0.50	0.75

Table 10. Average RMSE value comparisons.

Spectrum Type	Average RMSE
Average 5-fold Deep Regression model	0.0434
Kamuda results for CNN1D (5-fold cross-validation ensemble)	0.0515

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kwan, C.; Ayhan, B.; Stavola, A.; Islam, K.A.; Zhang, H.; Li, J. A Fast Framework for Generating Radioactive Mixture Spectra and Its Application to Remote High-Performance Mixture Identification. Electronics 2025, 14, 1688. https://doi.org/10.3390/electronics14081688

AMA Style

Kwan C, Ayhan B, Stavola A, Islam KA, Zhang H, Li J. A Fast Framework for Generating Radioactive Mixture Spectra and Its Application to Remote High-Performance Mixture Identification. Electronics. 2025; 14(8):1688. https://doi.org/10.3390/electronics14081688

Chicago/Turabian Style

Kwan, Chiman, Bulent Ayhan, Adam Stavola, Kazi Aminul Islam, Hongfang Zhang, and Jiang Li. 2025. "A Fast Framework for Generating Radioactive Mixture Spectra and Its Application to Remote High-Performance Mixture Identification" Electronics 14, no. 8: 1688. https://doi.org/10.3390/electronics14081688

APA Style

Kwan, C., Ayhan, B., Stavola, A., Islam, K. A., Zhang, H., & Li, J. (2025). A Fast Framework for Generating Radioactive Mixture Spectra and Its Application to Remote High-Performance Mixture Identification. Electronics, 14(8), 1688. https://doi.org/10.3390/electronics14081688

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Fast Framework for Generating Radioactive Mixture Spectra and Its Application to Remote High-Performance Mixture Identification

Abstract

1. Introduction

2. Multi-Isotope Spectrum Mixture Generation

2.1. Background

2.2. Examining Spectrum Mixing in GADRAS

2.3. New Spectral Data Generation Framework

2.4. Processing Steps in the Framework

2.5. Detector and Augmentation Parameters

3. Investigations with the Two-Isotope Mixtures

3.1. Mixture Identification Algorithms

3.2. Homogeneous Dataset Generation and Mixture Analysis Results

3.3. Slightly Homogeneous Dataset Generation and Mixture Analysis Results

3.4. Heterogeneous Dataset Generation and Mixture Analysis Results

3.5. Observations

4. Uranium Enrichment-Level Prediction

4.1. Background

4.2. Data Generation at Different Enrichment Levels

4.3. Results and Analyses

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI