1. Introduction
Cosmic-ray-based tomography systems generally consist of several plates which are positioned around the target material [
1,
2,
3,
4]. Such plates are position-sensitive particle detectors with the ability to detect the passage of electrically charged particles such as muons, therefore making it possible to learn about their trajectories. For example, ref. [
2] discusses a system that positions three detector plates below and another three above the material of interest to obtain precise information about muon trajectories through the system. In many tomography systems, detector plates are composed of luminescent materials and silicon photomultipliers (SiPM). Thus, when a muon crosses the plate, luminescent material produces light that is collected by the SiPMs. Then, the signal amplitudes of all SiPMs are processed to estimate the location of muon hits.
There are different ways to process the sensor amplitudes to estimate the hit position, and conventional analytical methods, such as Center of Gravity (CoG), have been used for this purpose [
5,
6]. However, the low performance of such methods motivated researchers to look for alternative techniques. Aguiar et al. simulated a plate detector and used Anger logic for localization of the muon hits [
7]. Recently, thanks to the latest technological developments, machine learning-based methods have become popular, in addition to the analytical methods for estimating the particle hit position. Their benefits such as better spatial resolution or faster processing capabilities are explored in various studies. Yonggang et al. implemented a multilayer perceptron (MLP) network with
neurons on an FPGA [
8]. They showed that machine-learning-based methods produce promising results, while they can lead to a more computationally efficient estimation in real-time application. In [
9], the authors use a similar network architecture as in [
8]. However, they took a different approach to the data configuration and the network training. They split
x and
y axes into 7 intervals each. They first train a global network to predict the
x and
y positions roughly. Then, they use a sub-interval network to find an exact location. They trained 7 neural networks for
x and 7 neural networks for
y as sub-interval networks. In addition to MPL architecture, [
10] used a convolutional neural network (CNN) to decode the gamma-ray interaction locations, achieving an average spatial resolution of 0.40 mm. They collected the light distribution as a 16-pixel image using the sensors placed at the edges of the plate. Then, they used a network that consists of a CNN layer with 200 filters of size 1 × 1 combined with a fully connected (FC) layer.
Hit positions of the muons are the most important data for cosmic-ray-based tomography systems, since the whole system is designed according to the trajectories of muons that are calculated using the hit positions. Due to this, the resolution of a hit position estimation carries the utmost importance. Improvement of the estimation method results in a better hit position estimation, consequently improving the imaging performance of the tomography system. Although there have been studies to enhance performance on this task, the literature is limited, and there is still a big research gap, especially in deep-learning-based methods. Studies about the performance of deep neural networks (DNN) in the hit position estimation task are very limited. Comparison of DNN and CoG, and DNN’s performance for different sensor configurations, are two topics to be explored. Therefore, in this study, we investigate how much one could increase the positional resolution by replacing CoG with deep neural networks. We present a fully connected architecture and a CNN-based architecture to evaluate the performances of different networks compared with CoG. Our results cover a comparison of resolution data. Optimization of physical detection systems is out of the scope of this study. There are three main contributions of this research:
We present a simulated dataset of muon hit positions on a detector plate.
We present two deep-learning-based methods for the muon hit position estimation task. We compare their performances with CoG. Additionally, we investigate the proposed method’s performance in depth.
We evaluate the effects of different SiPM configurations on performance.
The rest of this paper is organized as follows:
Section 2 describes the dataset.
Section 3 explains the proposed method.
Section 4 presents and discusses the experimental results. Finally,
Section 5 concludes this work.
2. Dataset
We characterized the response of a muon detector based on a plastic scintillator slab to create our dataset. We used the Monte Carlo simulation package GEANT4 [
11] version 10.5 to model the main parameters of the detectors. The parameters used in simulations are:
The plastic scintillator is an EJ-200 with light output 10,000 photons/MeV.
The plate surface area is 400 cm (20 cm × 20 cm), and the plate thickness is 10 mm.
The SiPMs sizes are 6 mm × 6 mm. Spacing between SiPMs placed on the top surface is 20 mm. The spacing is chosen heuristically, considering our efforts to keep the sensor number as low as possible due to the cost.
The detection efficiency of SiPMs is 40%.
The reflective material used in simulations is diffusive reflection material with the surface covered with TiO paint (reflection efficiency 95%).
The simulated detector plate can be seen in
Figure 1. In this figure, the red squares represent SiPMs, and the green surface represents the scintillator area. According to our setup, the SiPMs face the top of the detector plate. A symmetrical placement of SiPMs on both axes is followed as the plate is square-shaped.
We simulated muon hits only on one specific area, instead of the whole surface.
Figure 2 shows the simulated hit locations as red points. We chose the hit locations only in this limited area to save computation time, as the system is periodically invariant by translation. Muons are emitted perpendicular to the detector surface in a grid with a step of 1 mm. Therefore, we created 441 different hit positions. For each hit position, we simulated around 10,000 hits. Our total simulated hit count added up to 4,416,335 (∼441 × 10,000). When a muon hits the detector plate, the scintillation light spreads in the plate and is detected by SiPMs. We saved the signal amplitudes from SiPMs and hit coordinates (
x,
y) for each hit. Thus, we created a dataset of hit coordinates and corresponding signal amplitudes. Sensors are placed in the middle of the plate from the side view. Therefore, when a muon hits the exact location of the sensor, it causes a light to be produced on the plate as usual. The sensor at the hit location ends up with the highest signal amplitude, since it is the closest to the source.
Finally, we created sub-sets of this dataset by discarding outer sensor signal amplitudes. For example, we obtained a sub-dataset for 64 () sensors by discarding the outer signal readings from the data. Following the same logic, we created datasets for 49 (), 36 (), 25 (), and 16 () sensors, separately. This way, we could experiment with how different SiPM configurations affect performance. Since we obtained the sub-sets by cropping outer sensor amplitudes from the 81-sensor data, the hit locations and the parameters remain the same for the different sensor configurations.
It is possible to treat each signal amplitude as a pixel and show the illumination from a single-particle hit as an image. This way, we present the samples from (0, 0) and (20, 20) in
Figure 3. From the samples, it can be seen that as the hit coordinate changes, the illumination output changes. To show an obvious change, we demonstrated outputs of hits on (0, 0) and (20, 20) coordinates. Higher signal amplitudes from sensors, i.e., brighter pixels, shift position according to the hit position.
4. Experimental Results and Discussion
Firstly, we compared FCN and CNN to see which model performs better. In this experiment, we used signal amplitudes from 81 sensors as the input data. The predicted hit coordinates are compared with the actual hit coordinates to obtain an error. We calculated the error by euclidean distance, and took an average of errors for all the predicted hits to obtain a single average error value as result:
where
N is the number of samples,
x and
y are the actual hit positions, and
x and
y are the predicted hit positions on the
x-axis and
y-axis, respectively.
Table 1 shows the performance of each network. Both networks achieve an error lower than 0.8 mm, however, FCN performs better by a small margin. We discuss that our CNN could not extract features that are more differentiating than the ones extracted by FCN. The CNN extracts the features by relating the pixels with the neighboring pixels thanks to the convolution operation iterated through the whole image. Therefore, we suspect that the image size is too small to utilize the full capability of the CNN because, in our case, the count of the pixels with a significant value is fairly low. Considering the results in
Table 1, we chose FCN as our proposed method and continued the experiments using it.
Additionally, we experimented using different SiPM configurations. Initially, our dataset had signal amplitudes from 81 sensors. However, as explained in
Section 2, we discarded signals from outer sensors and evaluated the effect of SiPM amount on performance. Considering the possible complexity of cosmic-ray tomography systems, being able to use fewer materials to obtain a competitive performance would decrease the cost and implementation complexity.
In
Table 2, we present the performance of FCN for different sensor configurations. In the table, we can see that the performance is very close for 81, 36, and 25 sensors with average errors, respectively, 0.72 mm, 0.73 mm, and 0.75 mm. The first big jump in the average error is obtained when the sensor count is reduced to 16. From these results, we argue that the sensor amount does not necessarily need to be as high as possible. We can see that after 25 sensors, an increase in the sensor count does not necessarily improve the performance by a lot. For this reason, a trade-off should be carried out between the performance and the cost to find a satisfying spot. For our case, we find 25 sensors plausible, as we do not want to have more than triple the amount of sensors to obtain an increase of 0.03 mm in performance.
Another important point in
Table 2 is the trend seen there. It is clear that, as there are more sensors, better performance is obtained. This is expected as more sensors mean a more precise estimate of the illumination caused by the hit, which in turn results in a more precise prediction of the position. However, we can also see that after some point, the increase in the number of sensors is not so effective. We explain this by the behavior of the light produced by the hit. Whenever a hit happens, the hit point becomes the source of the light. This means, as we go away from the hit position, signals become weaker. Eventually, the sensors on far positions have very small signal amplitudes, hence carrying little information. An example of the signal amplitudes for a particle hit on coordinates
is given below. As expected, the sensors around the hit position have the highest signal amplitudes, while the outer sensors become insignificant gradually. It should be noted that the average value of signal amplitudes is on the same level for different hit locations, although the pattern shifts around:
To have a detailed look into the performance, we also explored the results for
x and
y coordinates separately. For this purpose, in
Table 3, we demonstrate the number of predicted coordinates and corresponding intervals for their errors. For example, 97.35% of
x predictions fall in the 0–1.2 mm error interval. From the table, we can see that similar results are obtained for both models trained on the
x and
y coordinates, separately. Similar performances for both models are expected as our setup is symmetrical in sensor placement. Additionally, only 0.01% and 0.03% of the estimations have an error over 6 mm. This shows that the outliers are negligible.
Finally, to evaluate the performance against a commonly used conventional method, we compare our results with those obtained with CoG:
where
are the coordinates on the detector plate,
is the corresponding amplitude, and
are the CoG positions.
As it can be seen in
Table 4, our deep-learning-based methods produce significantly better results compared with CoG on average. One of the benefits of using analytical methods such as CoG is that they do not require a big dataset to learn, unlike a deep learning model. However, in our case, the dataset is viable. Therefore a significant gain in performance is definitely preferable.
We visualize hit location estimations for some coordinates, i.e., with a 5 mm step, in
Figure 6 to have a more detailed comparison between CoG and FCN. The area in this figure represents the red dotted area shown in
Figure 2. The dark blue area represents the coordinates where there are no estimations, and the transition from light blue to red represents the increasing number of hits estimated on the respective coordinates. We show only twenty-five hit locations in this display. We did not include all the hit locations used in the experiment here to be able to present the results clearly. However, the results follow the same trend for the other hit locations as well. We present average errors obtained for these locations together with the estimations. Due to a technical issue in our simulation, some of the data at the
location have an uncharacteristic distribution of signal amplitudes. To avoid biases from this spurious effect, we utilize hits at
instead of
to have better clarity on the results, as the results follow the same trend.
For CoG, in
Figure 6a,c, hits at the sensor locations
,
,
, and
are estimated more accurately compared with other locations. CoG weighs the closer sensors the most because of their high signal amplitudes, so this behavior is expected. The performance decreases as the hit locations move away from the sensors. Moreover, it can be seen that the estimations around the sensors have a bias towards the sensors, which causes a lower performance around the sensors. This creates a certain pattern for CoG estimations, which is expected due to the nature of the algorithm. On the other hand, FCN estimations follow a different pattern. By comparing
Figure 6c,d, we can see that the proposed method obtains smaller errors than CoG for all hit locations except
,
,
, and
. For these particular locations, however, the average errors are similar between the methods. Compared with CoG, the estimations for FCN are concentrated around the actual hit locations more since the errors are smaller.
An important point to consider about FCN is a generalization. In
Figure 6b, the estimations at the edges have sharper shapes compared with the inner ones. We argue that this is a result of the characterization of the restricted data used during the training by the model. The model learns that the hits are restricted in this particular area, so the estimations are limited to the same area. However, this issue can be overcome by using a dataset that includes hits from the whole area targeted. To demonstrate that, we simply multiplied the current data symmetrically along the axes to cover an extended area and trained another model using the extended data. The multiplication of the data is performed by first exploiting the symmetry of the signal amplitudes and corresponding hit locations on the
y-axis, then exploiting their symmetry on the
x-axis. The results of this experiment can be seen in
Figure 7. The average error stays the same with the previous surface at 0.72 mm. A similar error is expected since the data are symmetrical. In the figure, the previously restricted locations such as hits along the axes are now estimated without a restriction. This experiment shows that the method can cover the whole plate if the data are generated accordingly. In that case, the restricted area would be the plate itself, which is a desirable outcome.
5. Conclusions
Cosmic-ray tomography systems utilize various detector plates, and their imaging performances heavily depend on the particle hit position estimation on those plates. A precise estimation of the hit positions can increase the produced image’s quality, and vice versa, a bad estimation may cause low-quality or even meaningless images. Therefore, a successful position estimation carries the utmost importance in such systems. Traditionally, analytical methods such as Center of Gravity have been used for particle hit position estimation. These methods do not require a big dataset, on the other hand, their performances are relatively poor, as they do not exploit the full information available in a detector layer. Thus, research has been conducted on alternative methods, such as deep learning techniques. So far, deep-learning-based methods have produced promising results for hit position estimation, but there is still a research gap.
In this paper, we investigated the performance of two deep-learning-based networks, i.e., FCN and CNN on the muon hit position estimation task. We collected a muon hits dataset produced by GEANT4 and obtained an average error of 0.72 mm using FCN with 81-sensor data. This is a significant improvement over an analytical method such as CoG. We also investigated the effects of different sensor configurations in our setup. We saw that 25 sensors produce results almost as good as 81 sensors. As part of future work, the effect of different intervals on hit positions simulated can be investigated to improve the performance further. Moreover, the dataset can be extended to cover a larger plate area and include varying hit angles. Additionally, our plans involve building a realistic experimental setup to compare the results of this study with real device data.