Next Article in Journal
Impaired Lumbar Extensor Force Control Is Associated with Increased Lifting Knee Velocity in People with Chronic Low-Back Pain
Next Article in Special Issue
A Real-Time Energy Response Correction Method for Cs3Cu2I5:Tl Scintillating Dosimeter
Previous Article in Journal
Real-Time Road Intersection Detection in Sparse Point Cloud Based on Augmented Viewpoints Beam Model
Previous Article in Special Issue
Radiation Damage by Heavy Ions in Silicon and Silicon Carbide Detectors
 
 
Article
Peer-Review Record

Unveiling Insights: Harnessing the Power of the Most-Frequent-Value Method for Sensor Data Analysis

Sensors 2023, 23(21), 8856; https://doi.org/10.3390/s23218856
by Victor V. Golovko *, Oleg Kamaev and Jiansheng Sun
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Sensors 2023, 23(21), 8856; https://doi.org/10.3390/s23218856
Submission received: 22 September 2023 / Revised: 21 October 2023 / Accepted: 24 October 2023 / Published: 31 October 2023
(This article belongs to the Special Issue Advances in Particle Detectors and Radiation Detectors)

Round 1

Reviewer 1 Report

The manuscript presents the most frequent value (MFV) method for the analysis of sensor data, which provides a more effective and accurate approach for estimating the most commonly observed value. The manuscript is well prepared and clearly stated. Here are a few comments.

 

1.     According to Equation 1 and 2, the dihesion parameter is very important in the iteration process. The authors also present the iteration equation for the dihesion parameter. I think it is better for the readers to understanding the determination process of the dihesion parameter if the authors could give out a simple example.

2.     Why is the confidence level chosen as 68.27% or 95.45%? I suggest the authors to give a simple explanation.

3.     It is easy to understand that the MFV method would of course give a more reliable and reasonable value for the estimation of sensor data analysis. However, due to the need of multiple iterations, I would guess that the time required for computation should be much larger than other general methods, e.g., the mean value method. What is the difference in the required computation time between the MFV method and other general methods, if the volume of data is very large?

Author Response

Dear Reviewer,

I would like to sincerely thank you for your valuable suggestions and comments on the manuscript. Your input has been particularly interesting and greatly appreciated. Please find below the detailed responses to each of your comments (they are also highlighted in "red" color in the manuscript).

Thank you once again for your time and effort in reviewing our work.

Best regards,
Victor Golovko

Comment 1:
According to Equation 1 and 2, the dihesion parameter is very important in the iteration process. The authors also present the iteration equation for the dihesion parameter. I think it is better for the readers to understanding the determination process of the dihesion parameter if the authors could give out a simple example.

Response 1:
Thank you for your feedback. We really appreciate your suggestion. If you are interested in learning more about how the dihesion parameter is determined, you can find detailed explanations in the books referenced as [Steiner, F. The Most Frequent Value. Introduction to a Modern Conception of Statistics; Akadémiai Kiadó, Budapest, 1991.] and [Steiner, F. Optimum Methods in Statistics; Akadémiai Kiadó,Budapest, 1997]. Additionally, if you would like a practical example of how the dihesion is derived, you can refer to [Steiner, F. Most Frequent Value Procedures (a Short Monograph); Geophysical Transactions 1988, 34, 139–260 and Szabó, N.P.; Balogh, G.P.; Stickel, J. Most Frequent Value-Based Factor Analysis of Direct-Push Logging Data: MFV-based Factor Analysis; Geophysical Prospecting 2018, 66, 530–548] (see lines 153 to 155 in the manuscript).

Comment 2:
Why is the confidence level chosen as 68.27% or 95.45%? I suggest the authors to give a simple explanation.

Response 2:
Thank you for your question. The reason why the confidence level is chosen as 68.27% or 95.45% is because these values are widely used in statistical analysis. They indicate the level of certainty or reliability in the results. Additionally, they are associated with 1 and 2 sigma errors, which are measures of the variability or uncertainty in the data (see lines 282-283 in the manuscript). 

Comment 3:
It is easy to understand that the MFV method would of course give a more reliable and reasonable value for the estimation of sensor data analysis. However, due to the need of multiple iterations, I would guess that the time required for computation should be much larger than other general methods, e.g., the mean value method. What is the difference in the required computation time between the MFV method and other general methods, if the volume of data is very large?

Response 3:
Thank you for your insightful comment. You are correct in understanding that the MFV method provides a more reliable and reasonable estimation for sensor data analysis. However, it is true that the computation time required for the MFV method may be larger compared to other general methods like the mean value method.

When dealing with a large volume of data, the difference in computation time between the MFV method and other general methods can be significant. The MFV method may take longer to compute due to its iterative nature, whereas other general methods may be faster as they do not require multiple iterations. It is important to consider this trade-off between accuracy and computation time when choosing the appropriate method for analyzing large volumes of data. The maximum number of iterations that was achieved during estimation of the MFV was not exceeding 80, therefore it is longer than estimation using the mean value method, but still required a reasonable time for estimation (see lines 163 to 169 in the manuscript).

Author Response File: Author Response.pdf

Reviewer 2 Report

I am not an expert statistician, so the Editor is of course welcome to disregard the following comments.

But it seems to me that throughout the paper the authors are fond of quoting numbers to a greater accuracy than seems justified by the context, especially given the spread of the data in Table 1.  For example, the authors persist in quoting confidence limits to four figures, even though their '68.27%' confidence limit means that there is, to paraphrase, a ~2-in-3 chance of being right and a ~1-in-3 chance of being wrong.  I doubt if it really matters whether the chance of being wrong about something is (100.00-68.27)% = 31.73% or (100-68)% = 32%.

That aside, I'm not sure that what is said in the 'Conclusions' section is sufficiently specific.  According to Section 1.1, the aim of the paper is to 'establish reliable confidence intervals for the non-uniform background radiation ...'.  In the 'Conclusions' section the two numbers highlighted are 35.19 with an uncertainty range of +3.41 to −3.59, and 34.80 with an uncertainty range of +3.58 to −3.48.  Are these the numbers that are intended to show that the background ambient radiation level around the detector and its shielding is indeed non-uniform — as desired in Section 1.1?  If they are, I think the authors should be clearer about it.  But if I am interpreting these numbers aright, my personal view is that these numbers are not really different.

The paper is probably a good illustration of statistical manipulation, but as to its demonstration of 'non-uniform background radiation' I have my doubts. 

There seems to be some needless repetition of text within the paper, but that is a matter for the Editor.

However, as I said before, I am not an expert statistician.

Author Response

Dear Reviewer,

I would like to express my sincere gratitude for your invaluable suggestions and comments on the manuscript. Your input has proven to be particularly intriguing and highly appreciated. Enclosed herewith are the comprehensive responses addressing each of your comments (which have also been visually emphasized in "green" within the manuscript).

I extend my heartfelt thanks once again for dedicating your time and exerting effort in reviewing our work.

Yours sincerely,
Victor Golovko

I am not an expert statistician, so the Editor is of course welcome to disregard the following comments.

Comment 1:
But it seems to me that throughout the paper the authors are fond of quoting numbers to a greater accuracy than seems justified by the context, especially given the spread of the data in Table 1.  For example, the authors persist in quoting confidence limits to four figures, even though their '68.27%' confidence limit means that there is, to paraphrase, a ~2-in-3 chance of being right and a ~1-in-3 chance of being wrong.  I doubt if it really matters whether the chance of being wrong about something is (100.00-68.27)% = 31.73% or (100-68)% = 32%.

Response 1:
The MFV technique and confidence interval bootstrapping was used to analyze neutron lifetime measurements. We used the original neutron lifetime data from a study (Zhang et al., 2022) as a test to check if the MFV and bootstrapping algorithms gives consistent results. Even though the bootstrap process is random, we found that it is better to use a confidence level specified to the second decimal place (for example, 1 sigma corresponds to 68.27%) in order to replicate the MFV confidence interval for neutron lifetime measurements. Therefore, we adopted the same approach in this study (see lines 185 to 191 in the manuscript).

Comment 2:
That aside, I'm not sure that what is said in the 'Conclusions' section is sufficiently specific.  According to Section 1.1, the aim of the paper is to 'establish reliable confidence intervals for the non-uniform background radiation ...'.  In the 'Conclusions' section the two numbers highlighted are 35.19 with an uncertainty range of +3.41 to −3.59, and 34.80 with an uncertainty range of +3.58 to −3.48.  Are these the numbers that are intended to show that the background ambient radiation level around the detector and its shielding is indeed non-uniform — as desired in Section 1.1?  If they are, I think the authors should be clearer about it.  But if I am interpreting these numbers aright, my personal view is that these numbers are not really different.

Response 2:
Thank you for your feedback. We appreciate your comments regarding the specificity of the 'Conclusions' section. We understand that you are unsure if the highlighted numbers, 35.19 with an uncertainty range of +3.41 to −3.59, and 34.80 with an uncertainty range of +3.58 to −3.48, are intended to demonstrate the non-uniformity of the background radiation as stated in Section 1.1.

To clarify, yes, these numbers are indeed intended to show the MFV and estimated uncertainty of the background ambient radiation level around the detector and its shielding, as mentioned in Section 1.1.  We apologize if this was not clearly explained in the 'Conclusions' section, and we will make sure to provide a clearer explanation in the revised version of the paper. Our focus was on determining the confidence intervals based on actual measurements using passive sensors around the detector and its shielding, with a specific level of confidence. We have previously discussed the uneven distribution of background radiation around the detector and its shielding in detail Golovko et.al (2023). For future dark matter detectors, it would be useful to know the upper level of the confidence interval, which can be used as a conservative estimate of the ambient radiation level for designing the shielding.

Once again, we thank you for your valuable feedback, and we will address these concerns in the revised version of the paper (see lines 397 to 411 in the manuscript).

Comment 3:
The paper is probably a good illustration of statistical manipulation, but as to its demonstration of 'non-uniform background radiation' I have my doubts. 

Response 3:
We appreciate the feedback from the reviewer. We understand that the reviewer has doubts about our demonstration of 'non-uniform background radiation' and we would like to address their concerns.

We would like to acknowledge that the concept of non-uniform background radiation may be a complex and nuanced topic. It was previously discussed in a paper by Golovko et al. titled "Ambient Dose and Dose Rate Measurement in SNOLAB Underground Laboratory at Sudbury, Ontario, Canada; Sensors 2023, 23". Our research aimed to investigate and analyze the confidence intervals of non-uniformity in background radiation levels, using statistical techniques to support our findings. Recently, we have applied a method, based on the most frequent value approach combined with bootstrap analysis, that provides a more robust way to estimate historical measurements of 39Ar half-life. This method has resulted in the uncertainty being a factor of 3 smaller than that of the most precise re-calculated 39Ar half-life measurements corresponding to the 68% confidence level and presented in a paper by Golovko titled "Application of the most frequent value method for 39Ar half-life determination"; The European Physical Journal C (EPJC) 83 (10), 930 (see lines 133 to 137 in the manuscripts). The editor of the journal was interested in the idea of reducing the uncertainty using the suggested methods. Moreover, the referee of the paper recommended tagging the paper on arXiv under several categories, so that both nuclear/particle physicists and geo physicists can easily find it.

Comment 4:
There seems to be some needless repetition of text within the paper, but that is a matter for the Editor.

Response 4:
We appreciate the feedback from the reviewer.


However, as I said before, I am not an expert statistician.

Author Response File: Author Response.pdf

Reviewer 3 Report

- line 4: Please remember that this is not a journal specialized in particle physics. At this point of the abstract, and also in the introduction, it could be not clear to the casual reader what you mean by “environmental gamma background”. The authors should first introduce the role of ambient radiation in rare event searches in particle physics (lines 6-7) and only later describe the application of the technique to gamma radiation. A more general introduction to the challenges of rare event searches, already in the abstract, would also be appreciated.

 

- line 88-89: please be more quantitative on the requirements to exclude dose values from the analysis.

 

- line 104-105: This is not a very interesting example, because in most cases the possible outcomes of the measurement are almost continuous, with very little chance of getting exactly the same value more than once, and the mode is estimated by taking a histogram, which, with an appropriate choice of the binning, would work also for the second series of numbers. I suggest stressing here, instead, the advantages of the MFV approach over a histogram-based approach.

 

- Eq. 3: I presume epsilon here is the convergence value of the iterations in Eq. 2. Please specify it in the text.

 

- line 203: I don’t understand how the control dosimeters are used. Also, they give a measurement that is higher than almost all other dosimeters. It is puzzling because my understanding was that these control dosimeters should give a sort of pedestal to be subtracted from the other measurements.

 

- line 273-279: Most of this is already written in lines 215-221. Please remove repetitions.

 

- Table 1: It is not straightforward to understand what badges are used for the measurements at lines 239-249 and what are used for the measurements at lines 250-260. I suggest modifying Table 1 to add this information (e.g. by adding one or two columns indicating in which measurement each badge is used).

 

 

 

Author Response

Dear Reviewer,

I am writing to express my deepest gratitude for your invaluable suggestions and comments on the manuscript. Your input has proven to be particularly engaging and highly appreciated. I have enclosed the comprehensive responses addressing each of your comments, which have also been visually emphasized in "blue" within the manuscript.

I would like to sincerely thank you once again for dedicating your time and exerting effort in reviewing our work.

Yours sincerely,
Victor Golovko

- line 4: Please remember that this is not a journal specialized in particle physics. At this point of the abstract, and also in the introduction, it could be not clear to the casual reader what you mean by “environmental gamma background”. The authors should first introduce the role of ambient radiation in rare event searches in particle physics (lines 6-7) and only later describe the application of the technique to gamma radiation. A more general introduction to the challenges of rare event searches, already in the abstract, would also be appreciated.

Response:
We appreciate the reviewer's feedback and acknowledge their concerns regarding the clarity of our terminology and the need for a more general introduction to rare event searches. We will address these points in the revised version of our manuscript.

Regarding the term "environmental gamma background," we understand that it may not be immediately clear to readers outside the field of particle physics (see lines 5 to 6). In response to the reviewer's suggestion, we will provide a brief introduction to the role of ambient radiation in rare event searches in particle physics. This will help contextualize our study and clarify the significance of the environmental gamma background in our research.

Furthermore, we agree that a more general introduction to the challenges of rare event searches would enhance the abstract. We will expand the introductory section to provide a broader overview of the difficulties faced in detecting rare events, including the importance of minimizing background noise and optimizing signal-to-noise ratios (see lines 37 to 42). By doing so, we aim to make our research more accessible to a wider audience, including readers who may not be familiar with the specific field of particle physics.

Thank you for your valuable feedback, which will undoubtedly improve the clarity and comprehensibility of our manuscript. We will incorporate these suggestions into the revised version to ensure that our research is more effectively communicated to both specialized and non-specialized readers.

- line 88-89: please be more quantitative on the requirements to exclude dose values from the analysis.

Response:
Additional specific information about the criteria for excluding dose values from the analysis can be found in a separate publication by Golovko et.al. in 2023 (see lines 97 to 98).
 
- line 104-105: This is not a very interesting example, because in most cases the possible outcomes of the measurement are almost continuous, with very little chance of getting exactly the same value more than once, and the mode is estimated by taking a histogram, which, with an appropriate choice of the binning, would work also for the second series of numbers. I suggest stressing here, instead, the advantages of the MFV approach over a histogram-based approach.

Response:
The example provided here is a basic made-up dataset used to demonstrate the distinction between mode and MFV. To explore more recent and intriguing datasets, as well as the benefits of using the MFV approach compared to other statistical methods, you can refer to the following sources: Zhang et al. (2017), Zhang et al. (2018), Zhang et al. (2022), and Golovko (2023) (see lines 116 to 121).

- Eq. 3: I presume epsilon here is the convergence value of the iterations in Eq. 2. Please specify it in the text.

Response:
Done (see lines 176 to 177).

- line 203: I don’t understand how the control dosimeters are used. Also, they give a measurement that is higher than almost all other dosimeters. It is puzzling because my understanding was that these control dosimeters should give a sort of pedestal to be subtracted from the other measurements.

Response:
Details on how the control dosimeters are used are provided elsewhere (see lines 242 to 243).

- line 273-279: Most of this is already written in lines 215-221. Please remove repetitions.

Response:
Done (the paragraph removed).
 

- Table 1: It is not straightforward to understand what badges are used for the measurements at lines 239-249 and what are used for the measurements at lines 250-260. I suggest modifying Table 1 to add this information (e.g. by adding one or two columns indicating in which measurement each badge is used).

Response:
More information about the positions of the TLDs badges mentioned in Table 1 can be found in Golovko et.al. (2023) (see lines 275 to 276).
 

Author Response File: Author Response.pdf

Back to TopTop