Next Article in Journal
Draft Genome Sequence of the Commercial Strain Rhizobium ruizarguesonis bv. viciae RCAM1022
Previous Article in Journal
Machine Learning Classification Workflow and Datasets for Ionospheric VLF Data Exclusion
 
 
Article
Peer-Review Record

Can Data and Machine Learning Change the Future of Basic Income Models? A Bayesian Belief Networks Approach

by Hamed Khalili
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Submission received: 21 December 2023 / Revised: 10 January 2024 / Accepted: 12 January 2024 / Published: 23 January 2024

Round 1

Reviewer 1 Report (New Reviewer)

Comments and Suggestions for Authors

The article addresses the prediction of households' vulnerability to future poverty. The data are the main point of originality of this work. The methodology is appropriate and I find the results interesting and well discussed. However, I think that there are several methodological details and clarifications to be added in order to make the article suited for publication. Please, see below.

It is not clear why the authors consider Bayesian networks as a machine learning method. Basically, they are multivariate statistical models, thus this affirmation should be better explained. Also, they say that Bayesian networks are an "explainable" machine learning method. Again, the authors should say why Bayesian networks can be considered explainable.

Title: in my opinion, the specific method used (Bayesian networks) should appear in the title. The term "machine learning" is too vague.

Table 1: I suggest to add a column indicating the nature of each variable (discrete or continuous), together with the list of the levels.

In Section 4, the formal definition of Bayesian networks must be added. Usually, it is defined as composed by a qualitative part (DAG) and a quantitative part (conditional probability tables). Consider adding a citation for the definition: there are many textbooks that can be followed (Neapolitan, 2003 is a possibility).

Line 258: which scoring metric was used?

Line 264-265: which inference algorithm was used?

Lines 260-261: it is not correct to say that the DAG was computed. It was estimated. What do the authors mean with Bayesian computations? Did they use priors on DAGs and/or on parameters? Details must be provided.

Software: is pyAgrum the software employed? It is not clear. Please, clarify and provide a citation.

Figure 4: legends overlap the lines. It would be better to move them on a corner.

The need of discretizing continuous variables is a limitation of Bayesian networks that may influence the results. This should be stated in the conclusions.

Author Response

Comments and Suggestions for Authors

The article addresses the prediction of households' vulnerability to future poverty. The data are the main point of originality of this work. The methodology is appropriate and I find the results interesting and well discussed. However, I think that there are several methodological details and clarifications to be added in order to make the article suited for publication. Please, see below.

  1. It is not clear why the authors consider Bayesian networks as a machine learning method. Basically, they are multivariate statistical models, thus this affirmation should be better explained. Also, they say that Bayesian networks are an "explainable" machine learning method. Again, the authors should say why Bayesian networks can be considered explainable.
  2. Title: in my opinion, the specific method used (Bayesian networks) should appear in the title. The term "machine learning" is too vague.
  3. Table 1: I suggest to add a column indicating the nature of each variable (discrete or continuous), together with the list of the levels.
  4. In Section 4, the formal definition of Bayesian networks must be added. Usually, it is defined as composed by a qualitative part (DAG) and a quantitative part (conditional probability tables). Consider adding a citation for the definition: there are many textbooks that can be followed (Neapolitan, 2003 is a possibility).
  5. Line 258: which scoring metric was used?
  6. Line 264-265: which inference algorithm was used?
  7. Lines 260-261: it is not correct to say that the DAG was computed. It was estimated. What do the authors mean with Bayesian computations? Did they use priors on DAGs and/or on parameters? Details must be provided.
  8. Software: is pyAgrum the software employed? It is not clear. Please, clarify and provide a citation.
  9. Figure 4: legends overlap the lines. It would be better to move them on a corner.
  10. The need of discretizing continuous variables is a limitation of Bayesian networks that may influence the results. This should be stated in the conclusions.

Author’s Reply:

  1. BBNs are represented as interpretable AI methods in the literature e.g. in Mihaljevic et al., 2021. The reason for calling them interpretable is briefly highlighted in section 1 via adding 2 references. The reference sources are added to the references section.
  2. The title is updated to address the Bayesian approach used in this paper. However, I would let the first generic part of the title remain as we expect elaborating more on the subject and data of this study through further ML methods in the near future.
  3. The type of each variable is added to the table 1.
  4. Definition & a formal representation of BBNs is added to the beginning of section 4. Equation 1 & 2 are added.
  5. In this study the Bic Score (Koller & Friedman, 2009) is used, which is a log-likelihood score with an additional penalty for network complexity, to avoid overfitting. Please see the highlighted paragraph between the lines 273-285.
  6. Please see the highlighted phrase. We are pursuing the feasibility of obtaining reliable inferences regarding the cash accessibility posterior probabilities of any household in an upcoming year of interest.
  7. The required change is done. Please see the updates highlighted paragraph between the lines 273-285.
  8. The required change is done. Please see the highlighted paragraph between the lines 273-285.
  9. The figure 4 is renewed. The Y-axis labels & subplots’ legend positions are updated.
  10. The proposed phrase is added to the conclusion and discussion section.

 

Reviewer 2 Report (New Reviewer)

Comments and Suggestions for Authors

The main objective of the study is to examine the feasibility of predicting the vulnerability of households to future poverty based on the welfare characteristics of existing households. Is the goal achieved?

 

The topic is relevant.

The abstract and introduction are not clear enough.

The paper needs to provide more detail on the methodology and include appropriate citations to support the argument. The results and the conclusion are not very clear. They need more explanation. The tables would benefit from a more detailed description.

Author Response

Comments and Suggestions for Authors

The main objective of the study is to examine the feasibility of predicting the vulnerability of households to future poverty based on the welfare characteristics of existing households.

  1. Is the goal achieved? The topic is relevant.
  2. The abstract and introduction are not clear enough.
  3. The paper needs to provide more detail on the methodology
  4. and include appropriate citations to support the argument.
  5. The results and the conclusion are not very clear. They need more explanation.
  6. The tables would benefit from a more detailed description.

Author’s reply:

  1. Based on the conclusion & discussion section, different metrics applied in our study shows that the opportunity to converge toward a balanced solution between a highly precise prediction of relative wealthier groups and lowest possible error regarding false negative counts are to some extent possible.
  2. The introduction is updated.
  3. Definition & a formal representation of BBNs is added to the beginning of section 4. Equation 1 & 2 are added.
  4. The reason for choosing BBNs is explained and highlighted.
  5. The results, limitation & future scope are explicitly discussed.
  6. In previous review round an extra table (table 1) is added to explain the tables 2 & 3.

 

Reviewer 3 Report (New Reviewer)

Comments and Suggestions for Authors

Dear Author, Even though the article has merits, I have few questions in order to get a strong approach regarding the paper topic. The big issue is that the basic income is related to a financial year, which starts on January 1 and ends on December 31. Every year, new laws and financial regulations appear and can change the values associated with the basic income (new fiscal rules are imposed usually at the beginning of each year). Even if I understood the reason to treat and analyze the Iran case (as the first country in the world to provide a basic income system to all its citizens), as is mentioned in the Table 1, data are corresponding to the Iranian years (1395, 1396, 1397, 1398) which are equivalent, in the European format to years that are not representing fiscal years (is about of years that are starting on each March 20 to the next March 20, of the next year).

Then, is important to explain if the figures represent the results processed by the authors. In the same time, you have to explain how are generated these figures (using the software......, insert a capture designed in ......etc.). Also, you need to think to a solution to generate, for each figure, a higher resolution, because is important to insert within the article, all these figures (not as supplementary files).

The last section could be divided in two: Discussions and Conclusions.

A question that I have in my mind is related to the rules regarding the fiscal regulations. In what manner these different rules from country to other country could generate some specific features that are incompatible with the approach chosen in this case? This approach is it sufficiently comprehensive for the fiscal rules in each country?

An easier aspect to solve is that in the case of Figures 3 and 4, the titles must start with a capital letter. Good luck!

Comments on the Quality of English Language

Is mandatory to apply several proofreading rounds. Is necessary to proceed to extensive spelling checks.

Author Response

  1. Dear Author, Even though the article has merits, I have few questions in order to get a strong approach regarding the paper topic. The big issue is that the basic income is related to a financial year, which starts on January 1 and ends on December 31. Every year, new laws and financial regulations appear and can change the values associated with the basic income (new fiscal rules are imposed usually at the beginning of each year). Even if I understood the reason to treat and analyze the Iran case (as the first country in the world to provide a basic income system to all its citizens), as is mentioned in the Table 1, data are corresponding to the Iranian years (1395, 1396, 1397, 1398) which are equivalent, in the European format to years that are not representing fiscal years (is about of years that are starting on each March 20 to the next March 20, of the next year).
  2. Then, is important to explain if the figures represent the results processed by the authors. In the same time, you have to explain how are generated these figures (using the software......, insert a capture designed in ......etc.).
  3. Also, you need to think to a solution to generate, for each figure, a higher resolution, because is important to insert within the article, all these figures (not as supplementary files).
  4. The last section could be divided in two: Discussions and Conclusions.
  5. A question that I have in my mind is related to the rules regarding the fiscal regulations. In what manner these different rules from country to other country could generate some specific features that are incompatible with the approach chosen in this case? This approach is it sufficiently comprehensive for the fiscal rules in each country?
  6. An easier aspect to solve is that in the case of Figures 3 and 4, the titles must start with a capital letter. Good luck!

Author’s reply:

  1. In this paper we are only discussing the Iranian case. Just for the sake of translation of Iranian calendar to the European, we have mentioned European dates. Our analysis does not serve as a kind of ubiquitous model for all countries. It aims to examine the feasibility of prediction of future vulnerability to poverty based on a rich existing dataset from Iran.
  2. The Jupyter Notebook to re-produce all results & figures is in the supplementary material of this paper.
  3. Currently figures 1 & 5 are generated in pdf format, others are generated in jpeg or jpg. I hope we can insert all formats to the main text in final document, so we do not need to address them in supplementary material.
  4. Conclusion & discussion are tightly connected, so that preferably they are linked together.
  5. As mentioned in point 1, we do not claim that we have achieved some generic approach to be applied to all countries.
  6. Figures 3, 4, 7 are renewed with capitalized Axis labels.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Review of "A Bayesian Approach to Examine the Feasibility of Integrating Machine Learning to Recognize Households' Eligibility in a Basic Income System"

Abstract:

The abstract of the paper lacks key elements necessary to orient the reader effectively.

 Introduction:

 It does not provide a clear summary of the study's aim, research methods, data sources, or main findings. Furthermore, the introduction fails to articulate the significance and contribution of the study, which is essential for engaging the reader and establishing context.

Literature Review:

The absence of a dedicated literature review section is a notable shortcoming. While there is a brief discussion in the introduction, it does not sufficiently build the conceptual framework of the study. A comprehensive literature review is essential to contextualize the research, demonstrate awareness of prior work in the field, and highlight the gaps the study aims to address.

Data and Variables:

The paper does not adequately justify the selection of the 30 variables used in the analysis. The absence of a theoretical or conceptual framework leaves the reader wondering why these specific variables were chosen. Moreover, there is no explanation regarding the time period for the data used. Given that the paper refers to time periods such as 2016-2017 and 2017-2018, it is crucial to clarify whether the data is still relevant in the context of the current year, 2023.

Methodology:

The paper mentions a Bayesian approach but lacks a detailed description of the methodology employed. A clear exposition of the model, data preprocessing, and model training procedures is necessary for the reader to assess the validity and reliability of the study's findings.

Findings and Discussion:

Another significant shortcomings of the paper is the absence of a discussion of the findings. The reader is left uninformed about how the results of the analysis compare with existing research or how they contribute to the field. Without this critical analysis and interpretation, the findings remain disconnected and lack practical implications.

Limitations and Further Studies:

The paper does not address its limitations or suggest avenues for further research. It is essential to acknowledge any constraints or potential sources of bias in the study's design or data. Additionally, proposing future research directions would enhance the paper's completeness and scholarly value.

Conclusion:

 

In summary, the paper "A Bayesian Approach to Examine the Feasibility of Integrating Machine Learning to Recognize Households' Eligibility in a Basic Income System" requires significant improvement in several key areas. A clear statement of the research aim, robust methodology description, comprehensive literature review, justification for variable selection, and a discussion of findings are essential elements that should be incorporated to enhance the paper's validity and relevance. Addressing these issues will strengthen the paper and contribute to its overall quality.

Comments on the Quality of English Language

Moderate editing of English language required.

Author Response


1. Introduction:


The introduction has now been updated to reflect the study's significance and contribution

2. Literature review:


The references to the previous literature section and the procedure are now made visible in the text


3. Data and variables: 
The selection of the 30 variables is in line with the criteria taken into account by the Ministry of Social Affairs in the case country examined. In addition, although the data will no longer be updated after 2020, the importance of the forecasting method can be demonstrated for future periods


4. Discussions and limitations of the study:


This part has been extensively expanded and now covers the required points

Reviewer 2 Report

Comments and Suggestions for Authors

 

1.The summary section should clearly state the background, methods, and results so that readers can obtain valuable information.

2. The content mentions using the official welfare statistics of Iranian citizens and a Bayesian network approach for analysis. While this suggests a scientific approach, more details about the methodology, data sources, and the Bayesian network's specific application would be beneficial.

3. The language used is clear and concise, making it easy for the reader to understand. However, some sentences are long and could be broken down for better readability.

4. The content lacks citations or references to support the claims and data used. Including references to relevant studies and data sources would enhance the paper's credibility.

5.The content hints at the potential implications of the research but does not explicitly state any findings or conclusions. A brief summary of the expected outcomes or insights would be helpful.

6. The content addresses an interesting and relevant topic—the use of machine learning in determining eligibility for basic income. However, it needs to provide more details about the methodology and include proper citations to strengthen its argument. Additionally, a clear conclusion or summary of expected results would improve the overall impact of the paper.

Overall, the content has the potential to contribute to the discussion on the implementation of basic income systems and the role of machine learning in eligibility determination, but it requires further development and refinement.

Author Response


1. Introduction:


The introduction has now been updated to reflect the study's significance and contribution

2. Literature review:


The references to the previous literature section and the procedure are now made visible in the text


3. Data and variables: 
The selection of the 30 variables is in line with the criteria taken into account by the Ministry of Social Affairs in the case country examined. In addition, although the data will no longer be updated after 2020, the importance of the forecasting method can be demonstrated for future periods


4. Discussions and limitations of the study:


This part has been extensively expanded and now covers the required points

Reviewer 3 Report

Comments and Suggestions for Authors

1、The overall idea of the article is concise and clear. The expression is clear and easy to understand. In the article, you used expressions such as Figures 1 and 2, but in the rest of the article, I did not see the figures, so I cannot distinguish the actual performance of the proposed solution.

2、Have you ever conducted research on other papers with similar research content? The proposed model and algorithm perform well on the dataset used in the paper, but lack comparison with other articles to prove the superiority or inferiority of this scheme.

Comments on the Quality of English Language

Appropriate English expression ability.

Author Response

  1. The figures numbers are now updated
  2. Following improvements to the contents are undertaken:
    1. Introduction:


The introduction has now been updated to reflect the study's significance and contribution

2. Literature review:


The references to the previous literature section and the procedure are now made visible in the text


3. Data and variables: 
The selection of the 30 variables is in line with the criteria taken into account by the Ministry of Social Affairs in the case country examined. In addition, although the data will no longer be updated after 2020, the importance of the forecasting method can be demonstrated for future periods


4. Discussions and limitations of the study:


This part has been extensively expanded and now covers the required points

Reviewer 4 Report

Comments and Suggestions for Authors


The paper deals with an important and timely topic: How to correctly identify households in need of transfer payments based on limited information.

I am a bit baffled about the lengthy discussion of basic income. I understand that a transition from a basic income to a means-tested one is necessary but this does not seem to play any role in the analysis. Both abstract and introduction would benefit from a focus on the main research question.

I am also a bit confused in the results section. Tables 2 and 3 present the findings but in a way that makes it very hard to follow. Maybe a graph would have been more useful? Or a ranking? I recommend a re-write that makes the reader's life easier.

Comments on the Quality of English Language

Typos such as "form" rather than "from" in Table 2 and 3 captions. Some sentences hard to understand.

Author Response

  1. Tables are now updated via a Confusion Matrix
  2. In addition, following improvements are undertaken:

    1. Introduction:


    The introduction has now been updated to reflect the study's significance and contribution

    2. Literature review:


    The references to the previous literature section and the procedure are now made visible in the text


    3. Data and variables: 
    The selection of the 30 variables is in line with the criteria taken into account by the Ministry of Social Affairs in the case country examined. In addition, although the data will no longer be updated after 2020, the importance of the forecasting method can be demonstrated for future periods


    4. Discussions and limitations of the study:


    This part has been extensively expanded and now covers the required points

Back to TopTop