Next Article in Journal
Transient and Persistent Technical Efficiencies in Rice Farming: A Generalized True Random-Effects Model Approach
Previous Article in Journal
Instrumental Variable Method for Regularized Estimation in Generalized Linear Measurement Error Models
 
 
Article
Peer-Review Record

Is It Sufficient to Select the Optimal Class Number Based Only on Information Criteria in Fixed- and Random-Parameter Latent Class Discrete Choice Modeling Approaches?

Econometrics 2024, 12(3), 22; https://doi.org/10.3390/econometrics12030022
by Péter Czine 1, Péter Balogh 2,3,*, Zsanett Blága 4,5, Zoltán Szabó 6, Réka Szekeres 5, Stephane Hess 7 and Béla Juhász 5
Reviewer 1:
Reviewer 2:
Econometrics 2024, 12(3), 22; https://doi.org/10.3390/econometrics12030022
Submission received: 21 May 2024 / Revised: 30 June 2024 / Accepted: 30 July 2024 / Published: 8 August 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The topic of the manuscript is interesting and would make a valuable contribution to the literature. However, I am a bit unsure as to what its aims exactly are. The introduction suggests a theoretical approach investigating the sufficiency of information criteria to determine the number of latent classes in RCL specifications. This type of paper simulates a model with known specifications -in the case here it would be a specific number of classes- and then tests different methods to determine which one can accurately capture the given characteristics of the model. Sometimes, an illustration using empirical data is provided.

The manuscript I have reviewed only provides the illustration, and concludes that the IC indicates that 3-class and 4-class LCs are preferred, but other criteria have to be considered to determine the correct number of classes. But no-one really knows the actual number of classes of an RCL model based on the paper’s sample of 1011 responses. The excellent empirical work of the paper suggests 3 or 4 classes, but there is no theoretical justification for this. In essence, my comment is that I would have expected more theory in this paper, given its title and the statements in the introduction. The empirical work is methodologically solid and and conducted in a thorough and competent manner.

Minor issue: Section 2 starts with the statement :"This chapter will describe[..]". This is not a chapter.

 My main suggestion would be either to transparently present the paper as an applied econometric investigation or to add more theory. The former implies that more detail about the experiment, its aims, and the implications of 3/4 classes should be given. At the moment, there is only a minimum of information to understand what it is about. If the latter, then there should be much more theoretical background on IC criteria and their role in determining the number of latent classes. 

Comments on the Quality of English Language

The authors need to thoroughly review the manuscript for typos and odd sentences, e.g., the last sentence of the abstract, which appears to be missing a pronoun or a verb, or the sentence at the top of page 6.

Author Response

Reviewer 1.

Questions:

Comments and Suggestions for Authors

The topic of the manuscript is interesting and would make a valuable contribution to the literature. However, I am a bit unsure as to what its aims exactly are. The introduction suggests a theoretical approach investigating the sufficiency of information criteria to determine the number of latent classes in RCL specifications. This type of paper simulates a model with known specifications -in the case here it would be a specific number of classes- and then tests different methods to determine which one can accurately capture the given characteristics of the model. Sometimes, an illustration using empirical data is provided.

The manuscript I have reviewed only provides the illustration, and concludes that the IC indicates that 3-class and 4-class LCs are preferred, but other criteria have to be considered to determine the correct number of classes. But no-one really knows the actual number of classes of an RCL model based on the paper’s sample of 1011 responses. The excellent empirical work of the paper suggests 3 or 4 classes, but there is no theoretical justification for this. In essence, my comment is that I would have expected more theory in this paper, given its title and the statements in the introduction. The empirical work is methodologically solid and and conducted in a thorough and competent manner.

Minor issue: Section 2 starts with the statement :"This chapter will describe[..]". This is not a chapter.

 My main suggestion would be either to transparently present the paper as an applied econometric investigation or to add more theory. The former implies that more detail about the experiment, its aims, and the implications of 3/4 classes should be given. At the moment, there is only a minimum of information to understand what it is about. If the latter, then there should be much more theoretical background on IC criteria and their role in determining the number of latent classes. 

 

Answers:

Thank you very much for your criticism and helpful suggestions!

We have tried to improve the paper on several points in line with your suggestions, highlighting its contribution to the literature.

The research was not described in detail because the experiment has already been published in a previous paper with a different focus, which is cited in this paper. A detailed description of the survey is given in that paper (Blága et al., 2023):

The description in that study is as follows: “Our research was conducted from March 2021 to September 2021 during the third wave of the coronavirus pandemic in Debrecen, the second-largest city in Hungary (202, 402 people) [12]. Before the data collection, we consulted with several healthcare professionals in the field/doctors, and we also managed to conduct an online focus group interview with eight participants. The aim was to narrow down the range of factors influencing the choice of coronavirus vaccine to be tested. As a result of the process, we were able to identify seven vaccine attributes, which were as follows: (1) country of origin (USA/European Union/Hungary/Russia/China);

(2) type of technology used in the production (old/new);

(3) the effectiveness of the vaccine (60–70%/71–90%/more than 90%);

(4) the type of possible side effect (according to the package leaflet/long-term);

(5) duration of protection provided by the vaccine (6 months/12 months/lifelong);

(6) the number of doses required to develop protection (1 dose/2 dose);

(7) the price of the vaccine (HUF 2000/HUF 6000/HUF 10,000/HUF 14,000).

 

After selecting the vaccine attributes, we designed the structure of our questionnaire, which consisted of three major parts. First, our respondents had to evaluate six statements (on a scale of 1 to 5) regarding the precautionary measures recommended by the National Public Health Centre during a pandemic, and then we presented the decision situations of the discrete choice experiment that formed the basis of our research. In the last section of the questionnaire, we asked questions about the COVID-19 pandemic and collected the sociodemographic characteristics of our respondents.

Following the design of the questionnaire, we conducted a pilot study with the participation of 83 individuals mostly working or receiving higher education, based on convenience sampling. Our goal was to provide feedback on difficult-to-understand parts of the questionnaire and to obtain preliminary information about respondents’ preferences, thus establishing a Bayesian-type experimental design. In our pilot study, we included the seven attributes presented earlier, and we created our D-efficient type design with Ngene 1.2 software [13]. The efficient type of experimental designs allow researchers to gain reliable parameter estimates with significantly lower sample size. One type of this is the D-efficient experimental design, which increases the efficiency of the design by minimizing the D-error (a determinant of the asymptotic variance-covariance matrix, which is an approximation of the real variance–covariance matrix) [14]. Our experimental design included 16 choices, each of which included three COVID-19 vaccine alternatives and one “no choice” option. Given of the high number of decision-making situations, the so-called blocking was used, so respondents were faced with only a subset of situations (eight situations). An example of the decision situation of our pilot study is shown in Table 1.

Based on the results of our pilot study, the type of technology used to make the vaccine, the number of doses required for vaccination, and the price of the vaccine were omitted from our final design (these attributes did not have a significant impact on the choices, the coefficients of these attributes do not significantly differ from zero). Using the significant coefficients, we designed a Bayesian D-efficient experimental design in which, similar to the pilot study, three vaccine alternatives and one no-choice option were included in the decision situations and blocking (we arranged the 32 decision situations of our experimental design into four blocks, so similarly to the pilot study, the respondents had to choose in only eight decision situations) to avoid the fatigue effect [15]. The vaccine attributes included in the final questionnaire are illustrated in Table 2, and an example of a decision situation is shown in Table 3.

This cross-sectional study was conducted by surveying active residents of Hajdú-Bihar County, who came to one of the three vaccination points in Debrecen (University of Debrecen, Kenézy Gyula Hospital, and the Outpatient Clinic) to receive the vaccination. Participation in the study was voluntary and the questionnaires were completed anonymously. The questionnaires were filled in at these three vaccination points in Debrecen from March 2021 to September 2021. As no distribution data are available for the studied population (vaccinated inhabitants of Hajdú-Bihar County), we cannot support the representativeness of the collected sample. It is also important to emphasize that a further limitation of our sample stems from the fact that at the time of data collection, vaccination of people under the age of 30 was already taking place in Hungary.

Blaga, Z.; Czine, P.; Takacs, B.; Szilagyi, A.; Szekeres, R.; Wachal, Z.; Hegedus, C.; Buchholcz, G.; Varga, B.; Priksz, D.; Bombicz, M.; Szabo, A.M.; Kiss, R.; Gesztelyi, R.; Romanescu, D.D.; Szabo, Z.; Szucs, M.; Balogh, P.; Szilvassy, Z.; Juhasz, B. Examination of Preferences for COVID-19 Vaccines in Hungary Based on Their Properties—Examining the Impact of Pandemic Awareness with a Hybrid Choice Approach. Int. J. Environ. Res. Public Health 2023, 20, 1270. https://doi.org/10.3390/ijerph20021270

 

Questions:

Comments on the Quality of English Language

The authors need to thoroughly review the manuscript for typos and odd sentences, e.g., the last sentence of the abstract, which appears to be missing a pronoun or a verb, or the sentence at the top of page 6.

Answers:

Thank you for your suggestion! We modified our original text according to your comments.

 

Reviewer 2 Report

Comments and Suggestions for Authors

I would like to express my gratitude to the authors for their work. I have thoroughly reviewed the document titled “Is it sufficient to select the optimal class number based only on information criteria in fixed and random parameter latent class discrete choice modelling approaches?” and I found it to be an exceptionally well-done piece of work.

The text addresses the modeling of stated choice experiments, highlighting the limitations of the multinomial logit (MNL) model and proposing alternatives such as mixed logit (MXL) models and random parameter latent class (RLC) models to capture preference heterogeneity. It compares the effectiveness of these models based on a study of respondents' preferences for COVID-19 vaccines, using data from a survey conducted in Debrecen, Hungary. It concludes by recommending the use of additional criteria, beyond information criteria, to determine the optimal number of classes in discrete choice models.

Highlighting some of the strengths of the presented work, I believe the text presents a detailed and rigorous comparison of different discrete choice models, providing a clear evaluation of their advantages and limitations in terms of capturing preference heterogeneity. Additionally, practical recommendations are offered to analysts on how to determine the optimal number of classes in these models, based on criteria beyond standard information criteria.

The sample size (n=1011) is adequate to provide statistically significant and robust results in the analysis of discrete choice preferences. However, it is important to consider that the bias introduced by the specific context of vaccination in Hungary in 2021 could influence the generalization of the results to other populations or situations. For future work, it is recommended to both expand the sample size and consider other geographic regions.

In any case, I consider the work to be of high quality and that it addresses an interesting and current topic.

No formulation errors are observed, and the results are consistent with the models described in the document.

The bibliography is adequate, with no detected self-citations.

The text is well-written, with no detected grammatical or spelling errors.

For all the above reasons, there is no obstacle to this document continuing with the publication process in Econometrics.

Author Response

 

Reviewer 3.

Question:

Comments and Suggestions for Authors

I would like to express my gratitude to the authors for their work. I have thoroughly reviewed the document titled “Is it sufficient to select the optimal class number based only on information criteria in fixed and random parameter latent class discrete choice modelling approaches?” and I found it to be an exceptionally well-done piece of work.

The text addresses the modeling of stated choice experiments, highlighting the limitations of the multinomial logit (MNL) model and proposing alternatives such as mixed logit (MXL) models and random parameter latent class (RLC) models to capture preference heterogeneity. It compares the effectiveness of these models based on a study of respondents' preferences for COVID-19 vaccines, using data from a survey conducted in Debrecen, Hungary. It concludes by recommending the use of additional criteria, beyond information criteria, to determine the optimal number of classes in discrete choice models.

Highlighting some of the strengths of the presented work, I believe the text presents a detailed and rigorous comparison of different discrete choice models, providing a clear evaluation of their advantages and limitations in terms of capturing preference heterogeneity. Additionally, practical recommendations are offered to analysts on how to determine the optimal number of classes in these models, based on criteria beyond standard information criteria.

The sample size (n=1011) is adequate to provide statistically significant and robust results in the analysis of discrete choice preferences. However, it is important to consider that the bias introduced by the specific context of vaccination in Hungary in 2021 could influence the generalization of the results to other populations or situations. For future work, it is recommended to both expand the sample size and consider other geographic regions.

In any case, I consider the work to be of high quality and that it addresses an interesting and current topic.

No formulation errors are observed, and the results are consistent with the models described in the document.

The bibliography is adequate, with no detected self-citations.

The text is well-written, with no detected grammatical or spelling errors.

For all the above reasons, there is no obstacle to this document continuing with the publication process in Econometrics.

 

Answer:

Thank you very much for the detailed, positive and helpful review! The study has been modified in line with the reviewers' points.

 

Back to TopTop