Next Article in Journal
Rural–Urban Divide: Generation Z and Pro-Environmental Behaviour
Previous Article in Journal
Mitigating Subsynchronous Torsional Interaction Using Geometric Feature Extraction Method
 
 
Article
Peer-Review Record

Maladaptive Cognitions in Adolescents and Young Adults When They Play: The Dysfunctional Cognitions in Gaming Scale (DCG)

Sustainability 2022, 14(23), 16109; https://doi.org/10.3390/su142316109
by Iván Sánchez-Iglesias 1, Mónica Bernaldo-de-Quirós 2, Francisco J. Estupiñá 2,*, Ignacio Fernández-Arias 2, Marta Labrador 2, Marina Vallejo-Achón 2, Jesús Saiz 3 and Francisco J. Labrador 2
Reviewer 1: Anonymous
Reviewer 3:
Sustainability 2022, 14(23), 16109; https://doi.org/10.3390/su142316109
Submission received: 5 November 2022 / Revised: 27 November 2022 / Accepted: 29 November 2022 / Published: 2 December 2022
(This article belongs to the Section Psychology of Sustainability and Sustainable Development)

Round 1

Reviewer 1 Report

I thoroughly reviewed your study titled Maladaptive cognitions in adolescents when they play: The Dysfunctional Cognitions in Gaming Scale (DCG). I think your work will significantly contribute to the technology addiction field. I also think that the necessary analyzes have been made. In general, I congratulate you on your work. However, I tried to list some good things to do below one by one.

- In the summary part, how the information was collected, how many stages it was collected, which analysis methods and programs were used should be written briefly in one or two sentences.

- Your literature section is well written, but it would be better if there were some more resources.

- Necessary analyzes were made for the validity and reliability of a scale. Congratulations.

-You wrote in the limitations as “although there are participants between 12 and 22 years old, and invariance of the instrument was found when applying it in the age groups, it would be convenient to study the validity of the instrument in university students and adult” How old are the university students in your country? Also, you mentioned adolescents in the article's title, but there are participants up to the age of 22. Aren't these statements contradictory?

- 4.1. The text under Maladaptive Cognitions, Gambling Frequency, Perceived Health, and PUGV would be well developed with additional resources.

- You said that the scale was prepared for adolescents, but it would be better if it were validated on adults. So what would you say is a scale for adults? Are they a little incomprehensible?

 

- Expressed as scales in some places for Preoccupation, Self-esteem, and Compulsion factors. Let's correct them as factors.

Author Response

Reviewer 1

I thoroughly reviewed your study titled Maladaptive cognitions in adolescents when they play: The Dysfunctional Cognitions in Gaming Scale (DCG). I think your work will significantly contribute to the technology addiction field. I also think that the necessary analyzes have been made. In general, I congratulate you on your work. However, I tried to list some good things to do below one by one.

We would like to thank the editor and the three reviewers for the chance to submit a new and improved version of the manuscript. We have corrected some typos and grammatical errors, the numbering of some tables and some inaccuracies in the list of references. We have added missing information and clarified some issues. Thank you for your keen eye and thoughtful comments.

- In the summary part, how the information was collected, how many stages it was collected, which analysis methods and programs were used should be written briefly in one or two sentences.

We have added some information at the beginning of the Discussion section.

- Your literature section is well written, but it would be better if there were some more resources.

We have tried to keep the introduction concise yet comprehensive. However, if the reviewer wants to point out where we should strengthen the literature review, we will do so promptly.

- Necessary analyzes were made for the validity and reliability of a scale. Congratulations.

Thank you for your kind words.

-You wrote in the limitations as “although there are participants between 12 and 22 years old, and invariance of the instrument was found when applying it in the age groups, it would be convenient to study the validity of the instrument in university students and adult” How old are the university students in your country? Also, you mentioned adolescents in the article's title, but there are participants up to the age of 22. Aren't these statements contradictory?

Students can enroll in college when they are 18 years old. However, the participants (including adults) were recruited only from schools (the equivalent to 7th to 11th grade, Basic and Advanced Professional Training in the US system), so they were not university students. We have made this explicit in the Participants section.

To include the entire study population in the title, we have changed it to “Maladaptive cognitions in adolescents and young adults when they play: The Dysfunctional Cognitions in Gaming Scale (DCG)”.

- 4.1. The text under Maladaptive Cognitions, Gambling Frequency, Perceived Health, and PUGV would be well developed with additional resources.

Section 4.1. deals with the interpretation of the results obtained, which is related to the introductory section. If the reviewer suggests some specific additions, we will be happy to expand the discussion on these results. 

- You said that the scale was prepared for adolescents, but it would be better if it were validated on adults. So what would you say is a scale for adults? Are they a little incomprehensible?

In the limitations section we acknowledge that it would be desirable to study the validity of the DCG scale in older adults (i.e., over 22 years of age).  We have added this to the text to clarify the issue. The same scale could yield different results and have different psychometric properties in that population.

- Expressed as scales in some places for Preoccupation, Self-esteem, and Compulsion factors. Let's correct them as factors.

We have changed scale to factor (but have kept the term subscale).

Reviewer 2 Report

After reviewing the article submitted for evaluation, the following issues have been noted.
 The title, abstract and keywords reflect the contents of the text.  Throughout the work, all the objectives and hypotheses are answered. The author/s, handles an important bibliography, so that the ideas and arguments used during the work are perfectly justified, denoting a wide knowledge of the subject. It is recommended to include more current bibliographical references in the bibliography.

During the work, both the instruments used for data collection and the sample used are justified. The variables taken into account are clear.

As for the originality of the work, it is an important research in the current context, since more and more cases of young people with gambling addiction are found. Therefore, it is a highly topical subject, of which there are still not many studies carried out. The data, as stated in the work, has been collected directly by the author/s for the realization of the study.

The work presents a correct scientific structure, which allows a simple and structured reading of it.
The work stands out for the large sample used, 3,831 participants, and for its internationality, which allows contextualizing the phenomenon under study in different realities.

Throughout the development of the text and in the conclusions, the information provided is contrasted with scientific literature, which allows the text to be perfectly justified. The discussion presented is contrasted with previous studies, so that they have been able to analyze the information obtained with a much more holistic view.
In the tables it is recommended to indicate that they are self-elaborated, in case they are. 

It is recommended to create a detailed theoretical framework on the state of the question.

Author Response

Reviewer 2

After reviewing the article submitted for evaluation, the following issues have been noted.

We would like to thank the editor and the three reviewers for the chance to submit a new and improved version of the manuscript. We have corrected some typos and grammatical errors, the numbering of some tables and some inaccuracies in the list of references. We have added missing information and clarified some issues. Thank you for your keen eye and thoughtful comments.

 The title, abstract and keywords reflect the contents of the text.  Throughout the work, all the objectives and hypotheses are answered. The author/s, handles an important bibliography, so that the ideas and arguments used during the work are perfectly justified, denoting a wide knowledge of the subject. It is recommended to include more current bibliographical references in the bibliography.

We have tried to keep up to date with bibliographic references. However, if the reviewer can identify any areas where the literature review needs to be strengthened, we will do so right away.

During the work, both the instruments used for data collection and the sample used are justified. The variables taken into account are clear.

Thank you.

As for the originality of the work, it is an important research in the current context, since more and more cases of young people with gambling addiction are found. Therefore, it is a highly topical subject, of which there are still not many studies carried out. The data, as stated in the work, has been collected directly by the author/s for the realization of the study.

The work presents a correct scientific structure, which allows a simple and structured reading of it.

The work stands out for the large sample used, 3,831 participants, and for its internationality, which allows contextualizing the phenomenon under study in different realities.

Thank you for your kind words. This manuscript is part of a larger project and many resources were devoted to population sampling and data collection.

Throughout the development of the text and in the conclusions, the information provided is contrasted with scientific literature, which allows the text to be perfectly justified. The discussion presented is contrasted with previous studies, so that they have been able to analyze the information obtained with a much more holistic view.

In the tables it is recommended to indicate that they are self-elaborated, in case they are.

As usual, all tables and figures were composed to present the results in an orderly manner, and arranged to fit the Sustainability template style.

It is recommended to create a detailed theoretical framework on the state of the question.

We have added some information to the introduction to clarify some ideas, although not additional citations. We believe we have adequately put into context the topic of problematic cognitions in gaming, but we can have missed relevant literature. To date, we believe that the relevant literature on the role of cognitions in video game disorders is not very extensive. In essence, the literature is ascribed to the work of Forrest, Moudiab, Spada, King, and Delfabbro, already mentioned in the theoretical introduction. The work presented here attempts to follow a parallel line of research, which could result in an alternative theoretical model. However, any clues that help to improve the theoretical framework (for this or future work) are welcome.

Reviewer 3 Report

The manuscript reflects the results of an instrumental study in which the called Dysfunctional Cognitions in Gaming Scale (DCG) has been developed and tested in a large Spanish sample of adolescents and young adults. The sampling procedure stands out especially.

The conceptual framework developed is relevant, with potentially significant implications for problems evaluation/intervention derived from the consumption of video games in the target population.

In my opinion, the manuscript has the potential to publishing in the journal, but there are still some issues that need to be improved (major revision) and which I will summarize below:

Introduction

1.       The arguments presented in the introduction are pertinent, as well as the review of previous studies and works. The main problem or limitation I see in this section is that the objective of this study is not sufficiently justified. The authors point out discrepancies in the interpretation of the factorial structure of the IGCS in two previous studies (Chinese and French versions). Based on these discrepancies, they are committed to developing a new instrument (see the “Instruments” section). Why a new instrument and not one tested in China and France? Interpretation discrepancies of factorial structures are more or less frequent, especially in EFA framework. I do not think this situation will be a reason enough to justify the need to develop a new instrument (even if it is based on the previous one). Furthermore, the discrepancies are based on only two studies (in two different countries). On the one hand, two studies may not be sufficient to reach firm conclusions when it comes to validating instrument scores using psychometric models. On the other hand, the discrepancies could be explained by cultural differences. In both cases, it seems premature to scrap the instrument in favour of a new one.

Materials and Methods

2.       The age variable has been coded as a dichotomous variable: 12 to 16 years and 17 to 22 years. Is there an academic reason (for example, level of studies, course) or another type? This issue should be clarified, especially for the assessment of metric invariance (MI) results.

3.       As I pointed out before, it is not clear why the authors chose to develop a new instrument based on the one proposed by King and Delfabbro. Now we must add why the authors chose the study by Forrest et al. and the creation of 16 items in particular (note: “IGDC” must be a typo, line 128). Why not directly use the previous instrument? Why 16 items? Furthermore, the study by Forrest et al. clamoured for a 4-factor structure, and the authors propose 3-factor structure (but do not discuss this divergence). On the other hand, the study by Forrest et al. implies a definition of the target population (implicit) different from that proposed in the manuscript: 16-65 years versus 12-22 years. All these issues should be clarified.

4.       A 5-point response scale has been included: from "never" to "always". These labels are usually discouraged in any manual or list on how to craft items. The label "never" may make sense for evaluating the absence of a variable ("absolute zero"), but the label "always" does not seem to be justified. For example, in IGDS9-SF “very often” is used. Could the authors clarify why they have chosen the label “always”?

5.       Table 1. Although skewness and kurtosis are reported later, the descriptive information should also include this information per item.

6.       General comment on table numbering: on several occasions the table number does not coincide with the one cited in the text.

7.       IGDS9-SF: the first sentence is confusing: “This is a Spanish validated translation of the original scale [16].” [16] refers to the Portuguese version of the original version, so it cannot be the “Spanish validated translation”.

8.       GHQ-12: “This is a Spanish validation of the original GHQ-12 [18]”. Either [18] is not the correct reference, or the authors do not explain the translation, adaptation and validation procedure of the Spanish version.

9.       GHQ-12: the internal consistency of the total score is low (.569). This can later affect the results shown when correlating scores with DCG. If relevant, one could leave only the internal consistency value of the two subscales (and consequently only show these correlations in results).

10.   Line 189: typo? – “factor analysis (EFA). And semTools…”

11.   Could the authors comment on whether the two subsamples used for EFA and CFA are similar or comparable in variables such as sex, age and any other relevant variable?

Results

12.   Parallel Analysis (PA). The decision to consider three factors is somewhat borderline (the third factor could be left out). In fact, Figure 1 also reflects that the existence of a single factor could be considered (a large difference between the eigenvalues ​​of the first and second factors). Therefore, the option of three factors is not as clear as it may seem, so it would be convenient to add substantive, practical criteria, etc., to underpin the decision taken by the authors.

13.   Parallel Analysis (PA): In relation to the above, PA supports different analysis procedures and strategies. When a factor is at the limit (as is the case), using one strategy or another can more clearly decant the factor, or simply favor one decision more than another. PA admits the use of both Pearson and polychoric correlations as input matrix, it admits different estimation methods to extract the fictitious eigenvalues ​​ (simple random), and it also admits the mean criterion and the 95th percentile criterion. The authors have used principal component analysis as the extraction method (assuming from the Pearson correlation matrix) and the mean criterion (as indicated in Figure 1). Would the same result be obtained in case of using other combinations?

14.   EFA (1): the communalities could be removed from Table 2. They are not very informative (in addition, the inclusion of the skewness and kurtosis values ​​in Table 1 would be compensated). It is usual to interpret the EFA solution from the output factor loadings both in the factor to which they belong (from the point of view of interpretation), but also in other factors. Table 2 shows high factor loadings in factors other than those interpreted by the authors (cross-loadings), but they have not been interpreted. Sometimes the decision is made (factor loadings marked in bold) by a few tenths (for example, item 12 loads .562 in Preoccupation and .589 in Self-esteem; only need to consider item 12 in Self-esteem?).

15.   EFA (2): The correlations between factors are very high, consistent with the information shown in Figure 1 (first eigenvalue is much higher than the second eigenvalue). On this issue, see comments below.

16.   Estimation methods: gives the impression that in EFA the variables have been considered as continuous (also in PA), while in CFA an estimator for categorical data (DWLS) is used. The variables are the same, so the criteria should be unified (assuming that the estimators do not have to be the same in EFA and CFA) and always use the same input matrix (Pearson or polychoric). On the other hand, the input matrix used for CFA-MG is not reflected in the text. In my opinion, the most appropriate in all cases/analysis would be to use polychoric.

17.   CFA (model): the high correlations in EFA (higher still in CFA, as a consequence of setting the cross-loadings to zero) raise the possibility, even the necessity, of considering the scale as potentially unidimensional (perhaps not strict unidimensionality but from the point of view of essential unidimensionality; see Reise, 2012; Reise et al., 2013). This situation raises two ways of analysis or evaluation of CFA models. On the one hand, strict unidimensionality can be tested from a unifactorial model. On the other hand, a bifactor CFA model can be applied, with the three factors proposed as facets, and indices (such as ECV and hierarchical omega) can be obtained to assess the strength or factorial determination of the common variance to all items (general factor of the bifactor model).

Reise S. P. (2012). The rediscovery of bifactor measurement models. Multiv. Behav. Res. 47 667–696. 10.1080/00273171.2012.715555

Reise S. P., Scheines R., Widaman K. F., Haviland M. G. (2013). Multidimensionality and structural coefficient bias in structural equation modeling: a bifactor perspective. Educ. Psychol. Meas73 5–26. 10.1177/0013164412449831

18.   CFA (correlation with other variables): for me it does not make sense to recommend use of both the total and the subscales scores. Either we have evidence in favor of one or the other. Therefore, it does not make sense to me to correlate all the scores derived from the factorial solution. Since the correlations between factors are very high, it seems likely that the most appropriate recommendation would be to use a single total score. The application of the bifactor CFA can shed light on this issue. Note that the correlations of the scores of the subscales with other variables (Table 5) are very similar (hypothesis of high variance common to all items). Although keeping the three factors may make sense from a substantive point of view, the use of subscale scores is not discriminative in relation to other variables, they do not differ from the predictive/discriminative power of the total score. This is already reflected in the cross-loadings shown in Table 2. In summary, the authors can bet on the model of three correlated factors as a theoretical proposal and complement the analysis with unifactorial and bifactorial CFA in order to assess which scores are the ones that should be used (total or subscales).

19.   CFA-MG: I have several comments on this point. First of all, to evaluate metric invariance (MI) using CFA-MG, the starting point is to make a separate CFA for each group (see Brown, 2015 – reference [30] in the manuscript). If at this level the CFA model does not hold up in any of the groups, then there is no point in continuing. Secondly, some values ​​of the fit indices are already inadequate at the configural level, according to the RVs themselves set by the authors (page 6). Third, that the chi-square difference test obtains a p-value > 0.05 is indicative of lack of invariance, not the problem of sensitivity to large chi-square sample sizes. Fourth, review the term “robust” to refer to the differences in chi-square between models nested in CFA-MG (note Table 3). In summary, the data provided does not seem to support the IM of the instrument.

20.   In general, the issue of invariance is complex and restricted. Applying CFA-MG in assessment contexts with adolescents can be too demanding. In adolescents, it can be assumed that there may be jumps or changes in maturation (or learning) between an age value and the following year, or important differences in the way of understanding the construct (i.e., the content of the items) between boys and girls. In addition, both variables (sex, age) can interact in maturational/learning terms. Faced with this situation, two possibilities occur to me: a) apply CFA-MG to other models (unifactorial, bifactor) also included in previous analyses, or (if a) it is unsuccessful) b) report lack of invariance (explicitly assume this difficulty and address it in the text), differentiating total scores and percentiles by sex and also by age, and suggesting that this issue be further explored in the future (through MIMIC models, for example, introducing sex and age as covariates in the global CFA model: see Brown, 2015).

Author Response

Reviewer 3

The manuscript reflects the results of an instrumental study in which the called Dysfunctional Cognitions in Gaming Scale (DCG) has been developed and tested in a large Spanish sample of adolescents and young adults. The sampling procedure stands out especially.

The conceptual framework developed is relevant, with potentially significant implications for problems evaluation/intervention derived from the consumption of video games in the target population.

In my opinion, the manuscript has the potential to publishing in the journal, but there are still some issues that need to be improved (major revision) and which I will summarize below:

We would like to thank the editor and the three reviewers for the chance to submit a new and improved version of the manuscript. We have corrected some typos and grammatical errors, the numbering of some tables and some inaccuracies in the list of references. We have added missing information and clarified some issues. Thank you for your keen eye and thoughtful comments.

Introduction

  1. The arguments presented in the introduction are pertinent, as well as the review of previous studies and works. The main problem or limitation I see in this section is that the objective of this study is not sufficiently justified. The authors point out discrepancies in the interpretation of the factorial structure of the IGCS in two previous studies (Chinese and French versions). Based on these discrepancies, they are committed to developing a new instrument (see the “Instruments” section). Why a new instrument and not one tested in China and France? Interpretation discrepancies of factorial structures are more or less frequent, especially in EFA framework. I do not think this situation will be a reason enough to justify the need to develop a new instrument (even if it is based on the previous one). Furthermore, the discrepancies are based on only two studies (in two different countries). On the one hand, two studies may not be sufficient to reach firm conclusions when it comes to validating instrument scores using psychometric models. On the other hand, the discrepancies could be explained by cultural differences. In both cases, it seems premature to scrap the instrument in favour of a new one.

We address this issues in comment #3.

Materials and Methods

  1. The age variable has been coded as a dichotomous variable: 12 to 16 years and 17 to 22 years. Is there an academic reason (for example, level of studies, course) or another type? This issue should be clarified, especially for the assessment of metric invariance (MI) results.

The dichotomy responds to an academic criterion. The participants were recruited from the Spanish compulsory education system (12 to 16 years old, the equivalent to 7th to 11th grade in the US sysyem) and higher education vocational courses (17 years old an older, equivalent to Basic and Advanced Professional Training in the US system).

  1. As I pointed out before, it is not clear why the authors chose to develop a new instrument based on the one proposed by King and Delfabbro. Now we must add why the authors chose the study by Forrest et al. and the creation of 16 items in particular (note: “IGDC” must be a typo, line 128). Why not directly use the previous instrument? Why 16 items? Furthermore, the study by Forrest et al. clamoured for a 4-factor structure, and the authors propose 3-factor structure (but do not discuss this divergence). On the other hand, the study by Forrest et al. implies a definition of the target population (implicit) different from that proposed in the manuscript: 16-65 years versus 12-22 years. All these issues should be clarified.

The reviewer is right; we needed to justify further the objective of our study. Indeed, the lack of a clear factor structure would not be, by itself, enough to justify the development of a new instrument. As the reviewer points out, discrepancies in factor structure are common in psychometric measures across cultures and languages. However, King and Delfabbro found that the adolescents with IGD had distinct problematic cognitions about gaming than those without IDG, with a large size of observed effects [8]. This strong association between gaming cognitions and IGD symptoms could be explained because the scale items  were, precisely, developed from the common cognitions found in primary studies on IGD [7] (i.e., subjects with gaming problems). We think that the maladaptive cognitions do appear with some degree of frequency also in subjects without IGD. Given that they constitute the vast majority of the population, it would be useful to have a new instrument to detect problematic cognitions in this general population, and to create sorting scales to discriminate adequately these subjects. The DCG was developed from a situational framework common to all gamers (with or without IGD).  We have added these ideas to the manuscript, at the end of the introduction.

Our population of interest is the age range of 12 to 22 years. This has been developed in the introduction. While it is true that Forrest et al. used a broader age range, King and Delfabbro developed their Internet Gaming Cognition Scale using school students 12 years and older (M = 14.1 years, SD = 1.5).

“IDCG” was a typographical error.

The DCG was conceived a brief, screening measure, to be administered as a stand-alone instrument, or along with other instruments as the IGDS9-SF [15]. As such, it only has 16 items. This have been made explicit in the manuscript (both in the introduction and in section 2.2.1 DCG scale).

  1. A 5-point response scale has been included: from "never" to "always". These labels are usually discouraged in any manual or list on how to craft items. The label "never" may make sense for evaluating the absence of a variable ("absolute zero"), but the label "always" does not seem to be justified. For example, in IGDS9-SF “very often” is used. Could the authors clarify why they have chosen the label “always”?

This study is part of a larger project; the two instruments were developed by two independent teams of researchers who apparently used different criteria to anchor the Likert scales. The label “always” seems to be the semantic counterpart of “never”. Although the reviewer is right that “always” does not make sense in this context, its use is widespread. Its presence can positively skew the scores of the items. This issue has been addressed in the limitations section, taking into account the newly added skewness values of Table 1.

  1. Table 1. Although skewness and kurtosis are reported later, the descriptive information should also include this information per item.

We have added skewness and kurtosis values, for each item, in Table 1.

  1. General comment on table numbering: on several occasions the table number does not coincide with the one cited in the text.

You are absolutely right. We have corrected the numbering of the tables, and rearranged them to match the text layout.

  1. IGDS9-SF: the first sentence is confusing: “This is a Spanish validated translation of the original scale [16].” [16] refers to the Portuguese version of the original version, so it cannot be the “Spanish validated translation”.

There was some error in preparing cites and references. The citation [15] refers to the Spanish validated translation, and [16] is the Pontes & Griffiths (2015) original English scale. We have corrected them.

  1. GHQ-12: “This is a Spanish validation of the original GHQ-12 [18]”. Either [18] is not the correct reference, or the authors do not explain the translation, adaptation and validation procedure of the Spanish version.

The citation for the Spanish validation is [17], but it is difficult to see because it was placed next to the section heading (“2.2.3. General Health Questionnaire, GHQ-12 [17]”). We moved the citation to the beginning of the paragraph (“This is a Spanish validation [17] of the original GHQ-12 [18]”).

  1. GHQ-12: the internal consistency of the total score is low (.569). This can later affect the results shown when correlating scores with DCG. If relevant, one could leave only the internal consistency value of the two subscales (and consequently only show these correlations in results).

The total GHQ score is widely used in the clinical context and other applied settings. We prefer to keep it in the results even though its poor internal consistency may affect other outcomes.

  1. Line 189: typo? – “factor analysis (EFA). And semTools…

You are correct. We have changed it to “…factor analysis (EFA); and semTools…”.

  1. Could the authors comment on whether the two subsamples used for EFA and CFA are similar or comparable in variables such as sex, age and any other relevant variable?

Being a very large sample, we have relied on random assignment to ensure homogeneity of the subsamples.

Results

  1. Parallel Analysis (PA). The decision to consider three factors is somewhat borderline (the third factor could be left out). In fact, Figure 1 also reflects that the existence of a single factor could be considered (a large difference between the eigenvalues ​​of the first and second factors). Therefore, the option of three factors is not as clear as it may seem, so it would be convenient to add substantive, practical criteria, etc., to underpin the decision taken by the authors.
  2. Parallel Analysis (PA): In relation to the above, PA supports different analysis procedures and strategies. When a factor is at the limit (as is the case), using one strategy or another can more clearly decant the factor, or simply favor one decision more than another. PA admits the use of both Pearson and polychoric correlations as input matrix, it admits different estimation methods to extract the fictitious eigenvalues ​​ (simple random), and it also admits the mean criterion and the 95th percentile criterion. The authors have used principal component analysis as the extraction method (assuming from the Pearson correlation matrix) and the mean criterion (as indicated in Figure 1). Would the same result be obtained in case of using other combinations?

We tried several estimation methods, obtaining the same results. We have stated this in the Results section, adding “As the presence of a third factor was somewhat marginal, several factoring methods were used (maximum likelihood, minimal residual, unweighted least squares, and minimum rank factor analysis), combining them with Pearson's correlation as input matrices, and using the mean and 95th percentile criteria. The results were all the same. I In addition, when deciding how many factors to extract, we have taken into account theoretical criteria. However, the large difference between the eigenvalues of the first and second factors also suggests that the existence of a single factor should also be considered and discussed”.

  1. EFA (1): the communalities could be removed from Table 2. They are not very informative (in addition, the inclusion of the skewness and kurtosis values ​​in Table 1 would be compensated). It is usual to interpret the EFA solution from the output factor loadings both in the factor to which they belong (from the point of view of interpretation), but also in other factors. Table 2 shows high factor loadings in factors other than those interpreted by the authors (cross-loadings), but they have not been interpreted. Sometimes the decision is made (factor loadings marked in bold) by a few tenths (for example, item 12 loads .562 in Preoccupation and .589 in Self-esteem; only need to consider item 12 in Self-esteem?).

                The communalities have been removed from Table 2. Also, we have added skewness and kurtosis values, for each item, in Table 1.

We have considered the cross-loadings, as well as other results (the high eigenvalue for the first factor, and a high hierarchical omega and explained common variance, ECV, in a bifactor model, as suggested below).

  1. EFA (2): The correlations between factors are very high, consistent with the information shown in Figure 1 (first eigenvalue is much higher than the second eigenvalue). On this issue, see comments below.
  2. Estimation methods: gives the impression that in EFA the variables have been considered as continuous (also in PA), while in CFA an estimator for categorical data (DWLS) is used. The variables are the same, so the criteria should be unified (assuming that the estimators do not have to be the same in EFA and CFA) and always use the same input matrix (Pearson or polychoric). On the other hand, the input matrix used for CFA-MG is not reflected in the text. In my opinion, the most appropriate in all cases/analysis would be to use polychoric.

As the variables have five ordered categories, we have considered them as continuous for the purposes of all analyses. We used a Pearson’s correlation matrix as input in all the analyses (including CFA, where we used DWLS estimation method to address the multivariate non-normal distribution). We have specified this in the appropriate sections.

  1. CFA (model): the high correlations in EFA (higher still in CFA, as a consequence of setting the cross-loadings to zero) raise the possibility, even the necessity, of considering the scale as potentially unidimensional (perhaps not strict unidimensionality but from the point of view of essential unidimensionality; see Reise, 2012; Reise et al., 2013). This situation raises two ways of analysis or evaluation of CFA models. On the one hand, strict unidimensionality can be tested from a unifactorial model. On the other hand, a bifactor CFA model can be applied, with the three factors proposed as facets, and indices (such as ECV and hierarchical omega) can be obtained to assess the strength or factorial determination of the common variance to all items (general factor of the bifactor model).

Reise S. P. (2012). The rediscovery of bifactor measurement models. Multiv. Behav. Res. 47 667–696. 10.1080/00273171.2012.715555

Reise S. P., Scheines R., Widaman K. F., Haviland M. G. (2013). Multidimensionality and structural coefficient bias in structural equation modeling: a bifactor perspective. Educ. Psychol. Meas. 73 5–26. 10.1177/0013164412449831

We carried out the bifactor model as suggested; we discussed the ECV and hierarchical omega. Thank you for the references, these papers were very useful.

  1. CFA (correlation with other variables): for me it does not make sense to recommend use of both the total and the subscales scores. Either we have evidence in favor of one or the other. Therefore, it does not make sense to me to correlate all the scores derived from the factorial solution. Since the correlations between factors are very high, it seems likely that the most appropriate recommendation would be to use a single total score. The application of the bifactor CFA can shed light on this issue. Note that the correlations of the scores of the subscales with other variables (Table 5) are very similar (hypothesis of high variance common to all items). Although keeping the three factors may make sense from a substantive point of view, the use of subscale scores is not discriminative in relation to other variables, they do not differ from the predictive/discriminative power of the total score. This is already reflected in the cross-loadings shown in Table 2. In summary, the authors can bet on the model of three correlated factors as a theoretical proposal and complement the analysis with unifactorial and bifactorial CFA in order to assess which scores are the ones that should be used (total or subscales).

We want to thank the reviewer for this useful summary. We used ECV and omega hierarchical coefficients as the strategy to decide that the measure was ‘‘unidimensional enough’’, after finding several clues of unidimensionality in other outcomes. Nevertheless, we have kept the correlations (and other results) of the subscales with other measures for research purposes, but recommended the use of the total scores for assessment and other applied purposes. Moreover, we removed the scoring scales for the subscales, leaving only the scale for the global factor (but, this time, separated by gender and age).

  1. CFA-MG: I have several comments on this point. First of all, to evaluate metric invariance (MI) using CFA-MG, the starting point is to make a separate CFA for each group (see Brown, 2015 – reference [30] in the manuscript). If at this level the CFA model does not hold up in any of the groups, then there is no point in continuing. Secondly, some values ​​of the fit indices are already inadequate at the configural level, according to the RVs themselves set by the authors (page 6). Third, that the chi-square difference test obtains a p-value > 0.05 is indicative of lack of invariance, not the problem of sensitivity to large chi-square sample sizes. Fourth, review the term “robust” to refer to the differences in chi-square between models nested in CFA-MG (note Table 3). In summary, the data provided does not seem to support the IM of the instrument.

We stand corrected. We have changed the discourse in the corresponding sections.

  1. In general, the issue of invariance is complex and restricted. Applying CFA-MG in assessment contexts with adolescents can be too demanding. In adolescents, it can be assumed that there may be jumps or changes in maturation (or learning) between an age value and the following year, or important differences in the way of understanding the construct (i.e., the content of the items) between boys and girls. In addition, both variables (sex, age) can interact in maturational/learning terms. Faced with this situation, two possibilities occur to me: a) apply CFA-MG to other models (unifactorial, bifactor) also included in previous analyses, or (if a) it is unsuccessful) b) report lack of invariance (explicitly assume this difficulty and address it in the text), differentiating total scores and percentiles by sex and also by age, and suggesting that this issue be further explored in the future (through MIMIC models, for example, introducing sex and age as covariates in the global CFA model: see Brown, 2015).

Thank you for all these options and your comments. We have decided to acknowledge the lack of invariance, provide percentile rank scales segmented by gender and age, and propose future lines of work with MIMIC models.

Round 2

Reviewer 3 Report

I congratulate the authors for the review. I believe that the authors have satisfactorily addressed all my suggestions and comments.

Back to TopTop