A Cautionary Note Regarding Multilevel Factor Score Estimates from Lavaan
Abstract
:1. Introduction
Types of Factor Score Estimate
2. Motivating Example
2.1. Maximum Likelihood Estimate
2.2. Expected a Posteriori Estimate
3. Factor Score Estimates from Lavaan
3.1. Data and Method
- mlm <- ’
- level: 1
- Yw =~ a*Y
- level: 2
- Yb =~ a*Y
- ’
3.2. Results
- MLE SE_MLE EAP SE_EAP
- [1,] 0.01350424 NA 0.01350424 NA
- [2,] 0.32330489 NA 0.32330489 NA
- [3,] -0.17971389 NA -0.17971389 NA
- [4,] 0.03329527 NA 0.03329527 NA
- [5,] -0.13949739 NA -0.13949739 NA
- [6,] -0.63425187 NA -0.63425187 NA
- [7,] -0.21836302 NA -0.21836302 NA
- [8,] 0.51805248 NA 0.51805248 NA
- [9,] 0.05040566 NA 0.05040566 NA
- [10,] -0.35429797 NA -0.35429797 NA
- .
- .
- .
- MLE SE_MLE EAP SE_EAP
- [1,] 0.02267735 0.301777 0.01350424 0.1797065
- [2,] 0.54291830 0.301777 0.32330489 0.1797065
- [3,] -0.30178930 0.301777 -0.17971389 0.1797065
- [4,] 0.05591196 0.301777 0.03329527 0.1797065
- [5,] -0.23425468 0.301777 -0.13949739 0.1797065
- [6,] -1.06508424 0.301777 -0.63425187 0.1797065
- [7,] -0.36669188 0.301777 -0.21836302 0.1797065
- [8,] 0.86995334 0.301777 0.51805248 0.1797065
- [9,] 0.08464504 0.301777 0.05040566 0.1797065
- [10,] -0.59496425 0.301777 -0.35429797 0.1797065
- .
- .
- .
4. Discussion
Conclusions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A. Derivation of Equations (1)–(4)
Appendix B. Derivation of Equations (10) and (11)
Appendix C. Lavaan Output
- lavaan 0.6-12 ended normally after 14 iterations
- Estimator ML
- Optimization method NLMINB
- Number of model parameters 3
- Number of observations 1000
- Number of clusters [g] 100
- Model Test User Model:
- Test statistic 0.000
- Degrees of freedom 0
- Parameter Estimates:
- Standard errors Standard
- Information Observed
- Observed information based on Hessian
- Level 1 [within]:
- Latent Variables:
- Estimate Std.Err z-value P(>|z|)
- Yw =~
- Y (a) 1.000
- Intercepts:
- Estimate Std.Err z-value P(>|z|)
- .Y 0.000
- Yw 0.000
- Variances:
- Estimate Std.Err z-value P(>|z|)
- .Y 0.000
- Yw 0.911 0.043 21.213 0.000
- Level 2 [g]:
- Latent Variables:
- Estimate Std.Err z-value P(>|z|)
- Yb =~
- Y (a) 1.000
- Intercepts:
- Estimate Std.Err z-value P(>|z|)
- .Y -0.000 0.047 -0.000 1.000
- Yb 0.000
- Variances:
- Estimate Std.Err z-value P(>|z|)
- .Y 0.000
- Yb 0.134 0.032 4.173 0.000
Appendix D. R Code
- # Set working directory
- wd <- file.path( "C:/MyFolder" )
- setwd( wd )
- # Read example data
- exampleData <- read.table( "exampleData.txt", header = T, sep = "\t"
- )
- J <- length( unique( exampleData$g ) )
- nn <- NA for ( i in 1:J ) {
- nn[ i ] <- length( which( exampleData$g == 1 ) ) }
- n <- mean( nn )
- # Specify and run the example model in lavaan
- # install.packages( "lavaan" )
- library( lavaan )
- mlm <- ’
- level: 1
- Yw =~ a*Y
- level: 2
- Yb =~ a*Y
- ’
- fit <- sem( mlm, data = exampleData, cluster = "g" )
- # Obtain factor scores from lavaan
- lavMles <- lavPredict( fit, level = 2, method = "Bartlett", se = "
- standard" )
- lavEaps <- lavPredict( fit, level = 2, method = "regression", se = "
- standard" )
- lavMles.se <- attributes( lavMles )$se[[1]]
- lavMles <- cbind( lavMles, rep( lavMles.se[1], J ) )
- colnames( lavMles ) <- c( "MLE", "SE_MLE" )
- lavEaps.se <- attributes( lavEaps )$se[[1]]
- lavEaps <- cbind( lavEaps, rep( lavEaps.se[1], J ) )
- colnames( lavEaps ) <- c( "EAP", "SE_EAP" )
- lavFscores <- cbind( lavMles, lavEaps )
- print( fscores[ 1:10, ] )
- # Obtain "custom-built" factor scores
- mles <- rep( NA, J )
- for ( i in 1:J ){
- mles[ i ] <- mean( as.numeric( dat$Y[ which( dat$g == i ) ] ) )
- }
- params <- parameterEstimates(fit)
- var.Yb <- params[ which( params$lhs == "Yb" & params$op == "~~" &
- params$rhs == "Yb" & params$level == 2 ), 7 ]
- var.Yw <- params[ which( params$lhs == "Yw" & params$op == "~~" &
- params$rhs == "Yw" & params$level == 1 ), 7 ]
- rel <- var.Yb/(var.Yb + var.Yw/n)
- eaps <- rel*mles
- mle.se <- sqrt( var.Yw/n )
- mles <- cbind( mles, rep( mle.se, J ) ) colnames( mles ) <- c( "MLE",
- "SE_MLE" )
- eap.se <- rel*mle.se
- eaps <- cbind( eaps, rep( eap.se, J ) )
- colnames( eaps ) <- c( "EAP", "SE_EAP" )
- fscores <- cbind( mles, eaps )
- print ( fscores[ 1:10, ] )
References
- Edelsbrunner, P.A. A model and its fit lie in the eye of the beholder: Long live the sum score. Front. Psychol. 2022, 13, 1–5. [Google Scholar] [CrossRef] [PubMed]
- Robitzsch, A.; Lüdtke, O. Some thoughts on analytical choices in the scaling model for test scores in international large-scale assessment studies. Meas. Instruments Soc. Sci. 2022, 4, 9. [Google Scholar] [CrossRef]
- Widaman, K.F.; Revelle, W. Thinking thrice about sum scores, and then some more about measurement and analysis. Behav. Res. Methods 2022. Advance Online Publication. [Google Scholar] [CrossRef]
- Bartlett, M.S. The statistical conception of mental factors. Br. J. Psychol. Gen. Sect. 1937, 28, 97–104. [Google Scholar] [CrossRef]
- Thomson, G.H. The meaning of ‘i’ in the estimate of ‘g’. Br. J. Psychol. Gen. Sect. 1934, 25, 92–99. [Google Scholar] [CrossRef]
- Thurstone, L.L. The Vectors of Mind; University of Chicago Press: Chicago, IL, USA, 1935. [Google Scholar]
- Skrondal, A.; Laake, P. Regression among factor scores. Psychometrika 2001, 4, 563–576. [Google Scholar] [CrossRef] [Green Version]
- Croon, M.A. Using predicted latent scores in general latent structure models. In Latent Variable and Latent Structure Modeling; Marcoulides, G., Moustaki, I., Eds.; Lawrence Erlbaum: Mahwah, NJ, USA, 2002; pp. 195–223. [Google Scholar]
- Croon, M.A.; van Veldhoven, M.J.P.M. Predicting group-level outcome variables from variables measured at the individual level: A latent variable multilevel model. Psychol. Methods 2007, 12, 45–57. [Google Scholar] [CrossRef] [PubMed]
- Grilli, L.; Rampichini, C. The role of sample cluster means in multilevel models. Methodology 2011, 7, 121–133. [Google Scholar] [CrossRef]
- Devlieger, I.; Rosseel, Y. Factor score path analysis: An Alternative for SEM? Methodology 2017, 13, 31–38. [Google Scholar] [CrossRef]
- Kelcey, B.; Cox, K.; Dong, N. Croon’s bias-corrected factor score path analysis for small- to moderate-sample multilevel structural equation models. Organ. Res. Methods 2019, 24, 55–77. [Google Scholar] [CrossRef]
- Devlieger, I.; Rosseel, Y. Multilevel factor score regression. Multivar. Behav. Res. 2020, 55, 600–624. [Google Scholar] [CrossRef]
- Aydin, B.; Algina, J. Best linear unbiased prediction of latent means in three-level data. J. Exp. Educ. 2021, 90, 452–468. [Google Scholar] [CrossRef]
- Zitzmann, S.; Helm, C. Multilevel analysis of mediation, moderation, and nonlinear effects in small samples, using expected a posteriori estimates of factor scores. Struct. Equ. Model. 2021, 28, 529–546. [Google Scholar] [CrossRef]
- Rosseel, Y.; Loh, W.W. A structural after measurement (SAM) approach to structural equation modeling. Psychol. Methods 2022, accepted. [Google Scholar]
- Zitzmann, S.; Lohmann, J.F.; Krammer, G.; Helm, C.; Aydin, B.; Hecht, M. A Bayesian EAP-based nonlinear extension of Croon and van Veldhoven’s model for analyzing data from micro-macro multilevel designs. Mathematics 2022, 10, 842. [Google Scholar] [CrossRef]
- McDonald, R.P. Measuring latent quantities. Psychometrika 2011, 76, 511–536. [Google Scholar] [CrossRef] [PubMed]
- Bolstad, W.M.; Curran, J.M. Introduction to Bayesian Statistics; Wiley: Hoboken, NJ, USA, 2017. [Google Scholar]
- Mislevy, R.J. Randomization-based inference about latent variables from complex samples. Psychometrika 1991, 56, 177–196. [Google Scholar] [CrossRef]
- Hoff, P.D. A First Course in Bayesian Statistical Methods; Springer Texts in Statistics; Springer: New York, NY, USA, 2009. [Google Scholar]
- Lüdtke, O.; Robitzsch, A. Einführung in die Plausible-Values-Technik für die psychologische Forschung [An introduction to the plausible value technique for psychological research]. Diagnostica 2017, 63, 193–205. [Google Scholar] [CrossRef]
- Rosseel, Y. lavaan: An R package for structural equation modeling. J. Stat. Softw. 2012, 48, 1–36. [Google Scholar] [CrossRef] [Green Version]
- R Development Core Team. R: A Language and Environment for Statistical Computing; R Development Core Team: Vienna, Austria, 2016. [Google Scholar]
- Lüdtke, O.; Robitzsch, A.; Trautwein, U.; Kunter, M. Assessing the impact of learning environments: How to use student ratings of classroom or school characteristics in multilevel modelling. Educ. Psychol. 2009, 34, 120–131. [Google Scholar] [CrossRef]
- Lüdtke, O.; Marsh, H.W.; Robitzsch, A.; Trautwein, U.; Asparouhov, T.; Muthén, B.O. The multilevel latent covariate model: A new, more reliable approach to group-level effects in contextual studies. Psychol. Methods 2008, 13, 203–229. [Google Scholar] [CrossRef] [PubMed]
- Asparouhov, T.; Muthén, B.O. Constructing Covariates in Multilevel Regression (Mplus Web Notes No. 11, Version 2). 2007. Available online: https://www.statmodel.com/download/webnotes/webnote11.pdf (accessed on 1 November 2022).
- Mehta, P.D.; Neale, M.C. People are variables too: Multilevel structural equations modeling. Psychol. Methods 2005, 10, 259–284. [Google Scholar] [CrossRef] [Green Version]
- Koch, T.; Schultze, M.; Holtmann, J.; Geiser, C.; Eid, M. A multimethod latent state-trait model for structurally different and interchangeable methods. Psychometrika 2017, 82, 17–47. [Google Scholar] [CrossRef] [PubMed]
- Zitzmann, S.; List, M.; Lechner, C.; Hecht, M.; Krammer, G. Reporting factor score estimates of teaching quality based on student ratings back to teachers: Recommendations from Psychometrics. Educ. Psychol. Meas. 2022; submitted. [Google Scholar]
- Schweig, J.D. Quantifying error in survey measures of school and classroom environments. Appl. Meas. Educ. 2014, 27, 133–157. [Google Scholar] [CrossRef]
- Kane, M.T.; Brennan, R.L. The generalizability of class means. Rev. Educ. Res. 1977, 47, 267–292. [Google Scholar] [CrossRef]
- Bliese, P.D. Within-group agreement, non-independence, and reliability: Implications for data aggregation and analysis. In Multilevel Theory, Research, and Methods in Organizations: Foundation, Extensions, and New Directions; Klein, K.J., Kozlowski, S.W., Eds.; Jossey-Bass: San Francisco, CA, USA, 2000; pp. 349–381. [Google Scholar]
- Snijders, T.A.B.; Bosker, R.J. Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling, 2nd ed.; Sage: Los Angeles, CA, USA, 2012. [Google Scholar]
- Brennan, R.L. Generalizability Theory; Springer: New York, NY, USA, 2001. [Google Scholar]
- Zitzmann, S.; Lüdtke, O.; Robitzsch, A. A Bayesian approach to more stable estimates of group-level effects in contextual studies. Multivar. Behav. Res. 2015, 50, 688–705. [Google Scholar] [CrossRef]
- Lüdtke, O.; Marsh, H.W.; Robitzsch, A.; Trautwein, U. A 2 × 2 taxonomy of multilevel latent contextual models: Accuracy-bias trade-offs in full and partial error correction models. Psychol. Methods 2011, 16, 444–467. [Google Scholar] [CrossRef] [PubMed]
- Stapleton, L.M.; Yang, J.S.; Hancock, G.R. Construct meaning in multilevel settings. J. Educ. Behav. Stat. 2016, 41, 481–520. [Google Scholar] [CrossRef]
- Zitzmann, S.; Lüdtke, O.; Robitzsch, A.; Marsh, H.W. A Bayesian approach for estimating multilevel latent contextual models. Struct. Equ. Model. 2016, 23, 661–679. [Google Scholar] [CrossRef]
- Zitzmann, S.; Weirich, S.; Hecht, M. Using the effective sample size as the stopping criterion in Markov chain Monte Carlo with the Bayes Module in Mplus. Psych 2021, 3, 336–347. [Google Scholar] [CrossRef]
- Zitzmann, S. A computationally more efficient and more accurate stepwise approach for correcting for sampling error and measurement error. Multivar. Behav. Res. 2018, 53, 612–632. [Google Scholar] [CrossRef] [PubMed]
- Greenland, S. Principles of multilevel modelling. Int. J. Epidemiol. 2000, 29, 158–167. [Google Scholar] [CrossRef] [PubMed]
- Zitzmann, S.; Lüdtke, O.; Robitzsch, A.; Hecht, M. On the performance of Bayesian approaches in small samples: A comment on Smid, McNeish, Miočević, and van de Schoot (2020). Struct. Equ. Model. 2021, 28, 40–50. [Google Scholar] [CrossRef]
- Zitzmann, S.; Helm, C.; Hecht, M. Prior specification for more stable Bayesian estimation of multilevel latent variable models in small samples: A comparative investigation of two different approaches. Front. Psychol. 2021, 11, 611267. [Google Scholar] [CrossRef] [PubMed]
- Lord, F.M.; Novick, M.R. Statistical Theories of Mental Test Scores; Addison-Wesley: Reading, MA, USA, 1968. [Google Scholar]
- Zitzmann, S.; Bardach, L.; Horstmann, K.; Ziegler, M.; Hecht, M. Quantifying individual personality change more accurately by regression-based change scores. Multivariate Behav. Res. 2022; submitted. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zitzmann, S. A Cautionary Note Regarding Multilevel Factor Score Estimates from Lavaan. Psych 2023, 5, 38-49. https://doi.org/10.3390/psych5010004
Zitzmann S. A Cautionary Note Regarding Multilevel Factor Score Estimates from Lavaan. Psych. 2023; 5(1):38-49. https://doi.org/10.3390/psych5010004
Chicago/Turabian StyleZitzmann, Steffen. 2023. "A Cautionary Note Regarding Multilevel Factor Score Estimates from Lavaan" Psych 5, no. 1: 38-49. https://doi.org/10.3390/psych5010004
APA StyleZitzmann, S. (2023). A Cautionary Note Regarding Multilevel Factor Score Estimates from Lavaan. Psych, 5(1), 38-49. https://doi.org/10.3390/psych5010004