A SAS Macro for Automated Stopping of Markov Chain Monte Carlo Estimation in Bayesian Modeling with PROC MCMC
Abstract
:1. Introduction
2. Automatic Monitoring of Stopping Criteria in Markov Chain Monte Carlo Estimation
2.1. Convergence and Stopping Criteria in Markov Chain Monte Carlo
2.2. The %automcmc Macro
2.2.1. Macro Parameters for Model Estimation
2.2.2. Output Customization
3. Examples: Applying the %Automcmc Macro
3.1. Example 1: 1PL Model with Informative Priors (Single Chain)
Listing 1: PROC MCMC Code for Example 1. |
PROC MCMC DATA=IRTdat NBI=5000 NMC=25000 OUTPOST=ex1_2_out SEED=1000 NTHREADS=-1 MONITOR=(a b); ARRAY b [10]; ARRAY d [10]; ARRAY p [10]; PARMS a 1; PARMS d:0; PRIOR a:~LOGNORMAL(0, VAR=1); PRIOR d:~NORMAL(0, VAR=1); RANDOM theta~NORMAL(0, VAR=1) SUBJECT=person; DO j=1 TO 10; p[j]=LOGISTIC(a*theta-d[j]); b[j]=d[j]/a; END; MODEL item01~BINARY(p1); MODEL item02~BINARY(p2); MODEL item03~BINARY(p3); MODEL item04~BINARY(p4); MODEL item05~BINARY(p5); MODEL item06~BINARY(p6); MODEL item07~BINARY(p7); MODEL item08~BINARY(p8); MODEL item09~BINARY(p9); MODEL item10~BINARY(p10); RUN; |
Listing 2: Execution of PROC MCMC Code (Example 1) with the %automcmc Macro. |
%MACRO proc_mcmc; PROC MCMC DATA=IRTdat NBI=5000 NMC=25000 OUTPOST=ex1_2_out SEED=1000 NTHREADS=-1 MONITOR=(a b); ARRAY b [10]; ARRAY d [10]; ARRAY p [10]; PARMS a 1; PARMS d:0; PRIOR a:~LOGNORMAL(0, VAR=1); PRIOR d:~NORMAL(0, VAR=1); RANDOM theta~NORMAL(0, VAR=1) SUBJECT=person; DO j=1 TO 10; p[j]=LOGISTIC(a*theta-d[j]); b[j]=d[j]/a; END; MODEL item01~BINARY(p1); MODEL item02~BINARY(p2); MODEL item03~BINARY(p3); MODEL item04~BINARY(p4); MODEL item05~BINARY(p5); MODEL item06~BINARY(p6); MODEL item07~BINARY(p7); MODEL item08~BINARY(p8); MODEL item09~BINARY(p9); MODEL item10~BINARY(p10); %MEND; %automcmc(ESSconv=1000, PSRconv=1.01, maxnmc=1E6, output=no, PROCLOG=on); |
3.2. Example 2: 1PL Model with Informative Priors (Multiple Chains)
Listing 3: Estimation of PROC MCMC Code (Example 2) with the %automcmc Macro. |
%automcmc(ESSconv=1000, PSRconv=1.01, maxnmc=1E6, output=no, chains=3); |
3.3. Example 3: 1PL Model with Uninformative Priors (Multiple Chains)
Listing 4: Generating Starting Values (Example 3) with the %automcmc Macro. |
%MACRO proc_mcmc; PROC MCMC DATA=IRTdat NBI=0 NMC=1 OUTPOST=ex3_out SEED=1000 NTHREADS=-1 MONITOR=(a b); ARRAY b [10]; ARRAY d [10]; ARRAY p [10]; PARMS a 1; PARMS d:0; PRIOR a:~LOGNORMAL(0, VAR=2); PRIOR d:~NORMAL(0, VAR=100); RANDOM theta~NORMAL(0, VAR=1) SUBJECT=person; DO j=1 TO 10; p[j]=LOGISTIC(a*theta-d[j]); b[j]=d[j]/a; END; MODEL item01~BINARY(p1); MODEL item02~BINARY(p2); MODEL item03~BINARY(p3); MODEL item04~BINARY(p4); MODEL item05~BINARY(p5); MODEL item06~BINARY(p6); MODEL item07~BINARY(p7); MODEL item08~BINARY(p8); MODEL item09~BINARY(p9); MODEL item10~BINARY(p10); %MEND; %automcmc(ESSconv=0, PSRconv=0, maxnmc=1E6, biratio=0, output=no, results=no, chains=3); |
Listing 5: Estimation of PROC MCMC Code (Example 3) with the %automcmc Macro and Given Starting Values. |
%MACRO proc_mcmc; PROC MCMC DATA=IRTdat NBI=5000 NMC=25000 OUTPOST=ex3_out SEED=1000 NTHREADS=-1 MONITOR=(a b); ARRAY b [10]; ARRAY d [10]; ARRAY p [10]; PARMS a 1; PARMS d:0; PRIOR a:~LOGNORMAL(0, VAR=5); PRIOR d:~NORMAL(0, VAR=10000); RANDOM theta~NORMAL(0, VAR=1) SUBJECT=person; DO j=1 TO 10; p[j]=LOGISTIC(a*theta-d[j]); b[j]=d[j]/a; END; MODEL item01~BINARY(p1); MODEL item02~BINARY(p2); MODEL item03~BINARY(p3); MODEL item04~BINARY(p4); MODEL item05~BINARY(p5); MODEL item06~BINARY(p6); MODEL item07~BINARY(p7); MODEL item08~BINARY(p8); MODEL item09~BINARY(p9); MODEL item10~BINARY(p10); %MEND; %automcmc(ESSconv=1000, PSRconv=1.01, maxnmc=1E6, output=no, chains=3, svdat=yes); |
3.4. Example 4: 3PL Model with Informative Priors (Single Chain)
Listing 6: PROC MCMC Code for Example 4. |
PROC MCMC DATA=IRT3pldat NBI=5000 NMC=25000 OUTPOST=ex4_out SEED=1000 NTHREADS=-1 MONITOR=(a b c); ARRAY a [10]; ARRAY b [10]; ARRAY c [10]; ARRAY d [10]; ARRAY p [10]; PARMS a1 1 c1 0.2 d1 0; PARMS a2 1 c2 0.2 d2 0; PARMS a3 1 c3 0.2 d3 0; PARMS a4 1 c4 0.2 d4 0; PARMS a5 1 c5 0.2 d5 0; PARMS a6 1 c6 0.2 d6 0; PARMS a7 1 c7 0.2 d7 0; PARMS a8 1 c8 0.2 d8 0; PARMS a9 1 c9 0.2 d9 0; PARMS a10 1 c10 0.2 d10 0; PRIOR a:~LOGNORMAL(0, VAR=1); PRIOR c:~BETA(5, 20); PRIOR d:~NORMAL(0, VAR=1); RANDOM theta~NORMAL(0, VAR=1) SUBJECT=person; DO j=1 TO 10; p[j]=c[j]+(1-c[j])*LOGISTIC(a[j]*theta-d[j]); b[j]=d[j]/a[j]; END; MODEL item01~BINARY(p1); MODEL item02~BINARY(p2); MODEL item03~BINARY(p3); MODEL item04~BINARY(p4); MODEL item05~BINARY(p5); MODEL item06~BINARY(p6); MODEL item07~BINARY(p7); MODEL item08~BINARY(p8); MODEL item09~BINARY(p9); MODEL item10~BINARY(p10); RUN; |
3.5. Example 5: 1PL Model with Hierarchical Priors Using Arrays
Listing 7: PROC MCMC Code for Example 5. |
PROC MCMC DATA=IRTdat NBI=5000 NMC=25000 OUTPOST=ex4_out SEED=1000 NTHREADS=-1; ARRAYb[10];ARRAYp[10]; PARMS a; PARMS b:; PARMS mub varb; PRIORb:~NORMAL(mub, VAR=varb); PRIOR mub~UNIFORM(-6, 6); PRIOR varb~IGAMMA(.01, SCALE=.01); PRIOR a~LOGNORMAL(0, VAR=1); RANDOM theta~NORMAL(0, VAR=1) SUBJECT=person; DOj=1TO10; p[j]=LOGISTIC(a*theta-b[j]);END; MODEL item01~BINARY(p1); MODEL item02~BINARY(p2); MODEL item03~BINARY(p3); MODEL item04~BINARY(p4); MODEL item05~BINARY(p5); MODEL item06~BINARY(p6); MODEL item07~BINARY(p7); MODEL item08~BINARY(p8); MODEL item09~BINARY(p9); MODEL item10~BINARY(p10); BEGINCNST; mub=0; varb=1; a=1; b1=0; b2=0; b3=0; b4=0; b5=0; b6=0; b7=0; b8=0; b9=0; b10=0; ENDCNST; RUN; |
Listing 8: PROC MCMC Code for Example 5 Without ARRAY Statements. |
PROC MCMC DATA=IRTdat NBI=5000 NMC=25000 OUTPOST=ex4_out SEED=1000 NTHREADS=-1; PARMS a; PARMS b1-b10; PARMS mub varb; PRIORb1~NORMAL(mub, VAR=varb); PRIORb2~NORMAL(mub, VAR=varb); PRIORb3~NORMAL(mub, VAR=varb); PRIORb4~NORMAL(mub, VAR=varb); PRIORb5~NORMAL(mub, VAR=varb); PRIORb6~NORMAL(mub, VAR=varb); PRIORb7~NORMAL(mub, VAR=varb); PRIORb8~NORMAL(mub, VAR=varb); PRIORb9~NORMAL(mub, VAR=varb); PRIORb10~NORMAL(mub, VAR=varb); PRIOR mub~UNIFORM(-6, 6); PRIOR varb~IGAMMA(.01, SCALE=.01); PRIOR a~LOGNORMAL(0, VAR=1); RANDOM theta~NORMAL(0, VAR=1) SUBJECT=person; p1=LOGISTIC(a*theta-b1); p2=LOGISTIC(a*theta-b2); p3=LOGISTIC(a*theta-b3); p4=LOGISTIC(a*theta-b4); p5=LOGISTIC(a*theta-b5); p6=LOGISTIC(a*theta-b6); p7=LOGISTIC(a*theta-b7); p8=LOGISTIC(a*theta-b8); p9=LOGISTIC(a*theta-b9); p10=LOGISTIc(a*theta-b10); MODEL item01~BINARY(p1); MODEL item02~BINARY(p2); MODEL item03~BINARY(p3); MODEL item04~BINARY(p4); MODEL item05~BINARY(p5); MODEL item06~BINARY(p6); MODEL item07~BINARY(p7); MODEL item08~BINARY(p8); MODEL item09~BINARY(p9); MODEL item10~BINARY(p10); BEGINCNST; mub=0; varb=1; a=1; b1=0; b2=0; b3=0; b4=0; b5=0; b6=0; b7=0; b8=0; b9=0; b10=0; ENDCNST; RUN; |
4. Automated Stopping in Related Software
5. Discussion
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
- Listing A1: SAS Code for IRT Examples.* Population model: 1 PL (a = 1, b_j = -0.9, -0.7, ... 0.9);DATA IRTdat;ARRAY item(10) item01-item10;DO person=1 TO 250;theta=NORMAL(12345);DO j=1 TO 10;item(j)=(LOGISTIC(1*(theta-(-0.9+(j-1)/5)))>RANUNI(12345));END;OUTPUT;END;DROP theta j;RUN;* Population model: 1 PL (a_j = 0.80, 0.85, ... 1.25, b_j = -0.9, -0.7, ... 0.9);* guessing: c_j = .20;DATA IRT3pldat;ARRAY item(10) item01-item10;DO person=1 TO 250;theta=NORMAL(12345);DO j=1 TO 10;item(j)=(.20+.80*LOGISTIC((0.75+0.05*j)*(theta-(-0.9+(j-1)/5)))>RANUNI(12345));END;OUTPUT;END;DROP theta j;RUN;**********************;* Examples 1 + 2 *;* informative priors *;**********************;* Model parameterizing with intercepts d_j (instead of difficulties b_j) to improve mixing (see Stone & Zhu, 2015);%MACRO proc_mcmc;PROC MCMC DATA=IRTdat NBI=5000 NMC=25000 OUTPOST=ex1_2_out SEED=1000 NTHREADS=-1 MONITOR=(a b);PARMS a 1; PARMS d:0;PRIOR a:~LOGNORMAL(0, VAR=1); PRIOR d:~NORMAL(0, VAR=1);RANDOM theta~NORMAL(0, VAR=1) SUBJECT=person;DO j=1 TO 10; p[j]=LOGISTIC(a*theta-d[j]); b[j]=d[j]/a; END;MODEL item01~BINARY(p1);MODEL item02~BINARY(p2);MODEL item03~BINARY(p3);MODEL item04~BINARY(p4);MODEL item05~BINARY(p5);MODEL item06~BINARY(p6);MODEL item07~BINARY(p7);MODEL item08~BINARY(p8);MODEL item09~BINARY(p9);MODEL item10~BINARY(p10);%MEND;* Example 1: informative priors, single chain;%automcmc(ESSconv=1000, PSRconv=1.01, maxnmc=1E6, output=no, PROCLOG=on);* Example 2: informative priors, 3 chains;%automcmc(ESSconv=1000, PSRconv=1.01, maxnmc=1E6, output=no, chains=3);************************;* Example 3 *;* uninformative priors *;************************;* Generating starting values based on model with more informative priors (only 1 iteration necessary);%MACRO proc_mcmc;PROC MCMC DATA=IRTdat NBI=0 NMC=1 OUTPOST=ex3_out SEED=1000 NTHREADS=-1 MONITOR=(a b);PARMS a 1; PARMS d:0;PRIOR a:~LOGNORMAL(0, VAR=2); PRIOR d:~NORMAL(0, VAR=100);RANDOM theta~NORMAL(0, VAR=1) SUBJECT=person;DO j=1 TO 10; p[j]=LOGISTIC(a*theta-d[j]); b[j]=d[j]/a; END;MODEL item01~BINARY(p1);MODEL item02~BINARY(p2);MODEL item03~BINARY(p3);MODEL item04~BINARY(p4);MODEL item05~BINARY(p5);MODEL item06~BINARY(p6);MODEL item07~BINARY(p7);MODEL item08~BINARY(p8);MODEL item09~BINARY(p9);MODEL item10~BINARY(p10);%MEND;* Example 3: generating starting values, 3 chains (no convergence criteria applied);%automcmc(ESSconv=0, PSRconv=0, maxnmc=1E6, biratio=0, output=no, results=no, chains=3);* apply original target model with uninformative priors (based on generated starting values);%MACRO proc_mcmc;PROC MCMC DATA=IRTdat NBI=5000 NMC=25000 OUTPOST=ex3_out SEED=1000 NTHREADS=-1 MONITOR=(a b);PARMS a 1; PARMS d:0;PRIOR a:~LOGNORMAL(0, VAR=5); PRIOR d:~NORMAL(0, VAR=10000);RANDOM theta~NORMAL(0, VAR=1) SUBJECT=person;DO j=1 TO 10; p[j]=LOGISTIC(a*theta-d[j]); b[j]=d[j]/a; END;MODEL item01~BINARY(p1);MODEL item02~BINARY(p2);MODEL item03~BINARY(p3);MODEL item04~BINARY(p4);MODEL item05~BINARY(p5);MODEL item06~BINARY(p6);MODEL item07~BINARY(p7);MODEL item08~BINARY(p8);MODEL item09~BINARY(p9);MODEL item10~BINARY(p10);%MEND;* 3 chains;%automcmc(ESSconv=1000, PSRconv=1.01, maxnmc=1E6, output=no, chains=3, svdat=yes);
References
- van de Schoot, R.; Winter, S.D.; Ryan, O.; Zondervan-Zwijnenburg, M.; Depaoli, S. A systematic review of Bayesian articles in psychology: The last 25 years. Psychol. Methods 2017, 22, 217–239. [Google Scholar] [CrossRef]
- Muthén, B.; Asparouhov, T. Bayesian structural equation modeling: A more flexible representation of substantive theory. Psychol. Methods 2012, 17, 313–335. [Google Scholar] [CrossRef]
- Depaoli, S.; Clifton, J.P. A Bayesian approach to multilevel structural equation modeling with continuous and dichotomous outcomes. Struct. Equ. Model. A Multidiscip. J. 2015, 22, 327–351. [Google Scholar] [CrossRef]
- Zitzmann, S.; Lüdtke, O.; Robitzsch, A.; Hecht, M. On the performance of Bayesian approaches in small samples: A comment on Smid, McNeish, Miocevic, and van de Schoot (2020). Struct. Equ. Model. A Multidiscip. J. 2021, 28, 40–50. [Google Scholar] [CrossRef]
- Geyer, C.J. Practical Markov chain Monte Carlo. Stat. Sci. 1992, 7, 473–483. [Google Scholar] [CrossRef]
- Zitzmann, S.; Hecht, M. Going beyond convergence in Bayesian estimation: Why precision matters too and how to assess it. Struct. Equ. Model. 2019, 26, 646–661. [Google Scholar] [CrossRef]
- Muthén, L.K.; Muthén, B.O. Mplus User’s Guide, 8th ed.; Muthén & Muthén: Los Angeles, CA, USA, 1998–2017. [Google Scholar]
- Zitzmann, S.; Weirich, S.; Hecht, M. Using the effective sample size as the stopping criterion in Markov chain Monte Carlo with the Bayes module in Mplus. Psych 2021, 3, 336–347. [Google Scholar] [CrossRef]
- SAS Institute Inc. SAS/STAT® 15.1 User’s Guide; SAS Institute Inc.: Cary, NC, USA, 2018. [Google Scholar]
- Ames, A.J.; Samonte, K. Using SAS PROC MCMC for item response theory models. Educ. Psychol. Meas. 2015, 75, 585–609. [Google Scholar] [CrossRef]
- Mai, Y.; Zhang, Z. Software packages for Bayesian multilevel modeling. Struct. Equ. Model. A Multidiscip. J. 2018, 25, 650–658. [Google Scholar] [CrossRef]
- Moeyaert, M.; Rindskopf, D.; Onghena, P.; Van den Noortgate, W. Multilevel modeling of single-case data: A comparison of maximum likelihood and Bayesian estimation. Psychol. Methods 2017, 22, 760–778. [Google Scholar] [CrossRef] [PubMed]
- Miočević, M.; Gonzalez, O.; Valente, M.J.; MacKinnon, D.P. A tutorial in Bayesian potential outcomes mediation analysis. Struct. Equ. Model. A Multidiscip. J. 2018, 25, 121–136. [Google Scholar] [CrossRef] [PubMed]
- Wurpts, I.C.; Miočević, M.; MacKinnon, D.P. Sequential Bayesian data synthesis for mediation and regression analysis. Prev. Sci. 2022, 23, 378–389. [Google Scholar] [CrossRef] [PubMed]
- Leventhal, B.C.; Stone, C.A. Bayesian analysis of multidimensional item response theory models: A discussion and illustration of three response style models. Meas. Interdiscip. Res. Perspect. 2018, 16, 114–128. [Google Scholar] [CrossRef]
- Stone, C.A.; Zhu, X. Bayesian Analysis of Item Response Theory Models Using SAS; SAS Institute Inc.: Cary, NC, USA, 2015. [Google Scholar]
- Austin, P.C.; Lee, D.S.; Leckie, G. Comparing a multivariate response Bayesian random effects logistic regression model with a latent variable item response theory model for provider profiling on multiple binary indicators simultaneously. Stat. Med. 2020, 39, 1390–1406. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Z. Modeling error distributions of growth curve models through Bayesian methods. Behav. Res. 2016, 48, 427–444. [Google Scholar] [CrossRef]
- McNeish, D. Fitting residual error structures for growth models in SAS PROC MCMC. Educ. Psychol. Meas. 2017, 77, 587–612. [Google Scholar] [CrossRef]
- Gelman, A.; Rubin, D.B. Inference from iterative simulation using multiple sequences. Stat. Sci. 1992, 7, 457–472. [Google Scholar] [CrossRef]
- Asparouhov, T.; Muthén, B. Bayesian analysis using Mplus: Technical implementation (Version 4). 2023. Available online: http://www.statmodel.com/download/Bayes2.pdf (accessed on 25 August 2023).
- Link, W.A.; Eaton, M.J. On thinning of chains in MCMC. Methods Ecol. Evol. 2012, 3, 112–115. [Google Scholar] [CrossRef]
- Kass, R.E.; Carlin, B.P.; Gelman, A.; Neal, R.M. Markov Chain Monte Carlo in practice: A roundtable discussion. Am. Stat. 1998, 52, 93–100. [Google Scholar] [CrossRef]
- Vehtari, A.; Gelman, A.; Simpson, D.; Carpenter, B.; Bürkner, P.-C. Rank-normalization, folding, and localization: An improved R for assessing convergence of MCMC (with discussion). Bayesian Anal. 2021, 16, 667–718. [Google Scholar] [CrossRef]
- de Ayala, R.J. The Theory and Practice of Item Response Theory; The Guilford Press: New York, NY, USA, 2009. [Google Scholar]
- Zitzmann, S. A computationally more efficient and more accurate stepwise approach for correcting for sampling error and measurement error. Multivar. Behav. Res. 2018, 53, 612–632. [Google Scholar] [CrossRef] [PubMed]
- Hecht, M.; Zitzmann, S. A computationally more efficient Bayesian approach for estimating continuous-time models. Struct. Equ. Model. A Multidiscip. J. 2020, 27, 829–840. [Google Scholar] [CrossRef]
- Hecht, M.; Gische, C.; Vogel, D.; Zitzmann, S. Integrating out nuisance parameters for computationally more efficient Bayesian estimation—An illustration and tutorial. Struct. Equ. Model. A Multidiscip. J. 2020, 27, 483–493. [Google Scholar] [CrossRef]
- Plummer, M. JAGS: Just another Gibbs SAMPLER (Version 4.3.1). Available online: https://sourceforge.net/projects/mcmc-jags/ (accessed on 16 May 2023).
- Hecht, M.; Weirich, S.; Zitzmann, S. Comparing the MCMC efficiency of JAGS and Stan for the multi-level intercept-only model in the covariance- and mean-based and classic parametrization. Psych 2021, 3, 751–779. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wagner, W.; Hecht, M.; Zitzmann, S. A SAS Macro for Automated Stopping of Markov Chain Monte Carlo Estimation in Bayesian Modeling with PROC MCMC. Psych 2023, 5, 966-982. https://doi.org/10.3390/psych5030063
Wagner W, Hecht M, Zitzmann S. A SAS Macro for Automated Stopping of Markov Chain Monte Carlo Estimation in Bayesian Modeling with PROC MCMC. Psych. 2023; 5(3):966-982. https://doi.org/10.3390/psych5030063
Chicago/Turabian StyleWagner, Wolfgang, Martin Hecht, and Steffen Zitzmann. 2023. "A SAS Macro for Automated Stopping of Markov Chain Monte Carlo Estimation in Bayesian Modeling with PROC MCMC" Psych 5, no. 3: 966-982. https://doi.org/10.3390/psych5030063
APA StyleWagner, W., Hecht, M., & Zitzmann, S. (2023). A SAS Macro for Automated Stopping of Markov Chain Monte Carlo Estimation in Bayesian Modeling with PROC MCMC. Psych, 5(3), 966-982. https://doi.org/10.3390/psych5030063