A Comparison of Methods for Synthesizing Results from Previous Research to Obtain Priors for Bayesian Structural Equation Modeling

Finch, Holmes

doi:10.3390/psych6010004

Open AccessArticle

A Comparison of Methods for Synthesizing Results from Previous Research to Obtain Priors for Bayesian Structural Equation Modeling

by

Holmes Finch

Department of Educational Psychology, Ball State University, Muncie, IN 47306, USA

Psych 2024, 6(1), 45-88; https://doi.org/10.3390/psych6010004

Submission received: 20 November 2023 / Revised: 12 December 2023 / Accepted: 15 December 2023 / Published: 3 January 2024

(This article belongs to the Section Psychometrics and Educational Measurement)

Download

Browse Figure

Review Reports Versions Notes

Abstract

:

Bayesian estimation of latent variable models provides some unique advantages to researchers working with small samples and complex models when compared with the more commonly used maximum likelihood approach. A key aspect of Bayesian modeling involves the selection of prior distributions for the parameters of interest. Prior research has demonstrated that using default priors, which are typically noninformative, may yield biased and inefficient estimates. Therefore, it is recommended that data analysts obtain useful, informative priors from prior research whenever possible. The goal of the current simulation study was to compare several methods designed to combine results from prior studies that will yield informative priors for regression coefficients in structural equation models. These methods include noninformative priors, Bayesian synthesis, pooled analysis, aggregated priors, standard meta-analysis, power priors, and the meta-analytic predictive methods. Results demonstrated that power priors and meta-analytic predictive priors, used in conjunction with Bayesian estimation, may yield the most accurate estimates of the latent structure coefficients. Implications for practice and suggestions for future research are discussed.

Keywords:

Bayesian estimation; priors; structural equation modeling

1. Introduction

Bayesian parameter estimation in statistical modeling represents a fundamental shift from the frequentist methods that are commonly used in statistical science and data analysis. In contrast to the frequentist paradigm, which relies solely on the observed data to obtain model parameter estimates, Bayesian estimation involves the use of prior distributional information about the parameters in conjunction with the data [1]. In addition, whereas frequentist methods estimate a population value for a given parameter (e.g., a regression coefficient) using a single value, within the Bayesian paradigm, the population parameter is estimated as a distribution of values. The Bayesian approach combines prior information regarding the nature of the parameter distribution with information taken from the sample data in order to estimate a posterior distribution of the parameter distribution. In practice, when a single-value estimate of a model parameter is desired, such as a regression coefficient linking dependent and independent variables, the mean, median, or mode of the posterior distribution is calculated [2].

A key aspect of conducting Bayesian analysis involves specifying the prior distribution for each of the model parameters [3]. Priors typically include both the location (e.g., the mean of the distribution) and scale (e.g., a variance of the distribution). Priors can be either informative or noninformative, where informative priors are typically drawn from previous research and/or practice and will be fairly specific in terms of both their mean and variance [3]. In contrast, noninformative (sometimes referred to as diffuse) priors are not based on prior research but rather are deliberately selected so as to constrain the posterior distribution for the parameter as little as possible. Noninformative priors are commonly used when little or no useful information is available for setting the prior distribution [2]. When noninformative priors are used, the data drives the estimation of the posterior distribution with relatively less input from the priors, whereas when informative priors are used, the posterior reflects both the influence of the data and the influence of the prior information that the researcher brings to the analysis.

Prior research has explored a number of methods for obtaining informative priors in the context of observed and factor analysis models [4]. The purpose of the current study was to extend these works by focusing on the development of priors for Bayesian modeling using the results from earlier studies in the context of estimating structure coefficients relating latent variables in a structural equation model (SEM). Several methods have been demonstrated to be effective for synthesizing prior research to obtain priors in factor analysis and observed regression model contexts [2,5,6,7,8]. There have been recent findings suggesting that power priors may be among the most effective of these methods [9]. However, other researchers have found that in certain contexts (e.g., with multilevel models), techniques such as Bayesian dynamic borrowing may yield a more accurate estimation of model parameters [9,10,11]. Power priors have also been shown to be particularly effective in replication studies [12]. Bayesian meta-analytic techniques have also been used by researchers as a way to improve priors for subsequent analyses [12].

The choice of approach for synthesizing information from previous research in order to obtain priors for the target study can have an impact on the ultimate results of the analyses, making this an important decision for researchers. In particular, some of the techniques that are described below apply differential weights to results from prior research based on their similarity to the target data. Other methods weight prior research results based upon the variability of the estimates reported there. And still, other approaches do not weight prior information at all when synthesizing it for prior derivation. Thus, a researcher using Bayesian estimation should carefully consider how they will derive priors using results from previously published studies and/or existing data. The current study was designed to help address the question of which approach(es) might be most useful in the context of structure coefficients in SEM by extending earlier work in this area, which has primarily focused on observed variable regression or factor analysis. The manuscript is organized as follows: First, a brief review of SEM is provided, followed by a discussion of approaches for synthesizing previous research results in order to obtain priors for a subsequent Bayesian analysis. Next, the research goals are discussed, and the simulation study carried out to address them is described. Finally, the results and their implications are discussed, and ideas for future research are given.

1.1. Structure Equation Modeling

SEMs allow researchers to relate latent variables to one another in a fashion similar to regression models for observed variables. The SEM can be written as:

η = B γ + ζ

(1)

where

$η =$ endogenous latent variable(s)
$γ =$ exogenous latent variable(s)
$B =$ coefficient linking the endogenous and exogenous variables with one another
$ζ =$ random error with a mean of 0 and a variance of ϕ.

In turn, the latent endogenous variable can be expressed as:

y = τ + Λ η + ε

(2)

where

$y =$ vector of observed indicator variables
$τ =$ vector of latent intercepts
$Λ = m$ atrix of factor loadings
$ε =$ vector of random errors.
A similar model can be written for the exogenous latent variables.

In the context of the current study, the researcher would be interested in identifying an informative prior distribution for

B

to be used in the context of Bayesian estimation. Such priors would potentially provide a more accurate estimation of the relationship between latent traits. More specifically, the researcher would find studies using the same (or similar) latent traits that estimated

B

estimates. These estimates could then be summarized in some way (to be described below), with the summarized value(s) in turn being used as informative priors in a Bayesian estimation of the SEM. There are a number of techniques that have been suggested for summaries in the context of slopes for observed variable regression [2,13,14]. However, there has been very little research examining how various synthesis methods work in the context of obtaining priors for structure coefficients with latent variable SEMs. Following is a discussion of several methods for such synthesis that have been found to be effective in the observed regression context.

1.2. Bayesian Synthesis

One approach that has been suggested for using results from previous research to develop priors for Bayesian estimation is known as Bayesian synthesis [8]. This method, which is sometimes also referred to as augmented data-dependent priors (AUDP), involves the sequential analysis of independent datasets containing the same measures as the target data. It should be noted that this method assumes the availability of such datasets prior to the conduct of the target study. These data might be obtained in the context of a longitudinal study involving the collection of unique samples for each version of the study, such as the Programme for International Student Assessment (PISA) or the General Social Survey (GSS). The results from the analyses of each previously collected dataset are then combined in order to develop priors for the analysis of the subsequent dataset. As an example, consider the case where a researcher has access to data from the prior five years of data collection for the youth risk behavior survey (YRBS) that is collected by the U.S. Center for Disease Control [15]. In addition, assume that the researcher is interested in examining the relationship between latent variables depressive symptoms and exposure to violence using the most recent version of the YRBS. The data synthesis approach would involve fitting a SEM linking these two latent variables using the first year of the prior data and obtaining estimates for the structure coefficients of interest as well as their standard errors. Next, the researcher would fit the same SEM to the second year’s dataset using the coefficients and the squared standard error from the first study as priors for the structure coefficient in the second year’s analysis. This sequential set of analyses would be repeated for each subsequent year. The results of the year prior to the target year would then serve as the priors for fitting the SEM to the target data. Research has shown that if one can assume these previous datasets are exchangeable with one another, the Bayesian synthesis method approach to finding priors yields accurate results in the context of a linear growth model [8].

There are some important issues that need to be considered when using the sequential Bayesian synthesis approach. First, the methodology described above assumes that the measures used in each year are the same. If this is not the case, then the variables must first be equated statistically before the sequential analyses can be conducted [8]. Second, in many real-world situations, the researcher does not have access to full datasets, including the same variables, but rather prior information must be obtained from summary data in published research. If the earlier data are in the form of means, variances, and correlation (or covariance) matrices, SEMs can be fit to the data based on this summary information. If, however, earlier studies only report the parameter values and standard errors (e.g., structure coefficients from SEM) of interest, then the sequential approach outlined above cannot be applied. One alternative involves averaging results from earlier studies in order to obtain priors for use in the target analysis. This approach is referred to as aggregated data-dependent priors (AGDP). Kaplan, et al. [2] described this method in the context of both single and multilevel regression models. Finally, when the raw data from the earlier studies are available, the measures are the same as for the target, and the earlier samples were drawn from the same population as the target sample, the researcher can pool the data and conduct a single analysis.

1.3. Meta-Analysis

When prior information is drawn from published research and/or summary data, meta-analysis offers a viable alternative for obtaining prior values for the target Bayesian estimation. Meta-analysis is a statistical approach used to combine results for one or more target parameters (e.g., regression coefficients, correlation coefficients, and effect sizes) from multiple studies while accounting for study features such as sample size and variability in the parameter estimates [16]. The results of a meta-analysis, which typically include both point estimates and measures of variability for the parameter(s) of interest, can then be directly incorporated as priors for Bayesian estimation involving the target dataset.

Assuming that the researcher is able to obtain estimates of the parameter of interest (e.g., SEM structure coefficient) from published research, the random effects meta-analysis model can be used to summarize the previously published values for use as priors in the Bayesian estimation of the SEM for the target dataset. This technique, which was applied in the current study, yields an estimate of the parameter of interest (e.g., SEM structure coefficient) based upon information from previous studies using the model:

θ_{j} = θ + v_{i} + e_{i}

(3)

where

$θ =$ overall parameter value (e.g., structure coefficient)
$v_{i} =$ between study variation; $N (0, τ^{2})$
$e_{i} =$ random error; $N (0, σ^{2})$ .

The overall value,

θ

, is estimated as a weighted mean using Equation (4):

\bar{θ} = \frac{\sum_{j = 1}^{J} w_{j} θ_{j}}{\sum_{j = 1}^{J} w_{j}}

(4)

where

w_{j} = \frac{1}{σ^{2} + τ^{2}}

Estimation of

τ^{2}

is typically done using a method described by DerSimonian and Laird [17]. A meta-analytic approach offers potential advantages to the simple averaging of prior parameter estimates by accounting for sample size and variation. The reader interested in a detailed technical description of meta-analysis is encouraged to read any of the excellent discussions of the method, such as Card [16].

1.4. Meta-Analytic Predictive Method

The standard meta-analysis approach to summarizing prior information described above weights the data based upon the variance within and between the studies. However, in the context of obtaining priors for Bayesian estimation, it may be the case that data from some previous studies are more similar to the current target data than are others. Thus, it would be beneficial for the meta-analysis to build this heterogeneity of data sources into the calculation of the prior values. Standard meta-analysis would not account for the relative similarity between the target and previous data, but rather only for the variance within and between studies for the parameter(s) of interest. One approach to account for the similarity of results from previous studies to those of the target data is the meta-analytic predictive (MAP) method [13,18]. In this context, the goal is to link the target model parameters and the prior study parameters with hyper-parameters that provide information about both parameter types, as in Equation (5).

p (θ_{*}, θ_{1}, \dots, θ_{J} | Ψ)

(5)

where

$θ_{*} =$ parameters for the target data
$θ_{J} =$ parameters for the prior dataset J
$Ψ =$ hyper-parameters for the target and prior data.

The priors are essentially the posterior probability of the target data parameters based upon the prior datasets, referred to as the marginal posterior and expressed as:

p M A P (θ_{*} | Y_{1}, \dots, Y_{J})

(6)

where

$Y_{J} =$ prior dataset J.

The posterior distribution of the parameters for the target data is then expressed as:

p (θ_{*} | Y_{*}) \propto p (Y_{*} | θ_{*}) p M A P (θ_{*})

(7)

where

$Y_{*} =$ target data.

It should be noted that Equation (6) does not require the actual data from previous studies but instead can incorporate prior estimates of the parameter(s) of interest.

As discussed previously, in some cases, the prior datasets differ from the target data, potentially creating heterogeneity between the Bayesian prior and the target data. In order to account for this potential heterogeneity, Schmidli, et al. [13] introduced a mixture component and weight into the calculation of

p M A P

. This robust

p M A P

prior takes the form:

p M A P r = (1 - w) p M A P (θ_{*}) + w p v (θ_{*})

(8)

where

$p v (θ_{*}) =$ mixture component
$w =$ probability that the prior information is not relevant to the target data;
derived from multivariate normal unit information prior
For the current study, both $p M A P$ and $p M A P r$ were used.

1.5. Power Priors

Another approach designed to account for heterogeneity in the samples used to obtain priors and the target data is known as the power prior method. As was alluded to earlier, samples in prior studies may be drawn from overlapping but not identical populations to the one from which the target data were drawn. Likewise, the parameters of interest might not be constant over time, making estimates obtained in some previous studies less optimal than others for use as priors for the current target analysis. An alternative approach to dealing with this lack of homogeneity involves the incorporation of information about the similarity of prior and current samples into the weighting of the priors. This weighting can be done using power priors, which were initially described by Ibrahim and Chen [19]. Following is a brief description of power priors. A more thorough discussion appears in Ibrahim, et al. [7].

The basic power prior is defined as:

π (θ | D_{0}, a_{0}) \propto L {(θ | D_{0})}^{a_{0}} π_{0} (θ)

(9)

where

$θ =$ parameters for target data
$D_{0} =$ historic data
$L (θ | D_{0}) =$ likelihood function for $θ$ given historical data $D_{0}$
$π_{0} (θ) =$ initial prior for $θ$ before $D_{0}$ is observed
$a_{0} =$ weight of historic data relative to the likelihood of the current data, $L (θ | D)$
$D =$ target data.

The weight parameter,

a_{0}

ranges between 0 and 1 and accounts for the heterogeneity of the data from prior studies vis-à-vis the target data. Values of

a_{0}

closer to 1 indicate that the historic data are more similar to the target data and thus should play a greater role in the estimation of the parameters for the target. Based on the power prior, the posterior distribution of

θ

for the target data is:

π (θ | {D, D}_{0}, a_{0}) \propto L (θ | D) L {(θ | D_{0})}^{a_{0}} π_{0} (θ)

(10)

The power prior is essentially a likelihood function raised to a power (

a_{0}

), where the power reflects the homogeneity of the historic data relative to the target data. Ibrahim, et al. (2015) [7] cited several advantages of the power prior approach versus other techniques for obtaining priors:

Propriety relative to the posterior distribution.
Semi-automatic prior elicitation scheme for variable subset selection and general model selection.
All asymptotics associated with a likelihood also applies to the power prior.

The

a_{0}

can either be estimated as part of the modeling process (DP_random) or supplied by the researcher (DP). In this latter case, Ibrahim, et al. [7] recommend using several values for

a_{0}

and then conducting a sensitivity analysis in order to identify the optimal value based on model fit. When

a_{0}

is estimated rather than supplied by the researcher, the full prior specification for the model parameters is:

π (θ, a_{0} | D_{0}) \propto π^{*} L {(θ, a_{0} | D_{0})}^{} \equiv L {(θ | D_{0})}^{a_{0}} π_{0} (θ) π_{0} (a_{0})

(11)

where

$π_{0} (θ) =$ initial prior for model parameters, for example, structure coefficients
$π_{0} (a_{0}) =$ initial prior for $a_{0}$ .

In the current study, both estimates and predefined fixed values for

a_{0}

were used.

1.6. Study Goals

The primary goal of this study was to extend earlier work comparing various methods for determining prior distributions for use in the Bayesian estimation of structure coefficients in SEM. The approaches selected for inclusion were drawn from earlier research and included the following methods for determining the priors for Bayesian estimation: noninformative Bayes priors, Bayes synthesis, pooled Bayes, AGDP, standard meta-analysis,

p M A P

,

p M A P r

, and power priors. A simulation study design was used to compare the various methods for obtaining priors in the context of Bayesian estimation. In addition, an analysis of an empirical dataset was also conducted in order to demonstrate the methodology in practice.

2. Materials and Methods

For each combination of simulation conditions, a total of 1000 replications per combination of conditions were used. A three-factor SEM (Figure 1) was simulated with 5 observed indicators for each latent trait. The data were simulated under a pure, simple structure. The indicators were simulated using the standard normal distribution with a mean of 0 and a variance of 1. The structure coefficients, A and B in Figure 1, were of primary interest. Structure coefficient A was simulated with the value 1, and coefficient B was simulated with the value 0.5. Data were generated using the R lavaan package, and the models were fit using Mplus [20] as well as the R packages RBesT, bayesDP, and metafor. In addition, the R MplusAutomation library was used to integrate the Mplus data analysis package into the R script used to generate the data and summarize the results. Following is a description of the manipulated study conditions.

2.1. Factor Loadings

The factor loadings connecting the individual observed indicator variables to the latent traits were either 0.6 or 0.8. For a given study condition (e.g., loadings of 0.8), all 15 indicators had the same loading. These values were selected to represent cases where less than 50% of the variance in the observed indicators was accounted for by the factor (loadings = 0.6) and when more than 50% of the variance in the indicators was accounted for by the factor (loadings = 0.8).

2.2. Number of Previous Datasets

For each simulation replication, a target dataset was generated using the methodology described above. It was this target dataset for which the outcome variables (described below) were collected. As well as the target, additional datasets were generated using the same methodology as was employed for the target data. These datasets served the role of data used in prior research. For each of these datasets, an SEM based on Figure 1 was fitted to the data, and the structure coefficient estimates were then used to obtain prior distributional values using the methods described earlier in the manuscript. Three conditions for the number of prior datasets were included in this study: 5, 10, and 15. These conditions correspond to scenarios in which researchers who collected the target data had access to data from 5, 10, or 15 previous studies using the same observed indicators.

2.3. Structure Coefficient Heterogeneity

The structure coefficients for the previous datasets were simulated to either be homogeneous with those of the target data (A = 1 and B = 0.5) or heterogeneous with respect to the target values. For the heterogeneous conditions, structure coefficient A was simulated to be either 0.2, 0.5, 0.8, 1.2, 1.5, or 1.8. For structure B, the heterogeneous values were −0.3, 0, 0.3, 0.7, 1.0, or 1.3. The heterogeneous conditions correspond to situations in which the parameters of interest for the prior data differed from those of the target data.

2.4. Sample Size

The sample sizes used in the current study were simulated to be either 200, 500, or 1000. These values were selected so as to correspond to small, moderate, or relatively large samples.

2.5. Methods for Determining Priors

The parameters of interest in the current study were the structure coefficients, A and B, from Figure 1. Multiple approaches for determining the prior distributions when estimating these parameters were used, including noninformative priors, Bayes synthesis, AGDP, meta-analysis,

p M A P

,

p M A P r

, and the power priors. These methods were selected because they have been shown to be effective in prior research. For example, in the context of latent variable growth models, Bayesian synthesis and pooled data analysis were both shown to be effective tools for synthesizing previous results into priors [8]. Likewise, power priors have been shown to be effective tools for determining priors for regression coefficients in multilevel modeling [2] as well as for single-level regression models [19]. Meta-analytic Bayesian prior techniques have also been found useful in observed variable modeling. Given these results from earlier research, primarily with observed variable models, it was of interest to ascertain how they might work in the latent variable context, given their ubiquity in psychological research. For the power priors, the random effects model (DP_random) was used to estimate the optimal value of

a_{0}

. In addition, values for

a_{0}

of 0.75 (DP_75), 0.50 (DP_50), or 0.25 (DP_25) were also included in the study. The use of these values corresponds to the case where the researcher supplies their own

a_{0}

and then uses sensitivity analysis to determine the optimal value. In addition, estimation using the pooled data was also included in the study.

2.6. Study Outcomes

For each structure coefficient, three outcomes were included in this study: mean squared error (MSE), relative bias, and the empirical standard error associated with each of the structure coefficients (A and B) in the SEM in Figure 1. The MSE is calculated as:

M S E = \frac{\sum_{R = 1}^{r} {(\hat{θ_{r}} - θ)}^{2}}{R}

(12)

where

$θ =$ data-generating value of the structure coefficient
$\hat{θ_{r}} =$ estimated value of the structure coefficient for replication r
$R =$ total number of simulation replications.

The relative bias for each replication was calculated as:

B i a s = \frac{\hat{θ_{r}} - θ}{θ}

(13)

The mean relative bias taken across the R replications was then used to assess the performance of the various approaches to determining the priors. Finally, the empirical standard error was simply the standard deviation of the

\hat{θ_{r}}

taken across the R replications for each combination of the simulation study conditions. An analysis of variance (ANOVA) was used to determine which of the manipulated conditions and their interactions were associated with each of the outcome variables.

3. Results

3.1. Mean Squared Error

The results for the two structure coefficients were nearly identical, and therefore the focus of the following discussion is only on coefficient A, which had a population value of 1. The ANOVA results indicated that the interaction of the prior method and the heterogeneity of the data used for determining the priors (

F_{72,11} = 43.01, p < 0.001, η^{2} = 0.73

) and the method and sample size (

F_{24,2} = 78.31, p < 0.001, η^{2} = 0.99

) were statistically significantly associated with the MSE. The MSE values by method and heterogeneity appear in Table 1. For all of the methods, the closer the structure coefficient values in the prior set were to the population-generating value for the target data (1), the lower the MSE. With respect to the methods themselves, the lowest MSE values were associated with the DP_random method for obtaining prior values. However, this was not universally the case, thus leading to the statistically significant interaction reported above. When the mean of the structure coefficient for the prior set was equal to that of the data-generating value for the target set (heterogeneity = 1), the sequential Bayesian synthesis approach yielded the smallest MSE values, followed by the two

p M A P

methods. The largest MSE in this case was associated with the noninformative Bayesian estimate and the DP_random method. When the mean of the Bayesian prior set was less than that for the target population, DP_random yielded the smallest MSE value, with the exception of a value of 0.8. In that latter case, DP_25 and DP_50 had the smallest MSE values. When the mean structure coefficient for the prior set was larger than that of the target population, DP_random also had the smallest MSE, except for a prior structure mean of 1.2, in which case DP_25 and DP_50 had the lowest MSE. Across the heterogeneous prior set condition, MSE for the noninformative Bayes prior was not influenced by the degree of difference between the prior data and the target data. This result was to be expected, given that the noninformative Bayes estimate did not make use of the previous information.

MSE by sample size and method appears in Table 2. For all methods, the MSE declined concomitantly with increases in sample size. The lowest MSE was associated with DP_random, followed by DP_25, DP_50, and DP_75. Finally, the impact of sample size on MSE was most marked for

P m a p R

, with a difference in MSE of approximately 0.055. This approach also had the largest MSE across all sample size conditions.

3.2. Bias

The results of the ANOVA indicated that the interaction of method and the heterogeneity of the data used for determining the priors (

F_{72,11} = 22.84, p < 0.001, η^{2} = 0.68

) and method by factor loadings (

F_{24,4} = 71.13, p < 0.001, η^{2} = 0.99

) were statistically significantly associated with bias in the structure coefficient estimates. Table 3 includes bias results for each method based on the heterogeneity of the Bayesian prior data and the target data. When the structure coefficient mean for the target and Bayesian prior data were the same (i.e., 1), the lowest bias results were associated with the two

p M A P

methods and the sequential Bayesian synthesis approach. In contrast, the greatest bias in the equal structure coefficient case was associated with the noninformative Bayes approach.

When the mean structure value for the Bayes prior data was below that of the target data, all methods yielded negatively biased estimates. The lowest bias was associated with DP_random across the Bayes prior structure coefficient values. The next lowest bias was associated with DP_25, DP_50, and DP_75. The

p M A P

techniques yielded the third largest degree of bias across non-equivalent mean coefficient conditions, despite having the least bias when the target and prior data had equal structure coefficient values. The worst-performing method in this respect was the meta-analysis technique. The noninformative Bayes approach was largely unaffected by the degree of heterogeneity, which was to be expected given that it did not make use of information from prior datasets.

Bias was larger for all methods, with smaller loadings for the individual factors involved in the analyses (Table 4). In addition, across factor loading values, the lowest bias was associated with DP_random. The largest difference in bias between the 0.6 and 0.8 loading conditions was associated with the Naïve Bayes method (approximately. 0.09) and AGDP (approximately 0.08). For the other approaches used in this study, the difference in bias between loadings of 0.6 and 0.8 was never larger than 0.05.

3.3. Standard Error

ANOVA results revealed that the interaction of method and heterogeneity of the data used for determining the priors (

F_{66,42} = 21.76, p < 0.001, η^{2} = 0.76

) and method by sample size (

F_{22,6} = 1577.43, p < 0.001, η^{2} = 0.99

) were statistically significantly associated with the MSE. Table 5 includes the standard errors for the structure coefficient by level of heterogeneity and method. Across levels of heterogeneity, DP_random yielded the lowest standard error. Several other methods yielded comparably low standard errors, including AGDP, DP_75,

p M A P

, and

P m a p R

. The largest standard errors were associated with the synthesized Bayes and noninformative Bayes approaches. For all of the methods examined here, except noninformative Bayes, standard errors increased in value concomitantly with an increase in the divergence of the prior data mean structure value from that of the target data. With respect to sample size (Table 6), for all methods studied here, standard errors were smaller for larger samples. And as discussed above, DP_random, AGDP, DP_75,

p M A P

, and

P m a p R

had the smallest standard error values.

3.4. Prior and Posterior Distributions

In order to gain greater insights into the relationship between the prior and posterior distributions for each of the methods, histograms for each method appear in the Appendix A. These histograms reflect the prior and posterior distributions for the structure coefficient in the sample size of 1000 heterogeneity = 1 case for each of the methods included in the study. Note that the population value for the coefficient was 1. An examination of these figures reveals a high level of congruence between the posterior and prior distributions around the population value of 1 for the

p M A P

technique. This result supports the findings that this approach generally yielded lower levels of bias and MSE, as outlined above. The spread of both the prior and posterior was also smaller than was the case for the other methods.

Perhaps the two best performers in the heterogeneous prior data case were DP_random and

p M A P

. The prior and posterior distributions for each of these methods appear at the end of the Appendix A. From these results, we can see that the prior distributions for both methods are centered well above the population value for the target variable, which was 1. The posterior means for both techniques were close to the population-generating value. However, the variability in both the prior and posterior distributions for DP_random was much lower than that for

p M A P

. This result provides insight into the lower bias associated with DP_random as compared to the other methods included in this study.

4. Discussion

The goal of this study was to compare several approaches for identifying prior distributions to be used in the Bayesian estimation of structure coefficients in a latent variable SEM. The results presented above did indeed find differences in the performance of the various approaches with respect to an accurate estimate of the structure values. The results of this study are, in some respects, similar to previous findings for observed variable models. For example, the Bayesian synthesis approach worked well when the prior data were similar with respect to relationships and distributions to those of the target data, which was also found for multilevel regression models [2,8]. This study also found that, generally speaking, the

p M A P r

approach to meta-analytic synthesis was better than either standard meta-analysis or the non-robust

p M A P

, which extends upon similar findings for observed variable models [13]. The power prior method was found to be perhaps the most effective tool for developing prior distribution information based on prior research. Prior work on the prior distribution did not compare it with as many other methods as did the current study. However, with respect to the methods to which it had been compared previously, specifically the Naïve Bayes and AGDP techniques, power priors were found to yield more accurate results for regression models [19]. The current study found similar positive results for power priors in the context of latent variable models over a wider range of alternative techniques.

From the findings described above, several recommendations for practice can be made. First, the use of noninformative priors is typically not recommended when the researcher has access to information from prior research. Estimates based on the noninformative priors generally had larger MSE and SE values than most of the other techniques examined in this research. On the other hand, when the prior information available to the researcher was heterogeneous with the target data, the relative estimation bias for the noninformative Bayes estimator was lower than that of several of the other methods. Nonetheless, noninformative Bayes never outperformed all of the other methods for setting priors and is thus not generally recommended for use when prior information is available to the researcher.

A second recommendation to come from these findings is that the power prior approach DP_random appears to yield the least biased results across most study conditions. This reduced bias vis-à-vis the other methods was particularly notable when the prior data available for obtaining prior distributions diverged from that of the target dataset. Indeed, the more heterogeneous these prior datasets were from the target, the greater the advantage provided by DP_random with respect to estimation bias. Furthermore, the DP_random technique also consistently yielded the smallest standard errors of the structure coefficient estimate. Taken together, these results suggest that using the Bayesian estimator with DP_random power priors was likely to yield the most accurate and efficient estimates of the structure coefficients for the SEM. A third finding from this study that may have direct application to research practice is that having as few as five previous results available upon which to draw information for determining priors is sufficient to yield relatively accurate and efficient estimates for the target data using the DP_random approach. Having more previous data for determining the priors did not significantly impact the accuracy of the results for the target data.

A fourth finding from this study relevant for practice was that methods for Bayesian analysis involving actual analysis of the prior datasets (i.e., Bayes synthesis or pooled estimation) did not generally yield more accurate results than techniques that developed priors from summary data (e.g., power priors and meta-analysis). When the population parameters for the previous and target data were identical, the synthesis and pooled approaches generally yielded the most accurate and efficient structure coefficient estimates. This result is in keeping with the assumption that previously collected datasets will be homogeneous with the target data [8] when the Bayesian synthesis technique is to be used. Thus, when researchers have such datasets available to them and the data are homogeneous with the current target data, the Bayesian synthesis method in particular may be a useful alternative for researchers to use when developing priors. Such situations are not uncommon in the context of large data collection programs such as the aforementioned PISA, YRBS, and GSS. However, in many other research scenarios, such datasets are not available, and thus, synthesis and pooling are not realistic options. In addition, when the researcher cannot verify (or at least very reasonably assume) that the prior data are homogeneous with respect to the current data, the synthesis technique may not be appropriate for developing priors. Again, this recommendation fits with prior recommendations in the literature [8].

Study Limitations and Directions for Future Research

Although this study was designed to cover a number of real-world conditions faced by researchers in practice, it definitely has limitations that should be addressed in future studies. Perhaps foremost, future research should consider more complex SEMs than the one used in this study. The relatively simple model used here was selected so as to allow for a clear understanding of how accurately specific structure coefficients could be estimated. However, certainly more complex models should be considered in future research. In addition, future studies should include a wider array of sample sizes, particularly smaller values than the smallest used here, 250. Again, the purpose of this study was to investigate the performance of these methods for determining priors in relatively straightforward conditions. However, researchers are sometimes faced with small samples, and Bayesian estimation has been recommended for use in such cases (e.g., Kaplan, 2016). At the same time, it is also the case that with small samples, the choice of priors is very important, as they will have a greater impact on the final parameter estimates than is the case with larger samples. Thus, future work examining the performance of the prior determination methods studied here should include a range of small sample sizes. Research also needs to be done examining a greater array of latent variable modeling situations, including models with complex factor structure (e.g., bifactor and second-order factor), latent class models, and multiple groups factor models used for invariance testing. Finally, future work should also consider cases where the latent structure is misspecified. In practice, researchers may not always correctly specify their latent variable models, perhaps ignoring cross-loadings for the factors or using an incorrect number of latent variables. It is important to learn whether some methods for determining priors are more accurate in such cases.

5. Conclusions

Bayesian estimation is a powerful tool for researchers to employ when working with latent variable models. It has been shown to be particularly useful with small samples and complex, difficult-to-fit models. One key aspect of properly fitting such models is the determination of prior distributions to be used in the estimation process. The results of this study suggest that the power prior approach, for which the weight applied to results from previous studies is estimated, may be a particularly appealing technique for developing priors of the structure coefficients. It tended to yield the least biased and most efficient estimates across a variety of study conditions.

It is hoped that researchers can apply these results to their own work. The approaches studied here, particularly the power prior method, are very easy to carry out using the R software package v 4.3.2. Therefore, if researchers are able to identify prior studies or existing datasets that are related to their own work, they can use that information to develop priors using the synthesis methods described here. For example, a researcher working in the area of achievement goal motivation could take factor analysis and SEM results from earlier studies, enter them into an Excel spreadsheet, and then apply the methods studied here, such as power priors, in order to obtain appropriate priors for the analysis of their own data using a Bayesian estimator. Given the results of this study, it seems clear that in many situations, the default noninformative priors available in software packages may not be the most accurate estimates of structure coefficients in a SEM. Thus, by synthesizing results reported in the literature using an approach such as power priors, the researcher can obtain more accurate SEM structure coefficient estimates than would be the case simply using default priors available in software.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A. Posterior and Prior Distributions for Methods Used to Obtain Bayesian Priors

Appendix B. R Simulation Script

library(lavaan)

library(semTools)

library(spatstat.utils)

library(tcltk)

library(Matrix)

library(gdata)

library(bain)

library(gtools)

library(MIIVsem)

library(simsem)

library(MplusAutomation)

library(metafor)

library(bayesDP)

library(RBesT)

library(MLmetrics)

# Replace function is needed to fill in parameters in Mplus script with specified values

loopReplace <- function(text, replacements) {

for (v in names(replacements)){

text <- gsub(sprintf(“\\[\\[%s\\]\\]”, v), replacements[[v]], text)

}

return(text)

}

set.seed(31092)

simulation1.coef<-NULL

simulation1.se<-NULL

simulation1.cover<-NULL

simulation1.bias<-NULL

n<-200

coef1<-1

coef2<-0.5

#SIMULATION MODEL#

population.model <- ‘ f1 =~ x1 + 0.6*x2 + 0.6*x3 + 0.6*x4 + 0.6*x5

f2 =~ x6 + 0.6*x7 + 0.6*x8 + 0.6*x9 + 0.6*x10

f3 =~ x11 + 0.6*x12 + 0.6*x13 + 0.6*x14 + 0.6*x15

f3 ~ 1*f1 + 0.5*f2′

population.model.hetero <- ‘ f1 =~ x1 + 0.6*x2 + 0.6*x3 + 0.6*x4 + 0.6*x5

f2 =~ x6 + 0.6*x7 + 0.6*x8 + 0.6*x9 + 0.6*x10

f3 =~ x11 + 0.6*x12 + 0.6*x13 + 0.6*x14 + 0.6*x15

f3 ~ 0.5*f1 + 0.25*f2

‘

for(z in 1:100) {

#################

##GENERATE DATA##

#################

sim.data1 <- simulateData(population.model.hetero, sample.nobs=n)

sim.data2 <- simulateData(population.model.hetero, sample.nobs=n)

sim.data3 <- simulateData(population.model.hetero, sample.nobs=n)

sim.data4 <- simulateData(population.model.hetero, sample.nobs=n)

sim.data5 <- simulateData(population.model.hetero, sample.nobs=n)

sim.data6 <- simulateData(population.model, sample.nobs=n)

####### Create MLE Syntax for First Data Set #######

mle.script <- mplusObject(

TITLE = “SEM for data1;”,

VARIABLE = “USEVARIABLES = x1-x15;”,

ANALYSIS = “ESTIMATOR = ml;”,

MODEL = “

f1 by x1-x5;

f2 by x6-x10;

f3 by x11-x15;

f3 on f1 f2;

output: cinterval

“,

usevariables=c(“x1”,”x2”,”x3”,”x4”,”x5”,”x6”,”x7”,”x8”,”x9”,”x10”,

“x11”,”x12”,”x13”,”x14”,”x15”),

rdata=sim.data1)

##RUN MLE IN MPLUS##

mle.result = mplusModeler(mle.script, “sim.data1”, modelout = “mle.inp”,run = 1L)

##EXTRACT STRUCTURE COEFFICIENTS AND STANDARD ERRORS FOR MLE##

mle.coef.f1<-mle.result$results$parameters$unstandardized[mle.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & mle.result$results$parameters$unstandardized$param == ‘F1′,3]

mle.se.f1<-mle.result$results$parameters$unstandardized[mle.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & mle.result$results$parameters$unstandardized$param == ‘F1′,4]

mle.coef.f2<-mle.result$results$parameters$unstandardized[mle.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & mle.result$results$parameters$unstandardized$param == ‘F2′,3]

mle.se.f2<-mle.result$results$parameters$unstandardized[mle.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & mle.result$results$parameters$unstandardized$param == ‘F2′,4]

mle.coef.f1

mle.se.f1

mle.coef.f2

mle.se.f2

mle.low.f1<-mle.result$results$parameters$ci.unstandardized[mle.result$results$parameters$ci.unstandardized$paramHeader==‘F3.ON’ & mle.result$results$parameters$ci.unstandardized$param==‘F1′,4]

mle.high.f1<-mle.result$results$parameters$ci.unstandardized[mle.result$results$parameters$ci.unstandardized$paramHeader==‘F3.ON’ & mle.result$results$parameters$ci.unstandardized$param==‘F1′,8]

mle.cover.f1<-ifelse(coef1>=mle.low.f1 & coef1<=mle.high.f1,1,0)

mle.low.f2<-mle.result$results$parameters$ci.unstandardized[mle.result$results$parameters$ci.unstandardized$paramHeader==‘F3.ON’ & mle.result$results$parameters$ci.unstandardized$param==‘F2′,4]

mle.high.f2<-mle.result$results$parameters$ci.unstandardized[mle.result$results$parameters$ci.unstandardized$paramHeader==‘F3.ON’ & mle.result$results$parameters$ci.unstandardized$param==‘F2′,8]

mle.cover.f2<-ifelse(coef2>=mle.low.f2 & coef2<=mle.high.f2,1,0)

####### Create Bayes Syntax for First Data Set #######

bayes1.script <- mplusObject(

TITLE = “SEM for data1;”,

VARIABLE = “USEVARIABLES = x1-x15;”,

ANALYSIS = “ESTIMATOR = bayes;”,

MODEL = “

f1 by x1-x5;

f2 by x6-x10;

f3 by x11-x15;

f3 on f1 f2;

“,

usevariables=c(“x1”,”x2”,”x3”,”x4”,”x5”,”x6”,”x7”,”x8”,”x9”,”x10”,

“x11”,”x12”,”x13”,”x14”,”x15”),

rdata=sim.data1)

##RUN BAYES1 IN MPLUS##

bayes1.result = mplusModeler(bayes1.script, “sim.data1”, modelout = “bayes1.inp”,run = 1L)

##EXTRACT STRUCTURE COEFFICIENTS AND STANDARD ERRORS FOR BAYES1##

bayes1.coef.f1<-bayes1.result$results$parameters$unstandardized[bayes1.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes1.result$results$parameters$unstandardized$param == ‘F1′,3]

bayes1.se.f1<-bayes1.result$results$parameters$unstandardized[bayes1.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes1.result$results$parameters$unstandardized$param == ‘F1′,4]

bayes1.coef.f2<-bayes1.result$results$parameters$unstandardized[bayes1.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes1.result$results$parameters$unstandardized$param == ‘F2′,3]

bayes1.se.f2<-bayes1.result$results$parameters$unstandardized[bayes1.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes1.result$results$parameters$unstandardized$param == ‘F2′,4]

bayes1.coef.f1

bayes1.se.f1

bayes1.coef.f2

bayes1.se.f2

bayes1.var.f1<-bayes1.se.f1^2

bayes1.var.f2<-bayes1.se.f2^2

df.f<-data.frame(bayes1.coef.f1,bayes1.coef.f2,bayes1.var.f1,bayes1.var.f2)

bayes1.low.f1<-bayes1.result$results$parameters$unstandardized[bayes1.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & bayes1.result$results$parameters$unstandardized$param==‘F1′,6]

bayes1.high.f1<-bayes1.result$results$parameters$unstandardized[bayes1.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & bayes1.result$results$parameters$unstandardized$param==‘F1′,7]

bayes1.cover.f1<-ifelse(coef1>=bayes1.low.f1 & coef1<=bayes1.high.f1,1,0)

bayes1.low.f2<-bayes1.result$results$parameters$unstandardized[bayes1.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & bayes1.result$results$parameters$unstandardized$param==‘F2′,6]

bayes1.high.f2<-bayes1.result$results$parameters$unstandardized[bayes1.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & bayes1.result$results$parameters$unstandardized$param==‘F2′,7]

bayes1.cover.f2<-ifelse(coef2>=bayes1.low.f2 & coef2<=bayes1.high.f2,1,0)

####### Create Bayes Syntax for Second Data Set #######

bayes2.script <- mplusObject(

TITLE = “SEM for data2;”,

VARIABLE = “USEVARIABLES = x1-x15;”,

ANALYSIS = “ESTIMATOR = bayes;”,

MODEL = loopReplace(“

f1 by x1-x5;

f2 by x6-x10;

f3 by x11-x15;

f3 on f1 (beta1);

f3 on f2 (beta2);

MODEL PRIORS:

beta1~N([[bayes1.coef.f1]], [[bayes1.var.f1]]);

beta2~N([[bayes1.coef.f2]], [[bayes1.var.f2]]);

“, df.f),

usevariables=c(“x1”,”x2”,”x3”,”x4”,”x5”,”x6”,”x7”,”x8”,”x9”,”x10”,

“x11”,”x12”,”x13”,”x14”,”x15”),

rdata=sim.data2)

##RUN BAYES2 IN MPLUS##

bayes2.result = mplusModeler(bayes2.script, “sim.data2”, modelout = “bayes2.inp”,run = 1L)

##EXTRACT STRUCTURE COEFFICIENTS AND STANDARD ERRORS FOR BAYES1##

bayes2.coef.f1<-bayes2.result$results$parameters$unstandardized[bayes2.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes2.result$results$parameters$unstandardized$param == ‘F1′,3]

bayes2.se.f1<-bayes2.result$results$parameters$unstandardized[bayes2.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes2.result$results$parameters$unstandardized$param == ‘F1′,4]

bayes2.coef.f2<-bayes2.result$results$parameters$unstandardized[bayes2.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes2.result$results$parameters$unstandardized$param == ‘F2′,3]

bayes2.se.f2<-bayes2.result$results$parameters$unstandardized[bayes2.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes2.result$results$parameters$unstandardized$param == ‘F2′,4]

bayes2.coef.f1

bayes2.se.f1

bayes2.coef.f2

bayes2.se.f2

bayes2.var.f1<-bayes2.se.f1^2

bayes2.var.f2<-bayes2.se.f2^2

df.f<-data.frame(bayes2.coef.f1,bayes2.coef.f2,bayes2.var.f1,bayes2.var.f2)

####### Create Bayes Syntax for Third Data Set #######

bayes3.script <- mplusObject(

TITLE = “SEM for data3;”,

VARIABLE = “USEVARIABLES = x1-x15;”,

ANALYSIS = “ESTIMATOR = bayes;”,

MODEL = loopReplace(“

f1 by x1-x5;

f2 by x6-x10;

f3 by x11-x15;

f3 on f1 (beta1);

f3 on f2 (beta2);

MODEL PRIORS:

beta1~N([[bayes2.coef.f1]], [[bayes2.var.f1]]);

beta2~N([[bayes2.coef.f2]], [[bayes2.var.f2]]);

“, df.f),

usevariables=c(“x1”,”x2”,”x3”,”x4”,”x5”,”x6”,”x7”,”x8”,”x9”,”x10”,

“x11”,”x12”,”x13”,”x14”,”x15”),

rdata=sim.data3)

##RUN BAYES2 IN MPLUS##

bayes3.result = mplusModeler(bayes3.script, “sim.data3”, modelout = “bayes3.inp”,run = 1L)

##EXTRACT STRUCTURE COEFFICIENTS AND STANDARD ERRORS FOR BAYES1##

bayes3.coef.f1<-bayes3.result$results$parameters$unstandardized[bayes3.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes3.result$results$parameters$unstandardized$param == ‘F1′,3]

bayes3.se.f1<-bayes3.result$results$parameters$unstandardized[bayes3.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes3.result$results$parameters$unstandardized$param == ‘F1′,4]

bayes3.coef.f2<-bayes3.result$results$parameters$unstandardized[bayes3.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes3.result$results$parameters$unstandardized$param == ‘F2′,3]

bayes3.se.f2<-bayes3.result$results$parameters$unstandardized[bayes3.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes3.result$results$parameters$unstandardized$param == ‘F2′,4]

bayes3.coef.f1

bayes3.se.f1

bayes3.coef.f2

bayes3.se.f2

bayes3.var.f1<-bayes3.se.f1^2

bayes3.var.f2<-bayes3.se.f2^2

df.f<-data.frame(bayes3.coef.f1,bayes3.coef.f2,bayes3.var.f1,bayes3.var.f2)

####### Create Bayes Syntax for Fourth Data Set #######

bayes4.script <- mplusObject(

TITLE = “SEM for data4;”,

VARIABLE = “USEVARIABLES = x1-x15;”,

ANALYSIS = “ESTIMATOR = bayes;”,

MODEL = loopReplace(“

f1 by x1-x5;

f2 by x6-x10;

f3 by x11-x15;

f3 on f1 (beta1);

f3 on f2 (beta2);

MODEL PRIORS:

beta1~N([[bayes3.coef.f1]], [[bayes3.var.f1]]);

beta2~N([[bayes3.coef.f2]], [[bayes3.var.f2]]);

“, df.f),

usevariables=c(“x1”,”x2”,”x3”,”x4”,”x5”,”x6”,”x7”,”x8”,”x9”,”x10”,

“x11”,”x12”,”x13”,”x14”,”x15”),

rdata=sim.data4)

##RUN BAYES2 IN MPLUS##

bayes4.result = mplusModeler(bayes4.script, “sim.data4”, modelout = “bayes4.inp”,run = 1L)

##EXTRACT STRUCTURE COEFFICIENTS AND STANDARD ERRORS FOR BAYES1##

bayes4.coef.f1<-bayes4.result$results$parameters$unstandardized[bayes4.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes4.result$results$parameters$unstandardized$param == ‘F1′,3]

bayes4.se.f1<-bayes4.result$results$parameters$unstandardized[bayes4.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes4.result$results$parameters$unstandardized$param == ‘F1′,4]

bayes4.coef.f2<-bayes4.result$results$parameters$unstandardized[bayes4.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes4.result$results$parameters$unstandardized$param == ‘F2′,3]

bayes4.se.f2<-bayes4.result$results$parameters$unstandardized[bayes4.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes4.result$results$parameters$unstandardized$param == ‘F2′,4]

bayes4.coef.f1

bayes4.se.f1

bayes4.coef.f2

bayes4.se.f2

bayes4.var.f1<-bayes3.se.f1^2

bayes4.var.f2<-bayes3.se.f2^2

df.f<-data.frame(bayes4.coef.f1,bayes4.coef.f2,bayes4.var.f1,bayes4.var.f2)

####### Create Bayes Syntax for Fifth Data Set #######

bayes5.script <- mplusObject(

TITLE = “SEM for bayes5;”,

VARIABLE = “USEVARIABLES = x1-x15;”,

ANALYSIS = “ESTIMATOR = bayes;”,

MODEL = loopReplace(“

f1 by x1-x5;

f2 by x6-x10;

f3 by x11-x15;

f3 on f1 (beta1);

f3 on f2 (beta2);

MODEL PRIORS:

beta1~N([[bayes4.coef.f1]], [[bayes4.var.f1]]);

beta2~N([[bayes4.coef.f2]], [[bayes4.var.f2]]);

“, df.f),

usevariables=c(“x1”,”x2”,”x3”,”x4”,”x5”,”x6”,”x7”,”x8”,”x9”,”x10”,

“x11”,”x12”,”x13”,”x14”,”x15”),

rdata=sim.data5)

##RUN BAYES2 IN MPLUS##

bayes5.result = mplusModeler(bayes5.script, “sim.data5”, modelout = “bayes5.inp”,run = 1L)

##EXTRACT STRUCTURE COEFFICIENTS AND STANDARD ERRORS FOR BAYES1##

bayes5.coef.f1<-bayes5.result$results$parameters$unstandardized[bayes5.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes5.result$results$parameters$unstandardized$param == ‘F1′,3]

bayes5.se.f1<-bayes5.result$results$parameters$unstandardized[bayes5.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes5.result$results$parameters$unstandardized$param == ‘F1′,4]

bayes5.coef.f2<-bayes5.result$results$parameters$unstandardized[bayes5.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes5.result$results$parameters$unstandardized$param == ‘F2′,3]

bayes5.se.f2<-bayes5.result$results$parameters$unstandardized[bayes5.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes5.result$results$parameters$unstandardized$param == ‘F2′,4]

bayes5.coef.f1

bayes5.se.f1

bayes5.coef.f2

bayes5.se.f2

bayes5.var.f1<-bayes5.se.f1^2

bayes5.var.f2<-bayes5.se.f2^2

df.f<-data.frame(bayes5.coef.f1,bayes5.coef.f2,bayes5.var.f1,bayes5.var.f2)

####### Create Bayes Syntax for Sixth Data Set #######

bayes6.script <- mplusObject(

TITLE = “SEM for bayes6;”,

VARIABLE = “USEVARIABLES = x1-x15;”,

ANALYSIS = “ESTIMATOR = bayes;”,

MODEL = loopReplace(“

f1 by x1-x5;

f2 by x6-x10;

f3 by x11-x15;

f3 on f1 (beta1);

f3 on f2 (beta2);

MODEL PRIORS:

beta1~N([[bayes5.coef.f1]], [[bayes5.var.f1]]);

beta2~N([[bayes5.coef.f2]], [[bayes5.var.f2]]);

“, df.f),

usevariables=c(“x1”,”x2”,”x3”,”x4”,”x5”,”x6”,”x7”,”x8”,”x9”,”x10”,

“x11”,”x12”,”x13”,”x14”,”x15”),

rdata=sim.data6)

##RUN BAYES2 IN MPLUS##

bayes6.result = mplusModeler(bayes6.script, “sim.data6”, modelout = “bayes6.inp”,run = 1L)

##EXTRACT STRUCTURE COEFFICIENTS AND STANDARD ERRORS FOR BAYES1##

bayes6.coef.f1<-bayes6.result$results$parameters$unstandardized[bayes6.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes6.result$results$parameters$unstandardized$param == ‘F1′,3]

bayes6.se.f1<-bayes6.result$results$parameters$unstandardized[bayes6.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes6.result$results$parameters$unstandardized$param == ‘F1′,4]

bayes6.coef.f2<-bayes6.result$results$parameters$unstandardized[bayes6.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes6.result$results$parameters$unstandardized$param == ‘F2′,3]

bayes6.se.f2<-bayes6.result$results$parameters$unstandardized[bayes6.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes6.result$results$parameters$unstandardized$param == ‘F2′,4]

bayes6.coef.f1

bayes6.se.f1

bayes6.coef.f2

bayes6.se.f2

bayes6.var.f1<-bayes6.se.f1^2

bayes6.var.f2<-bayes6.se.f2^2

df.f<-data.frame(bayes6.coef.f1,bayes6.coef.f2,bayes6.var.f1,bayes6.var.f2)

bayes6.low.f1<-bayes1.result$results$parameters$unstandardized[bayes6.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & bayes6.result$results$parameters$unstandardized$param==‘F1′,6]

bayes6.high.f1<-bayes1.result$results$parameters$unstandardized[bayes6.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & bayes6.result$results$parameters$unstandardized$param==‘F1′,7]

bayes6.cover.f1<-ifelse(coef1>=bayes6.low.f1 & coef1<=bayes6.high.f1,1,0)

bayes6.low.f2<-bayes6.result$results$parameters$unstandardized[bayes6.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & bayes6.result$results$parameters$unstandardized$param==‘F2′,6]

bayes6.high.f2<-bayes6.result$results$parameters$unstandardized[bayes6.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & bayes6.result$results$parameters$unstandardized$param==‘F2′,7]

bayes6.cover.f2<-ifelse(coef2>=bayes6.low.f2 & coef2<=bayes6.high.f2,1,0)

###DATA POOLING###

pooled.data<-rbind(sim.data1,sim.data2,sim.data3,sim.data4,sim.data5,sim.data6)

####### Create ML Syntax for Pooled Data Set #######

ml.pooled.script <- mplusObject(

TITLE = “SEM for pooled data;”,

VARIABLE = “USEVARIABLES = x1-x15;”,

ANALYSIS = “ESTIMATOR = ml;”,

MODEL = “

f1 by x1-x5;

f2 by x6-x10;

f3 by x11-x15;

f3 on f1 f2;

output: cinterval”,

usevariables=c(“x1”,”x2”,”x3”,”x4”,”x5”,”x6”,”x7”,”x8”,”x9”,”x10”,

“x11”,”x12”,”x13”,”x14”,”x15”),

rdata=pooled.data)

##RUN BAYES2 IN MPLUS##

ml.pooled.result = mplusModeler(ml.pooled.script, “pooled.data”, modelout = “mlpooled.inp”,run = 1L)

##EXTRACT STRUCTURE COEFFICIENTS AND STANDARD ERRORS FOR BAYES1##

ml.pooled.coef.f1<-ml.pooled.result$results$parameters$unstandardized[ml.pooled.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & ml.pooled.result$results$parameters$unstandardized$param == ‘F1′,3]

ml.pooled.se.f1<-ml.pooled.result$results$parameters$unstandardized[ml.pooled.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & ml.pooled.result$results$parameters$unstandardized$param == ‘F1′,4]

ml.pooled.coef.f2<-ml.pooled.result$results$parameters$unstandardized[ml.pooled.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & ml.pooled.result$results$parameters$unstandardized$param == ‘F2′,3]

ml.pooled.se.f2<-ml.pooled.result$results$parameters$unstandardized[ml.pooled.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & ml.pooled.result$results$parameters$unstandardized$param == ‘F2′,4]

ml.pooled.coef.f1

ml.pooled.se.f1

ml.pooled.coef.f2

ml.pooled.se.f2

ml.pooled.low.f1<-ml.pooled.result$results$parameters$ci.unstandardized[ml.pooled.result$results$parameters$ci.unstandardized$paramHeader==‘F3.ON’ & ml.pooled.result$results$parameters$ci.unstandardized$param==‘F1′,4]

ml.pooled.high.f1<-ml.pooled.result$results$parameters$ci.unstandardized[ml.pooled.result$results$parameters$ci.unstandardized$paramHeader==‘F3.ON’ & ml.pooled.result$results$parameters$ci.unstandardized$param==‘F1′,8]

ml.pooled.cover.f1<-ifelse(coef1>=ml.pooled.low.f1 & coef1<=ml.pooled.high.f1,1,0)

ml.pooled.low.f2<-ml.pooled.result$results$parameters$ci.unstandardized[ml.pooled.result$results$parameters$ci.unstandardized$paramHeader==‘F3.ON’ & ml.pooled.result$results$parameters$ci.unstandardized$param==‘F2′,4]

ml.pooled.high.f2<-ml.pooled.result$results$parameters$ci.unstandardized[ml.pooled.result$results$parameters$ci.unstandardized$paramHeader==‘F3.ON’ & ml.pooled.result$results$parameters$ci.unstandardized$param==‘F2′,8]

ml.pooled.cover.f2<-ifelse(coef2>=ml.pooled.low.f2 & coef2<=ml.pooled.high.f2,1,0)

####### Create Bayes Syntax for Pooled Data Set #######

pooled.script <- mplusObject(

TITLE = “SEM for pooled data;”,

VARIABLE = “USEVARIABLES = x1-x15;”,

ANALYSIS = “ESTIMATOR = bayes;”,

MODEL = “

f1 by x1-x5;

f2 by x6-x10;

f3 by x11-x15;

f3 on f1 f2;

“,

usevariables=c(“x1”,”x2”,”x3”,”x4”,”x5”,”x6”,”x7”,”x8”,”x9”,”x10”,

“x11”,”x12”,”x13”,”x14”,”x15”),

rdata=pooled.data)

##RUN BAYES2 IN MPLUS##

pooled.result = mplusModeler(pooled.script, “pooled.data”, modelout = “pooled.inp”,run = 1L)

##EXTRACT STRUCTURE COEFFICIENTS AND STANDARD ERRORS FOR BAYES1##

pooled.coef.f1<-pooled.result$results$parameters$unstandardized[pooled.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & pooled.result$results$parameters$unstandardized$param == ‘F1′,3]

pooled.se.f1<-pooled.result$results$parameters$unstandardized[pooled.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & pooled.result$results$parameters$unstandardized$param == ‘F1′,4]

pooled.coef.f2<-pooled.result$results$parameters$unstandardized[pooled.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & pooled.result$results$parameters$unstandardized$param == ‘F2′,3]

pooled.se.f2<-pooled.result$results$parameters$unstandardized[pooled.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & pooled.result$results$parameters$unstandardized$param == ‘F2′,4]

pooled.coef.f1

pooled.se.f1

pooled.coef.f2

pooled.se.f2

pooled.low.f1<-pooled.result$results$parameters$unstandardized[pooled.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & pooled.result$results$parameters$unstandardized$param==‘F1′,6]

pooled.high.f1<-pooled.result$results$parameters$unstandardized[pooled.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & pooled.result$results$parameters$unstandardized$param==‘F1′,7]

pooled.cover.f1<-ifelse(coef1>=pooled.low.f1 & coef1<=pooled.high.f1,1,0)

pooled.low.f2<-pooled.result$results$parameters$unstandardized[pooled.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & pooled.result$results$parameters$unstandardized$param==‘F2′,6]

pooled.high.f2<-pooled.result$results$parameters$unstandardized[pooled.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & pooled.result$results$parameters$unstandardized$param==‘F2′,7]

pooled.cover.f2<-ifelse(coef2>=pooled.low.f2 & coef2<=pooled.high.f2,1,0)

###AGDP USE OF AVERAGE PRIOR ESTIMATES###

bayes.coef.f1.mean<-mean(bayes1.coef.f1,bayes2.coef.f1,bayes3.coef.f1,bayes4.coef.f1,bayes5.coef.f1)

bayes.coef.f2.mean<-mean(bayes1.coef.f2,bayes2.coef.f2,bayes3.coef.f2,bayes4.coef.f2,bayes5.coef.f2)

bayes.var.f1.mean<-mean(bayes1.var.f1,bayes2.var.f1,bayes3.var.f1,bayes4.var.f1,bayes5.var.f1)

bayes.var.f2.mean<-mean(bayes1.var.f2,bayes2.var.f2,bayes3.var.f2,bayes4.var.f2,bayes5.var.f2)

df.f<-data.frame(bayes.coef.f1.mean,bayes.coef.f2.mean,bayes.var.f1.mean,bayes.var.f2.mean)

agdp.script <- mplusObject(

TITLE = “SEM for AGDP;”,

VARIABLE = “USEVARIABLES = x1-x15;”,

ANALYSIS = “ESTIMATOR = bayes;”,

MODEL = loopReplace(“

f1 by x1-x5;

f2 by x6-x10;

f3 by x11-x15;

f3 on f1 (beta1);

f3 on f2 (beta2);

MODEL PRIORS:

beta1~N([[bayes.coef.f1.mean]], [[bayes.var.f1.mean]]);

beta2~N([[bayes.coef.f2.mean]], [[bayes.var.f2.mean]]);

“, df.f),

usevariables=c(“x1”,”x2”,”x3”,”x4”,”x5”,”x6”,”x7”,”x8”,”x9”,”x10”,

“x11”,”x12”,”x13”,”x14”,”x15”),

rdata=sim.data6)

##RUN AGDP BAYES IN MPLUS##

agdp.result = mplusModeler(agdp.script, “sim.data6”, modelout = “agdp.inp”,run = 1L)

##EXTRACT STRUCTURE COEFFICIENTS AND STANDARD ERRORS FOR BAYES##

agdp.coef.f1<-agdp.result$results$parameters$unstandardized[agdp.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & agdp.result$results$parameters$unstandardized$param == ‘F1′,3]

agdp.se.f1<-agdp.result$results$parameters$unstandardized[agdp.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & agdp.result$results$parameters$unstandardized$param == ‘F1′,4]

agdp.coef.f2<-agdp.result$results$parameters$unstandardized[agdp.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & agdp.result$results$parameters$unstandardized$param == ‘F2′,3]

agdp.se.f2<-agdp.result$results$parameters$unstandardized[agdp.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & agdp.result$results$parameters$unstandardized$param == ‘F2′,4]

agdp.coef.f1

agdp.se.f1

agdp.coef.f2

agdp.se.f2

agdp.low.f1<-agdp.result$results$parameters$unstandardized[agdp.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & agdp.result$results$parameters$unstandardized$param==‘F1′,6]

agdp.high.f1<-agdp.result$results$parameters$unstandardized[agdp.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & agdp.result$results$parameters$unstandardized$param==‘F1′,7]

agdp.cover.f1<-ifelse(coef1>=agdp.low.f1 & coef1<=agdp.high.f1,1,0)

agdp.low.f2<-agdp.result$results$parameters$unstandardized[agdp.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & agdp.result$results$parameters$unstandardized$param==‘F2′,6]

agdp.high.f2<-agdp.result$results$parameters$unstandardized[agdp.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & agdp.result$results$parameters$unstandardized$param==‘F2′,7]

agdp.cover.f2<-ifelse(coef2>=agdp.low.f2 & coef2<=agdp.high.f2,1,0)

###PARTIAL DISCOUNTING PRIORS USING bayesDP###

##BASE THE HISTORICAL DATA ON THE AGDP VALUES##

####### Create Bayes Syntax for Sixth Data Set using Naive priors #######

bayes6b.script <- mplusObject(

TITLE = “SEM for data6 naive priors;”,

VARIABLE = “USEVARIABLES = x1-x15;”,

ANALYSIS = “ESTIMATOR = bayes;”,

MODEL = “

f1 by x1-x5;

f2 by x6-x10;

f3 by x11-x15;

f3 on f1 f2;

“,

usevariables=c(“x1”,”x2”,”x3”,”x4”,”x5”,”x6”,”x7”,”x8”,”x9”,”x10”,

“x11”,”x12”,”x13”,”x14”,”x15”),

rdata=sim.data6)

##RUN BAYES1 IN MPLUS##

bayes6b.result = mplusModeler(bayes6b.script, “sim.data6”, modelout = “bayes6b.inp”,run = 1L)

##EXTRACT STRUCTURE COEFFICIENTS AND STANDARD ERRORS FOR BAYES1##

bayes6b.coef.f1<-bayes6b.result$results$parameters$unstandardized[bayes6b.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes6b.result$results$parameters$unstandardized$param == ‘F1′,3]

bayes6b.se.f1<-bayes6b.result$results$parameters$unstandardized[bayes6b.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes6b.result$results$parameters$unstandardized$param == ‘F1′,4]

bayes6b.coef.f2<-bayes6b.result$results$parameters$unstandardized[bayes6b.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes6b.result$results$parameters$unstandardized$param == ‘F2′,3]

bayes6b.se.f2<-bayes1.result$results$parameters$unstandardized[bayes6b.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & bayes6b.result$results$parameters$unstandardized$param == ‘F2′,4]

bayes_dp.random.coef.f1.fit<-bdpnormal( mu_t=bayes6b.coef.f1,sigma_t=bayes6b.se.f1,N_t=nrow(sim.data6),

mu0_t=pooled.coef.f1,sigma0_t=sqrt(pooled.se.f1),N0_t=nrow(pooled.data), method=“mc” )

bayes_dp.random.coef.f1 <- round(median(bayes_dp.random.coef.f1.fit$posterior_treatment$posterior_mu),4)

bayes_dp.random.se.f1 <- round(mean(bayes_dp.random.coef.f1.fit$posterior_treatment$posterior_sigma2))

summary(bayes_dp.random.coef.f1.fit)

bayes_dp.random.coef.f2.fit<-bdpnormal( mu_t=bayes6b.coef.f2,sigma_t=bayes6b.se.f2,N_t=nrow(sim.data6),

mu0_t=pooled.coef.f2,sigma0_t=sqrt(pooled.se.f2),N0_t=nrow(pooled.data), method=“mc” )

bayes_dp.random.coef.f2 <- round(median(bayes_dp.random.coef.f2.fit$posterior_treatment$posterior_mu),4)

bayes_dp.random.se.f2 <- (mean(bayes_dp.random.coef.f2.fit$posterior_treatment$posterior_sigma2))

summary(bayes_dp.random.coef.f2.fit)

bayes_dp.75.coef.f1.fit<-bdpnormal( mu_t=bayes6b.coef.f1,sigma_t=bayes6b.se.f1,N_t=nrow(sim.data6),

mu0_t=pooled.coef.f1,sigma0_t=sqrt(pooled.se.f1),N0_t=nrow(pooled.data), method=“mc”, alpha_max=0.75, fix_alpha=TRUE)

bayes_dp.75.coef.f1 <- round(median(bayes_dp.75.coef.f1.fit$posterior_treatment$posterior_mu),4)

bayes_dp.75.se.f1 <- (mean(bayes_dp.75.coef.f1.fit$posterior_treatment$posterior_sigma2))

summary(bayes_dp.75.coef.f1.fit)

bayes_dp.75.coef.f2.fit<-bdpnormal( mu_t=bayes6b.coef.f2,sigma_t=bayes6b.se.f2,N_t=nrow(sim.data6),

mu0_t=pooled.coef.f2,sigma0_t=sqrt(pooled.se.f2),N0_t=nrow(pooled.data), method=“mc”, alpha_max=0.75, fix_alpha=TRUE )

bayes_dp.75.coef.f2 <- round(median(bayes_dp.75.coef.f2.fit$posterior_treatment$posterior_mu),4)

bayes_dp.75.se.f2 <- (mean(bayes_dp.75.coef.f2.fit$posterior_treatment$posterior_sigma2))

summary(bayes_dp.75.coef.f2.fit)

bayes_dp.50.coef.f1.fit<-bdpnormal( mu_t=bayes6b.coef.f1,sigma_t=bayes6b.se.f1,N_t=nrow(sim.data6),

mu0_t=pooled.coef.f1,sigma0_t=sqrt(pooled.se.f1),N0_t=nrow(pooled.data), method=“mc”, alpha_max=0.50, fix_alpha=TRUE)

bayes_dp.50.coef.f1 <- round(median(bayes_dp.50.coef.f1.fit$posterior_treatment$posterior_mu),4)

bayes_dp.50.se.f1 <- (mean(bayes_dp.50.coef.f1.fit$posterior_treatment$posterior_sigma2))

summary(bayes_dp.50.coef.f1.fit)

bayes_dp.50.coef.f2.fit<-bdpnormal( mu_t=bayes6b.coef.f2,sigma_t=bayes6b.se.f2,N_t=nrow(sim.data6),

mu0_t=pooled.coef.f2,sigma0_t=sqrt(pooled.se.f2),N0_t=nrow(pooled.data), method=“mc”, alpha_max=0.50, fix_alpha=TRUE )

bayes_dp.50.coef.f2 <- round(median(bayes_dp.50.coef.f2.fit$posterior_treatment$posterior_mu),4)

bayes_dp.50.se.f2 <- (mean(bayes_dp.50.coef.f2.fit$posterior_treatment$posterior_sigma2))

summary(bayes_dp.50.coef.f2.fit)

bayes_dp.25.coef.f1.fit<-bdpnormal( mu_t=bayes6b.coef.f1,sigma_t=bayes6b.se.f1,N_t=nrow(sim.data6),

mu0_t=pooled.coef.f1,sigma0_t=sqrt(pooled.se.f1),N0_t=nrow(pooled.data), method=“mc”, alpha_max=0.25, fix_alpha=TRUE)

bayes_dp.25.coef.f1 <- round(median(bayes_dp.25.coef.f1.fit$posterior_treatment$posterior_mu),4)

bayes_dp.25.se.f1 <- (mean(bayes_dp.25.coef.f1.fit$posterior_treatment$posterior_sigma2))

summary(bayes_dp.25.coef.f1.fit)

bayes_dp.25.coef.f2.fit<-bdpnormal( mu_t=bayes6b.coef.f2,sigma_t=bayes6b.se.f2,N_t=nrow(sim.data6),

mu0_t=pooled.coef.f2,sigma0_t=sqrt(pooled.se.f2),N0_t=nrow(pooled.data), method=“mc”, alpha_max=0.25, fix_alpha=TRUE )

bayes_dp.25.coef.f2 <- round(median(bayes_dp.25.coef.f2.fit$posterior_treatment$posterior_mu),4)

bayes_dp.25.se.f2 <- (mean(bayes_dp.25.coef.f2.fit$posterior_treatment$posterior_sigma2))

summary(bayes_dp.25.coef.f2.fit)

bayes_dp.random.low.f1<-quantile(bayes_dp.random.coef.f1.fit$posterior_treatment$posterior_mu, .025)

bayes_dp.random.high.f1<-quantile(bayes_dp.random.coef.f1.fit$posterior_treatment$posterior_mu, .975)

bayes_dp.random.cover.f1<-ifelse(coef1>=bayes_dp.random.low.f1 & coef1<=bayes_dp.random.high.f1,1,0)

bayes_dp.random.low.f2<-quantile(bayes_dp.random.coef.f2.fit$posterior_treatment$posterior_mu, .025)

bayes_dp.random.high.f2<-quantile(bayes_dp.random.coef.f2.fit$posterior_treatment$posterior_mu, .975)

bayes_dp.random.cover.f2<-ifelse(coef2>=bayes_dp.random.low.f2 & coef2<=bayes_dp.random.high.f2,1,0)

bayes_dp.75.low.f1<-quantile(bayes_dp.75.coef.f1.fit$posterior_treatment$posterior_mu, .025)

bayes_dp.75.high.f1<-quantile(bayes_dp.75.coef.f1.fit$posterior_treatment$posterior_mu, .975)

bayes_dp.75.cover.f1<-ifelse(coef1>=bayes_dp.75.low.f1 & coef1<=bayes_dp.75.high.f1,1,0)

bayes_dp.75.low.f2<-quantile(bayes_dp.75.coef.f2.fit$posterior_treatment$posterior_mu, .025)

bayes_dp.75.high.f2<-quantile(bayes_dp.75.coef.f2.fit$posterior_treatment$posterior_mu, .975)

bayes_dp.75.cover.f2<-ifelse(coef2>=bayes_dp.75.low.f2 & coef2<=bayes_dp.75.high.f2,1,0)

bayes_dp.50.low.f1<-quantile(bayes_dp.50.coef.f1.fit$posterior_treatment$posterior_mu, .025)

bayes_dp.50.high.f1<-quantile(bayes_dp.50.coef.f1.fit$posterior_treatment$posterior_mu, .975)

bayes_dp.50.cover.f1<-ifelse(coef1>=bayes_dp.50.low.f1 & coef1<=bayes_dp.50.high.f1,1,0)

bayes_dp.50.low.f2<-quantile(bayes_dp.50.coef.f2.fit$posterior_treatment$posterior_mu, .025)

bayes_dp.50.high.f2<-quantile(bayes_dp.50.coef.f2.fit$posterior_treatment$posterior_mu, .975)

bayes_dp.50.cover.f2<-ifelse(coef2>=bayes_dp.50.low.f2 & coef2<=bayes_dp.50.high.f2,1,0)

bayes_dp.25.low.f1<-quantile(bayes_dp.25.coef.f1.fit$posterior_treatment$posterior_mu, .025)

bayes_dp.25.high.f1<-quantile(bayes_dp.25.coef.f1.fit$posterior_treatment$posterior_mu, .975)

bayes_dp.25.cover.f1<-ifelse(coef1>=bayes_dp.25.low.f1 & coef1<=bayes_dp.25.high.f1,1,0)

bayes_dp.25.low.f2<-quantile(bayes_dp.25.coef.f2.fit$posterior_treatment$posterior_mu, .025)

bayes_dp.25.high.f2<-quantile(bayes_dp.25.coef.f2.fit$posterior_treatment$posterior_mu, .975)

bayes_dp.25.cover.f2<-ifelse(coef2>=bayes_dp.25.low.f2 & coef2<=bayes_dp.25.high.f2,1,0)

###META ANALYSIS FOR BAYES PRIORS###

#CREATE FILE FOR PRIOR COEFFICIENT AND STANDARD ERROR ESTIMATES#

bayes.coef.f1.meta<-c(bayes1.coef.f1,bayes2.coef.f1,bayes3.coef.f1,bayes4.coef.f1,bayes5.coef.f1)

bayes.coef.f2.meta<-c(bayes1.coef.f2,bayes2.coef.f2,bayes3.coef.f2,bayes4.coef.f2,bayes5.coef.f2)

bayes.se.f1.meta<-c(bayes1.se.f1,bayes2.se.f1,bayes3.se.f1,bayes4.se.f1,bayes5.se.f1)

bayes.se.f2.meta<-c(bayes1.se.f2,bayes2.se.f2,bayes3.se.f2,bayes4.se.f2,bayes5.se.f2)

df.meta<-data.frame(bayes.coef.f1.meta,bayes.coef.f2.meta,bayes.se.f1.meta,bayes.se.f2.meta)

#RUN META ANALYSIS FOR PREVIOUS ESTIMATES#

f1.meta<-rma(yi=bayes.coef.f1.meta, sei=bayes.se.f1.meta, data=df.meta, method=“DL”)

coef.f1.meta<-as.numeric(f1.meta$beta)

se.f1.meta<-as.numeric(f1.meta$se)

f2.meta<-rma(yi=bayes.coef.f2.meta, sei=bayes.se.f2.meta, data=df.meta, method=“DL”)

coef.f2.meta<-as.numeric(f2.meta$beta)

se.f2.meta<-as.numeric(f2.meta$se)

meta.var.f1<-se.f1.meta^2

meta.var.f2<-se.f2.meta^2

df.f<-data.frame(coef.f1.meta,coef.f2.meta,meta.var.f1,meta.var.f2)

####### Create Bayes Syntax for Sixth Data Set for META ANALYSIS #######

meta.script <- mplusObject(

TITLE = “SEM for META;”,

VARIABLE = “USEVARIABLES = x1-x15;”,

ANALYSIS = “ESTIMATOR = bayes;”,

MODEL = loopReplace(“

f1 by x1-x5;

f2 by x6-x10;

f3 by x11-x15;

f3 on f1 (beta1);

f3 on f2 (beta2);

MODEL PRIORS:

beta1~N([[coef.f1.meta]], [[meta.var.f1]]);

beta2~N([[coef.f2.meta]], [[meta.var.f2]]);

“, df.f),

usevariables=c(“x1”,”x2”,”x3”,”x4”,”x5”,”x6”,”x7”,”x8”,”x9”,”x10”,

“x11”,”x12”,”x13”,”x14”,”x15”),

rdata=sim.data6)

##RUN META BAYES IN MPLUS##

meta.result = mplusModeler(meta.script, “sim.data6”, modelout = “meta.inp”,run = 1L)

##EXTRACT STRUCTURE COEFFICIENTS AND STANDARD ERRORS FOR BAYES1##

meta.coef.f1<-meta.result$results$parameters$unstandardized[meta.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & meta.result$results$parameters$unstandardized$param == ‘F1′,3]

meta.se.f1<-meta.result$results$parameters$unstandardized[meta.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & meta.result$results$parameters$unstandardized$param == ‘F1′,4]

meta.coef.f2<-meta.result$results$parameters$unstandardized[meta.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & meta.result$results$parameters$unstandardized$param == ‘F2′,3]

meta.se.f2<-meta.result$results$parameters$unstandardized[meta.result$results$parameters$unstandardized$paramHeader == ‘F3.ON’ & meta.result$results$parameters$unstandardized$param == ‘F2′,4]

meta.coef.f1

meta.se.f1

meta.coef.f2

meta.se.f2

meta.low.f1<-meta.result$results$parameters$unstandardized[meta.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & meta.result$results$parameters$unstandardized$param==‘F1′,6]

meta.high.f1<-meta.result$results$parameters$unstandardized[meta.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & meta.result$results$parameters$unstandardized$param==‘F1′,7]

meta.cover.f1<-ifelse(coef1>=meta.low.f1 & coef1<=meta.high.f1,1,0)

meta.low.f2<-meta.result$results$parameters$unstandardized[meta.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & meta.result$results$parameters$unstandardized$param==‘F2′,6]

meta.high.f2<-meta.result$results$parameters$unstandardized[meta.result$results$parameters$unstandardized$paramHeader==‘F3.ON’ & meta.result$results$parameters$unstandardized$param==‘F2′,7]

meta.cover.f2<-ifelse(coef2>=meta.low.f2 & coef2<=meta.high.f2,1,0)

###RBEST ANALYSIS###

rbest.study<-c(1,2,3,4,5)

rbest.n<-c(nrow(sim.data1),nrow(sim.data2),nrow(sim.data3),nrow(sim.data4),nrow(sim.data5))

rbest.coef.f1<-c(bayes1.coef.f1,bayes2.coef.f1,bayes3.coef.f1,bayes4.coef.f1,bayes5.coef.f1)

rbest.se.f1<-c(bayes1.se.f1,bayes2.se.f1,bayes3.se.f1,bayes4.se.f1,bayes5.se.f1)

rbest.coef.f2<-c(bayes1.coef.f2,bayes2.coef.f2,bayes3.coef.f2,bayes4.coef.f2,bayes5.coef.f2)

rbest.se.f2<-c(bayes1.se.f2,bayes2.se.f2,bayes3.se.f2,bayes4.se.f2,bayes5.se.f2)

df.rbest.f1<-data.frame(rbest.coef.f1,rbest.se.f1,rbest.n,rbest.study)

df.rbest.f2<-data.frame(rbest.coef.f2,rbest.se.f2,rbest.n,rbest.study)

map_f1<-gMAP(cbind(rbest.coef.f1,rbest.se.f1)~1|rbest.study, family=gaussian,

data=df.rbest.f1, weights=rbest.n,

tau.dist=“HalfNormal”,tau.prior=44, beta.prior=cbind(0,88))

map_f2<-gMAP(cbind(rbest.coef.f2,rbest.se.f2)~1|rbest.study, family=gaussian,

data=df.rbest.f2, weights=rbest.n,

tau.dist=“HalfNormal”,tau.prior=44, beta.prior=cbind(0,88))

map_f1.auto <- automixfit(map_f1, Nc=3)

map_f1.auto

map_f1.robust <- robustify(map_f1.auto, weight = .1, mean = 0, sigma=10)

map_f1.robust

map_f2.auto <- automixfit(map_f2, Nc=3)

map_f2.auto

map_f2.robust <- robustify(map_f2.auto, weight = .1, mean = 0, sigma=10)

map_f2.robust

rbest.auto.f1<-postmix(map_f1.auto,m=bayes6.coef.f1,n=nrow(sim.data6))

rbest.robust.f1<-postmix(map_f1.robust,m=bayes6.coef.f1,n=nrow(sim.data6))

rbest.auto.f2<-postmix(map_f2.auto,m=bayes6.coef.f2,n=nrow(sim.data6))

rbest.robust.f2<-postmix(map_f2.robust,m=bayes6.coef.f2,n=nrow(sim.data6))

rbest.auto.coef.f1<-rbest.auto.f1[1,1]*rbest.auto.f1[2,1]+rbest.auto.f1[1,2]*rbest.auto.f1[2,2]+rbest.auto.f1[1,3]*rbest.auto.f1[2,3]

rbest.auto.se.f1<-rbest.auto.f1[1,1]*rbest.auto.f1[3,1]+rbest.auto.f1[1,2]*rbest.auto.f1[3,2]+rbest.auto.f1[1,3]*rbest.auto.f1[3,3]

rbest.auto.coef.f2<-rbest.auto.f2[1,1]*rbest.auto.f2[2,1]+rbest.auto.f2[1,2]*rbest.auto.f2[2,2]+rbest.auto.f2[1,3]*rbest.auto.f2[2,3]

rbest.auto.se.f2<-rbest.auto.f2[1,1]*rbest.auto.f2[3,1]+rbest.auto.f2[1,2]*rbest.auto.f2[3,2]+rbest.auto.f2[1,3]*rbest.auto.f2[3,3]

rbest.robust.coef.f1<-rbest.robust.f1[1,1]*rbest.robust.f1[2,1]+rbest.robust.f1[1,2]*rbest.robust.f1[2,2]+rbest.robust.f1[1,3]*rbest.robust.f1[2,3]

rbest.robust.se.f1<-rbest.robust.f1[1,1]*rbest.robust.f1[3,1]+rbest.robust.f1[1,2]*rbest.robust.f1[3,2]+rbest.robust.f1[1,3]*rbest.robust.f1[3,3]

rbest.robust.coef.f2<-rbest.robust.f2[1,1]*rbest.robust.f2[2,1]+rbest.robust.f2[1,2]*rbest.robust.f2[2,2]+rbest.robust.f2[1,3]*rbest.robust.f2[2,3]

rbest.robust.se.f2<-rbest.robust.f2[1,1]*rbest.robust.f2[3,1]+rbest.robust.f2[1,2]*rbest.robust.f2[3,2]+rbest.robust.f2[1,3]*rbest.robust.f2[3,3]

rbest.auto.f1.hi<-rbest.auto.coef.f1+2*rbest.auto.se.f1

rbest.auto.f1.lo<-rbest.auto.coef.f1-2*rbest.auto.se.f1

rbest.robust.f1.hi<-rbest.robust.coef.f1+2*rbest.robust.se.f1

rbest.robust.f1.lo<-rbest.robust.coef.f1-2*rbest.robust.se.f1

rbest.auto.f2.hi<-rbest.auto.coef.f2+2*rbest.auto.se.f2

rbest.auto.f2.lo<-rbest.auto.coef.f2-2*rbest.auto.se.f2

rbest.robust.f2.hi<-rbest.robust.coef.f2+2*rbest.robust.se.f2

rbest.robust.f2.lo<-rbest.robust.coef.f2-2*rbest.robust.se.f2

rbest.auto.f1.cover<-ifelse((coef1>=rbest.auto.f1.lo & coef1<=rbest.auto.f1.hi),1,0)

rbest.auto.f2.cover<-ifelse((coef2>=rbest.auto.f2.lo & coef2<=rbest.auto.f2.hi),1,0)

rbest.robust.f1.cover<-ifelse((coef1>=rbest.robust.f1.lo & coef1<=rbest.robust.f1.hi),1,0)

rbest.robust.f2.cover<-ifelse((coef2>=rbest.robust.f2.lo & coef2<=rbest.robust.f2.hi),1,0)

###CALCULATE BIAS###

mle.bias.f1<-(mle.coef.f1-coef1)/coef1

mle.bias.f2<-(mle.coef.f2-coef2)/coef2

bayes1.bias.f1<-(bayes1.coef.f1-coef1)/coef1

bayes1.bias.f2<-(bayes1.coef.f2-coef2)/coef2

bayes6.bias.f1<-(bayes6.coef.f1-coef1)/coef1

bayes6.bias.f2<-(bayes6.coef.f2-coef2)/coef2

ml.pooled.bias.f1<-(ml.pooled.coef.f1-coef1)/coef1

ml.pooled.bias.f2<-(ml.pooled.coef.f2-coef2)/coef2

pooled.bias.f1<-(pooled.coef.f1-coef1)/coef1

pooled.bias.f2<-(pooled.coef.f2-coef2)/coef2

agdp.bias.f1<-(agdp.coef.f1-coef1)/coef1

agdp.bias.f2<-(agdp.coef.f2-coef2)/coef2

meta.bias.f1<-(meta.coef.f1-coef1)/coef1

meta.bias.f2<-(meta.coef.f2-coef2)/coef2

bayes_dp.random.bias.f1<-(bayes_dp.random.coef.f1-coef1)/coef1

bayes_dp.random.bias.f2<-(bayes_dp.random.coef.f2-coef2)/coef2

bayes_dp.75.bias.f1<-(bayes_dp.75.coef.f1-coef1)/coef1

bayes_dp.75.bias.f2<-(bayes_dp.75.coef.f2-coef2)/coef2

bayes_dp.50.bias.f1<-(bayes_dp.50.coef.f1-coef1)/coef1

bayes_dp.50.bias.f2<-(bayes_dp.50.coef.f2-coef2)/coef2

bayes_dp.25.bias.f1<-(bayes_dp.25.coef.f1-coef1)/coef1

bayes_dp.25.bias.f2<-(bayes_dp.25.coef.f2-coef2)/coef2

rbest.auto.bias.f1<-(rbest.auto.coef.f1-coef1)/coef1

rbest.auto.bias.f2<-(rbest.auto.coef.f2-coef2)/coef2

rbest.robust.bias.f1<-(rbest.robust.coef.f1-coef1)/coef1

rbest.robust.bias.f2<-(rbest.robust.coef.f2-coef2)/coef2

####COMBINE RESULTS####

coef.results<-cbind(mle.coef.f1,bayes1.coef.f1,bayes6.coef.f1,ml.pooled.coef.f1,pooled.coef.f1,agdp.coef.f1,meta.coef.f1,

bayes_dp.random.coef.f1,bayes_dp.75.coef.f1,bayes_dp.50.coef.f1,bayes_dp.25.coef.f1,rbest.auto.coef.f1,rbest.robust.coef.f1,

mle.coef.f2,bayes1.coef.f2,bayes6.coef.f2,ml.pooled.coef.f2,pooled.coef.f2,agdp.coef.f2,meta.coef.f2,

bayes_dp.random.coef.f2,bayes_dp.75.coef.f2,bayes_dp.50.coef.f2,bayes_dp.25.coef.f2,rbest.auto.coef.f2,rbest.robust.coef.f2)

se.results<-cbind(mle.se.f1,bayes1.se.f1,bayes6.se.f1,ml.pooled.se.f1,pooled.se.f1,agdp.se.f1,meta.se.f1,

bayes_dp.random.se.f1,bayes_dp.75.se.f1,bayes_dp.50.se.f1,bayes_dp.25.se.f1,rbest.auto.se.f1,rbest.robust.se.f1,

mle.se.f2,bayes1.se.f2,bayes6.se.f2,ml.pooled.se.f2,pooled.se.f2,agdp.se.f2,meta.se.f2,

bayes_dp.random.se.f2,bayes_dp.75.se.f2,bayes_dp.50.se.f2,bayes_dp.25.se.f2,rbest.auto.se.f2,rbest.robust.se.f2)

cover.results<-cbind(mle.cover.f1,mle.cover.f2,bayes1.cover.f1,bayes1.cover.f2,bayes6.cover.f1,bayes6.cover.f2,

ml.pooled.cover.f1,ml.pooled.cover.f2,pooled.cover.f1,pooled.cover.f2,agdp.cover.f1,agdp.cover.f2,meta.cover.f1,meta.cover.f2,

bayes_dp.random.cover.f1,bayes_dp.random.cover.f2,bayes_dp.75.cover.f1,bayes_dp.75.cover.f2,

bayes_dp.50.cover.f1,bayes_dp.50.cover.f2,bayes_dp.25.cover.f1,bayes_dp.25.cover.f2,rbest.auto.f1.cover,rbest.auto.f2.cover,

rbest.robust.f1.cover,rbest.robust.f2.cover)

bias.results<-cbind(mle.bias.f1,bayes1.bias.f1,bayes6.bias.f1,ml.pooled.bias.f1,pooled.bias.f1,agdp.bias.f1,meta.bias.f1,

bayes_dp.random.bias.f1,bayes_dp.75.bias.f1,bayes_dp.50.bias.f1,bayes_dp.25.bias.f1,rbest.auto.bias.f1,rbest.robust.bias.f1,

mle.bias.f2,bayes1.bias.f2,bayes6.bias.f2,ml.pooled.bias.f2,pooled.bias.f2,agdp.bias.f2,meta.bias.f2,

bayes_dp.random.bias.f2,bayes_dp.75.bias.f2,bayes_dp.50.bias.f2,bayes_dp.25.bias.f2,rbest.auto.bias.f2,rbest.robust.bias.f2)

simulation1.coef<-rbind(simulation1.coef, coef.results)

simulation1.se<-rbind(simulation1.se, se.results)

simulation1.cover<-rbind(simulation1.cover, cover.results)

simulation1.bias<-rbind(simulation1.bias, bias.results)

}

#CALCULATE MSE#

actual1<-rep(coef1,z)

actual2<-rep(coef2,z)

ml.mse1<-MSE(actual1,simulation1.coef[,1])

bayes1.mse1<-MSE(actual1,simulation1.coef[,2])

bayes6.mse1<-MSE(actual1,simulation1.coef[,3])

ml_pool.mse1<-MSE(actual1,simulation1.coef[,4])

bayes_pool.mse1<-MSE(actual1,simulation1.coef[,5])

agdp.mse1<-MSE(actual1,simulation1.coef[,6])

meta.mse1<-MSE(actual1,simulation1.coef[,7])

dp_random.mse1<-MSE(actual1,simulation1.coef[,8])

dp_75.mse1<-MSE(actual1,simulation1.coef[,9])

dp_50.mse1<-MSE(actual1,simulation1.coef[,10])

dp_25.mse1<-MSE(actual1,simulation1.coef[,11])

rbest_auto.mse1<-MSE(actual1,simulation1.coef[,12])

rbest_robust.mse1<-MSE(actual1,simulation1.coef[,13])

ml.mse2<-MSE(actual2,simulation1.coef[,14])

bayes1.mse2<-MSE(actual2,simulation1.coef[,15])

bayes6.mse2<-MSE(actual2,simulation1.coef[,16])

ml_pool.mse2<-MSE(actual2,simulation1.coef[,17])

bayes_pool.mse2<-MSE(actual2,simulation1.coef[,18])

agdp.mse2<-MSE(actual2,simulation1.coef[,19])

meta.mse2<-MSE(actual2,simulation1.coef[,20])

dp_random.mse2<-MSE(actual2,simulation1.coef[,21])

dp_75.mse2<-MSE(actual2,simulation1.coef[,22])

dp_50.mse2<-MSE(actual2,simulation1.coef[,23])

dp_25.mse2<-MSE(actual2,simulation1.coef[,24])

rbest_auto.mse2<-MSE(actual2,simulation1.coef[,25])

rbest_robust.mse2<-MSE(actual2,simulation1.coef[,26])

simulation1.mse<-cbind(ml.mse1,bayes1.mse1,bayes6.mse1,ml_pool.mse1,bayes_pool.mse1,agdp.mse1,meta.mse1,

dp_random.mse1,dp_75.mse1,dp_50.mse1,dp_25.mse1,rbest_auto.mse1,rbest_robust.mse1,

ml.mse2,bayes1.mse2,bayes6.mse2,ml_pool.mse2,bayes_pool.mse2,agdp.mse2,meta.mse2,

dp_random.mse2,dp_75.mse2,dp_50.mse2,dp_25.mse2,rbest_auto.mse2,rbest_robust.mse2)

simulation1.coef.mean<-colMeans(simulation1.coef)

simulation1.se.mean<-colMeans(simulation1.se)

simulation1.cover.mean<-colMeans(simulation1.cover)

simulation1.bias.mean<-colMeans(simulation1.bias)

simulation1.mse.mean<-colMeans(simulation1.mse)

References

Gill, J. Bayesian Methods: A Social and Behavioral Sciences Approach; Chapman & Hall, CRC: Boca Raton, FL, USA, 2008. [Google Scholar]
Kaplan, D. Bayesian Statistics for the Social Sciences; The Guilford Press: New York, NY, USA, 2016. [Google Scholar]
Congdon, P. Applied Bayesian Modeling; Wiley: New York, NY, USA, 2014. [Google Scholar]
Smid, S.C.; Winter, S.D. Dangers of the defaults: A tutorial on the impact of default priors when using Bayesian SEM with small samples. Front. Psychol. 2020, 11, 611963. [Google Scholar] [CrossRef] [PubMed]
Du, H.; Bradbury, T.N.; Lavner, J.A.; Meltzer, A.L.; McNulty, J.K.; Neff, L.A.; Karney, B.R. A comparison of Bayesian synthesis approaches for studies comparing two means: A tutorial. Res. Synth. Methods 2020, 11, 36–65. [Google Scholar] [CrossRef] [PubMed]
Haddad, T.; Himes, A.; Thompson, L.; Irony, T.; Nair, R. Incorporation of stochastic engineering models as prior information in Bayesian medical device trials. J. Biopharm. Stat. 2017, 27, 1089–1103. [Google Scholar] [CrossRef] [PubMed]
Ibrahim, J.G.; Chen, M.H.; Gwon, Y.; Chen, F. The power prior: Theory and applications. Stat. Med. 2015, 34, 3724–3749. [Google Scholar] [CrossRef] [PubMed]
Marcoulides, K.M. A Bayesian Synthesis Approach to Data Fusion Using Augmented Data-Dependent Priors. Unpublished Doctoral Dissertation, Arizona State University, Tempe, AZ, USA, 2017. [Google Scholar]
Liu, G.F. A dynamic power prior for borrowing historical data in noninferiority trials with binary endpoint. Pharm. Stat. 2018, 17, 61–73. [Google Scholar] [CrossRef] [PubMed]
Kaplan, D.; Chen, J.; Yavuz, S.; Lyu, W. Bayesian dynamic borrowing of historical information with applications to the analysis of large-scale assessments. Psychometrika 2023, 88, 1–30. [Google Scholar] [CrossRef] [PubMed]
Vehtari, A.; Gelman, A.; Gabry, J. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat. Comput. 2017, 27, 1413–1432. [Google Scholar] [CrossRef]
Held, L.; Micheloud, C.; Pawel, S. The assessment of replication success based on relative effect size. Ann. Appl. Stat. 2022, 16, 706–720. [Google Scholar] [CrossRef]
Schmidli, H.; Gsteiger, S.; Roychoudhury, S.; O’Hagan, A.; Spiegelhalter, D.; Neuenschwander, B. Robust meta-analytic-predictive priors in clinical trials with historical control information. Biometrics 2014, 70, 1023–1032. [Google Scholar] [CrossRef] [PubMed]
Viele, K.; Berry, S.; Neuenschwander, B.; Amzal, B.; Chen, F.; Enas, N.; Thompson, L. Use of historical control data for assessing treatment effects in clinical trials. Pharm. Stat. 2014, 13, 41–54. [Google Scholar] [CrossRef] [PubMed]
Centers for Disease Control and Prevention. Youth Risk Behavior Survey, Data Summary and Trends Report. 2011–2021. Available online: https://www.cdc.gov/healthyyouth/data/yrbs/pdf/YRBS_Data-Summary-Trends_Report2023_508.pdf (accessed on 19 November 2023).
Card, N. Applied Meta-Analysis for Social Science Research; The Guilford Press: New York, NY, USA, 2012. [Google Scholar]
DerSimonian, R.; Laird, N. Meta-analysis in clinical trials revisited. Contemp. Clin. Trials 2015, 45, 139–145. [Google Scholar] [CrossRef]
Gelman, A.; Stern, H.S.; Carlin, J.B.; Dunson, D.B.; Vehtari, A.; Rubin, D.B. Bayesian Data Analysis; Chapman & Hall/CRC: New York, NY, USA, 2013. [Google Scholar]
Ibrahim, J.G.; Chen, M.H. Power prior distributions for regression models. Stat. Sci. 2000, 15, 46–60. [Google Scholar]
Muthén, L.K.; Muthén, B.O. Mplus User’s Guide, 8th ed.; Muthén & Muthén: Los Angeles, CA, USA, 1998–2022. [Google Scholar]

Figure 1. Data-generating SEM.

Table 1. Structure coefficient MSE by method and heterogeneity between the current dataset and the data used for priors.

Heterogeneity	1	0.2	0.5	0.8	1.2	1.5	1.8
Noninformative *	0.1744	0.1774	0.1761	0.1738	0.1712	0.1712	0.1709
Bayes synthesis	0.0024	0.1701	0.1656	0.0292	0.0271	0.1678	0.4395
Pooled	0.002	0.175	0.1721	0.0294	0.0307	0.1772	0.4477
AGDP	0.0063	0.0776	0.0762	0.0153	0.0162	0.0647	0.1471
Meta-analysis	0.003	0.2255	0.2219	0.0382	0.0381	0.2294	0.5824
DP_random	0.0127	0.0127	0.0125	0.012	0.0126	0.0111	0.014
DP_75	0.0037	0.0632	0.0616	0.0117	0.013	0.0509	0.112
DP_50	0.0045	0.0455	0.044	0.0095	0.0115	0.0369	0.0776
DP_25	0.0064	0.0236	0.0222	0.0077	0.0104	0.0214	0.0404
$p M A P$	0.0026	0.209	0.2048	0.0372	0.0361	0.2151	0.5395
$P m a p R$	0.0026	0.2112	0.2068	0.0375	0.0355	0.2116	0.5284

* Noninformative = noninformative priors; Bayes synthesis = Bayesian synthesis method; pooled = analysis of pooled data, including target and previous data; AGDP = priors taken from mean results across earlier studies; meta-analysis = standard meta-analysis; DP_random = power prior with estimated weight; DP_75 = power prior with weight of 0.75; DP_50 = power prior with weight of 0.50; DP_25 = power prior with weight of 0.25;

p M A P

= predictive meta-analysis;

p m a p R

= robust predictive meta-analysis.

Table 2. Structure coefficient MSE by method and sample size.

N	200	500	1000
Noninformative *	0.1994	0.1698	0.1522
Bayes synthesis	0.1482	0.1367	0.1444
Pooled	0.1494	0.1473	0.1465
AGDP	0.0654	0.0548	0.0527
Meta-analysis	0.1946	0.1891	0.1899
DP_random	0.0239	0.009	0.0047
DP_75	0.0587	0.0453	0.0315
DP_50	0.0454	0.032	0.021
DP_25	0.0293	0.017	0.0103
$p M A P$	0.1855	0.1748	0.1729
$P m a p R$	0.2600	0.2112	0.2068

* Noninformative = noninformative priors; Bayes synthesis = Bayesian synthesis method; pooled = analysis of pooled data, including target and previous data; AGDP = priors taken from mean results across earlier studies; meta-analysis = standard meta-analysis; DP_random = power prior with estimated weight; DP_75 = power prior with weight of 0.75; DP_50 = power prior with weight of 0.50; DP_25 = power prior with weight of 0.25;

p M A P

= predictive meta-analysis;

p m a p R

= robust predictive meta-analysis.

Table 3. Structure coefficient bias by method and heterogeneity between the current dataset and the data used for priors.

Heterogeneity	1	0.2	0.5	0.8	1.2	1.5	1.8
Noninformative *	0.187	0.191	0.177	0.18	0.183	0.179	0.184
Bayes synthesis	0.003	−0.4101	−0.4046	−0.165	0.1571	0.4051	0.659
Pooled	0.0063	−0.4167	−0.4131	−0.167	0.1695	0.4178	0.6664
AGDP	0.011	−0.2677	−0.2655	−0.0975	0.0976	0.2354	0.3605
Meta-analysis	0.0042	−0.4728	−0.4688	−0.1893	0.1875	0.475	0.7593
DP_random	0.0181	−0.134	−0.121	0.111	0.062	0.113	0.129
DP_75	0.007	−0.2442	−0.242	−0.0933	0.0914	0.2088	0.3156
DP_50	0.008	−0.2033	−0.2012	−0.0763	0.0761	0.1692	0.2535
DP_25	0.0102	−0.1349	−0.1328	−0.0481	0.0529	0.1095	0.1623
$p M A P$	0.0034	−0.455	−0.4504	−0.1874	0.1832	0.4603	0.7311
$P m a p R$	0.0023	−0.4573	−0.4527	−0.1884	0.1816	0.4565	0.7233

* Noninformative = noninformative priors; Bayes synthesis = Bayesian synthesis method; pooled = analysis of pooled data, including target and previous data; AGDP = priors taken from mean results across earlier studies; meta-analysis = standard meta-analysis; DP_random = power prior with estimated weight; DP_75 = power prior with weight of 0.75; DP_50 = power prior with weight of 0.50; DP_25 = power prior with weight of 0.25;

p M A P

= predictive meta-analysis;

p m a p R

= robust predictive meta-analysis.

Table 4. Structure coefficient bias by method and factor loadings.

N	0.6	0.8
Noninformative *	0.194	0.171
Bayes synthesis	0.349	0.342
Pooled	0.383	0.369
AGDP	0.247	0.164
Meta-analysis	0.427	0.416
DP_random	0.116	0.077
DP_75	0.186	0.138
DP_50	0.162	0.12
DP_25	0.154	0.11
$p M A P$	0.409	0.405
$P m a p R$	0.381	0.376

* Noninformative = noninformative priors; Bayes synthesis = Bayesian synthesis method; pooled = analysis of pooled data, including target and previous data; AGDP = priors taken from mean results across earlier studies; meta-analysis = standard meta-analysis; DP_random = power prior with estimated weight; DP_75 = power prior with weight of 0.75; DP_50 = power prior with weight of 0.50; DP_25 = power prior with weight of 0.25;

p M A P

= predictive meta-analysis;

p m a p R

= robust predictive meta-analysis.

Table 5. Structure coefficient standard error by method and heterogeneity between the current dataset and the data used for priors.

Heterogeneity	1	0.2	0.5	0.8	1.2	1.5	1.8
Noninformative *	0.1243	0.1219	0.1235	0.1291	0.1292	0.1246	0.1298
Bayes synthesis	0.0839	0.1093	0.1001	0.0841	0.1173	0.1349	0.1534
Pooled	0.0335	0.0408	0.0376	0.0336	0.0439	0.0486	0.0541
AGDP	0.0336	0.0412	0.0376	0.0338	0.0447	0.0498	0.0565
Meta-analysis	0.0573	0.0757	0.0679	0.0573	0.0835	0.0986	0.1151
DP_random	0.0211	0.0279	0.0248	0.0211	0.0303	0.035	0.0403
DP_75	0.0311	0.0398	0.0359	0.0321	0.0421	0.0488	0.0507
DP_50	0.0389	0.0489	0.0468	0.0441	0.0501	0.0522	0.0554
DP_25	0.0401	0.0510	0.0483	0.0459	0.0527	0.0556	0.0589
$p M A P$	0.0342	0.0381	0.0363	0.0351	0.0422	0.0492	0.0581
$P m a p R$	0.0331	0.0382	0.035	0.0334	0.0415	0.049	0.0575

* Noninformative = noninformative priors; Bayes synthesis = Bayesian synthesis method; pooled = analysis of pooled data, including target and previous data; AGDP = priors taken from mean results across earlier studies; meta-analysis = standard meta-analysis; DP_random = power prior with estimated weight; DP_75 = power prior with weight of 0.75; DP_50 = power prior with weight of 0.50; DP_25 = power prior with weight of 0.25;

p M A P

= predictive meta-analysis;

p m a p R

= robust predictive meta-analysis.

Table 6. Structure coefficient standard error by method and sample size.

N	200	500	1000
Noninformative *	0.1608	0.1132	0.0940
Bayes synthesis	0.1689	0.1045	0.0732
Pooled	0.0647	0.0366	0.026
AGDP	0.1108	0.0654	0.0318
Meta-analysis	0.1198	0.0779	0.0383
DP_random	0.0573	0.0299	0.0262
DP_75	0.0588	0.0305	0.0274
DP_50	0.0597	0.0320	0.0285
DP_25	0.0603	0.0356	0.0284
$p M A P$	0.0556	0.0399	0.0279
$P m a p R$	0.0554	0.0398	0.0278

* Noninformative = noninformative priors; Bayes synthesis = Bayesian synthesis method; pooled = analysis of pooled data, including target and previous data; AGDP = priors taken from mean results across earlier studies; meta-analysis = standard meta-analysis; DP_random = power prior with estimated weight; DP_75 = power prior with weight of 0.75; DP_50 = power prior with weight of 0.50; DP_25 = power prior with weight of 0.25;

p M A P

= predictive meta-analysis;

p m a p R

= robust predictive meta-analysis.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Finch, H. A Comparison of Methods for Synthesizing Results from Previous Research to Obtain Priors for Bayesian Structural Equation Modeling. Psych 2024, 6, 45-88. https://doi.org/10.3390/psych6010004

AMA Style

Finch H. A Comparison of Methods for Synthesizing Results from Previous Research to Obtain Priors for Bayesian Structural Equation Modeling. Psych. 2024; 6(1):45-88. https://doi.org/10.3390/psych6010004

Chicago/Turabian Style

Finch, Holmes. 2024. "A Comparison of Methods for Synthesizing Results from Previous Research to Obtain Priors for Bayesian Structural Equation Modeling" Psych 6, no. 1: 45-88. https://doi.org/10.3390/psych6010004

Article Menu

A Comparison of Methods for Synthesizing Results from Previous Research to Obtain Priors for Bayesian Structural Equation Modeling

Abstract

1. Introduction

1.1. Structure Equation Modeling

1.2. Bayesian Synthesis

1.3. Meta-Analysis

1.4. Meta-Analytic Predictive Method

1.5. Power Priors

1.6. Study Goals

2. Materials and Methods

2.1. Factor Loadings

2.2. Number of Previous Datasets

2.3. Structure Coefficient Heterogeneity

2.4. Sample Size

2.5. Methods for Determining Priors

2.6. Study Outcomes

3. Results

3.1. Mean Squared Error

3.2. Bias

3.3. Standard Error

3.4. Prior and Posterior Distributions

4. Discussion

Study Limitations and Directions for Future Research

5. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Posterior and Prior Distributions for Methods Used to Obtain Bayesian Priors

Appendix B. R Simulation Script

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI