Next Article in Journal
CCR7 Mediates Dendritic-Cell-Derived Exosome Migration and Improves Cardiac Function after Myocardial Infarction
Previous Article in Journal
Monoclonal Antibodies, Gene Silencing and Gene Editing (CRISPR) Therapies for the Treatment of Hyperlipidemia—The Future Is Here
Previous Article in Special Issue
Alternative Pharmacokinetic Metrics in Single-Dose Studies to Ensure Bioequivalence of Prolonged-Release Products at Steady State—A Case Study
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparison of Seven Non-Linear Mixed Effect Model-Based Approaches to Test for Treatment Effect

by
Estelle Chasseloup
and
Mats O. Karlsson
*,† on behalf of the Alzheimer’s Disease Neuroimaging Initiative
Pharmacometrics Group, Pharmacy Department, Uppsala University, 751 23 Uppsala, Sweden
*
Author to whom correspondence should be addressed.
Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf accessed on 11 August 2022.
Pharmaceutics 2023, 15(2), 460; https://doi.org/10.3390/pharmaceutics15020460
Submission received: 6 December 2022 / Revised: 4 January 2023 / Accepted: 9 January 2023 / Published: 30 January 2023
(This article belongs to the Special Issue Recent Advances in Population Pharmacokinetics and Pharmacodynamics)

Abstract

:
Analyses of longitudinal data with non-linear mixed-effects models (NLMEM) are typically associated with high power, but sometimes at the cost of inflated type I error. Approaches to overcome this problem were published recently, such as model-averaging across drug models (MAD), individual model-averaging (IMA), and combined Likelihood Ratio Test (cLRT). This work aimed to assess seven NLMEM approaches in the same framework: treatment effect assessment in balanced two-armed designs using real natural history data with or without the addition of simulated treatment effect. The approaches are MAD, IMA, cLRT, standard model selection (STDs), structural similarity selection (SSs), randomized cLRT (rcLRT), and model-averaging across placebo and drug models (MAPD). The assessment included type I error, using Alzheimer’s Disease Assessment Scale-cognitive (ADAS-cog) scores from 817 untreated patients and power and accuracy in the treatment effect estimates after the addition of simulated treatment effects. The model selection and averaging among a set of pre-selected candidate models were driven by the Akaike information criteria (AIC). The type I error rate was controlled only for IMA and rcLRT; the inflation observed otherwise was explained by the placebo model misspecification and selection bias. Both IMA and rcLRT had reasonable power and accuracy except under a low typical treatment effect.

1. Introduction

Population model-based (pharmacometric) approaches, through the usage of NLMEM, improve the power considerably when analyzing longitudinal data [1,2,3,4]. However, the assumptions involved in NLMEM, e.g., the absence of model misspecification or asymptotic conditions, can impact the performance of such approaches in terms of type I error, power, and accuracy of the treatment effect estimates [5]. As the development of a reasonable model often implies a data-driven trial and error process across many models, type I error inflation related to multiple testing is a legitimate concern. Furthermore, despite all the efforts invested in the rationalization of the selection of one of the candidate models, it inevitably leads to selection bias, and relying on a unique selected model can hinder inference by discarding the model structure uncertainty and dismissing the inherent model misspecification [6].
Over the recent years, multiple approaches have been developed to overcome these caveats. Model-averaging across drug models (MAD) weights the outcome of interest from a set of pre-selected models according to a goodness-of-fit based metric [7,8,9,10] to prevent selection bias and handle model structure uncertainty. Individual model averaging (IMA) [11] uses mixture models to test for treatment effect, which mitigates consequences of both placebo and drug model misspecification and improves the conditions of application of the likelihood ratio test (LRT). combined-LRT (cLRT) [12] combines an alternative cut-off value for the LRT and MAD to handle model structure uncertainty.
The pre-selection of a set of possible candidate models prior to the data analysis, recommended in the ICH E9 guidance [13], is a common alternative to handle model selection bias and its consequences in terms of bias in the estimates. The restriction of the set of candidate models also inherently reduces the type I error inflation caused by multiple testing. MAD, IMA, and cLRT were assessed separately in different contexts of treatment effect or dose-response assessment using real or simulated data. This work aimed to assess MAD, IMA, and cLRT together with four other related approaches in the same framework: treatment effect assessment in balanced two-armed designs using real data. The additional approaches were standard model selection (STDs), structural similarity selection (SSs), randomized-cLRT (rcLRT), and model-averaging across placebo and drug models (MAPD).
Three evaluation aspects were considered: type I error, power, and accuracy of treatment effect estimate (assessed via the root mean squared error (RMSE)). The former aspect was assessed using real natural history data, while the two latter were assessed on the same natural history data modified by the addition of various simulated treatment effects. Model candidate pre-selection is an inherent part of the model-averaging approaches. In this work, it was generalized to all the approaches to provide a common scope to the seven NLMEM approaches for the evaluation. The AIC was used for selection and weighting according to previous recommendations [8,9].

2. Materials and Methods

For parameter estimation, NONMEM [14] version 7.5.0 was used. The simulation or randomization and re-estimations were performed using PsN [15,16] version 5.2.1 through the Stochastic Simulation and Estimation or the randtest functions. The runs with failed minimization status or unreportable number of significant digits were removed from the analysis (see Appendix A for more details). The first order with conditional estimates (FOCE) method was used for all models without the interaction option, as the residual error model was additive. The processing of the results was performed with the statistical software R [17] version 4.1.2.

2.1. Data

Data used in the preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). The ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial magnetic resonance imaging, positron emission tomography, other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment and early Alzheimer’s disease. For up-to-date information, see www.adni-info.org.
The real natural history data were longitudinal ADAS-cog scores ranging from 0 to 70, previously published and detailed elsewhere [18]. Due to the high number of categories, the data were treated as continuous. In this work, we used 817 individuals (aged from 55 to 91 years old), with ADAS-cog evaluation at 0, 6, 12, 18, 24, and 36 months, for a total observation count of 3597. The Baseline Mini-Mental State (BMMS) was also collected at baseline for all the individuals and is used to describe the baseline ADAS-cog scores.
The study population was randomized to two study arms, representing placebo ( T R T = 0 ) and treatment ( T R T = 1 ). In the base scenario used to assess type I error, all subjects’ data were their natural disease progression. To assess the power and the accuracy of the treatment effect estimates, the original data were also modified by adding various treatment effect functions to the individual allocated to the treated arm. Offset (Equation (2)) and time-linear (Equation (3)) models were used to generate different treatment effect scenarios: with (30% CV) or without IIV on the treatment effect parameters, with a low (2-points increase) or a high (8-points increase) typical treatment effect at the end of the study. Eight treatment effect scenarios were generated, using both time-linear and offset drug models: (1) with or; (2) without IIV; (3) small treatment effect; and (4) large treatment effect.

2.2. Models

The published disease model is described extensively elsewhere [18] and summarized in Equation (1). The corresponding NONMEM code is provided in Appendix B. The disease model is time-linear (Equation (1a)), including covariates effects on the slope (Equation (1c)), and a slope model links the baseline value to BMMS (Equation (1b)).
ADAS cog , i ( t ) = ADAS cog , i ( 0 ) + α i t + ε
ADAS cog , i ( 0 ) = ( Θ baseline + Θ intercept · BMMS i ) + η 1 , i
α i = f ( Cov i , Θ , η 2 , i )
where Θ describes fixed effect parameters, η i N ( 0 , ω 2 ) are additive individual random effects, and ε N ( 0 , σ 2 ) is the residual error for each observation. Four alternative disease models were considered for MAPD and are presented in Table 1.
Offset with or without IIV (Equation (2)) and disease-modifying with or without IIV (Equation (4)) models were considered as treatment effect models for the type I error assessment. For the power and accuracy assessment, a time-linear model (Equation (3)) was used instead of the disease-modifying model to avoid any disease model assumption in the simulation of the treatment effect.
Θ DE + η
Θ DE + η 36 t
α ( 1 ( Θ DE + η ) )
With α being the disease model slope.

2.3. Description of Modelling Approaches

For all the approaches, the Akaike information criteria (AIC) is used to compare the fit of the set of candidate models. The AIC is hence used to select the best-fitting candidate used as the alternative hypothesis (H1) in the statistical test, except for the model-averaging approaches, i.e., MAD and MAPD, for which no selection occurs, but an AIC-based weight is computed for each candidate model. The LRT is then used to conclude the presence of a treatment effect, except for the model averaging approaches, cLRT, and rcLRT for which the alternatives are described below.
In the STDs approach (Equation (5), Figure 1a), the null hypothesis (H0) consists of a placebo model applied to all subjects, and H1 adds a drug model to the treated subjects. The LRT is used to discriminate between the best model selected and H0 to conclude the presence of a treatment effect, using Δ O F V as the test statistic. The distribution of this test statistic under H0 is unknown. In the LRT, it is assumed to follow a χ 2 distribution with ν degrees of freedom, with α = 0.05 , and ν , the number of additional parameters estimated in H1 compared to H0. cLRT and rcLRT assumed a different distribution for that test statistic under H0, the alternative distribution being obtained by replicating the model selection procedure and the test statistic computation n = 100 times over n different data sets. For cLRT (Equation (5), Figure 1a), the distribution is obtained with n data sets simulated under H0, for rcLRT (Equation (5), Figure 1a) the distribution is obtained with n randomized data set differing by the treatment allocation assignment.
In SSs (Equation (6), Figure 1c), the drug model is fitted to all subjects in H0, but H1 allows different estimates for the treated individuals. The LRT is used to conclude on the presence of a treatment effect using the best model according to the AIC.
The two model averaging approaches, MAD and MAPD, have the H0 and H1 hypotheses constructed according to STDs (Equation (5) and Figure 1b). Instead of selecting a unique best candidate model via a selection step, the model-averaging approaches assigned an AIC-based weight to each candidate model (Equation (7)), with AIC min being the minimum A I C of the candidate models. Hence, each model from the pre-defined set contributes to the computation of the metric of interest proportionally to its relative weight, contrary to the selection-based methods where only the best model candidate is used to draw conclusions. MAD considered a unique H0 and multiple H1 via the formulation of various drug models and a unique placebo model, while MAPD differed by also considering various placebo models in the set of pre-defined models. In that aspect, MAPD differed from MAD, and the other six approaches, by considering multiple disease models instead of only the published disease model.
In IMA (Equation (8a), Figure 1c), all subjects have, through a mixture feature, a probability Θ MIX of being described by the drug model. This probability is fixed to the placebo allocation rate (0.5) in H0 but estimated based on the treatment allocation in H1. The LRT is used to conclude on the presence of a treatment effect using the best model according to the AIC.
H 0 Pub , 0 : Plb Pub
H 1 Pub , d : Plb Pub + f drug , d ( T R T )
where Plb Pub is the published placebo model and f drug , d ( T R T ) a drug model d depending on the treatment allocation T R T .
H 0 Pub , d : Plb Pub + f drug , d
H 1 Pub , d : Plb Pub + f drug , d if T R T = 0 , f drug , d if T R T = 1 ,
where the same drug model d is applied to all the individuals, allowing for different parameter estimates between the two arms in H1.
W t p , d = exp ( AIC p , d AIC min ) p = 1 P d = 0 D exp ( AIC p , d AIC min )
Mixture model : Plb Pub if Mix = 1 Plb Pub + f drug , d if Mix = 2
H 0 Pub , d : P r ( M i x = 1 ) = P r ( M i x = 2 ) = Θ MIX = 0.5 FIX
H 1 Pub , d : P r ( M i x = 1 ) = ( 1 TRT ) Θ MIX + TRT ( 1 Θ MIX ) P r ( M i x = 2 ) = 1 P r ( M i x = 1 )

2.4. Approaches Assessment

For each of the seven approaches, the type I error rate was assessed first using the raw natural history data modified to randomly allocate (1:1) each subject to an artificial placebo or treated arm. The allocation was repeated N = 100 times to mimic N random trials without treatment effect. The type I error rate was computed over the N trials as the frequency with which H0 was rejected and assumed to be adequate when falling within the 2.5th–97.5th percentiles of a binomial distribution with a probability of success of 5% on N trial replicates.
When the type I error was controlled, power and accuracy were assessed using the data modified by the addition of a treatment effect to the subjects allocated to the treated arm. N simulations were performed for each of the eight treatment effect scenarios. The power was computed as the frequency with which H0 was rejected over N trials. Regarding the model-averaging approaches, the type I error and power were computed as the percentage of the weights allocated to any of the H1 considered in the set of the candidate models.
The accuracy in the treatment effect estimates was assessed only when using the data modified by the addition of simulated treatment effect, using the RMSE according to Equation (9), where Θ DE , i is the true value used in the simulations and Θ ^ DE , i is the estimated value of the n th trial.
RMSE = n = i N ( Θ ^ DE , i Θ DE , i ) 2 100
For IMA, Θ ^ DE , i was computed according to Equation (10), to account for the submodel allocation probability:
Θ ^ DE , i , IMA = ( 2 Θ MIX , i 1 ) Θ ^ DE , i

3. Results

The type I error for each approach is available in Table 2. Only IMA and rcLRT had controlled type I error (6%). All the other approaches had 100% type I error except SSs, for which the type I error was inflated to 17%. The model-averaging approaches had a very negligible total weight assigned to any H0 hypothesis, 6 × 10 24 . Details about the drug models selected in the N trials, their corresponding dOFV, and critical cut-off value for the LRT are presented in Figure 2A for all but the model-averaging approaches. The model-averaging approaches results are presented in Figure 2B, with the total relative weight allocated to any of the H0 or the H1 hypotheses. The minimization status is available in Appendix A in Figure A1 and Figure A2. The cLRT and rcLRT alternative distributions used for the determination of the cut-off value in the statistical test are presented in Figure A5 in Appendix C. The summary of the model fits (number of estimated parameters and OFV) is provided in Appendix D in Table A1 for the models used in the type I error computation for all the approaches but MAPD, and in Table A2 for the models used in MAPD, showing that the four proposed alternative disease models for MAPD improved the OFV significantly compared to the published disease model.
Power and accuracy in treatment effect estimates (RMSE) were investigated for IMA and rcLRT as they were the only two approaches with controlled type I error. The results (power and RMSE) for the eight investigated treatment effect scenarios are presented in Table 3. The minimization status is available in Appendix A.2 in Figure A3 for rcLRT, and in Figure A4 for IMA. For the high typical treatment effect scenarios (8-points), IMA and rcLRT had 100% power regardless of the simulated treatment effect model addition. For the low typical treatment effect scenarios (2-points), rcLRT had higher power than IMA when simulating the treatment effect with the offset models, whereas the opposite was true when simulating with the time-linear model. The RMSE was always higher for IMA for all eight scenarios tested.

4. Discussion

Seven NLMEM approaches were compared in the same context of treatment effect assessment in balanced two-armed trials using real natural history data. The comparison scope was first the type I error using the natural history data observed without any treatment. For approaches with controlled type I error, power and accuracy in the drug estimates were evaluated using the natural history data modified by the addition of different simulated treatment effects. Among the seven approaches tested, only two (IMA and rcLRT) had controlled type I error and were consequently assessed on data with a simulated treatment effect. IMA and rcLRT had similar results in terms of power: 100% power in the presence of a high typical treatment effect but lower in the presence of a low typical treatment effect, except for rcLRT when an offset drug model was used to simulate the treatment effect (100% power). rcLRT had consistently better RMSE than IMA.
The STDs approach type I error results (100%) could be anticipated from the fit of the four drug models on one randomization of the treatment allocation (see Table A1 in Appendix D.1). Out of the four models, offset or disease-modifying with or without IIV, the two models with IIV had a significant drop in OFV, according to the LRT. The disease-modifying model with IIV had a drop of −133.54, compared to a critical value of −5.99 for the LRT, about 111 OFV points lower than the offset drug model with IIV, leaving no chance of selection for another candidate model even after the parameters-based penalty introduced by the AIC. Previous investigations [11] of the STDs approach without the AIC selection step already outlined the uncontrolled type I error of the approach. Such uncontrolled type I error was attributed to the placebo model misspecification leaving room for additional model components and other possible violations of the standard LRT assumptions, such as not fulfilling the asymptotic properties. In this case, there was a pre-selection of H1 models using AIC. Another common way of model selection is to make multiple tests of different H1s against the H0 and then select the H1 associated with the lowest p-value, given that it is below the predetermined cut-off. Both these procedures suffer from multiple testing and their greedy behavior.
The cLRT approach [12] was introduced to account for the multiple testing of drug models and the structure model uncertainty in the computation of the critical value by using Monte Carlo simulation under H0. cLRT had controlled type I error in the context of simulated data [12], but had a 100% type I error inflation with the real natural history data and the published disease model that was used in our study. The alternative computation of cut-off values for cLRT was unable to prevent the type I error inflation.
Even though cLRT accounts for multiple testing in the computation of the critical value via Monte Carlo simulations, it still assumes that the structure of the placebo model is adequate by simulating under the assumption of that model for the computation of an alternative cut-off value for the statistical test. By computing the critical value using randomization of the treatment allocation, rcLRT adds the uncertainty of the placebo model in the computation of the critical value by removing any placebo model assumption from the process. The success of this approach (controlled type I error with a rate of 6%) could also be anticipated from the fit of the drug model on the natural history data (see Table A1 in Appendix D.1), as the dOFV of the best drug model used to compute the critical value is the same as the one used to test for treatment effect. This ensures that the distribution used for the critical value computation is of the same magnitude as the model selected by the AIC step, which is critical to have a chance to limit the type I error inflation. Appendix C illustrates the consequent difference in the typical value of the cut-off distribution obtained by cLRT and rcLRT, ranging, respectively, between −2 to −8 and −195 to −240. The success of this approach also validates the assumption that placebo model misspecification is the major factor involved in the type I error inflation of STDs and cLRT.
Aside from alternatives to the cut-off value used in the statistical test, SSs proposes another alternative to control the type I error inflation observed with STDs. SSs challenged the assumption of the main inflation factor being that the drug model tested is describing some features of the data that were not included in H0. Accordingly, SSs fits the drug model to all the subjects in H0 and allows for different estimates between the arms in H1. The expectation was that the drop in OFV observed in H1 for STDs, corresponding to an improvement of the placebo model rather than a treatment effect, would be included in the OFV of H0 and hence removed from the dOFV between H1 and H0. The results showed that the approach helped to decrease the type I error inflation (17% instead of 100%) but was not enough to control it. Further investigations would be necessary to decide whether and to which extent the remaining inflation should be attributed to multiple testing or the magnitude of the placebo model misspecification still present.
Pre-selection of the set of candidate drug models prior to the data analysis is a recommended practice to limit the type I error inflation [13]. Previous publications showed its application with NLMEM in combination with model-averaging techniques, which was helpful to integrate drug model misspecification in the prediction of key metrics to plan better later stages of drug development [8,9,12]. To our knowledge, in the NLMEM context, the averaging step was in these studies performed over a set of multiple drug model candidates and not over a set of both placebo and drug model candidates. In this work, the MAD approach illustrates the former, and MAPD the latter. MAD showed type I error control in previous publications on simulated data (method 3 from [8]) which was not the case with the real natural history data used in our study (type I error rate of 100%). For the model-averaging approaches, the type I error was computed as the percentage of the relative weights assigned to any H1, as the weights are usually used to favor the output of the respective models in the computation of an effect metric. Because the weights were AIC based, the favored models among the set of candidates were also the model with the lowest OFV (disease-modifying model with IIV), and because of the significant gap between this lowest and the second lowest OFV model (111 points), the total relative weight assigned to any H0 was very negligible ( 10 25 ). This result was also predictable from the model fit on a single allocation randomization (see Table A1 in Appendix D.1). The addition of multiple placebo models in the set of candidate models did not help to reduce the type I error inflation and also resulted in a 100% type I error rate, even though the four alternative placebo models proposed all significantly improved the OFV (between −23.62 and −60.15 decrease in OFV). For the Boxcox transformation, the t-distribution, and the time-exponential model, the dOFV pattern across the four drug models was the same as with the published drug model. The drug models with IIV had significant dOFV, with the disease-modifying model with IIV being the best one, with a dOFV of about −100 points. The model with IIV on RUV had only the disease-modifying model with IIV as a significant treatment effect model with a drop of −68.84 points. The maximum difference between the model with the lowest OFV (13,585.26 for the t-distribution placebo model with disease-modifying with IIV model) and the model with the highest OFV (13,768.66 for the published model without treatment effect), i.e., 183.40 points, also lead to a very negligible total relative weight ( 6 × 10 24 ) assigned to any H0. We can note that the multiplicity of H0 increased the total weight assigned to the H0 by less than 10 25 . Both MAD and MAPD suffered from the gap in OFV between the published model without treatment effect and the model with the best treatment effect, even though the set of pre-selected drug model candidates is restricted to only four models.
Power and bias in treatment effect estimates were assessed for IMA and rcLRT on the natural history data modified by the addition of offset or disease-modifying treatment effects with or without IIV, with a low or a high typical treatment effect. Both approaches had similar power performances and reasonably good RMSE in the presence of a high typical treatment effect. However, in the presence of a low typical treatment effect simulated with an offset model, only rcLRT had good power and RMSE. When using time-linear treatment effect models with a low typical treatment effect, both IMA and rcLRT had unsatisfactory power and poor RMSE. These poor performances can be explained by the combination of two main factors: (1) a difficulty to distinguish the drug model from the placebo model as the added treatment effect was simulated with the same mathematical function as the placebo model; (2) the magnitude of the treatment effect (2 ADAS-cog score points at 36 months) which might be of the same magnitude as the model misspecification. The performances of IMA in the low typical treatment effect scenarios can be explained by the additional degree of freedom brought by the mixture model, allowing some over-fitting associated with a much lower OFV, misleading the AIC selection process.
Aside from these two specific simulation scenarios, the RMSE was overall better for rcLRT. This loss in accuracy for IMA can be explained by the fact that the formula used to compute the final treatment effect combines two parameters estimates: the treatment effect estimate and the mixture proportion, contrary to rcLRT, where the treatment effect is only in the treatment effect estimate (see Equation (10)).
Overall the performances of the approaches were well aligned with the OFV obtained for each approach with a single fit of the different model (results presented in Appendix D). The usage of real data together with a model that was developed, assessed, and published using the same data frames, this work in an interestingly realistic context with real-life model misspecifications. In contrast, the addition of a simulated treatment effect to create scenarios for power and accuracy assessment might lack some real-life complexity. Nonetheless, it allowed the highlighting of the dangerous combination between described features of the natural history data by the placebo model and greedy behavior of the test statistic (dOFV) and/or selection criteria (AIC).
The scope of this work was restricted to treatment effects for balanced two-armed designs. While it is difficult to extrapolate the results further for most of the approaches, IMA and the standard approach without the selection step were assessed regarding type I error in unbalanced designs with respect to treatment effect and dose-response elsewhere [19]. The results were consistent with the ones presented here.

5. Conclusions

This work compared seven NLMEM approaches to test for treatment effects in the same framework using real natural history data. All approaches but IMA and rcLRT had inflated type I error. This can be explained by the misspecification of the placebo model, arising from the use of real natural history data, absent from the previous assessments of cLRT and MAD. Under such circumstances, the five remaining approaches (STDs, SSs, MAD, MAPD, and cLRT) suffered from the greedy behavior of the AIC criteria in the selection or the weighting step, often dismissing the null hypothesis. rcLRT handles the placebo model misspecification by calibrating the cut-off values for the statistical test via a randomization test, while IMA handles it by introducing the drug model already in the null hypothesis via a mixture model. Both IMA and rcLRT show promising results regarding power, bias, and accuracy using natural history data modified by the addition of various simulated treatment effects. However, both approaches were not flawless: IMA had low power to detect low typical treatment effect, and both showed poor performances in the scenarios combining low typical treatment effect and a treatment effect addition similar to the placebo model.

Author Contributions

Conceptualization, M.O.K.; methodology, M.O.K. and E.C.; investigation, E.C.; resources, M.O.K.; data curation, E.C.; writing—original draft preparation, E.C.; writing—review and editing, M.O.K.; visualization, E.C.; supervision, M.O.K.; project administration, M.O.K. and E.C.; funding acquisition, M.O.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work was incorporated into a Ph.D. project (Estelle Chasseloup) granted by the Institut de Recherches Internationales Servier. Financial support from the Swedish Research Council Grant 2018-03317 is acknowledged. Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Restrictions apply to the availability of these data. Data were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and are available on request at https://adni.loni.usc.edu/, accessed on 11 August 2022.

Acknowledgments

The computations/data handling was enabled by resources in project SNIC 2021/22-769 provided by the Swedish National Infrastructure for Computing (SNIC) at UPPMAX, partially funded by the Swedish Research Council through grant agreement no. 2018-05973.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

    The following abbreviations are used in this manuscript:
ADAS-cogAlzheimer’s Disease Assessment Scale-cognitive
AICAkaike Information Criteria
BMMSBaseline Mini-Mental State
cLRTCombined Likelihood Ratio Test
dOFVDifference in Objective Function Value
IIVInter-Individual Variability
IMAIndividual Model Averaging
LRTLikelihood Ratio Test
MADModel Averaging Across Drug models
MAPDModel Averaging Across Placebo and Drug models
NLMEMNon-Linear Mixed Effects Models
OFVObjective Function Value
rcLRTRandomized Combined Likelihood Ratio Test
RMSERoot Mean Squared Error
RUVResidual Unexplained Variability
SSsstructural similarity selection
STDsStandard model selection

Appendix A. Minimization Status

Appendix A.1. Type I Error

Figure A1. Minimization status for the models fitted on the natural history data for all the approaches but MAPD. Numbers indicate the count per status. DM: disease-modifying, IIV: inter-individual variability, SSE: stochastic simulation and estimation.
Figure A1. Minimization status for the models fitted on the natural history data for all the approaches but MAPD. Numbers indicate the count per status. DM: disease-modifying, IIV: inter-individual variability, SSE: stochastic simulation and estimation.
Pharmaceutics 15 00460 g0a1
Figure A2. Minimization status for the MAPD approach on natural history data, facetted by placebo models. Numbers indicate the count per status. DM: disease-modifying, IIV: inter-individual variability, RUV: residual unexplained variability.
Figure A2. Minimization status for the MAPD approach on natural history data, facetted by placebo models. Numbers indicate the count per status. DM: disease-modifying, IIV: inter-individual variability, RUV: residual unexplained variability.
Pharmaceutics 15 00460 g0a2

Appendix A.2. Power

Figure A3. Minimization status for the rcLRT approach on data with various addition of simulated treatment effects. The plot is facetted horizontally by the function used to simulate the treatment effect and vertically by the typical size of the treatment effect and the treatment allocation used. Numbers indicate the count per status. IIV: inter-individual variability, TDE: typical drug effect, TL: time-linear.
Figure A3. Minimization status for the rcLRT approach on data with various addition of simulated treatment effects. The plot is facetted horizontally by the function used to simulate the treatment effect and vertically by the typical size of the treatment effect and the treatment allocation used. Numbers indicate the count per status. IIV: inter-individual variability, TDE: typical drug effect, TL: time-linear.
Pharmaceutics 15 00460 g0a3
Figure A4. Minimization status for the IMA approach on data with various addition of simulated treatment effects. The plot is facetted horizontally by the function used to simulate the treatment effect and its typical size and vertically by the drug model fitted. Numbers indicate the count per status. IIV: inter-individual variability, TDE: typical drug effect, TL: time-linear.
Figure A4. Minimization status for the IMA approach on data with various addition of simulated treatment effects. The plot is facetted horizontally by the function used to simulate the treatment effect and its typical size and vertically by the drug model fitted. Numbers indicate the count per status. IIV: inter-individual variability, TDE: typical drug effect, TL: time-linear.
Pharmaceutics 15 00460 g0a4

Appendix B. NONMEM Code of the Published Placebo Model

  • $PROBLEM    Published model
  • $INPUT      C ID TIME DV BMMS INVF AGE APOF SEX EDU ARM
  • ;ID   : 817 individuals
  • ;TIME : months
  • ;DV   : ADAS-cog score
  • ;BMMS : baseline MMSE
  • ;INVF : inverse of baseline ADAS
  • ;AGE  : years
  • ;APOF : ApoE 0=non carrier, 1=hetero, 2=homo-carrier
  • ;SEX  : 1=male
  • ;EDU  : education level in years
  • ;ARM  : fake random TRT allocation
  • $DATA     ../data/data.csv IGNORE=@
  • $ABBREVIATED COMRES=3 PROTECT
  • $PRED
  •  
  • ; --------- Baseline model
  • INT=THETA(2)    ;baseline ADAS-cog
  • BSLP = THETA(3)
  • MM1 = BSLP*BMMS
  • BSL = (INT + MM1) + ETA(1)
  •  
  • ; --------- Covariates
  •  
  • BAS1=INVF**THETA(5)
  •  
  • ;age effect
  • AGE1 = (AGE/75)**THETA(6)
  •  
  • ;ApoE effect
  • APF = 0
  • IF(APOF.GT.0)THEN   ;0=non-carrier
  • APF = 1
  • ENDIF
  • APO = THETA(7)
  •  
  • ;SEX effect
  • GEN = 0
  • IF(SEX.EQ.1)THEN   ;1=male
  • GEN = 1
  • ENDIF
  • GEN1 = THETA(8)**GEN
  •  
  • ;education
  • EDC = (EDU/15)**THETA(9)
  •  
  • ; --------- Disease progression model
  • SLP=THETA(1)/12   ; disease progression
  • ISLP =SLP*BAS1*AGE1*(APO**APF)*GEN1*EDC+ ETA(2)
  • ADASCOG=BSL + ISLP*TIME
  •  
  • F=ADASCOG
  • W=THETA(4)
  • Y=F+W*EPS(1)
  •  
  • $THETA  4 ; PRM TH1 PLB SLOPE
  •  (0,60) ; PRM TH2 BASE INTERCEPT
  •  -1.69 ; PRM TH3 BASE SLOPE
  •  (0,3) ; PRM TH4 RUV ADD
  •  (1,3,5) ; PRM TH5 COV GAM INVF
  •  -1 ; PRM TH6 COV AGE
  •  1 ; PRM TH7 COV APO
  •  1 ; PRM TH8 COV SEX
  •  0 FIX ; PRM TH9 COV EDU
  • $OMEGA  BLOCK(2)
  •  9  ; PRM OM1 BASE
  •  0.01 0.09  ; PRM OM2 PLB SLOPE
  • $SIGMA  1  FIX  ;   PRM SIG1
  • $ESTIMATION MAXEVAL=9999 METHOD=1 NOABORT

Appendix C. Alternative Distribution for the Cut-Off Value Used in the Statistical Tests on the Natural History Data

Figure A5. Distribution on the N cut-off values computed from simulations of H0 for cLRT (left panel), or from randomizations of the treatment allocation column for rcLRT (right panel). Each of the N cut-offs was taken as the 5th percentile of dOFV (H1-H0) computed over n = 100 data sets.
Figure A5. Distribution on the N cut-off values computed from simulations of H0 for cLRT (left panel), or from randomizations of the treatment allocation column for rcLRT (right panel). Each of the N cut-offs was taken as the 5th percentile of dOFV (H1-H0) computed over n = 100 data sets.
Pharmaceutics 15 00460 g0a5

Appendix D. Models Description

Appendix D.1. Models Fitted on Natural History Data

Table A1. Models summary of the models fitted to the natural history data for all the approaches but MAPD (n = 1). IIV: inter-individual variability, OFV: objective function value, dOFV: difference in OFV between the model and its reference (Ref), Prm_nb: number of parameters estimated.
Table A1. Models summary of the models fitted to the natural history data for all the approaches but MAPD (n = 1). IIV: inter-individual variability, OFV: objective function value, dOFV: difference in OFV between the model and its reference (Ref), Prm_nb: number of parameters estimated.
Run_nbRefDescriptionPrm_nbOFVdOFV
Models fitted on real natural history data
2NAPublished model1113,768.66-
32Published + Offset1213,767.64−1.02
42Published + Offset IIV1313,746.54−22.12
52Published + Disease modifying1213,765.02−3.64
62Published + Disease modifying IIV1313,635.12−133.54
Models fitted on simulated natural history data
7NAPublished model on simulated data1113,567.91-
127Published + Offset1213,567.910
137Published + Offset IIV1313,567.910
147Published + Disease modifying1213,561.34−6.57
157Published + Disease modifying IIV1313,557.64−10.27
IMA models fitted on real natural history data
100NAPublished + Offset base1213,765.13-
101100Published + Offset full1313,765.11−0.02
102100Published + Offset IIV base1313,695.61−69.52
103102Published + Offset IIV full1413,694.39−1.23
104100Published + Disease modifying base1213,473.09−292.04
105104Published + Disease modifying full1313,471.41−1.67
106104Published + Disease modifying IIV base1313,455.68−17.41
107106Published + Disease modifying IIV full1413,454.20−1.48
SSs models fitted on real natural history data
50NAPublished + Offset base1213,766.41-
5450Published + Offset full1313,766.40−0.01
5150Published + Offset IIV base1313,729.93−36.48
5651Published + Offset IIV full1513,728.82−1.11
5250Published + Disease modifying base1213,686.89−79.53
5752Published + Disease modifying full1313,683.10−3.79
5550Published + Disease modifying IIV base1313,182.59−583.82
5855Published + Disease modifying IIV full1513,180.55−2.05
Table A2. Models summary of the models fitted to the natural history data for the MAPD approach (n = 1). dOFV: difference in OFV between the model and its reference (Ref), IIV: inter-individual variability, OFV: objective function value, Prm_nb: number of parameters estimated, RUV: residual unexplained variability.
Table A2. Models summary of the models fitted to the natural history data for the MAPD approach (n = 1). dOFV: difference in OFV between the model and its reference (Ref), IIV: inter-individual variability, OFV: objective function value, Prm_nb: number of parameters estimated, RUV: residual unexplained variability.
Run_nbRefDescriptionPrm_nbOFVdOFV
Published placebo model
2NAPublished placebo model1113,768.66-
32Published + Offset1213,767.64−1.02
42Published + Offset IIV1313,746.54−22.12
52Published + Disease modifying1213,765.02−3.64
62Published + Disease modifying IIV1313,635.12−133.54
Published placebo model with t-distribution transformation
19NAAlternative placebo model1213,708.54-
20319Pub t-dist + Offset1313,707.82−0.72
20419Pub t-dist + Offset IIV1413,691.71−16.83
20719Pub t-dist + Disease modifying1313,705.21−3.33
20819Pub t-dist + Disease modifying IIV1413,585.26−123.28
Published placebo with IIV on RUV
240NAAlternative placebo model1213,708.51-
241240Pub IIV on RUV + Offset1313,708.01−0.5
242240Pub IIV on RUV + Offset IIV1413,708.01−0.5
243240Pub IIV on RUV + Disease modifying1313,707.30−1.21
244240Pub IIV on RUV + Disease modifying IIV1413,639.67−68.84
Published placebo model with Boxcox transformation
250NAAlternative placebo model1213,711.92-
251250Pub Boxcox + Offset1313,710.97−0.94
252250Pub Boxcox + Offset IIV1413,692.13−19.78
253250Pub Boxcox + Disease modifying1313,709.59−2.32
254250Pub Boxcox + Disease modifying IIV1413,590.41−121.51
Published placebo with time-exponential
270NAAlternative placebo model1213,745.04-
271270Pub time-exp + Offset1313,738.74−6.3
272270Pub time-exp + Offset IIV1413,724.31−20.73
273270Pub time-exp + Disease modifying1313,737.09−7.95
274270Pub time-exp + Disease modifying IIV1413,616.90−128.14

Appendix D.2. rcLRT Models Fitted on Data Modified by the Addition of a Simulated Treatment Effect

Table A3. Models summary of the rcLRT models fitted to the data modified by the addition of various simulated treatment effects (n = 1). CV: coefficient of variation, dOFV: difference in OFV between the model and its reference (Ref), IIV: inter-individual variability, OFV: objective function value, Prm_nb: number of parameters estimated, RUV: residual unexplained variability, TDE: typical drug effect.
Table A3. Models summary of the rcLRT models fitted to the data modified by the addition of various simulated treatment effects (n = 1). CV: coefficient of variation, dOFV: difference in OFV between the model and its reference (Ref), IIV: inter-individual variability, OFV: objective function value, Prm_nb: number of parameters estimated, RUV: residual unexplained variability, TDE: typical drug effect.
Run_nbRefDescriptionPrm_nbOFVdOFV
Data modified by the addition of offset drug model, TDE = 2
44NAPublished placebo model1113,844.33-
17044Published plb + Offset1213,765.80−78.53
17144Published plb + Offset IIV1313,761.33−83
17244Published plb + Time linear1213,834.29−10.04
17344Published plb + Time linear IIV1313,830.05−14.28
Data modified by the addition of offset IIV drug model, TDE = 2 with 30%CV
24NAPublished placebo model1113,858.20-
17524Published plb + Offset1213,782.99−75.21
17424Published plb + Offset IIV1313,776.00−82.2
17624Published plb + Time linear1213,848.41−9.79
17724Published plb + Time linear IIV1313,781.10−77.1
Data modified by the addition of time-linear drug model, TDE = 2
74NAPublished placebo model1113,768.85-
17974Published plb + Offset1213,768.84−0.01
18074Published plb + Offset IIV1313,763.69−5.16
17874Published plb + Time linear1213,765.69−3.16
18174Published plb + Time linear IIV1313,762.61−6.24
Data modified by the addition of time-linear IIV drug model, TDE = 2 with 30%CV
64NAPublished placebo model1113,771.71-
18364Published plb + Offset1213,771.70−0.01
18464Published plb + Offset IIV1313,766.42−5.3
18564Published plb + Time linear1213,768.81−2.91
18264Published plb + Time linear IIV1313,765.42−6.29
Data modified by the addition of offset drug model, TDE = 8
360NAPublished placebo model1115,077.75-
364360Published plb + Offset1213,765.80−1311.95
368360Published plb + Offset IIV1313,761.33−1316.41
372360Published plb + Time linear1214,857.43−220.32
376360Published plb + Time linear IIV1314,845.45−232.3
Data modified by the addition of offset IIV drug model, TDE = 8 with 30%CV
361NAPublished placebo model1115,185.64-
365361Published plb + Offset1213,981.87−1203.77
369361Published plb + Offset IIV1313,917.96−1267.68
373361Published plb + Time linear1214,987.39−198.25
377361Published plb + Time linear IIV1314,961.95−223.69
Data modified by the addition of time-linear drug model, TDE = 8
362NAPublished placebo model1113,893.49-
366362Published plb + Offset1213,878.02−15.47
370362Published plb + Offset IIV1313,873.39−20.1
374362Published plb + Time linear1213,765.69−127.8
378362Published plb + Time linear IIV1313,762.61−130.88
Data modified by the addition of time-linear IIV drug model, TDE = 8 with 30%CV
363NAPublished placebo model1113,916.08-
367363Published plb + Offset1213,902.06−14.02
371363Published plb + Offset IIV1313,896.40−19.68
375363Published plb + Time linear1213,799.26−116.82
379363Published plb + Time linear IIV1313,792.27−123.81

Appendix D.3. IMA Models Fitted on Data Modified by the Addition of a Simulated Treatment Effect

Table A4. Models summary of the IMA models fitted to the data modified by the addition of various simulated treatment effects (n = 1). CV: coefficient of variation, dOFV: difference in OFV between the model and its reference (Ref), IIV: inter-individual variability, OFV: objective function value, Prm_nb: number of parameters estimated, RUV: residual unexplained variability, TDE: typical drug effect.
Table A4. Models summary of the IMA models fitted to the data modified by the addition of various simulated treatment effects (n = 1). CV: coefficient of variation, dOFV: difference in OFV between the model and its reference (Ref), IIV: inter-individual variability, OFV: objective function value, Prm_nb: number of parameters estimated, RUV: residual unexplained variability, TDE: typical drug effect.
Run_nbRefDescriptionPrm_nbOFVdOFV
Data modified by the addition of offset drug model, TDE = 2
312NAPublished plb + Offset base1213,807.81-
313312Published plb + Offset full1313,759.81−48
324312Published plb + Offset IIV base1313,721.88−85.93
325324Published plb + Offset IIV full1413,712.10−9.78
326312Published plb + Time linear base1213,843.7935.98
327326Published plb + Time linear full1313,834.31−9.49
328326Published plb + Time linear IIV base1313,629.18−214.61
329328Published plb + Time linear IIV full1413,627.83−1.35
Data modified by the addition of offset IIV drug model, TDE = 2 with 30%CV
304NAPublished plb + Offset base1213,822.38-
305304Published plb + Offset full1313,776.66−45.72
302304Published plb + Offset IIV base1313,732.38−90
303304Published plb + Offset IIV full1413,722.67−99.7
306302Published plb + Time linear base1213,857.50125.11
307306Published plb + Time linear full1313,848.43−9.07
308306Published plb + Time linear IIV base1313,641.31−216.19
309308Published plb + Time linear IIV full1413,639.74−1.57
Data modified by the addition of time-linear drug model, TDE = 2
816NAPublished plb + Offset base1213,766.88-
817816Published plb + Offset full1313,766.46−0.42
818816Published plb + Offset IIV base1313,690.94−75.94
819818Published plb + Offset IIV full1413,689.29−1.64
812816Published plb + Time linear base1213,768.851.97
813812Published plb + Time linear full1313,765.70−3.15
808812Published plb + Time linear IIV base1313,557.07−211.77
809808Published plb + Time linear IIV full1413,556.76−0.32
Data modified by the addition of time-linear IIV drug model, TDE = 2 with 30%CV
852NAPublished plb + Offset base1213,769.77-
853852Published plb + Offset full1313,769.39−0.38
854852Published plb + Offset IIV base1313,693.59−76.18
855854Published plb + Offset IIV full1413,691.98−1.61
850852Published plb + Time linear base1213,771.711.94
851850Published plb + Time linear full1313,768.82−2.9
762850Published plb + Time linear IIV base1313,558.84−212.87
763762Published plb + Time linear IIV full1413,558.49−0.35
Data modified by the addition of offset drug model, TDE = 8
552NAPublished plb + Offset base1214,303.38-
553552Published plb + Offset full1313,710.38−593.01
554552Published plb + Offset IIV base1314,283.48−19.9
555554Published plb + Offset IIV full1413,700.30−583.18
556552Published plb + Time linear base1215,040.48737.1
557556Published plb + Time linear full1314,854.14−186.34
558556Published plb + Time linear IIV base1314,933.37−107.11
559558Published plb + Time linear IIV full1414,776.40−156.97
Data modified by the addition of offset IIV drug model, TDE = 8 with 30%CV
524NAPublished plb + Offset base1214,402.80-
525524Published plb + Offset full1313,911.77−491.03
522524Published plb + Offset IIV base1314,328.56−74.24
523522Published plb + Offset IIV full1413,845.67−482.89
526522Published plb + Time linear base1215,144.65816.08
527526Published plb + Time linear full1314,984.37−160.27
528526Published plb + Time linear IIV base1315,027.12−117.52
529528Published plb + Time linear IIV full1414,877.69−149.43
Data modified by the addition of time-linear drug model, TDE = 8
844NAPublished plb + Offset base1213,893.24-
845844Published plb + Offset full1313,875.75−17.48
846844Published plb + Offset IIV base1313,818.27−74.96
847846Published plb + Offset IIV full1413,816.83−1.44
842844Published plb + Time linear base1213,885.17−8.06
843842Published plb + Time linear full1313,762.09−123.08
848842Published plb + Time linear IIV base1313,783.23−101.94
849848Published plb + Time linear IIV full1413,687.60−95.63
Data modified by the addition of time-linear IIV drug model, TDE = 8 with 30%CV
784NAPublished plb + Offset base1213,915.84-
785784Published plb + Offset full1313,899.91−15.93
786784Published plb + Offset IIV base1313,840.96−74.88
787786Published plb + Offset IIV full1413,839.97−0.99
788784Published plb + Time linear base1213,907.35−8.49
789788Published plb + Time linear full1313,795.58−111.77
782788Published plb + Time linear IIV base1313,798.92−108.43
783782Published plb + Time linear IIV full1413,705.10−93.82

References

  1. Karlsson, K.; Vong, C.; Bergstrand, M.; Jonsson, E.; Karlsson, M. Comparisons of analysis methods for proof-of-concept trials. CPT Pharmacometrics Syst. Pharmacol. 2012, 2, e23. [Google Scholar] [CrossRef] [PubMed]
  2. Ueckert, S.; Plan, E.L.; Ito, K.; Karlsson, M.O.; Corrigan, B.; Hooker, A.C. Improved utilization of ADAS-cog assessment data through item response theory based pharmacometric modeling. Pharm. Res. 2014, 31, 2152–2165. [Google Scholar] [CrossRef] [PubMed]
  3. Jonsson, E.N.; Sheiner, L.B. More efficient clinical trials through use of scientific model-based statistical tests. Clin. Pharmacol. Ther. 2002, 72, 603–614. [Google Scholar] [CrossRef] [PubMed]
  4. Plan, E.L.; Karlsson, K.E.; Karlsson, M.O. Approaches to simultaneous analysis of frequency and severity of symptoms. Clin. Pharmacol. Ther. 2010, 88, 255–259. [Google Scholar] [CrossRef] [PubMed]
  5. Karlsson, M.O.; Jonsson, E.N.; Wiltse, C.G.; Wade, J.R. Assumption testing in population pharmacokinetic models: Illustrated with an analysis of moxonidine data from congestive heart failure patients. J. Pharmacokinet. Biopharm. 1998, 26, 207–246. [Google Scholar] [CrossRef] [PubMed]
  6. Draper, D. Assessment and propagation of model uncertainty. J. R. Stat. Soc. Ser. B (Methodol.) 1995, 57, 45–70. [Google Scholar] [CrossRef]
  7. Bornkamp, B. model selection uncertainty, pre-specification, and model averaging. Pharm. Stat. 2015, 14, 79–81. [Google Scholar] [CrossRef] [PubMed]
  8. Aoki, Y.; Röshammar, D.; Hamrén, B.; Hooker, A.C. Model selection and averaging of nonlinear mixed-effect models for robust phase III dose selection. J. Pharmacokinet. Pharmacodyn. 2017, 44, 581–597. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  9. Buatois, S.; Ueckert, S.; Frey, N.; Retout, S.; Mentré, F. Comparison of model averaging and model selection in dose finding trials analyzed by nonlinear mixed effect models. AAPS J. 2018, 20, 59. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  10. Dosne, A.G.; Bergstrand, M.; Karlsson, M.O.; Renard, D.; Heimann, G. Model averaging for robust assessment of QT prolongation by concentration-response analysis. Stat. Med. 2017, 36, 3844–3857. [Google Scholar] [CrossRef] [PubMed]
  11. Chasseloup, E.; Tessier, A.; Karlsson, M.O. Assessing Treatment Effects with Pharmacometric Models: A New Method that Addresses Problems with Standard Assessments. AAPS J. 2021, 23, 1–8. [Google Scholar] [CrossRef] [PubMed]
  12. Buatois, S.; Ueckert, S.; Frey, N.; Retout, S.; Mentré, F. cLRT-Mod: An efficient methodology for pharmacometric model-based analysis of longitudinal phase II dose finding studies under model uncertainty. Stat. Med. 2021, 40, 2435–2451. [Google Scholar] [CrossRef] [PubMed]
  13. ICH. E9, Statistical Principles for Clinical Trials. 1998. Available online: https://www.ema.europa.eu/en/documents/scientific-guideline/ich-e-9-statistical-principles-clinical-trials-step-5_en.pdf (accessed on 5 October 2022).
  14. Beal, S.; Sheiner, L.; Boeckmann, A.; Bauer, R. NONMEM Users Guides; NONMEM Project Group, University Of California: San Francisco, CA, USA, 1992. [Google Scholar]
  15. Lindbom, L.; Pihlgren, P.; Jonsson, N. PsN-Toolkit—A collection of computer intensive statistical methods for non-linear mixed effect modeling using NONMEM. Comput. Methods Programs Biomed. 2005, 79, 241–257. [Google Scholar] [CrossRef] [PubMed]
  16. Lindbom, L.; Ribbing, J.; Jonsson, E.N. Perl-speaks-NONMEM (PsN)—A Perl module for NONMEM related programming. Comput. Methods Programs Biomed. 2004, 75, 85–94. [Google Scholar] [CrossRef]
  17. RStudio Team. RStudio: Integrated Development Environment for R; RStudio, PBC: Boston, MA, USA, 2021. [Google Scholar]
  18. Ito, K.; Corrigan, B.; Zhao, Q.; French, J.; Miller, R.; Soares, H.; Katz, E.; Nicholas, T.; Billing, B.; Anziano, R.; et al. Disease progression model for cognitive deterioration from Alzheimer’s Disease Neuroimaging Initiative database. Alzheimer Dement. 2011, 7, 151–160. [Google Scholar] [CrossRef] [PubMed]
  19. Chasseloup, E.; Li, X.; Tessier, A.; Karlsson, M.O. Poster: Individual Model Averaging to Increase Robustness in Drug Effect Estimation. 2021. Available online: https://www.page-meeting.org/default.asp?abstract=9830 (accessed on 10 June 2022).
Figure 1. Workflow illustration of the different methods.
Figure 1. Workflow illustration of the different methods.
Pharmaceutics 15 00460 g001
Figure 2. Panel (A) illustrates the type I error results for the non-model-averaging approaches: the colored dots and the associated black boxplot correspond to the distribution of the dOFV of the H1 hypothesis selected by the AIC selection step in each of the 100 trials. The distribution of the critical value used for the statistical test for each approach is indicated by the red boxplot. Panel (B) illustrates the proportion of the total relative weight associated with either the H0 or the H1 hypothesis.
Figure 2. Panel (A) illustrates the type I error results for the non-model-averaging approaches: the colored dots and the associated black boxplot correspond to the distribution of the dOFV of the H1 hypothesis selected by the AIC selection step in each of the 100 trials. The distribution of the critical value used for the statistical test for each approach is indicated by the red boxplot. Panel (B) illustrates the proportion of the total relative weight associated with either the H0 or the H1 hypothesis.
Pharmaceutics 15 00460 g002
Table 1. Alternative disease models for the model averaging across placebo and drug models approach.
Table 1. Alternative disease models for the model averaging across placebo and drug models approach.
Modified ComponentModification
Structural modelTime-exponential
RUV modelIIV on RUV
IIV modelBoxcox transformation of η 1
IIV modelt-distribution of η 1
IIV: Inter-individual variability, RUV: Residual unexplained variability.
Table 2. Type I error per approach using the real natural history data (N = 100).
Table 2. Type I error per approach using the real natural history data (N = 100).
ApproachPlacebo ModelType I Error (%) [1.64–11.28% *]
STDsPublished100
SSsPublished17
cLRTPublished100
rcLRTPublished6
MADPublished100
MAPDPre-selected set100
IMAPublished6
* 2.5th and 97.5th percentiles of a binomial distribution with a probability of success of 5% on 100 trial replicates. † Average of the percentage of the relative weights assigned to any H1.
Table 3. Power and RMSE for approaches with controlled type I error on data modified by the addition of simulated treatment effect models for the eight investigated scenarios (N = 100). RMSE: root mean squared error, IIV: inter-individual variability.
Table 3. Power and RMSE for approaches with controlled type I error on data modified by the addition of simulated treatment effect models for the eight investigated scenarios (N = 100). RMSE: root mean squared error, IIV: inter-individual variability.
rcLRTIMA
Simulation ModelTypical Treatment EffectPower (%)RMSEPower (%)RMSE
Offset21000.26370.83
Offset IIV21000.42330.75
Time-linear261.29631.54
Time-linear IIV261.29671.58
Offset81000.261000.48
Offset IIV81000.291000.41
Time-linear81000.551000.57
Time-linear IIV81000.571000.61
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chasseloup, E.; Karlsson, M.O., on behalf of the Alzheimer’s Disease Neuroimaging Initiative. Comparison of Seven Non-Linear Mixed Effect Model-Based Approaches to Test for Treatment Effect. Pharmaceutics 2023, 15, 460. https://doi.org/10.3390/pharmaceutics15020460

AMA Style

Chasseloup E, Karlsson MO on behalf of the Alzheimer’s Disease Neuroimaging Initiative. Comparison of Seven Non-Linear Mixed Effect Model-Based Approaches to Test for Treatment Effect. Pharmaceutics. 2023; 15(2):460. https://doi.org/10.3390/pharmaceutics15020460

Chicago/Turabian Style

Chasseloup, Estelle, and Mats O. Karlsson on behalf of the Alzheimer’s Disease Neuroimaging Initiative. 2023. "Comparison of Seven Non-Linear Mixed Effect Model-Based Approaches to Test for Treatment Effect" Pharmaceutics 15, no. 2: 460. https://doi.org/10.3390/pharmaceutics15020460

APA Style

Chasseloup, E., & Karlsson, M. O., on behalf of the Alzheimer’s Disease Neuroimaging Initiative. (2023). Comparison of Seven Non-Linear Mixed Effect Model-Based Approaches to Test for Treatment Effect. Pharmaceutics, 15(2), 460. https://doi.org/10.3390/pharmaceutics15020460

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop