1. Introduction
The debate concerning whether placebo treatment works for medical conditions has continued since Beecher’s landmark paper “The Powerful Placebo” published in 1955 [
1]. For the regular clinical setting, consensus of experts has been reached in that placebo effects should be considered as part of the treatment and maximized to improve treatment outcomes [
2].
On the other side, placebos are used in clinical trials which are designed to assess possible clinical effects of new treatment approaches when compared to placebos. Placebo-controlled double-blinded and randomized trials are considered as gold standard to achieve a high level of evidence for the comparison of new treatments with placebos [
3]. However, patients receiving placebos often experience symptomatic improvement, particularly for subjective rather than objective outcomes [
4]. Thus it is not surprising that clinical trials, such as for major depressive disorders, are often confronted with a high variability of placebo response, which complicates the interpretation of clinical outcomes. As a consequence, approximately half of the trials of newer marketed antidepressants in the US Food and Drug Administration (FDA) database failed to demonstrate superiority over placebo [
5].
However, such variations of placebo responses also occur in other chronic diseases. For inflammatory bowel disease, for example, the placebo response rates of clinical improvement and remission rates vary between 5 and 50% in Crohn’s disease and between 10 and 35% in ulcerative colitis [
6]. Interestingly, comparable to the placebo effects in trials with neuropsychiatric disorders, there has been a rise of the placebo response in trials with rheumatoid arthritis (RA) over the last two decades [
7]. In 32 selected placebo-controlled trials on the effects of biological and targeted synthetic disease-modifying agents, an increase in placebo ACR50 and ACR70 responses was reported, which remained significant after controlling for potential confounders. The authors explained this effect with possible shifting of the RA phenotype, changes in trial design, and expectation bias. As a matter of fact, outcome scores of rheumatic diseases including RA, psoriatic arthritis (PsA) and axial spondyloarthritis (SpA) include subjective assessments of disease activity.
Thus, placebo effects may vary not only in neuropsychiatric but also in immune-mediated diseases. The aim of this meta-analysis is to further analyze the effects of placebos on disease activities in clinical trials with RA, PsA, and SpA, and to examine the placebo effects on power calculations of clinical trials in the most prominent rheumatic diseases.
2. Literature Search and Methods
2.1. Used Guidelines
Study selection, assessment of eligibility criteria, data extraction, and statistical analysis were performed in accordance with the methodology guidelines from Cochrane [
8]. The findings are reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) statement [
9].
2.2. Data Sources and Searches
To identify all relevant publications about the placebo effect in chronic arthritic diseases, a systematic literature search was performed in the bibliographic databases Medline (PubMed) and The Cochrane Library (via Wiley). Double-blind placebo-controlled randomized clinical trials, clinical trials, evaluation studies, and validation studies conducted in rheumatoid arthritis (RA), axial spondyloarthritis (SpA), and psoriatic arthritis (PsA) were searched. Juvenile rheumatoid arthritis (jRA) was not included in this review, to avoid an age-related bias, and because of the low number of available studies. The final search was restricted to full-text articles that were written in English and published between 1990 and November 2018 in peer-reviewed journals.
2.3. Study Selection
To be included in the analysis, studies had to meet the following criteria: (1) be double-blind placebo-controlled randomized clinical trials, clinical trials, evaluation studies, and validation studies; (2) include adult participants with RA, SpA, or PsA, respectively; (3) compare placebo with active treatments including non-steroidal anti-inflammatory drugs (NSAIDs), biological disease-modifying anti-rheumatic drugs (bDMARDs), conventional synthetic DMARDs (csDMARDs), targeted synthetic DMARDs (tsDMARDs), and other immunosuppressant agents; (4) include at least 50 participants in each study group (placebo/active comparator); (5) report objectively measured disease-specific and clinically relevant primary endpoints (see
Table S1). Studies without the required study design—e.g., meta-analysis, single-blind studies, long-term extension studies (open-label or without a placebo group), post hoc analyses—and duplicates were excluded. Especially, studies not meeting these primary endpoints, for example, with nonclinical and radiographic outcomes, or other parameters for quality of life, biomarkers, and pharmacokinetics as outcomes were excluded.
Figure S1 summarizes the steps of the selection process.
2.4. Risk of Bias
A simple risk-of-bias assessment of individual studies was performed by two independent reviewers, with trials with low risk of bias being defined as fulfilling the three following criteria: (i) adequate concealment of allocation, (ii) inclusion of at least 50 patients per study group, and (iii) dropout rate less than 15%.
Additionally, the authors had no selection bias and were not supported by any of the pharmaceutical companies whose products were used in the included clinical studies.
2.5. Data Collection Process
A data extraction form was developed for data collection. One reviewer (K.R.) extracted and selected the data, whereas a second reviewer (M.S.) was consulted when necessary, and doubts were discussed to consensus. For the selected studies, the extracted data included: first author, name of trial, year and month of publication, study sites, number of patients in each study group (placebo/active comparator), drug class of the active treatment being investigated (NSAIDs, bDMARDs, csDMARDs, tsDMARDs, others), route of administration of the study drug (oral/parenteral), baseline characteristics of the placebo group (percentage of female participants, percentage of participants of Caucasian ethnicity, mean age, mean duration of symptoms/disease, concomitant and prior medication), results in clinical outcomes (depending on the specific disease), time point of outcome measurement (eventually more than one time point). The time point at which the primary endpoint was assessed was defined as the duration of the trial. Study sites were reported as the continent(s) in which the studies were conducted. For studies investigating more than one dosage of the active study drug, results in clinical outcomes were reported as the mean of the results of all the active comparator groups.
2.6. Data Synthesis and Analysis
Information on each publication was collected in an Excel sheet and then transferred into SPSS (IBM Corp. Released 2017. IBM SPSS Statistics for Macintosh, Version 25.0. Armonk, NY, USA: IBM Corp.) and R (R Core Team, 2021) for further analyses and comparisons.
Descriptive statistics were calculated using unweighted data. The Kolmogorov–Smirnov test, Chi-Quadrat test, and binomial test were performed to analyze normal distribution of variables describing baseline characteristics of the placebo groups. Not normal distributed variables were then compared using the Mann–Whitney-U test for independent samples, whereas normal distributed variables were compared using the t-test for independent samples.
Analyses of correlations and linear regressions between clinical endpoints and baseline characteristics of placebo groups were performed for studies conducted in the same disease after weighting of data according to the number of participants in the placebo groups.
The Pearson correlation coefficient and p-values were given for bivariate correlations, for significant correlations with p < 0.05 and |r| > 0.3. R2, and p-values for ANOVA and regression coefficients (beta coefficients: b0 = intercept; b1 = slope) were calculated using linear regressions.
2.7. Sample Size Calculations
Sample size calculations were performed with the SAS system (SAS Institute Inc., Cary, NC, USA, 2013), using the POWER procedure Fisher’s exact conditional test for two proportions, with alpha of 0.05 and a nominal statistical power of 0.9 which were used to calculate the different sample sizes as indicated for exemplary randomized placebo-controlled phase III trials, when results of response rates in phase I/II studies were available.
2.8. Role of the Funding Source
For publication fees only.
4. Discussion
Although the amount of literature on placebo effects in various diseases is rising, this is the first study to systematically analyze the placebo effect on disease activity in the clinical context of several chronic arthritic diseases. There are multiple disease-modifying agents for these diseases and a range of outcomes that include subjective (e.g., assessment of disease activity and pain) and objective measurements (e.g., laboratory and radiographic biomarkers). To assess the evidence for the effects of new treatments, the treatment approaches have to be tested formally in double-blinded randomized placebo controlled trials. This provides an excellent opportunity to explore the placebo effect and its determinants in chronic arthritic diseases.
This analysis examined the influence of placebo effects in treatment studies with placebo and active comparator groups of patients with RA, PsA, and axial SpA. The literature search yielded 152 eligible studies, with a total of 21.616 participants included in the placebo groups. To identify determinants of higher response in placebo groups, this study aimed in identifying any strong significant correlation between baseline characteristics of the placebo groups and clinically relevant endpoints. Analyses of bivariate correlations and linear regressions revealed two major determinants of the placebo effect in chronic arthritic diseases, namely, (i) DMARD naïvety and (ii) early stage of disease.
In clinical studies conducted in RA, placebo group participants, who have been completely DMARD-naïve before entry into the clinical trial, showed an ACR20 score of 73.0%, an ACR50 of 57.8%, and an ACR70 of 45.0% (compared to ACR20 of 85.5%, ACR50 of 70.9%, and ACR70 of 65.0% in the active comparator groups, respectively). This is quite impressive, considering that ACR50 implicates an improvement of symptoms of more than 50%. In contrast, placebo group participants who were previously treated with DMARDs (DMARD-naïvety of 0%) showed an ACR20 of 21.3%, an ACR 50 of 7.8%, and an ACR70 of only 2.4% (compared to ACR20 of 50.8%, ACR50 of 26.5%, and ACR70 of 11.5% in the active comparator group). Unfortunately, the information on proportions of DMARD-naïve study participants was not given in the included studies conducted in SpA and PsA. However, any prior bDMARD treatment of patients with PsA correlated negatively with ACR70 scores in the placebo groups, again indicating that the absence of prior experience with treatments is associated with more neutral expectations towards treatment response, thus critically determining the magnitude of placebo effect.
The second determinant of increased response in placebo groups of studies conducted in chronic arthritic diseases is an early stage of disease. Whereas only a moderate positive correlation could be observed for placebo group participants in RA achieving high ACR20 scores (R
2 of 0.713), a strong positive correlation was observed between proportions of patients achieving significantly relevant ACR50 and ACR70 responses (R
2 of 0.972 and 0.970, respectively) when the disease duration for all study participants was less than two years. Here, an ACR50 of 42.3% and an ACR70 of 27.3% could be achieved in the placebo groups, whereas the active comparator groups yielded 57.4% and 42.6%, respectively. When only around 20% of patients had an early disease stage, the percentages declined to 9.6 for ACR50 and 2.9 for ACR70 in the placebo groups (versus 28.7 and 13.5 in the active comparator groups, respectively). Accordingly, longer disease duration was associated with significantly lower ASAS response criteria in the placebo groups of clinical studies conducted in SpA. A meta-analysis of randomized controlled trials in fibromyalgia also observed lower placebo effect sizes in trials of participants with longer mean disease duration and thus consistently came to the conclusion that early intervention in fibromyalgia is more likely to give a good outcome [
11]. Conclusively, these results reflect the importance of early treatment, which significantly increases the chances of achieving good clinical outcomes. In fact, the longer a patient is affected by a disease, the more complicated his disease evolves; a patient’s expectancies may decrease, and as a result, it becomes harder to improve outcomes by either active treatment, placebo, or any other factors that influence contextual response.
However, in addition to the clinical relevance of the presented data, it is also important to create awareness of the underestimated impact of the placebo effect on study size calculations. For example, if phase I/II studies conducted in RA suggest a response rate of 70% for a new active drug, the calculated number of patients for each arm of the randomized placebo-controlled trial for phase III is 496, if 100% of patients are DMARD-naïve (with an expected ACR50 response of 60% in the placebo arm) (see
Figure 3). However, the calculated number of required study participants for each arm of the trial is only 16, if all RA patients are pre-treated with a DMARD (0% DMARD-naïve; with an expected ACR50 response of 10% in the placebo arm). Similarly, 140 patients per study arm are needed when the patients are diagnosed with RA in less than two years (with an expected ACR50 response of 40%, and 60% in the active comparator group), compared to only 21, when the diagnosis was already longer ago (expected ACR50 of 10%).
Furthermore, not only is there a crucial difference in sample size for the sponsor who is conducting the clinical trial, but large placebo responses in such trials may also keep effective medication from reaching the market. In this context, it might be necessary to shift from more patient-reported to objective outcome measurements. However, a recently performed meta-analysis revealed that both objective and subjective outcome measures in the placebo arms of RA trials improved to a clinically meaningful extent, at least within the five clinical trials included in the analysis. The authors of this meta-analysis therefore came to the conclusion that the observed placebo responses may be more than just a psychological phenomenon [
12]. Now, it is becoming essential that we improve our understanding of the underlying mechanisms.
Limitations
Although the systematic literature search identified a large number of eligible studies, there are several limitations to this analysis. First, it is well known that there is a publication bias against negative results. Second, this analysis used only collective and not individual patients’ data of the placebo groups. Therefore, it was not possible to investigate any correlations between individual patient’s characteristics, including burden of disease, expectations, beliefs, anticipations of clinical improvements, and attitude towards the therapy and the medical staff with the clinical outcome. These parameters are well known determinants of the placebo effect [
13]. Additionally, as studies included in the analysis were heterogeneous, conducted under different conditions and assessing different clinical outcomes, cross-disease comparisons were limited to variables, which were available for all studies. Additionally, due to this reason, it was not possible to conduct a meaningful multivariate analysis. Furthermore, only 7.11% of all studies identified by the applied search strategy were considered eligible for further analysis. The selection of only studies addressing clinical outcomes as a primary endpoint was driven by previous findings of placebos being ineffective for almost all objective outcomes (e.g., radiographic progression) [
14]. The analysis was restricted to studies with at least 50 participants per group, which was due to the facts that the sample size (i) significantly reduces the risk of bias and (ii) is a major determinant of placebo effects in osteoarthritis [
14]. Only studies published after 1990 were included since they were considered as more reliable to fulfill current standards of clinical trials’ design and therefore be more adequate for comparisons. A further limitation of this study is the lack of three-armed trials, including a non-treatment control group. Although essential to distinguish improvements in the placebo group from phenomena such as spontaneous remission or regression to the mean, which are often mistakenly understood as placebo effect [
15], a trial arm without any treatment has to be considered as unethical [
16].