**Worldwide Review and Meta-Analysis of Cohort Studies Measuring the E**ff**ect of Mammography Screening Programmes on Incidence-Based Breast Cancer Mortality**

### **Amanda Dibden 1, Judith O**ff**man 2, Stephen W. Du**ff**y 1,\* and Rhian Gabe <sup>1</sup>**


Received: 12 March 2020; Accepted: 13 April 2020; Published: 15 April 2020

**Abstract:** In 2012, the Euroscreen project published a review of incidence-based mortality evaluations of breast cancer screening programmes. In this paper, we update this review to October 2019 and expand its scope from Europe to worldwide. We carried out a systematic review of incidence-based mortality studies of breast cancer screening programmes, and a meta-analysis of the estimated effects of both invitation to screening and attendance at screening, with adjustment for self-selection bias, on incidence-based mortality from breast cancer. We found 27 valid studies. The results of the meta-analysis showed a significant 22% reduction in breast cancer mortality with invitation to screening, with a relative risk of 0.78 (95% CI 0.75–0.82), and a significant 33% reduction with actual attendance at screening (RR 0.67, 95% CI 0.61–0.75). Breast cancer screening in the routine healthcare setting continues to confer a substantial reduction in mortality from breast cancer.

**Keywords:** breast cancer; screening; mammography; incidence-based mortality

#### **1. Introduction**

Reviews of randomised controlled trials (RCTs) of mammography screening estimate that invitation to screening reduces risk of death from breast cancer by around 20% [1,2]. However, as the majority of RCTs were carried out over 30 years ago, they do not take account of changes in breast cancer incidence [3], mortality [4], screening techniques and treatments that have occurred over time. Furthermore, the results may not be representative of the effectiveness of individual population mammography screening programmes [5], which are affected by factors such as varying round lengths, radiographer skill and technology [6]. While RCTs provide reliable evidence and proof of principle that mammography screening is likely to be beneficial, once population screening programmes have been introduced, randomisation to a non-interventional control is no longer ethical and it is necessary to measure the effectiveness of screening in practice through observational studies.

Cohort studies have been used to achieve this objective but there can be important, subtle differences in methods of analysis used. One method is to use incidence-based mortality (IBM) [7], where deaths from breast cancer are only included in women diagnosed after screening has been introduced [8]. This avoids contamination of deaths in the screening period of women who were diagnosed prior to the start of screening, which would bias results against screening [9]. The aim of this review is to provide an overview of all IBM studies evaluating the impact of mammography screening on breast cancer mortality and to establish an up to date estimate of the long term benefit of breast screening.

#### **2. Materials and Methods**

#### *2.1. Search Strategy*

A systematic search of PubMed was performed in October 2019 with search terms based on those used by Njor et al. in their review of European IBM studies (Euroscreen project) [8]. Inclusion criteria were that (i) the study used IBM in the analysis, (ii) the study outcome was breast cancer mortality and (iii) the paper was in English. No restrictions were placed on age of study participants included to enable the inclusion of as many studies, and hence women, as possible.

#### *2.2. Selection of Sources of Evidence*

The titles and abstracts were initially assessed for relevance. A random selection of 100 papers were independently assessed by three reviewers (A.D., S.W.D. and J.O.) for accuracy. Following observation of more than 90% agreement among reviewers, the remainder of the papers were assessed by one reviewer (A.D.). The main text of the potentially eligible papers was then assessed by two reviewers (A.D. and S.W.D.) in order to make a final decision regarding inclusion in the review. We prepared a list of variables to extract from each paper (if available). These included programme characteristics, person years accrued and relative risks associated with invitation and/or exposure to screening as well as the proportion participating in screening (the latter to assist in correction for self-selection bias).

#### *2.3. Statistical Methods*

Random effects meta-analyses were undertaken to obtain overall estimates of the effects of (i) invitation to screening and (ii) attendance to screening on risk of breast cancer mortality [10]. It is important to note that when assessing the effect of invitation on mortality, it pertains to populations offered screening and the effect of attendance pertains to women who actually take up the offer of screening and is thus effected by the participation rate. Analyses were repeated stratified by age group (i) 50 and over, (ii) under 50. We chose age 50 to stratify the data as the majority of studies reported on the effects of screening in women aged 50–69, reflecting many national screening programmes, and in order to provide separate evidence in women under 50 years where possible as there has been uncertainty about whether screening younger women is effective and hence, cost-effective. Heterogeneity between studies was assessed using the χ<sup>2</sup> test. Where studies used overlapping data, the largest study was chosen on the basis of better precision with a smaller variance.

Statistical analyses were conducted using Stata Version 13 (StataCorp, College Station, TX, USA) [11].

#### Adjustment for Self-Selection Bias

Studies have shown that women who do not comply with an invitation to screening usually, but not invariably, have a higher risk of breast cancer mortality than those who choose to attend, resulting in a bias in favour of screening [12]. In order to account for such self-selection bias in studies reporting the effect of attending, we used the statistical adjustment proposed by Duffy et al. [13]. This uses the relative risk of attenders versus non-attenders from the current study, the participation rate, and the risk of death in non-attenders versus uninvited from an appropriate external source (Table 1). The relative risk of non-attenders versus uninvited women of 1.17 (95% CI 1.08–1.26) reported in the Swedish Organised Service Screening Evaluation Group (SOSSEG) study was used in this review as it was a large population based service screening study investigating IBM [14].

**Table 1.** Statistical adjustments to estimate effect of invitation and attendance to screening.


*RRA* = the relative risk of breast cancer death associated with attending screening versus not attending; *p* = the proportion of women who attend screening; *Dr* = the relative risk of breast cancer death for non-attenders versus uninvited = 1.17; Formulae for the variance, and thus the 95% confidence intervals, of these estimates can be found elsewhere [13].

#### **3. Results**

#### *3.1. Literature Selection*

The literature search identified a total of 5232 titles from three searches performed in PubMed (see Appendix A for details of searches 1, 2 and 3), and 43 were deemed relevant for our review after assessment of abstracts and full text (Figure 1). Of these, four studies assessed the effectiveness of screening programmes outside Europe, one each from Canada [15] and the USA [16], and two from New Zealand [17,18]. The remaining 39 studies were European with twelve from Sweden [14,19–29], nine from Finland [30–38], five from Norway [39–43], four from both Italy [44–47] and Denmark [48–51], two from the Netherlands [52,53] and the UK [54,55] and one from Spain [56].

**Figure 1.** Literature search flow diagram.

There was overlap between some papers, whereby authors used the same data to estimate the effect of different outcomes or updated results with longer follow-up. This resulted in sixteen exclusions (one paper from Denmark [51], four from Finland [31,34,35,37], two from Italy [44,46], three from Norway [39,42,43] and six from Sweden [19,21,22,26,27,29]). The remaining twenty-seven papers included in this review, representing independent populations, are summarized in Table 2.





1

screening began in 1995;

months between rounds 3 and 4; 8 The 15-year period 1991–2005 was partitioned into observation periods of two years accrual and up to nine years follow-up.

 Estimated from data in the paper;

 The average interval between the first and second, and second and third round was 38 months (range 22–65), but was 23

#### *3.2. Study Findings*

Whilst the majority of studies included women in the age range 50–64 years, the youngest age of invitation was 35 years and the oldest 83 years. Most countries invite women every two years, with the range between 18 months and three years. Table 3 shows the unadjusted relative risk for the effect of invitation and attendance on the outcome of incidence-based breast cancer mortality as reported in the studies, and corresponding relative risks adjusted for self-selection bias as described above. The effect sizes and the participation rates reported in the studies suggest differences in risk of breast cancer mortality within countries as well as between countries. Participation rates ranged from 44% in Canada to above 90% in Finland.

The studies reviewed used one or more of three types of comparison groups used to estimate breast cancer mortality in an uninvited population: contemporaneous, regional and historical. (i) The contemporaneous comparison group compared women not yet invited, during the same time period and in the same region, as the women invited. (ii) The regional comparison group is often concurrent to the invited women, but in a region not yet invited. (iii) The historical comparison group compares women invited with women from an epoch not yet invited.

Ten studies compared the screening population with a contemporaneous comparison group, five of which estimated the effect of invitation to screening and seven the effect of attending screening. The effect of invitation fell within a narrow range from 0.72 (95% CI 0.64–0.79) [41] to 0.81 (95% CI 0.64–1.01) [47] and the effect of attendance was 0.38 (95% CI 0.30–0.49) [17] to 0.67 (95% CI 0.49–0.97) [38].

A further six studies used a historical comparison group to compare the impact of introducing screening in a particular region or country, the majority of which reported both the effect of invitation and attendance. The range of the respective effect sizes was wider for invitation than the studies that used a contemporaneous comparison group but narrower for attendance at 0.58 (95% CI 0.44–0.75) [56] to 0.83 (95% CI 0.73–0.95) [18] and 0.52 (95% CI 0.46–0.59) [23] to 0.66 (95% CI 0.58–0.75) [32] respectively.

Only four studies used a regional comparison group without any adjustment for differences in underlying cancer incidence between regions. All studies estimated the effect of invitation, with the results ranging from 0.73 (95% CI 0.63–0.84) [55] to 0.94 (95% CI 0.68–1.29) [53]. Just one study reported the effect of attending screening and found a 29% reduction in breast cancer mortality (95% CI 0.62–0.80) [20] in women who attended screening compared to those who did not.

The remaining seven studies used a combination of regional and historical data, and again, all studies estimated the effect of invitation, with effect estimates ranging between 0.75 (95% CI 0.63–0.89) [50] to 0.97 (95% CI 0.73–1.28) [25], which is almost identical to the results of those studies that used a regional control group alone. The two studies that estimated the effect of attendance reported relative risks of 0.60 (95% CI 0.49–0.74) [50] and 0.68 (95% CI 0.59–0.79) [49] respectively.




**Table 3.** *Cont.*

study. *BMJ* **2005**, *330*, 220. [50]; 5 Not included in meta-analysis due to later paper by Beau et al.; 6 Calculated from data in the paper; 7 Unadjusted RR calculated from data in the paper; 8 RR calculated from data in the paper. The accrual period for women aged 50–64 was 2001–2003 and for women aged 45–49 and 65–69 was 2006–2008; 9 Excludes prevalent cases; RR1 and RR2 are as reported by authors in the paper; 11 Attendance rate not reported so taken from Giordano L, von Karsa L, Tomatis M, Majek O, de Wolf C, Lancucki L, et al. Mammographic screening programmes in Europe: organization, coverage and participation. *J. Med Screen*. **2012**, *19*, 72–82. [57] as region reported invites women until the age of 74; 12 Attendance rate not reported so taken from Swedish Organised Service Screening Evaluation Group. Reduction in breast cancer mortality from the organised service screening with mammography: 2. Validation with alternative analytic methods. *Cancer Epidemiol Biomark. Prev*. **2006**, *15*, 52–56. [58]; 13 One study only.

1 10

#### *3.3. Meta-Analysis by Age-Group of Women*

There were twenty-two studies that assessed the effect of invitation to screening (Table 3 and Figure 2). All studies had a relative risk of less than or equal to 1, with the largest studies achieving statistical significance. The largest studies were those by SOSSEG [14] and Johns et al. [54] who found a 20–30% reduction in breast cancer mortality. The pooled rate ratio was 0.78 (95% CI 0.75–0.82) with significant heterogeneity (*p* < 0.001).

**Figure 2.** Effect of invitation on risk of breast cancer mortality [14,18,20,23–25,28,30,32,33,36,38,41,47– 49,52–56].

Fourteen studies reported on the effect of being screened, eight of which also reported on the effect of being invited. All but one study reported a statistically significant result with the largest studies again by SOSSEG [14] and Johns et al. [54] with relative risks of 0.54 and 0.55 respectively. This therefore led to a pooled estimate of 0.54 (95% CI 0.49–0.59) with significant heterogeneity at *p* < 0.001. However, as discussed previously, the effect estimate of being screened is likely to be subject to self-selection bias. Therefore, an adjustment was made to account for this.

To be able to calculate the adjustments for self-selection suggested by Duffy et al. [13], it is necessary to know the proportion of women attending screening. This is not reported in the paper by Thompson et al. [16] and so the adjusted relative risks cannot be estimated. However, this study is small and therefore omission of this study in the calculation of the pooled relative risk would not have a substantial effect.

The intention to treat estimate, *RR*1, was 0.76 (95% CI 0.71–0.83) with significant heterogeneity (*p* < 0.001). This estimate is almost identical to the effect size presented in Figure 2 (RR 0.78). The adjusted pooled relative risk for the effect of being screened, *RR*2, was 0.67 (95% CI 0.61–0.75) and again there was significant heterogeneity between studies with *p* < 0.001 (Figure 3).

**Figure 3.** Estimated effect of attendance adjusted for self-selection on risk of breast cancer mortality (*RR*2) [14,15,17,20,23,32,33,38,40,45,49,50,54].

When assessing the effect of invitation in women aged 50 and over (Figure 4), whilst the pooled relative risk was similar to that in women of all ages, the *p*-value was 0.175, suggesting no heterogeneity between studies. However, there was still evidence of heterogeneity when assessing the effect of attendance with a relative risk, adjusted for self-selection, of 0.74 (95% CI 0.64–0.85) and a *p*-value of <0.001 (Figure 5).


**Figure 5.** Estimated effect of attendance adjusted for self-selection on risk of breast cancer mortality (*RR*2) in women aged 50 and over [15,32,33,38,40,45,49,50,54].

There were five studies that reported on the effect of screening in women under 50 years. Four studies [18,20,30,53] report the effect of invitation with a pooled relative risk of 0.81 (95% CI 0.74–0.87) and no evidence of heterogeneity (*p* = 0.418). Two studies [15,20] reported the effect of attendance to screening and the adjusted relative risk was 0.73 (95% CI 0.65–0.82).

#### **4. Discussion**

This systematic review and meta-analysis of IBM studies estimates that the risk of death from breast cancer in women invited for screening is reduced by 22% compared to women not invited (RR 0.78, 95% CI 0.75–0.82), with similar results across age groups. This result is consistent with earlier overviews of cohort studies [5,6] and from the RCTs in breast cancer screening, which suggest invitation to screening reduces mortality by approximately 20% [1].

The studies with contemporaneous control groups are likely to be least biased, with the regional and historical comparison groups potentially affected by differences in the underlying risk of breast cancer mortality between regions or across time periods respectively. When assessing the effect of invitation, the results of studies with contemporaneous control groups ranged from 0.72 (95% CI 0.64–0.79) in the Norwegian study by Weedon-Fekjær [41] to 0.81 (95% CI 0.64–1.01) in the Italian study by Paci [47].

The relative risk, *RR*2, estimates the effect of attendance adjusted for self-selection bias. Using population specific attendance rates and *Dr* = 1.17 from the SOSSEG study [14] results in a mortality decrease of 33% (RR: 0.67, 95% CI 0.61–0.75). This is slightly more conservative than the relative risk estimated by Broeders et al. [5] although it is unclear whether their result is adjusted for self-selection bias.

The relative risk, *RR*1, estimating the effect of invitation to screening from the relative risk of attendance adjusting for potential self-selection bias, is almost identical to the pooled effect estimated directly. The agreement between the two measures suggests that the adjustment method is valid.

There appears to be significant heterogeneity of the effect of invitation in the meta-analysis of all studies, which disappears when the analysis is stratified by the age. This suggests that there is a difference in the effect of invitation in differing age groups, and that the varying distributions by age among studies is contributing to the heterogeneity of the effect for all ages combined. The heterogeneity of the effect of attendance adjusted for self-selection bias (*RR*2), however, is present for women screened at any age and in women screened over the age of 50 years, suggesting that variation in age distributions is not entirely responsible for heterogeneity among studies. It is likely that this is partly due to differing screening regimens and practices.

Attendance rates varied between studies, from 0.44 in the Canadian study [15] to 0.92 in the study by Sarkeala et al. [32]. Canada differs from other countries in respect of protocol for call and recall, requiring women to self-refer in some provinces. When this opportunistic screening is taken into account, the attendance rate is estimated to be 63.1% [59]. Additionally, the screening programme has a high retention rate, with nearly 80% of previous participants attending a subsequent screen within 36 months.

IBM studies have been the focus of this review but they are not without their limitations. The main limitation is the identification of an appropriate comparison group in the absence of screening [8]. In addition, they are prospective studies and require a long follow-up period to accumulate enough deaths to achieve statistical power [60] and to see the full benefit of screening. Results from the Swedish Two-County Trial suggest that the full benefit of screening requires follow-up of at least 20 years [61]. The majority of studies included in this review had at least 10 years follow-up, with some having over 20 years. In addition, the length of the accrual period should be equal in comparison groups and should be equal to the length of the follow-up period. The effect of screening will be underestimated if the accrual period is shorter than the follow-up period, as more cases will accrue in the screened population than the non-screened population. Seventeen studies in this review had equal accrual and follow-up periods, with ten studies having a shorter accrual period.

Our results update and confirm those of the Euroscreen project for IBM studies [5,8]. Case-control studies reviewed in the Euroscreen project tended to find rather stronger effects than the IBM studies with the effect of invitation 0.69 (95% CI 0.57–0.83) and the effect of attendance adjusted for self-selection bias 0.52 (95% CI 0.42–0.65). This may be due to ascertainment biases in the case-control approach [12]. In addition, the Euroscreen project estimated the effect from trend studies to be between 28–36%, which is comparable to the results in this review [62].

There have been suggestions of alternative analysis methods using the IBM approach. Tabar et al. [63] suggest using as the endpoint the incidence of breast cancers subsequently proving fatal, within ten or twenty years. This method links exposure to endpoint more accurately, but requires a long follow-up period. Sasieni et al. [64] propose a method for estimating the expected number of deaths in the population without screening, which can be used when there is no contemporaneous comparison group. Neither of these methods have been used extensively up to now.

#### **5. Conclusions**

IBM studies yield estimates uncontaminated by pre-screening cancers. Results from these international studies indicate that inviting women to screening results in a 22% reduction in breast cancer mortality and that the effect of attending screening reduces the risk of death by around 30%. Breast cancer screening in the routine healthcare setting continues to confer a substantial reduction in mortality from breast cancer.

**Author Contributions:** Conceptualization, S.D. and A.D.; methodology, A.D., J.O., S.W.D. and R.G.; data extraction A.D.; writing—original draft preparation, A.D., S.W.D. and R.G.; writing—review and editing, A.D., J.O., S.W.D. and R.G. All authors have read and agree to the published version of the manuscript.

**Funding:** This research is funded by the National Institute for Health Research (NIHR) Policy Research Programme, conducted through the Policy Research Unit in Cancer Awareness, Screening and Early Diagnosis, 106/0001. The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **Appendix A**

#### *Search Strategy*

The following search terms were used to conduct the review of the literature in PubMed. These search terms were taken from previous review by Njor et al. [8], however our search was restricted to three out of the four searches performed by Njor et al. and papers in English only.


#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

#### *Article*
