Multiple Testing, Cut-Point Optimization, and Signs of Publication Bias in Prognostic FDG–PET Imaging Studies of Head and Neck and Lung Cancer: A Review and Meta-Analysis

Diagnostics 2020, 10(12), 1030;
Submission received: 19 October 2020 / Revised: 25 November 2020 / Accepted: 29 November 2020 / Published: 1 December 2020
(This article belongs to the Section Medical Imaging and Theranostics)


Positron emission tomography (PET) imaging with 2-deoxy-2-[18F]-fluorodeoxyglucose (FDG) was proposed as prognostic marker in radiotherapy. Various uptake metrics and cut points were used, potentially leading to inflated effect estimates. Here, we performed a meta-analysis and systematic review of the prognostic value of pretreatment FDG–PET in head and neck squamous cell carcinoma (HNSCC) and non-small cell lung cancer (NSCLC), with tests for publication bias. Hazard ratio (HR) for overall survival (OS), disease free survival (DFS), and local control was extracted or derived from the 57 studies included. Test for publication bias was performed, and the number of statistical tests and cut-point optimizations were registered. Eggers regression related to correlation of SUVmax with OS/DFS yielded p = 0.08/p = 0.02 for HNSCC and p < 0.001/p = 0.014 for NSCLC. No outcomes showed significant correlation with SUVmax, when adjusting for publication bias effect, whereas all four showed a correlation in the conventional meta-analysis. The number of statistical tests and cut points were high with no indication of improvement over time. Our analysis showed significant evidence of publication bias leading to inflated estimates of the prognostic value of SUVmax. We suggest that improved management of these complexities, including predefined statistical analysis plans, are critical for a reliable assessment of FDG–PET.

1. Introduction

Positron emission tomography (PET) offers a non-invasive method to assess functional biological characteristics of a tumor in an individual patient with cancer. A number of positron emitting tracers were developed to study various aspects of tumor biology [1,2,3,4,5,6,7]. However, clinical practice in cancer PET imaging is still dominated by very few tracers, with 2-deoxy-2-[18F]-fluorodeoxyglucose (FDG) being the clinical workhorse for most tumor sites. FDG–PET imaging is primarily used for staging purposes, as a supplement to anatomical images, but advances in the availability of PET imaging led to an increased interest in the feasibility of PET guided radiotherapy planning [8]. Several studies investigated the prognostic value of FDG–PET, and dose escalation to PET-positive areas within the tumor is one of the potential strategies for increasing effect of radiotherapy [9,10,11,12].
Despite the widespread use of FDG, only few prospective studies exist for this tracer—the value of FDG as prognostic factor was mainly tested in retrospective cohort studies, for example [13]. There are major limitations with this approach of evidence-generation in medicine. For example, the lack of prospective clinical studies with study registration, formalized sample size estimation and a predefined statistical analysis plans. All of this leads to a risk of “fishing expeditions” with a high risk of false positive findings, exaggerated effect sizes, and subsequent publication bias. Multiple testing is a well-described contributor to false positive findings; when comparisons are made for several subgroups or multiple variables, without adjustment of the type I error rate (false positives).
Searching for a standardized uptake value (SUV) cut-point for dichotomization of the patient group into a poor vs. good prognosis group, which minimizes the p-value when comparing outcome in the resulting groups, is an approach frequently used in imaging studies. This “optimization” approach invalidates a simple interpretation of the resulting p-value and is associated with a substantial increase in the rate of type I errors [14,15]. Deciding before the start of analyses to use the median SUV as a cut-point, is unbiased. However, in many cases this might not be a biologically meaningful way to classify patients into prognostic subgroups.
In the current study, we reviewed the methodology of published studies of the prognostic value of FDG in head and neck squamous cell carcinoma (HNSCC) and non-small cell lung cancer (NSCLC), and assessed the evidence of publication bias. We included studies where patients received radiotherapy (RT), possibly in combination with other treatment modalities. The Hazard Ratio (HR) for overall survival (OS), disease free survival (DFS), or local control (LC) were considered as outcomes, and analyzed as a function of FDG uptake metrics.

2. Materials and Methods

2.1. Search Strategy and Eligibility Criteria for Studies

Published reports on the prognostic value of pretreatment FDG–PET in two common tumors, HNSCC and NSCLC, were included.
We searched PubMed for published reports using the following search strings with ‘human’ filter:
HNSCC: (FDG OR “18-F”) AND (“Head and neck” OR “HNSCC” OR “SCCHN”) AND (radiotherapy OR chemoradi* OR radio* OR chemo-radi*) NOT review
NSCLC: (FDG OR “18-F”) AND (“non-small cell lung cancer” OR “NSCLC”) AND (radiotherapy OR chemoradi* OR radio* OR chemo-radi*) NOT review
In addition, manual screening of the selected articles, reviews, and meta-analyses were used to complement the search. The final date of the search was 1 November 2018, and we did not restrict publication date prior to this date. In this analysis, only articles in English were included. Studies reporting the HR for OS, DFS or LC, versus SUVmax or SUVpeak, were included. The primary treatment was required to include RT, but studies with mixed cohorts including RT, chemoradiotherapy, and surgery were also allowed. However, studies with only few patients receiving RT were excluded [16]. No restrictions on the study design were used. Where multiple reports with overlapping patient cohorts were available, only data from the largest study were included.

2.2. Data Extraction

Data were extracted for each study by MMC and entered into the meta-analysis software Review Manager (RevMan) software version 5.3 [17]. There was no attempt to obtain unpublished data. Data were analyzed by MMC and IRV. The HR for OS, DFS, and LC for each trial was extracted or derived from the available data. An HR above 1 implied a survival benefit for lower SUVmax. Both the HR and its confidence interval (CI) were required from each study for inclusion in the meta-analysis. Data from multivariate (MVA) were prioritized over univariate (UVA) analysis, when both were reported.
If the HR with CI was not stated in a report, one of two methods were used for its estimation. If the HR was given with a p-value, but without the CI, we assumed a normal distribution of the logarithm of the HR and estimated the CI by first finding the z-parameter of the normal distribution pertaining to the reported p-value. We then calculated the standard error of the ln(HR) estimate as S ln ( HR ) = ln ( HR ) z .
A number of studies did not report an HR, but only a p-value, together with the outcome at one or more specified points in time, in most cases in the form of a plot of Kaplan–Meier curves. In these cases, we estimated the HR from the relationship R ( t ) = ln ( p 2 ( t ) ) ln ( p 1 ( t ) ) , where p1 is the Kaplan–Meier estimate for low SUVmax at time t and p2 is the estimate for high SUVmax. When possible, we sampled the ratio at multiple time-points, ranging from the first time an event occurred in both groups to the end of follow-up, and averaged the resulting HR(t) estimates. The CI was then calculated from the p-value as explained above. The methodology was previously described in more detail [18]. Variances were then calculated and used as study weights in meta-analysis, using RevMan [17].
Assessment of publication bias was made visually by ordering the studies in forest plots according to variance and by funnel plots. We did not systematically assess the risk of bias in the individual studies. The risk of bias across studies was assessed using the so-called Egger–Var method as a formal test for publication bias [19]. We performed quantitative assessment using the Egger’s method as follows. For each endpoint and comparison, a linear fit of log HR versus standard error of the study estimate was performed to assess if the included studies effect size depended on study precision. Formally, the regression equation was
ln ( HR i ) = α + β SE i + ε i ,
weighted by the inverse variance of each study. Here, α and β were the fitting parameters, SEi was the standard error of lnHR of the ith study, and ε was the residual error assumed to have a normal distribution. If β was different from zero at the 95% confidence level, it was concluded that the effect size estimates depended on the study precision—a clear indication of publication bias. α in Equation (1) was the extrapolation to zero SE and was used to estimate a publication bias corrected value of ln(HR).
Turning to the assessment of the number of statistical tests and the use of cut-point optimization, the assessment was made independently by MMC and IRV and disagreements were resolved by a consensus meeting. Only statistical tests related to the association of an imaging metric on one side and oncological outcome or baseline characteristics on the other, were counted. When the same question was addressed in univariate and multivariate analysis, the corresponding p-value was only counted once. Similarly, it was only counted once if it was part of a rational model building procedure, including forward or backwards elimination. In cases where a large number of multivariate models with a functional image metric were examined, outside of a model building procedure, the multivariate tests were included in the assessment of a number of statistical tests that were investigated, e.g., Schwartz et al. [20].

3. Results

The study selection process for this analysis is presented in Figure 1. Of the 930 studies identified by the initial search, 57 were analyzable. A total of 178 full text articles were screened, and 133 of these were excluded, as data were not assessable (not reporting HR, univariate log-rank test without p-value, no cut-point for SUV). In other words, 25.3% of the screened full text reports were included. Twelve studies were added from manual cross-referencing of articles, reviews, and browsing for a total of 57 included studies; 27 studies in patients with HNSCC [20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46], and 30 studies in NSCLC [47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76].
Study characteristics are summarized in Table 1 and Table 2 for HNSCC and NSCLC, respectively. The vast majority of studies were retrospective analyses—20 studies of HNSCC (74%) and 26 studies of NSCLC (87%). The included studies comprised a total of 5102 patients—1704 patients in the HNSCC group and 3398 in the NSCLC group. The median study size was 74. The NSCLC studies were generally larger, with a median study sample size of 95 patients compared to 58 in the HNSCC group. Twenty-seven studies did not perform MVA for DFS or OS, which were the primary endpoints of this analysis. A few studies reported no events until long follow-up, which gave rise to additional uncertainty in the HR estimate [54,68]. One study reported no events in the low-uptake group, giving rise to an infinite HR estimate, and the study had to be excluded [77]. A single study was excluded due to problems with interpretation of the KM plots [78].
The patient cohorts in both the HNSCC and NSCLC group were quite heterogeneous with respect to stage, treatment, and follow-up time.
Figure 2 and Figure 3 display the forest plots for SUVmax as a predictor of DFS and OS for HNSCC and NSCLC, respectively, with the studies ordered according to inverse variance from top to bottom. There was a trend for HR to decrease with decreasing variance, and publication bias is therefore suspected. The HR estimate from the pooled analysis is shown by the diamond-shaped mark, and it favored low SUV. However, this should be interpreted with caution due to the suspicion of publication bias. The same trend was observed for LC (Supplemental Figure S1). The data entered in RevMan and HR for UVA and MVA for all studies are listed in the Supplemental Table S1.
When Eggers regression was applied, neither OS nor DFS appeared to be significantly associated with FDG uptake in neither HNSCC or NSCLC. The regression slopes were significantly greater than zero in three of the four cases: DFS for HNSCC (p = 0.02) and both OS (p < 0.001) and DFS (p = 0.014) for NSCLC. See the supplement for details and the associated plots (Supplemental Figure S2).
Figure 4 shows a plot of the number of patients, number of statistical tests, and number of cut-point optimizations against the year of publication. Unfortunately, there is little sign of a consistent improvement in study characteristics, i.e., number of statistical tests, number of cut-point analyses, and size of study population over time. For HNSCC studies, there was a statistically significant increase in the number of cut-point optimizations versus time (Spearman rho = 0.5, p = 0.02), while there was no significant increase in the number of patients for later studies and data were also in accordance with no change in the number of tests performed (Figure 4A). Data for the NSCLC studies are shown in Figure 4B, where the Spearman rank correlation coefficients were statistically in agreement with no change over time, in either the number of patients, number of cut-points, or number of tests performed (p > 0.27 for all coefficients).
Power calculations were not performed in any of the included studies, and only three studies mention or adjust for multiple testing [27,33,60].

4. Discussion

Publication bias is a well-known problem, enriching the literature with false/true positive studies that will not be balanced by other studies with negative findings that are more likely to remain unpublished. This in turn will inflate the effect size estimated from an intervention or the discriminatory power of a diagnostic test. The inflated effect sizes from individual studies will carry over to a meta-analysis [79], thus reducing the value of the meta-analysis in evidence-based medicine. Indeed, our systematic analysis of prognostic studies of FDG uptake found statistically significant evidence of publication bias. Small studies are at particular risk of inflated effect-size bias [80] and it is thus a concern that the median study size was only 58 and 95 patients in published HNSCC and NSCLC studies, respectively. The TRIPOD reporting guidelines [81], attempts to address the problem by requiring a sample size justification in reporting, but this is not provided in any of the studies included here. Additionally, it might be argued that the general reporting guidelines of TRIPOD, albeit relevant, are not sufficiently specific for adequate reporting of image-based prognostic studies. In particular, it is an important concern that a large number of possible predictors can be extracted from a PET scan—SUVmax, SUVpeak, SUVmean, MTV, and TLG, just to name a few. Multiple comparisons, post-hoc search for positive associations and scanning for ‘optimal’ cut-off values separating the low and the high uptake groups, increase the risk of false positive findings [82], as also discussed by Vesselle et al., in the context of FDG prognostication [73].
A limitation of our study was that we did not have access to individual patient data, which led to the exclusion of some reports. Most of the included studies were conducted as retrospective studies (80.7%), without a predefined data analysis plan. While this might be defendable in the explorative setting, it increases the risk of overestimating the effect size if the cohort studies are not followed by controlled trials or studies with pre-specified protocols. In particular, with FDG, we would argue that we are beyond the exploratory phase and should perform larger studies with predefined protocols, to unequivocally reveal the prognostic or predictive role of FDG uptake in cancers that are common in the two sites studied in the present work. With the high number of correlations that are testable in image-based prognostication, it appears prudent to require predefined research protocols and, perhaps, publication of raw data to allow independent validation of findings, regardless of the chosen cut-point or predictor. It is possible that a functional imaging specific extension to the TRIPOD or REMARK guidelines could be of use. When published studies perform tens of comparisons and multiple cut-point optimizations in datasets of less than 100 patients, and without correction for multiple comparisons, the field is bound to be dominated by false or exaggerated correlations, which will ultimately harm patients if applied in clinical decision making and harm a promising field of research by misusing resources.
It is a substantial challenge to accommodate cross-study synthesis of data in meta-analysis at the same time as allowing the individual authors to appropriately handle the coding of image metrics in their study. Decisions to use continuous coding of SUV, logarithmic transformation, or a limited number of cut-points are all fair (if performed correctly), but hampers the ability to perform a meta-analysis. It appears to us that the complexity of these analyses implies that publication of the raw modeling data is a necessity for meaningful synthesis of data. We believe that the observations of the current study imply that such a synthesis is necessary for real progress.

5. Conclusions

Functional imaging with FDG or other tracers remains a promising tool for prognostication, prediction, and treatment selection for cancer patients. However, the current study points to issues limiting the interpretation, including inadequate sample sizes, lack of predefined analysis plans, lack of correction for multiple testing, and post-hoc cut-point optimizations. These issues result in a high risk of inflated effect sizes or false positive correlations that must be addressed to avoid leading the field astray.

Table 1. Characteristics of HNSCC studies included in the meta-analysis.
Table 1. Characteristics of HNSCC studies included in the meta-analysis.
AuthorYearTumor TypePatientsEndpointsMVA *Uptake MetricCut-Off ValueSUV ThresholdReconstruction AlgorithmTreatmentStageMedian Follow Up TimeData Extraction
Akagunduz et al.2015HN62LRFS, DFS, OSNoSUVmax, SULmax, MTV10.15 (SUL)Fitted-RT/CRT 18 monthsKM
Allal et al. 2004HNSCC120LC, DFS, OSYesSUVmax4.76
-RT +/− CT, surgery +/− RTI-IV48 monthsHR
Baschnagel et al.2015HNSCC74LC, LRC, DFSYesSUVmax13.8Median-CRTT1-T4, N0-N335 monthsHR
Brun et al.2002HNSCC47CR, LRC, OSNoSUV, MR9.0 MedianIterative MLRT, CRTII-IV3.3 yearsHR
Cacicedo et al. 2017HNSCC58DFS, LRC, DMFS, OSYesSUVmax11.85 (SUV-T), 5.4 (SUV-N)Median-Surgery + RT, RT +/− CTIII-IVB31 monthsKM
Chan et al.2017OHSCC124OS, RFSYesSUVmax, SUVmean, MTV, TLG, entropy, contrast, busyness, complexity14.22 OSEMCRTIII-IV28.7 monthsHR
Chung et al.2009SCC82CR, DFSYesMTV, SUV > 2.510.0 MedianOSEMRT, CRTI-IV34.8 monthsHR
Halfpenny et al. 2002HNSCC58SurvivalYesSUVpeak10.0FittedFBPSurgery, +/−RTI-IV39 monthsHR
Higgins et al.2012HNSCC8DFS, LRC, DMFS, OSNoSUVmax, SUV mean, TLG15.4MedianOSEMRT, CRTIII-IV (97%)15 monthsKM
Kim et al. 2007OSCC52LC, DFS, OSYesSUVmax6.0 Median-Surgery +/− RT/CRTI-IV36 monthsHR
Kitajima et al. 2014Laryngeal51PFS, LC, NPFS, DMFSYesSUVmax4.6FittedRAMLART +/− CT, surgery +/− CRT 48.6 monthsKM
Komar et al. 2014HNSCC22OSNoSUVmax, MATV11.74Median-Surgery +/− CRT, RTI-IV41 monthsKM
Koyasu et al. 2014SCC108DFSYesSUVmax, MTV, TLG10.0Fitted3D iterativeRT +/− CT, surgery +/− RTI-IV36.4 monthsHR
Kunkel et al. 2003OSCC44OSYesSUVpeak5.6 Median-RT (preop.) + surgeryI-IV38 monthsHR
Liao et al. 2009OSCC109LC, DFS, DSS, OSNoSUVmax19.3MedianML, OSEMSurgery + RT/CRTIII-IV39 monthsHR
Machtay et al.2009HNSCC60DFS, OSYesSUVmax9.0 Literature-RT, CRT, surgery + CRT/RTI-IV-HR
Minn et al. 1997HNSCC37OSNoSUVlean, MR9.0Median-RT +/− surgeryII-IV43 monthsHR
Moon et al. 2015NPC44DFSYesSUVmax, SUVmean, TLG, MTV7.8FittedOSEMCRTII-IVB40 monthsHR
Ng et al. 2016OHSCC86PFS, OSYesSUVmax, SUVmean, TLG, MTV19.44Fitted-CRTIII-IVB28 monthsHR
Preda et al. 2016HNSCC57DFSYesSUVmax5.75FittedOSEMSurgery + RT +/− CT, RT + CTT1-T4, N0-N221.3 monthsHR
Roh et al. 2007SCC 79DFS, LC, OSNoSUVmax8.0Fitted-Surgery +/− RT or RT +/− CTIII-IV36 monthsKM
Schwartz et al. 2004HNSCC54LRFS, DFS, OSNoSUVmax9.0MedianFBPRT +/− CT, surgery +/− RTI-IV17.5 monthsKM
Schwartz et al. 2015HNSCC74PFS, OSNoSUVmax, MTV15.07Median-CRT III-IV4.2 yearsHR
Suzuki et al. 2014OPSCC + HPSCC49OSYesSUVmax8.0FittedOSEMSurgery + RT +/− CT, RT + CTI-IV33 monthsHR
Suzuki-Shibata et al.2017OTSCC33PFS, OSYesSUVmax, MTV15.7FittedFORE-OSEMCRTII-IVA36 monthsHR
Torizuka et al. 2009HNSCC50LC, DFSNoSUVpeak, SUV cont. variable7.0FittedOSEMRT, CRT, surgery +/− CRTI-IV15 monthsKM
Xie et al. 2010NPC62OS, DFSNoSUVmax8.0Fitted-CRTIII-IVB61 monthsKM
* Performed multivariate analysis (MVA) in regards to the endpoints analyzed in this study: DFS and OS. Studies listed with author in italic are performed as prospective studies. Other studies are retrospective studies. HNSCC: head and neck squamous cell carcinoma, HN: head and neck cancer, LRFS: local recurrence free survival, DFS: disease free survival, OS: overall survival, SUV: standardized uptake value, SUL: lean body mass corrected standardized uptake value, MTV: metabolic tumor volume, RT: radiotherapy, CRT: chemoradiotherapy, KM: Kaplan-Meier, LC: local control, HR: hazard ratio, LRC: locoregional control, CR: complete response, ML: maximum likelihood, DMFS: distant metastasis free survival, CT: chemotherapy, OHSCC: oropharyngeal or hypopharyngeal squamous cell carcinoma, RFS: recurrence free survival, TLG: total lesion glycolysis, OSEM: ordered-subset expectation maximum, SCC: squamous cell carcinoma, FBP: filtered back projection, OSCC: oral squamous cell carcinoma, PFS: progression free survival, NPFS: nodal progression free survival, RAMLA: row-action maximum-likelihood algorithm, MATV: metabolically active tumor volume, DSS: disease-specific survival, MR: metabolic rate, NPC: nasopharyngeal carcinoma, OPSCC: oropharyngeal squamous cell carcinoma, HPSCC: hypopharyngeal squamous cell carcinoma, OTSCC: oral tongue squamous cell carcinoma, and FORE-OSEM: Fourier rebinning- ordered-subset expectation maximum.
Table 2. Characteristics of NSCLC studies included in the meta-analysis.
Table 2. Characteristics of NSCLC studies included in the meta-analysis.
AuthorYearTumor TypePatientsEndpointsMVA *Uptake MetricCut-Off ValueSUV ThresholdReconstruction AlgorithmTreatmentStageMedian Follow Up Time Data Extraction
Ahuja et al. 1998NSCLC155OSYesSUVpeak (SUR) 80% of max10.0Fitted-Surgery, RT, CRTI-IV20.9 monthsHR
Aoki et al. 2016NSCLC74LCYesSUVmax, AID 4.0Literature-SBRTI24.5 monthsHR
Borst et al. 2005NSCLC51DSS, OSNoSUVmax, SUV cont. variable15.0 MedianOSEMCRTI-III17 monthsKM
Carvalho et al. 2013NSCLC220OSNoMTV, SUVmax, SUV10.12MedianOSEMRT, CRTI-IIIB1.47 yearsKM
Cerfolio et al. 2005NSCLC315OS, DFSYesSUVmax10.0 MedianIterativeSurgery +/− CRTI-IV26 monthsHR
Chen et al. 2012NSCLC105PFS, OSYesTLG, MTV, SUVmax15.0FittedOSEMSurgery, CT, RT or CRTI-IV3.1 yearsHR
Clarke et al.2012NSCLC82OS, RFS, DFS, CSS, RR, LR, DMNoSUVmax4.75 Median-SBRTI2 yearsKM
Hamamoto et al. 2011NSCLC26LFFNoSUVmax5.0 Fitted-SBRTI21 monthsKM
Hofheinz et al. 2016NSCLC31PFS, OSNoSUV, SURtc, Kslope7.6FittedPSF + TOFCRT and/or surgeryT1-4N0-3M0-HR
Horne et al. 2014NSCLC95LC, PFS, OSYesSUVmax5.0Literature-SBRTIA-IB16.33 monthsHR
Huyn et al. 2015NSCLC161DFS, OSYesSUVmax, MTV14.0FittedOSEMSurgery +/− CT and/or RTIIIA-N220 monthsHR
Imamura et al. 2011NSCLC62OS, PFSNoSUVmax6.0Median3D-RAMLACT or CRTIIB-IV464 daysKM
Jiang et al. 2018NSCLC151OSNoSUVmax13.8Median-CRT, RT, CTI-IV10 yearsHR
Kohutek et al. 2015NSCLC211OSYesSUVmax, GTV3.0FittedOSEMSBRTT1-2N0M025.2 monthsHR
Lee et al. 2012NSCLC205OSYesSUVmax13.0 FittedIterativeNeoadj. CRT, + surgeryIIIA1.6 yearsHR
Nair et al. 2014NSCLC163PFS, OS, LRFS, DMFSYesSUVmax7.0Median-RT, SBRTT1-2N0M016 monthsKM
Nawara et al. 2012NSCLC91OSNoSUVmax, SUVmean7.0MedianIterativeRT +/− induction CTI-IIIB-KM
Pyka et al. 2015NSCLC45DSS, OSYesSUVmax, SUVmean, MTV, COV, entropy, coarseness, contrast, correlation11.2 (OS), 12.3 (DSS)FittedOSEMSBRTT1-2N0M021.4 monthsKM
Sasaki et al. 2005NSCLC162OS, DFSYesSUVmax5.0 FittedIterativeSurgery +/− RT or RT/CRTI-IIIB17 monthsHR
Shirai et al. 2017NSCLC45LC, PFS, OSNoSUVmax5.5Median-C-ion RTI28.9 monthsKM
Sugawara et al. 1999NSCLC38OSNoSUVlean8.72MedianHanning filterSurgery, CRTI-IV26.5 monthsKM
Takeda et al. 2011NSCLC95LCNoSUVmax6.0 FittedDRAMASBRTIA-IIIB16 monthsKM
Takeda et al. 2014NSCLC152OS, DFS. LC, RC, DMC, CSSYesSUVmax3.35 (LC), 3.64 (RC), 2.47 (DMC, DFS), 2.55 (CSS, OS) FittedRAMLASBRTT1-2N0M025.3 monthsHR
Takeda et al. 2017NSCLC26LC, PFS, OSNoSUV, MTV, TLG, entropy, dissimilarity, HILAE, zone percentage8.18MedianOSEMSBRTT1-2N0M036 monthsKM
Ulger et al. 2014NSCLC103OS, RFS, DFSYesSUVmax10.7Median-3D-CRTIIIA-IIIB22.63 monthsHR
Vansteenkiste et al. 1999NSCLC125OSYesSUVmax7.0 Fitted-Surgery +/− induction CT, RT +/− induction CTI-IIIB19 months (mean)HR
Vesselle et al. 2007NSCLC208OS, DFSYesSUVmax7.0 FittedHanning filterSurgery +/− neoadj. Or adjuvant therapy I-IV37 monthsHR
Vu et al. 2013NSCLC 50OS, RFSNoSUVmax, TLG, MTV6.43Median-SBRTI25.1 monthsHR
Xiang et al. 2012NSCLC84LRFS, DMFS, PFS, OSYesSUVmax14.2Median-High dose proton + CTIII19.2 monthsHR
Yilmaz et al. 2018NSCLC67PFS, OSYesSUVmax15.0Median-CRTIII20.7 monthsHR
* Performed multivariate analysis (MVA) in regards to the endpoints analyzed in this study: DFS and OS. Studies listed with author in italic are performed as prospective studies. Other studies are retrospective studies. NSCLC: non-small cell lung cancer, OS: overall survival, SUV: standardized uptake value, SUR: standardized uptake ratio, RT: radiotherapy, CRT: chemoradiotherapy, HR: hazard ratio, LC: local control, AID: average iodine density, SBRT: stereotactic body radiation therapy, DSS: disease-specific survival, OSEM: ordered-subset expectation maximum, KM: Kaplan-Meier, MTV: metabolic tumor volume, DFS: disease free survival, PFS: progression free survival, TLG: total lesion glycolysis, CT: chemotherapy, RFS: recurrence-free survival, CSS: cause-specific survival, RR: regional relapse, LR: local relapse, DM: distant metastasis, LFF: local failure free rate, PSF: point spread function, TOF: time of flight, RAMLA: row-action maximum-likelihood algorithm, GTV: gross tumor volume, LRFS: local recurrence-free survival, DMFS: distant metastasis free survival, COV: coefficient of variation, DRAMA: dynamic row-action expectation maximization algorithm, RC: regional control, DMC: distant metastasis control, HILAE: high-intensity large-area emphasis, and 3D-CRT: three-dimensional conformal radiotherapy.
