Next Article in Journal
Pre-Treatment SEPTIN9 Gene Methylation Ratio Predicts Tumor Response to Total Neoadjuvant Therapy in Patients with Locally Advanced Rectal Cancer
Previous Article in Journal
Correction: Park et al. Real-World Outcomes of First-Line Chemotherapy in Metastatic Pancreatic Cancer: A Nationwide Population-Based Study in Korea. Cancers 2024, 16, 3173
Previous Article in Special Issue
The Transformative Role of 3D Culture Models in Triple-Negative Breast Cancer Research
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Deep Learning Models Based on Pretreatment MRI and Clinicopathological Data to Predict Responses to Neoadjuvant Systemic Therapy in Triple-Negative Breast Cancer

by
Zhan Xu
1,
Zijian Zhou
1,
Jong Bum Son
1,
Haonan Feng
2,
Beatriz E. Adrada
3,
Tanya W. Moseley
3,
Rosalind P. Candelaria
3,
Mary S. Guirguis
3,
Miral M. Patel
3,
Gary J. Whitman
3,
Jessica W. T. Leung
3,
Huong T. C. Le-Petross
3,
Rania M. Mohamed
3,
Bikash Panthi
1,
Deanna L. Lane
3,
Huiqin Chen
2,
Peng Wei
2,
Debu Tripathy
4,
Jennifer K. Litton
4,
Vicente Valero
4,
Lei Huo
5,
Kelly K. Hunt
6,
Anil Korkut
7,
Alastair Thompson
8,
Wei Yang
3,
Clinton Yam
4,
Gaiane M. Rauch
3,9,† and
Jingfei Ma
1,*,†
add Show full author list remove Hide full author list
1
Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd., Houston, TX 77030, USA
2
Department of Biostatistics, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd., Houston, TX 77030, USA
3
Department of Breast Imaging, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd., Houston, TX 77030, USA
4
Department of Breast Medical Oncology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd., Houston, TX 77030, USA
5
Department of Pathology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd., Houston, TX 77030, USA
6
Department of Breast Surgical Oncology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd., Houston, TX 77030, USA
7
Department of Bioinformatics & Computational Biology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd., Houston, TX 77030, USA
8
Section of Breast Surgery, Baylor College of Medicine, 7200 Cambridge St., Houston, TX 77030, USA
9
Department of Abdominal Imaging, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd., Houston, TX 77030, USA
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Cancers 2025, 17(6), 966; https://doi.org/10.3390/cancers17060966
Submission received: 6 February 2025 / Revised: 27 February 2025 / Accepted: 4 March 2025 / Published: 13 March 2025
(This article belongs to the Special Issue Advances in Triple-Negative Breast Cancer)

Simple Summary

Deep learning models based on pretreatment MRI and clinicopathological data were constructed to predict responses to neoadjuvant systemic therapy in triple-negative breast cancer. The best-performing deep learning model developed from the pretreatment multiparametric breast MRI and clinicopathological data acquired from 282 patients with stage I–III triple-negative breast cancer enrolled in a prospective clinical trial achieved an AUC of 0.76 in the internal testing dataset. The deep learning model achieved an AUC of 0.72 in the external I-SPY 2 trial testing dataset. Tumor volume preprocessing affected the model performance; the 3D model frameworks, tumor ROI selection, and data inputs had minimal impact on the model performance.

Abstract

Purpose: To develop deep learning models for predicting the pathologic complete response (pCR) to neoadjuvant systemic therapy (NAST) in patients with triple-negative breast cancer (TNBC) based on pretreatment multiparametric breast MRI and clinicopathological data. Methods: The prospective institutional review board-approved study [NCT02276443] included 282 patients with stage I–III TNBC who had multiparametric breast MRI at baseline and underwent NAST and surgery during 2016–2021. Dynamic contrast-enhanced MRI (DCE), diffusion-weighted imaging (DWI), and clinicopathological data were used for the model development and internal testing. Data from the I-SPY 2 trial (2010–2016) were used for external testing. Four variables with a potential impact on model performance were systematically investigated: 3D model frameworks, tumor volume preprocessing, tumor ROI selection, and data inputs. Results: Forty-eight models with different variable combinations were investigated. The best-performing model in the internal testing dataset used DCE, DWI, and clinicopathological data with the originally contoured tumor volume, the tight bounding box of the tumor mask, and ResNeXt50, and achieved an area under the receiver operating characteristic curve (AUC) of 0.76 (95% CI: 0.60–0.88). The best-performing models in the external testing dataset achieved an AUC of 0.72 (95% CI: 0.57–0.84) using only DCE images (originally contoured tumor volume, enlarged bounding box of tumor mask, and ResNeXt50) and an AUC of 0.72 (95% CI: 0.56–0.86) using only DWI images (originally contoured tumor volume, enlarged bounding box of tumor mask, and ResNet18). Conclusions: We developed 3D deep learning models based on pretreatment data that could predict pCR to NAST in TNBC patients.

1. Introduction

Triple-negative breast cancer (TNBC) is an aggressive subtype of breast cancer that accounts for approximately 20% of all breast cancers [1]. TNBC is a heterogeneous disease with diverse molecular subtypes and is typically treated with neoadjuvant systemic therapy (NAST) followed by surgery, although the response to NAST varies across these subtypes [2]. Rates of pathologic complete response (pCR) after NAST range from 20% to 64% [2]. Patients with TNBC treated with NAST who achieve a pCR have favorable disease-free and overall survival, while up to 80% of patients with residual disease after NAST will develop metastatic disease [3]. The recent approval of immunotherapy as NAST for TNBC has led to increased rates of pCR, at the expense of higher toxicity [4]. Therefore, the accurate early prediction of whether a patient will have a pCR to NAST is highly important. The development of early noninvasive biomarkers of response to NAST could enable personalized treatments and minimize toxicity by permitting the triage of responders to less aggressive treatment strategies [5] or treatment de-escalation, sparing nonresponders from ineffective treatment regimens [6].
Prior studies have integrated baseline and posttreatment data to identify tumor changes that predict treatment responses in TNBC. Some studies used morphological, kinetic, and diffusion parameters from dynamic contrast-enhanced MRI (DCE) and diffusion-weighted imaging (DWI) to predict pCR status via linear-regression-based analysis [7,8,9,10]. Several recent studies have improved prediction accuracy using deep learning (DL) models that incorporate MRI findings [11,12,13,14]. However, these DL-based studies relied on input data from both pre-NAST scans and follow-up scans acquired several months later, which potentially precluded the early triage of patients to alternative treatments.
To overcome this limitation, other recent studies have attempted to use only baseline data to predict pCR to NAST. One model [15] used baseline DCE images and clinical information from the I-SPY 1 trial and achieved a testing AUC of 0.85. Another model incorporated molecular information with imaging data, achieving an AUC of 0.83 in an independent testing dataset [16]. However, both models were trained and tested over all breast cancer subtypes. To the best of our knowledge, there are no reports on the use of DL models for pCR prediction based on baseline data exclusive for TNBC.
The goal of our study was to develop DL models for predicting pCR to NAST in TNBC patients using pretreatment multiparametric breast MRI and clinicopathological data. Our study utilized the largest dataset of TNBC patients, whereas all previously published studies have included an inhomogeneous cohort of all breast cancer subtypes. Further, our study used only the data that were available before the start of the neoadjuvant treatment, potentially enabling the more personalized clinical management of the patients at an earlier time point.

2. Materials and Methods

2.1. Patient Data

This prospective study was approved by the institutional review board of MD Anderson Cancer Center. The internal dataset was from patients with stage I–III TNBC enrolled in an institutional clinical trial [NCT02276443] in which the patients were prospectively monitored for a response to NAST.
A total of 299 patients with biopsy-confirmed TNBC were enrolled between April 2016 and September 2021. The inclusion criteria included baseline multiparametric MRI and clinicopathological data being available, the patient having received NAST, and the patient having undergone definitive surgery with the assessment of pCR status, with pCR defined as the absence of residual invasive disease in the breast and axilla. The exclusion criteria were inflammatory breast cancer; missing pCR status after surgery; the absence of multiparametric MRI (DCE or DWI) before NAST; incomplete clinical information; incomplete or uninterpretable MR images; or failed manual tumor segmentation due to technical issues. In total, 282 patients from the clinical trial were included in this study (Figure 1). These patients were randomly divided with a 5:1 ratio into a development (training and validation) dataset (n = 235) and an independent internal testing dataset (n = 47). The internal NAST included 4 cycles of dose-dense doxorubicin and cyclophosphamide followed by 12 cycles of paclitaxel or other agents. Data from these patients have been previously reported [13,17,18,19].
The external testing dataset was from the I-SPY 2 trial [20], in which patients received 12 weekly cycles of paclitaxel alone (standard of care) or with experimental agents followed by 4 cycles of doxorubicin and cyclophosphamide every 2 to 3 weeks [21]. A total of 985 patients, including 242 with TNBC, were enrolled during 2010–2016. Patients with cancer types other than TNBC were excluded, and the identical inclusion and exclusion criteria used in the internal trial were applied, resulting in 62 patients with TNBC in the external testing dataset (Figure 1).

2.2. Imaging and Clinicopathological Data

The image acquisition parameters for DCE and DWI are detailed in Table A1 in Appendix B. For DCE, the subtraction images (referred to as DCE hereafter) obtained by subtracting the precontrast images from the images acquired 2.5 min after contrast agent injection were used for development and testing. For DWI, only the b = 800 s/mm2 images (referred to as DWI hereafter) were used. The external image data were first resampled to the voxel size of the internal image data and then both datasets were processed with the same procedure thereafter. Semiautomatic segmentations with MIM Maestro 7.2.5 (MIM Software Inc., Cleveland, OH, USA), performed by two fellowship-trained breast radiologists [G.M.R. and R.M.M.] with 5 and 15 years of experience in breast MRI, respectively, were used as the reference masks for tumor voxels in both the DCE and DWI images for the internal dataset; the publicly available tumor segmentations for the I-SPY 2 dataset were used as the reference masks for the external testing dataset. Categorical clinicopathological data (clinical stage, T category, N category, stromal tumor-infiltrating lymphocyte level, Ki-67 index) were converted into numerical values. The image data were self-normalized at the subject level, while clinical information was normalized at the group level.

2.3. Models

In previous studies of MRI-based pCR prediction in TNBC, several variables were identified as potentially affecting model performance [11,13,14,16,22,23,24,25,26,27,28,29,30]. Therefore, we systematically investigated the pCR prediction performance of the models with those variables:
(1)
Two 3D frameworks for transfer learning: ResNet18 [22,23,26,31], with ~5 million parameters and ~1000 trainable parameters, and ResNeXt50 [16,24], with ~30 million parameters and ~4000 trainable parameters. These two frameworks were chosen for their popularity and reported success in applications of medical-imaging-based prediction/classification.
(2)
Tumor volume preprocessing: normalized tumor volume [11,13,25] and original tumor matrix size [14,16,26,27,28,29,30]. The normalization referred to the median tumor size across all 282 subjects in the internal dataset, defined as the median MRI-measured longest diameter (cm) in the baseline study, which was 2.8 cm. The histogram of the tumor volume normalization ratio is shown in Figure A1 in Appendix B. The purpose here was to evaluate if using the original or normalizing tumor volume in preprocessing would impact the model performance, since both approaches [11,13,14,16,25,26,27,28,29,30] are reported in the literature.
(3)
Tumor ROI selection (Figure 2, Figure A2 in Appendix B): ROI1, which included voxels within the aforementioned reference masks of tumors [11,16,27]; ROI2, which included voxels within a tight bounding box of the reference masks [13,14,25]; and ROI3, which included voxels within an enlarged bounding box of the ROI2 (dilated by 5 mm along the top, bottom, left, and right sides) [28,29,30].
(4)
Data inputs: DCE images only; DWI images only; DCE and DWI images; or DCE and DWI images plus clinicopathological information. For each input channel of the image, the base layer parameters of the chosen transfer-learned 3D framework were locked and then linked to the top layer, which remained unlocked for training. In models that fused multiple inputs, the base layer features of each input were combined through concatenation and then linked to the top layer.
Forty-eight prediction models were evaluated with the internal testing dataset. Thirty-six models were evaluated with the external testing dataset, since clinicopathological information was not available from the I-SPY 2 dataset for the development of all of the models. The model structures are illustrated in Figure 2, and the network configurations and software environment for model training are detailed in Appendix A. For models incorporating the DCE image series, the difference image—obtained by subtracting the precontrast image from the image acquired at 2.5 min after contrast agent injection—was used as the input. For models involving DWI, only the b = 800 s/mm2 data were used. The three voxel selection methods for model training are detailed under ‘Data Pre-processing’ and in Figure A2. Each of the 48 alternative models was cross-validated as described in ‘Model Development’. The receiver operating characteristic (ROC) curves of the top-performing model are presented in the ’Evaluation’ section and in Figure 3.

2.4. Statistical Analysis

Age, body mass index, and longest tumor diameter were summarized using their means and standard deviations; clinical stage, tumor category, nodal category, and NAST regimen were summarized using the number of subjects and percentages; and the stromal tumor-infiltrating lymphocyte level and Ki-67 index were summarized using their median and interquartile range.
Differences between the pCR and non-pCR groups in age, body mass index, and longest tumor diameter were assessed using a t-test; differences in clinical stage, tumor category, and nodal category were assessed using the Pearson χ2 test; and differences in stromal tumor-infiltrating lymphocytes and Ki-67 level were assessed using Fisher’s exact test [32]. Statistical analyses were carried out using R version 4.0.3 (R Foundation for Statistical Computing, Vienna, Austria). p values of less than 0.05 were considered statistically significant.
For each prediction model, performance was measured by the mean of the area under the receiver operating characteristic (ROC) curves (AUCs) from each of the five folds over the testing dataset. The AUCs were compared between models with Delong’s method.

3. Results

Patient demographics and baseline clinical information are presented in Table 1. The development dataset, internal testing dataset, and external testing dataset were composed of 235 patients (age 50.0 ± 11.7 years), 47 patients (age 49.2 ± 10.6 years), and 62 patients (age 48.8 ± 10.5 years), respectively. The longest tumor diameter was significantly smaller in patients with a pCR than in patients without a pCR in both the internal dataset (p < 0.001) and the external testing dataset (p = 0.017). In addition, in the internal dataset, stromal tumor-infiltrating lymphocyte level (p < 0.001) and Ki-67 index (p = 0.012) were higher in the pCR group than in the non-pCR group, and there was an association between pCR and a lower clinical stage (p = 0.018), T category (p < 0.001), and N category (p < 0.001). The percentages of pCR cases in the development dataset, internal testing dataset, and external testing dataset were 48% (113/235), 40% (19/47), and 45% (28/62), respectively.

3.1. Top-Performing Models

For the internal testing dataset, the top-performing DL model using only DCE images achieved a mean AUC of 0.69 (95% confidence interval [CI]: 0.53–0.84), which resulted from the combination of the original tumor volume, ROI2, and ResNet18 (Table 2, Figure A3 in Appendix B). The top-performing models using DWI images only and using both DCE and DWI images also resulted from the same combination, achieving mean AUCs of 0.72 (95% CI: 0.56–0.86) for DWI images only (Table 2, Figure A4 in Appendix B), and 0.73 (95% CI: 0.57–0.86) using both DCE and DWI images (Table 2, Figure A5 in Appendix B). The top-performing model using DCE images, DWI images, and clinicopathological data achieved a mean AUC of 0.76 (95% CI: 0.60–0.88), which resulted from the combination of the original tumor volume, ROI2, and ResNeXt50 (Table 2, Figure A6 in Appendix B).
The logistic regression model using only clinical information achieved a mean AUC of 0.61 (range: 0.59–0.67) in the internal testing dataset.
For the external testing dataset, the top-performing DL model using only DCE images achieved a mean AUC of 0.72 (95% CI: 0.57–0.84), which resulted from the combination of the original tumor volume, ROI3, and ResNeXt50 (Table 3, Figure A3 in Appendix B). The top-performing DL model using only DWI images also achieved a mean AUC of 0.72 (95% CI: 0.56–0.86), which resulted from the combination of the original tumor volume, ROI3, and ResNet18 (Table 3, Figure A4 in Appendix B). The top-performing DL model using both DCE and DWI images achieved a mean AUC of 0.71 (95% CI: 0.56–0.84), which also resulted from the combination of the original tumor volume, ROI3, and ResNet18 (Table 3, Figure A5 in Appendix B).
The top-performing models by data inputs for the internal testing dataset (four models) and external testing dataset (three models) are summarized in Figure 3, and the confidence interval of each model is plotted in Figure A3, Figure A4, Figure A5 and Figure A6. The best-performing model in the internal testing dataset achieved a mean AUC of 0.76 and used DCE images, DWI images, and clinicopathological data. The best-performing model in the external testing dataset achieved a mean AUC of 0.72 and used only DCE images and only DWI images, respectively.

3.2. Impact of 3D Frameworks, Tumor Volume Preprocessing, and Tumor ROI Selection

The 3D frameworks did not affect model performance. Across all paired combinations, model performance did not differ significantly between ResNet18 and ResNeXt50. Among the seven top-performing models, five were composed with ResNet18, and two were composed with ResNeXt50.
Tumor volume preprocessing had a major impact on model performance. In models with paired combinations, using the original tumor volume resulted in a higher mean AUC than using the normalized tumor volume. For example, across models using DCE images only, eight (two with internal testing dataset, six with external testing dataset) of the twelve paired comparisons achieved higher mean AUCs with the original tumor volume than with the normalized tumor volume.
Tumor ROI selection did not affect model performances. In the seven top-performing models, replacing the selected ROI with one of the other two ROI options in the paired combination resulted in a difference in the mean AUC of at most 0.04.

4. Discussion

In this study, we developed and validated 3D DL models based on pretreatment data for predicting pCR in TNBC patients undergoing NAST. The best-performing model in the internal testing dataset included DCE images, DWI images, and clinicopathological data and achieved a mean AUC of 0.76. The best-performing model in the external testing dataset included only DCE images or DWI images and achieved a similar mean AUC of 0.72. Furthermore, we found that tumor volume preprocessing affected model performance, but 3D model frameworks, tumor ROI selection, and data inputs had minimal impact on model performance.
Several previous studies used only pretreatment data to predict pCR in breast cancer patients [14,16,28,33]. In a study including 356 patients (unknown number with TNBC) with different breast cancer molecular subtypes, models using DCE and molecular information were developed [16]. The best-performing DL model used the image-kinetic of DCE and molecular information and achieved an AUC of 0.83, while the model directly using DCE images and molecular data achieved an AUC of 0.79. In another study using the pretreatment data of 536 breast cancer patients (unknown number with TNBC), in which 429 patients were used for training and 107 were used for testing, the best-performing DL model used T1-weighted images, T2-weighted images, and clinical information and achieved an AUC of 0.89 [28]. The model using only T1-weighted images achieved an AUC of 0.73, the model using only T2-weighted images achieved an AUC of 0.66, and the model using clinical information alone achieved an AUC of 0.83. In contrast to prior reports, which were based on retrospective data analysis, our study used a well-curated prospectively collected dataset including only TNBC patients. Our top-performing models using only DCE images achieved AUCs of 0.69 for the internal testing dataset and 0.72 for the external testing dataset, which are comparable to the AUCs in the models from prior reports under similar settings.
Similarly to previous studies that showed the superior performance of models with multiple inputs over models with single inputs [9,16,31,32], our study showed that the best-performing model used DCE, DWI, and clinicopathological information. Among the models using DCE images alone or DWI images alone, we observed a large variation in AUC values in both the internal and external testing dataset. A logistic regression model using only clinicopathological information also performed poorly.
This study differs from previous studies in the investigation of input variables that affect model performance. We found that ResNet18, with its lower computational demands and faster performance in both model development and testing, may be better suited than ResNeXt50 for clinical applications. Unlike previous studies [11,13,25] that normalized tumor volume during preprocessing, we found that the original tumor volume without normalization was satisfactory for our model performance. Interestingly, we found that tumor ROI selection did not affect the model performance. Previous studies [11,13,14,16,25,27,28,29,30] have reported a good prediction performance with either accurate segmentation or bounding boxes, consistent with our findings. The lack of influence of the tumor ROI selection method on model performance suggests that our models can be used without often labor-intensive manual tumor segmentation.
The patients in our internal testing dataset received different NAST regimens to the ones used for patients in the external testing dataset. However, our top-performing model achieved a similar and even slightly better performance in the external testing dataset, suggesting that our model is stable across NAST regimens.
This study has several limitations. First, our DL models were developed only with DCE images, DWI images, and clinical information. Including other imaging modalities [11] (such as T2W images) or features [17,18,19] (such as radiomic information) might further improve prediction accuracy. A recent study reported that a DL model built from pathological slides acquired during the baseline biopsy of 540 breast cancer patients achieved an AUC of 0.85 [34]. Our results may be improved upon with the addition of digitized pathological information. Second, the methods used to integrate multiple inputs need further investigation. Our DL framework fused different inputs by concatenating the top-layer elements of each input at the “feature level”. Although this is a well-accepted data fusion strategy for classification purposes, some investigators [35,36] have also proposed fusion at the “decision level”, which is independent of the disparity between the original data dimensions among the modalities. Finally, the current models need to be validated on a larger dataset to improve their statistical confidence and reliability. We were not able to test our model based on DCE images, DWI images, and clinicopathological data on the external testing dataset due to insufficient available clinicopathological information. The comparable AUCs in the internal and external testing datasets demonstrated the reliability of our model across various NAST regimens. We also would expect a smaller confidence interval at a larger testing dataset.

5. Conclusions

In conclusion, we were able to develop 3D DL models for predicting pCR to NAST in TNBC based on multiparametric MR images and clinical information acquired before the initiation of therapy. The model integrating DCE, DWI, and clinical information provided the best prediction performance. This model could be used for the triaging of TNBC patients to the most successful and least toxic treatment regimens while achieving the best outcomes.

Author Contributions

The authors confirm their contributions to this manuscript as follows: Conceptualization, Z.X., Z.Z., G.M.R. and J.M.; Data curation, B.E.A., T.W.M., M.S.G., M.M.P., G.J.W., J.W.T.L., H.T.C.L.-P. and L.H.; Formal analysis, Z.X., J.B.S., D.L.L., R.M.M., B.P., R.P.C., P.W., W.Y. and J.M.; Methodology, Z.X., Z.Z., J.B.S., H.C., H.F., D.T. and J.M.; Supervision, G.M.R., J.M. and C.Y.; Writing, Z.X., J.K.L., V.V., K.K.H., A.K., A.T., G.M.R. and J.M. All authors have read and agreed to the published version of the manuscript.

Funding

This study has received funding from the National Institutes of Health/National Cancer Institute (Cancer Center Support Grant P30 CA016672). This study was supported by the University of Texas MD Anderson Cancer Center Breast Cancer Moonshot Program, Robert D. Moreton Distinguished Chair Funds in Diagnostic Radiology, and the Cancer Prevention and Research Institute of Texas Multi-Investigator Research Award (RP16710-C1-CPRIT). This work was supported in part by the generous philanthropic contributions to the Moon Shots Program of The University of Texas MD Anderson Cancer, the Shari Sella Memorial Fund (to D.T.), the Winterhof Fund (to D.T.), the Gayle Monroe Kuoni Breast Medical Oncology Research Endowment (to D.T.), the Still Water Foundation (to D.T.), the Suzanne Potter ARTEMIS Fund (to C.Y.), the Amelia Handegan Fund (to C.Y.), and a Conquer Cancer Career Development Award supported by Fleur Fairman (2020CDABC-5423266503, to C.Y.). Dr. Yam was additionally supported by a Conquer Cancer Career Development Award supported by Fleur Fairman, the 2018 Gianni Bonadonna Breast Cancer Research Fellowship (Conquer Cancer Foundation), the Allison and Brian Grove Endowed Fellowship for Breast Medical Oncology, and the Susan Papizan Dolan Fellowship in Breast Oncology.

Institutional Review Board Statement

This study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board of The University of Texas M. D. Anderson Cancer Center (protocol code: 2014-0185 and date of approval: 16 July 2014).

Informed Consent Statement

Written informed consent was received from all patients.

Data Availability Statement

Data are available from the corresponding author upon request.

Acknowledgments

The authors would like to thank Stephanie Deming of the Research Medical Library, The University of Texas MD Anderson Cancer Center, for editing the manuscript.

Conflicts of Interest

The funders had no role in the design of this study; in the collection, analysis, or interpretation of data; in the writing of manuscript; or in the decision to publish the results. K.K.H. serves on the Medical Advisory Board for ArmadaHealth, AstraZeneca, and receives research funding from Cairn Surgical, Eli Lilly & Co., and Lumicell. J.L. received grant or research support from Novartis, Medivation/Pfizer, Genentech, GSK, EMD-Serono, AstraZeneca, Medimmune, Zenith, and Merck; participated in the Speaker’s Bureau for MedLearning, Physician’s Education Resource, Prime Oncology, Medscape, Clinical Care Options, and Medpage; and receives royalty from UpToDate. The spouse of A.T. works for Eli Lilly. D.T. declares research contracts with Pfizer, Novartis, and Ployphor and is a consultant of AstraZeneca, GlaxoSmithKline, OncoPep, Gilead, Novartis, Pfizer, Personalis, and Sermonix. W.Y receives royalties from Elsevier. J.M. is a consultant of C4 Imaging, L.L.C., and an inventor of United States patents licensed to Siemens Healthineers and GE Healthcare. For the remaining authors, none were declared.

Abbreviations

The following abbreviations are used in this manuscript:
AUCarea under the receiver operating characteristic curve
DCEdynamic contrast-enhanced MRI
DLdeep learning
DWIdiffusion-weighted imaging
NASTneoadjuvant systemic therapy
pCRpathologic complete response
ROCreceiver operating characteristic
TNBCtriple-negative breast cancer
3Dthree-dimensional

Appendix A

Technical Details of Model Construction

The DCE images were first downsampled into a matrix size of 64 × 64 × 24, and DWI were downsampled into a matrix size of 64 × 64 × 16 to be fed to the 3D DL model. The initial patch size was 64 × 64 × 4, with a pooling/stride size of 2. The top layers of each input were linked to a 3D global average pooling layer and then a dropout layer at a ratio of 0.5. If multiple input channels were considered in the model, the top layer of each channel was flattened, then fused by concatenating into a dense layer at a size of 256. The rectified linear unit (ReLU) was used as the activation function. Subsequently, two additional dense layers were added, sized at 128 and 32, respectively. These layers were interconnected and interspersed with two dropout layers, with dropout ratios of 0.5 and 0.25, respectively. To predict the probability of each patient’s pCR, a classification layer with a Softmax activation was positioned atop the model.
Binary cross-entropy loss was used as the loss function to train the network. The loss was optimized using an Adam optimizer with an initial learning rate of 1 × 10–5. The learning rate was decreased by 0.2 if the validation loss did not improve in three consecutive training epochs. The training process was terminated if the validation loss did not improve in seven consecutive training epochs. The model with the minimal validation loss was saved for testing. Data augmentation, including random flipping, scaling, translation, and rotation, was configured with the same settings as another related work [12].
All model training was performed using an NVIDIA DGX1 system with dual 20-core Intel Xeon E5-2698 2.2-GHz CPUs, 512 GB of DDR4 RAM, and eight NVIDIA Tesla V100 32 GB GPUs with a total of 256 GB of GPU memory (NVIDIA, Santa Clara, CA, USA). The software environment included an Ubuntu Linux 18.4.6 operating system, Python 3.8.12, CUDA 11.1, cuDNN 7.6.5, and TensorFlow 2.8.0.

Appendix B

Table A1. Scan parameters of DCE, DWI, and clinical information.
Table A1. Scan parameters of DCE, DWI, and clinical information.
(A) Internal Dataset
DCE: 3D T1-Weighted DISCODWI: FOCUSClinical Information
Reconstruction matrix size512 × 512Reconstruction matrix size80 × 80Age
Field-of-view, mm300 × 300Field-of-view, mm160 × 160Clinical stage
Number of slices112~192Number of slices16T category
Slice thickness/gap, mm3.2/−1.6Slice thickness/gap, mm4/0N category
Flip angle, degrees12Flip angle, degrees90sTIL index
TR/TE1/TE2, ms6/1.1/2.3TR/TE, ms4000/70Ki-67 index
Number of temporal phases32~64b-values, s/mm2100, 800BMI
(B) External Dataset
DCE: 3D T1-WeightedDWI: 2D SE-EPIClinical Information
Reconstruction matrix size512 × 512Reconstruction matrix size256 × 256Age
Field-of-view, mm(260~360, 300~360)Field-of-view, mm(260~360, 300~360)Clinical stage
Number of slices>60Number of slicesvariableT category
Slice thickness/gap, mm≤2.5/0Slice thickness/gap, mm4–5/0
Flip angle, degrees10–20Flip angle, degrees90
TR/TE, ms4~10/1.4~4.8TR/TE, ms4000~10,600/50~100
Number of temporal phases68–256b-values, s/mm20, 100, 600, 800
Note—All MRI scans were performed on a 3T GE Discovery 750w scanner (GE Healthcare, Chicago, IL, USA) using an 8-channel bilateral phased array breast coil. BMI = body mass index; DCE = dynamic contrast-enhanced MRI; DISCO = differential subsampling with cartesian ordering; DWI = diffusion-weighted imaging; FOCUS = field-of-view optimized and constrained undistorted single-shot; sTIL = stromal tumor-infiltrating lymphocyte; and 3D = three-dimensional. MRI scans were performed using 1.5T or 3.0T GE scanners (Optima MR450w; Signa HDx; Signa HDxt; DISCOVERY MR750; DISCOVERY MR750w; GE Healthcare, Chicago, IL, USA), 1.5T or 3.0T Philips scanners (Achieva; Intera; Ingenia; Philips Healthcare, Best, The Netherlands), or 1.5T or 3.0T Siemens scanners (Avanto; Espree; Symphony; SymphonyTim; Verio; TrioTim; Prisma_fit; Siemens Medical Solutions, Erlangen, Germany). DCE = dynamic contrast-enhanced MRI; DWI = diffusion-weighted imaging; and SE-EPI = spin-echo echo-planar imaging.
Figure A1. Histogram of tumor volume normalization ratio. MRLD = MRI-measured longest diameter (cm) at baseline; MRLDref = median MRLD across all subjects in the internal dataset, which was 2.8 cm. A ratio of “2” indicates that the tumor volume is halved. The median MRLDref across all subjects in the external dataset was also 2.8 cm.
Figure A1. Histogram of tumor volume normalization ratio. MRLD = MRI-measured longest diameter (cm) at baseline; MRLDref = median MRLD across all subjects in the internal dataset, which was 2.8 cm. A ratio of “2” indicates that the tumor volume is halved. The median MRLDref across all subjects in the external dataset was also 2.8 cm.
Cancers 17 00966 g0a1
Figure A2. Three types of voxel selection for model training from one slice of a DCE image. The dashed line indicates the exact manually segmented tumor mask (reference mask; ROI1); the thin yellow line indicates the tight bounding box of the reference mask (ROI2); and the thick red line indicates the enlarged bounding box dilated by 5 mm along the top, bottom, left, and right sides of ROI2 (ROI3).
Figure A2. Three types of voxel selection for model training from one slice of a DCE image. The dashed line indicates the exact manually segmented tumor mask (reference mask; ROI1); the thin yellow line indicates the tight bounding box of the reference mask (ROI2); and the thick red line indicates the enlarged bounding box dilated by 5 mm along the top, bottom, left, and right sides of ROI2 (ROI3).
Cancers 17 00966 g0a2
Figure A3. The receiver operating characteristic (ROC) curve of the cross-validated classifiers and the ROC curve of the final model (the average of these classifiers) for the top-performing model using only dynamic contrast-enhanced MRI (DCE) images in the internal testing dataset and external testing dataset. AUC = area under the ROC curve.
Figure A3. The receiver operating characteristic (ROC) curve of the cross-validated classifiers and the ROC curve of the final model (the average of these classifiers) for the top-performing model using only dynamic contrast-enhanced MRI (DCE) images in the internal testing dataset and external testing dataset. AUC = area under the ROC curve.
Cancers 17 00966 g0a3
Figure A4. The receiver operating characteristic (ROC) curve of the cross-validated classifiers and the ROC curve of the final model (the average of these classifiers) for the top-performing model using only diffusion-weighted imaging (DWI) images in the internal testing dataset and external testing dataset. AUC = area under the ROC curve.
Figure A4. The receiver operating characteristic (ROC) curve of the cross-validated classifiers and the ROC curve of the final model (the average of these classifiers) for the top-performing model using only diffusion-weighted imaging (DWI) images in the internal testing dataset and external testing dataset. AUC = area under the ROC curve.
Cancers 17 00966 g0a4
Figure A5. The receiver operating characteristic (ROC) curve of the cross-validated classifiers and the ROC curve of the final model (the average of these classifiers) for the top-performing model using dynamic contrast-enhanced MRI (DCE) images and diffusion-weighted imaging (DWI) images in the internal testing dataset and external testing dataset. AUC = area under the ROC curve.
Figure A5. The receiver operating characteristic (ROC) curve of the cross-validated classifiers and the ROC curve of the final model (the average of these classifiers) for the top-performing model using dynamic contrast-enhanced MRI (DCE) images and diffusion-weighted imaging (DWI) images in the internal testing dataset and external testing dataset. AUC = area under the ROC curve.
Cancers 17 00966 g0a5
Figure A6. The receiver operating characteristic (ROC) curve of the cross-validated classifiers and the ROC curve of the final model (the average of these classifiers) for the top-performing model using dynamic contrast-enhanced MRI (DCE) images, diffusion-weighted imaging (DWI) images, and clinical information in the internal testing dataset. AUC = area under the ROC curve.
Figure A6. The receiver operating characteristic (ROC) curve of the cross-validated classifiers and the ROC curve of the final model (the average of these classifiers) for the top-performing model using dynamic contrast-enhanced MRI (DCE) images, diffusion-weighted imaging (DWI) images, and clinical information in the internal testing dataset. AUC = area under the ROC curve.
Cancers 17 00966 g0a6

References

  1. Foulkes, W.D.; Smith, I.E.; Reis-Filho, J.S. Triple-Negative Breast Cancer. N. Engl. J. Med. 2010, 363, 1938–1948. [Google Scholar] [CrossRef] [PubMed]
  2. Lee, J.S.; Yost, S.E.; Yuan, Y. Neoadjuvant Treatment for Triple Negative Breast Cancer: Recent Progresses and Challenges. Cancers 2020, 12, 1404. [Google Scholar] [CrossRef] [PubMed]
  3. Esserman, L.J.; Berry, D.A.; DeMichele, A.; Carey, L.; Davis, S.E.; Buxton, M.; Hudis, C.; Gray, J.W.; Perou, C.; Yau, C.; et al. Pathologic complete response predicts recurrence-free survival more effectively by cancer subset: Results from the I-SPY 1 TRIAL—CALGB 150007/150012, ACRIN 6657. J. Clin. Oncol. 2012, 30, 3242–3249. [Google Scholar] [CrossRef] [PubMed]
  4. Schmid, P.; Rugo, H.S.; Adams, S.; Schneeweiss, A.; Barrios, C.H.; Iwata, H.; Diéras, V.; Henschel, V.; Molinero, L.; Chui, S.Y.; et al. Atezolizumab plus nab-paclitaxel as first-line treatment for unresectable, locally advanced or metastatic triple-negative breast cancer (IMpassion130): Updated efficacy results from a randomised, double-blind, placebo-controlled, phase 3 trial. Lancet Oncol. 2020, 21, 44–59. [Google Scholar] [CrossRef]
  5. Gentile, D.; Martorana, F.; Karakatsanis, A.; Caruso, F.; Caruso, M.; Castiglione, G.; Di Grazia, A.; Pane, F.; Rizzo, A.; Vigneri, P.; et al. Predictors of mastectomy in breast cancer patients with complete remission of primary tumor after neoadjuvant therapy: A retrospective study. Eur. J. Surg. Oncol. 2024, 50, 108732. [Google Scholar] [CrossRef]
  6. Karagiannis, G.S.; Pastoriza, J.M.; Wang, Y.; Harney, A.S.; Entenberg, D.; Pignatelli, J.; Sharma, V.P.; Xue, E.A.; Cheng, E.; D’alfonso, T.M.; et al. Neoadjuvant chemotherapy induces breast cancer metastasis through a TMEM-mediated mechanism. Sci. Transl. Med. 2017, 9, eaan0026. [Google Scholar] [CrossRef]
  7. Sharma, U.; Danishad, K.K.A.; Seenu, V.; Jagannathan, N.R. Longitudinal study of the assessment by MRI and diffusion-weighted imaging of tumor response in patients with locally advanced breast cancer undergoing neoadjuvant chemotherapy. NMR Biomed. 2009, 22, 104–113. [Google Scholar] [CrossRef]
  8. Fangberget, A.; Nilsen, L.B.; Hole, K.H.; Holmen, M.M.; Engebraaten, O.; Naume, B.; Smith, H.-J.; Olsen, D.R.; Seierstad, T. Neoadjuvant chemotherapy in breast cancer-response evaluation and prediction of response to treatment using dynamic contrast-enhanced and diffusion-weighted MR imaging. Eur. Radiol. 2011, 21, 1188–1199. [Google Scholar] [CrossRef]
  9. Li, X.; Abramson, R.G.; Arlinghaus, L.R.; Kang, H.; Chakravarthy, A.B.; Abramson, V.G.; Farley, J.; Kelley, M.C.; Meszoely, I.M.; Means-Powell, J.; et al. Combined DCE-MRI and DW-MRI for Predicting Breast Cancer Pathological Response After the First Cycle of Neoadjuvant Chemotherapy HHS Public Access. Investig. Radiol. 2015, 50, 195–204. [Google Scholar] [CrossRef]
  10. Cho, N.; Im, S.-A.; Park, I.-A.; Lee, K.-H.; Li, M.; Han, W.; Noh, D.-Y.; Moon, W.K. Breast cancer: Early Prediction of Response to Neoadjuvant Chemotherapy Using Parametric Response Maps for MR Imaging. Radiology 2014, 272, 385–396. [Google Scholar] [CrossRef]
  11. Comes, M.C.; Fanizzi, A.; Bove, S.; Didonna, V.; Diotaiuti, S.; La Forgia, D.; Latorre, A.; Martinelli, E.; Mencattini, A.; Nardone, A.; et al. Early prediction of neoadjuvant chemotherapy response by exploiting a transfer learning approach on breast DCE-MRIs. Sci. Rep. 2021, 11, 14123. [Google Scholar] [CrossRef] [PubMed]
  12. Choi, J.H.; Kim, H.-A.; Kim, W.; Lim, I.; Lee, I.; Byun, B.H.; Noh, W.C.; Seong, M.-K.; Lee, S.-S.; Kim, B.I.; et al. Early prediction of neoadjuvant chemotherapy response for advanced breast cancer using PET/MRI image deep learning. Sci. Rep. 2020, 10, 21149. [Google Scholar] [CrossRef] [PubMed]
  13. Zhou, Z.; Adrada, B.E.; Candelaria, R.P.; Elshafeey, N.A.; Boge, M.; Mohamed, R.M.; Pashapoor, S.; Sun, J.; Xu, Z.; Panthi, B.; et al. Prediction of pathologic complete response to neoadjuvant systemic therapy in triple negative breast cancer using deep learning on multiparametric, MRI. Sci. Rep. 2023, 13, 1171. [Google Scholar] [CrossRef] [PubMed]
  14. Ravichandran, K.; Braman, N.; Janowczyk, A.; Madabhushi, A. A deep learning classifier for prediction of pathological complete response to neoadjuvant chemotherapy from baseline breast DCE-MRI. In Medical Imaging 2018: Computer-Aided Diagnosis; Mori, K., Petrick, N., Eds.; SPIE: Bellingham, WA, USA, 2018; Volume 11. [Google Scholar] [CrossRef]
  15. Braman, N.; Adoui, M.E.; Vulchi, M.; Turk, P.; Etesami, M.; Fu, P.; Bera, K.; Drisis, S.; Varadan, V.; Plecha, D.; et al. Deep learning-based prediction of response to HER2-targeted neoadjuvant chemotherapy from pre-treatment dynamic breast MRI: A multi-institutional validation study. arXiv 2020, arXiv:2001.08570. [Google Scholar]
  16. Peng, Y.; Cheng, Z.; Gong, C.; Zheng, C.; Zhang, X.; Wu, Z.; Yang, Y.; Yang, X.; Zheng, J.; Shen, J. Pretreatment DCE-MRI-Based Deep Learning Outperforms Radiomics Analysis in Predicting Pathologic Complete Response to Neoadjuvant Chemotherapy in Breast Cancer. Front. Oncol. 2022, 12, 846775. [Google Scholar] [CrossRef]
  17. Musall, B.C.; Rauch, D.E.; Mohamed, R.M.; Panthi, B.; Boge, M.; Candelaria, R.P.; Chen, H.; Guirguis, M.S.; Hunt, K.K.; Huo, L.; et al. Diffusion Tensor Imaging for Characterizing Changes in Triple-Negative Breast Cancer During Neoadjuvant Systemic Therapy. J. Magn. Reson. Imaging 2024, 60, 1367–1376. [Google Scholar] [CrossRef]
  18. Panthi, B.; Mohamed, R.M.; Adrada, B.E.; Boge, M.; Candelaria, R.P.; Chen, H.; Hunt, K.K.; Huo, L.; Hwang, K.-P.; Korkut, A.; et al. Longitudinal dynamic contrast-enhanced MRI radiomic models for early prediction of response to neoadjuvant systemic therapy in triple-negative breast cancer. Front. Oncol. 2023, 13, 1264259. [Google Scholar] [CrossRef]
  19. Mohamed, R.M.; Panthi, B.; Adrada, B.E.; Boge, M.; Candelaria, R.P.; Chen, H.; Guirguis, M.S.; Hunt, K.K.; Huo, L.; Hwang, K.-P.; et al. Multiparametric MRI—Based radiomic models for early prediction of response to neoadjuvant systemic therapy in triple—Negative breast cancer. Sci. Rep. 2024, 14, 16073. [Google Scholar] [CrossRef]
  20. Yee, D.; DeMichele, A.M.; Yau, C.; Isaacs, C.; Symmans, W.F.; Albain, K.S.; Chen, Y.Y.; Krings, G.; Wei, S.; Harada, S.; et al. Association of Event-Free and Distant Recurrence-Free Survival with Individual-Level Pathologic Complete Response in Neoadjuvant Treatment of Stages 2 and 3 Breast Cancer: Three-Year Follow-up Analysis for the I-SPY2 Adaptively Randomized Clinical Trial. JAMA Oncol. 2020, 6, 1355–1362. [Google Scholar] [CrossRef]
  21. Li, W.; Newitt, D.C.; Gibbs, J.; Wilmes, L.J.; Jones, E.F.; Arasu, V.A.; Strand, F.; Onishi, N.; Nguyen, A.A.-T.; Kornak, J.; et al. Predicting breast cancer response to neoadjuvant treatment using multi-feature MRI: Results from the I-SPY 2 TRIAL. NPJ Breast Cancer 2020, 6, 63. [Google Scholar] [CrossRef]
  22. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
  23. Moon, J.; Kim, J.; Shin, Y.; Hwang, S. Confidence-Aware Learning for Deep Neural Networks. In Proceedings of the 37th International Conference on Machine Learning, Virtual, 13–18 July 2020. [Google Scholar]
  24. Xie, S.; Girshick, R.; Dollár, P.; Tu, Z.; He, K. Aggregated residual transformations for deep neural networks. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5987–5995. [Google Scholar] [CrossRef]
  25. Liu, M.Z.; Mutasa, S.; Chang, P.; Siddique, M.; Jambawalikar, S.; Ha, R. A novel CNN algorithm for pathological complete response prediction using an I-SPY TRIAL breast MRI database. Magn. Reson. Imaging 2020, 73, 148–151. [Google Scholar] [CrossRef] [PubMed]
  26. Dammu, H.; Ren, T.; Duong, T.Q. Deep learning prediction of pathological complete response, residual cancer burden, and progression-free survival in breast cancer patients. PLoS ONE 2023, 18, e0280148. [Google Scholar] [CrossRef] [PubMed]
  27. Jin, C.; Yu, H.; Ke, J.; Ding, P.; Yi, Y.; Jiang, X.; Duan, X.; Tang, J.; Chang, D.T.; Wu, X.; et al. Predicting treatment response from longitudinal images using multi-task deep learning. Nat. Commun. 2021, 12, 1851. [Google Scholar] [CrossRef] [PubMed]
  28. Joo, S.; Ko, E.S.; Kwon, S.; Jeon, E.; Jung, H.; Kim, J.-Y.; Chung, M.J.; Im, Y.-H. Multimodal deep learning models for the prediction of pathologic response to neoadjuvant chemotherapy in breast cancer. Sci. Rep. 2021, 11, 18800. [Google Scholar] [CrossRef]
  29. Qu, Y.H.; Zhu, H.T.; Cao, K.; Li, X.T.; Ye, M.; Sun, Y.S. Prediction of pathological complete response to neoadjuvant chemotherapy in breast cancer using a deep learning (DL) method. Thorac. Cancer 2020, 11, 651–658. [Google Scholar] [CrossRef]
  30. Ha, R.; Chin, C.; Karcich, J.; Liu, M.Z.; Chang, P.; Mutasa, S.; Van Sant, E.P.; Wynn, R.T.; Connolly, E.; Jambawalikar, S. Prior to Initiation of Chemotherapy, Can We Predict Breast Tumor Response? Deep Learning Convolutional Neural Networks Approach Using a Breast MRI Tumor Dataset. J. Digit. Imaging 2019, 32, 693–701. [Google Scholar] [CrossRef]
  31. Duanmu, H.; Bhattarai, S.; Li, H.; Shi, Z.; Wang, F.; Teodoro, G.; Gogineni, K.; Subhedar, P.; Kiraz, U.; Janssen, E.A.M.; et al. A spatial attention guided deep learning system for prediction of pathological complete response using breast cancer histopathology images. Bioinformatics 2022, 38, 4605–4612. [Google Scholar] [CrossRef]
  32. Abuhadra, N.; Sun, R.; Yam, C.; Rauch, G.M.; Ding, Q.; Lim, B.; Thompson, A.M.; Mittendorf, E.A.; Adrada, B.E.; Damodaran, S.; et al. Predictive Roles of Baseline Stromal Tumor-Infiltrating Lymphocytes and Ki-67 in Pathologic Complete Response in an Early-Stage Triple-Negative Breast Cancer Prospective Trial. Cancers 2023, 15, 3275. [Google Scholar] [CrossRef]
  33. Cain, E.H.; Saha, A.; Harowicz, M.R.; Marks, J.R.; Marcom, P.K.; Mazurowski, M.A. Multivariate machine learning models for prediction of pathologic response to neoadjuvant therapy in breast cancer using MRI features: A study using an independent validation set. Breast Cancer Res. Treat. 2019, 173, 455–463. [Google Scholar] [CrossRef]
  34. Li, F.; Yang, Y.; Wei, Y.; He, P.; Chen, J.; Zheng, Z.; Bu, H. Deep learning-based predictive biomarker of pathological complete response to neoadjuvant chemotherapy from histological images in breast cancer. J. Transl. Med. 2021, 19, 348. [Google Scholar] [CrossRef]
  35. Pei, X.; Zuo, K.; Li, Y.; Pang, Z. A Review of the Application of Multi-modal Deep Learning in Medicine: Bibliometrics and Future Directions. Int. J. Comput. Intell. Syst. 2023, 16, 44. [Google Scholar] [CrossRef]
  36. Zahavy, T.; Krishnan, A.; Magnani, A.; Mannor, S. Is a picture worth a thousand words? A deep multi-modal architecture for product classification in e-commerce. In Proceedings of the 30th AAAI Conference on Artificial Intelligence, AAAI 2018, New Orleans, LA, USA, 2–7 February 2018; pp. 7873–7880. [Google Scholar]
Figure 1. Flowcharts of patient recruitment and exclusion for the internal and external datasets. The external data (I-SPY 2 Imaging Cohort 1) are publicly available and were downloaded from The Cancer Imaging Archive. DCE = dynamic contrast-enhanced MRI; DWI = diffusion-weighted imaging; pCR = pathologic complete response; TNBC = triple-negative breast cancer.
Figure 1. Flowcharts of patient recruitment and exclusion for the internal and external datasets. The external data (I-SPY 2 Imaging Cohort 1) are publicly available and were downloaded from The Cancer Imaging Archive. DCE = dynamic contrast-enhanced MRI; DWI = diffusion-weighted imaging; pCR = pathologic complete response; TNBC = triple-negative breast cancer.
Cancers 17 00966 g001
Figure 2. Workflow of training the deep learning (DL) prediction models. In “Model development”, training folds are indicated with blue and validation folds are indicated with orange. AUC = area under the ROC curve; pCR = pathologic complete response; ROI = region of interest; and TNBC = triple-negative breast cancer.
Figure 2. Workflow of training the deep learning (DL) prediction models. In “Model development”, training folds are indicated with blue and validation folds are indicated with orange. AUC = area under the ROC curve; pCR = pathologic complete response; ROI = region of interest; and TNBC = triple-negative breast cancer.
Cancers 17 00966 g002
Figure 3. Receiver operating characteristic curves of the best-performing models with the input of the dynamic contrast-enhanced MRI (DCE) images only, the diffusion-weighted imaging (DWI) images only, DCE with DWI images, and DCE and DWI with clinicopathological information.
Figure 3. Receiver operating characteristic curves of the best-performing models with the input of the dynamic contrast-enhanced MRI (DCE) images only, the diffusion-weighted imaging (DWI) images only, DCE with DWI images, and DCE and DWI with clinicopathological information.
Cancers 17 00966 g003
Table 1. Demographics and baseline clinical information of patients in the (A) internal and (B) external dataset.
Table 1. Demographics and baseline clinical information of patients in the (A) internal and (B) external dataset.
(A)
Internal DataAll PatientspCRnon-pCRp Value
No. of patients282132150
Age, mean ± SD, y49.8 ± 11.449.5 ± 11.550.1 ± 11.50.36
BMI, mean, kg/m228.928.529.20.15
Longest tumor diameter, mean ± SD, cm3.4 ± 1.52.9 ± 1.33.6 ± 1.8<0.001
Clinical stage, n (%) 0.018
I34 (12)19 (14)15 (10)
II206 (73)100 (76)106 (71)
III42 (15)13 (10)29 (19)
T category, n (%) <0.001
T151 (18)31 (23)20 (13)
T2189 (67)91 (69)98 (65)
T336 (13)8 (6)28 (19)
T46 (2)2 (2)4 (3)
N category, n (%) <0.001
N0181 (64)92 (70)89 (59)
N172 (26)30 (23)42 (28)
N27 (2)3 (2)4 (3)
N322 (8)7 (5)15 (10)
Stromal tumor-infiltrating lymphocyte level, %, median (IQR)10 (4–20)20 (4–30)10 (4–20)<0.001
Ki-67 index, %, median (IQR)70 (50–90)75 (50–90)70 (51–87)0.012
NAST regimen, n
Doxorubicin + Paclitaxel233121112
Doxorubicin + Enzalutamide1138
Doxorubicin + Panitumumab15213
Doxorubicin + Everolimus707
Doxorubicin + Atezolizumab1257
Doxorubicin + Alpelisib413
(B)
External DataAll PatientspCRnon-pCRp Value
No. of patients622834
Age, mean ± SD, y48.8 ± 10.549.4 ± 10.448.2 ± 10.90.62
Longest tumor diameter, mean ± SD, cm4.1 ± 2.33.5 ± 1.14.7 ± 2.90.017
T category, n (%) 0.11
T25 (8)1 (4)4 (12)
T356 (90)26 (92)30 (88)
N/A1 (2)1 (4)0 (0)
NAST regimen, n
Paclitaxel13310
Paclitaxel + MK-2206963
Paclitaxel + AMG 3861578
Paclitaxel + Ganetespib101
Paclitaxel + Ganitumab241212
Note—BMI = body mass index; IQR = interquartile range; NAST = neoadjuvant systemic therapy; pCR = pathologic complete response; SD = standard deviation; and N/A = not available.
Table 2. AUCs by data inputs and other variables in the internal testing dataset.
Table 2. AUCs by data inputs and other variables in the internal testing dataset.
Internal Testing Dataset
NetworkResNet18ResNeXt50
ROI1ROI2ROI3ROI1ROI2ROI3
DCE-only models
Normalized
tumor volume
0.630.610.630.520.480.48
(0.58, 0.7)(0.59, 0.65)(0.55, 0.68)(0.44, 0.61)(0.39, 0.62)(0.45, 0.53)
Original
tumor volume
0.680.690.680.660.65 *0.65 *
(0.62, 0.7)(0.66, 0.74)(0.66, 0.7)(0.65, 0.67)(0.64, 0.66)(0.65, 0.66)
DWI-only models
Normalized
tumor volume
0.600.590.610.610.610.57
(0.55, 0.68)(0.49, 0.67)(0.46, 0.74)(0.61, 0.62)(0.59, 0.63)(0.56, 0.59)
Original
tumor volume
0.690.720.700.530.580.60
(0.67, 0.72)(0.69, 0.75)(0.69, 0.71)(0.51, 0.56)(0.53, 0.73)(0.54, 0.73)
DCE + DWI models
Normalized
tumor volume
0.570.600.590.580.590.57
(0.5, 0.63)(0.59, 0.64)(0.54, 0.62)(0.54, 0.61)(0.58, 0.6)(0.56, 0.58)
Original
tumor volume
0.710.730.720.710.700.71
(0.67, 0.76)(0.71, 0.74)(0.68, 0.76)(0.66, 0.78)(0.65, 0.78)(0.67, 0.77)
DCE + DWI + clinical information models
Normalized
tumor volume
0.680.690.690.670.680.67
(0.66, 0.72)(0.66, 0.72)(0.66, 0.73)(0.64, 0.70)(0.65, 0.69)(0.63, 0.70)
Original
tumor volume
0.740.730.710.720.760.72
(0.67, 0.79)(0.67, 0.81)(0.65, 0.76)(0.69, 0.75)(0.72, 0.78)(0.68, 0.75)
Note—Values in table are mean (minimum, maximum) areas under the receiver operating characteristic curve (AUCs). The top-performing models are highlighted in bold. DCE = dynamic contrast-enhanced MR images; DWI = diffusion-weighted imaging images; ROI1 = accurate tumor segmentation; ROI2 = bounding box over tumor; and ROI3 = enlarged bounding box dilated by 5 mm along the top, bottom, left, and right sides of ROI2. * Higher AUC than the paired model with the other tumor volume preprocessing method, p < 0.05.
Table 3. AUCs by data inputs and other variables in the external testing dataset.
Table 3. AUCs by data inputs and other variables in the external testing dataset.
External Testing Dataset
NetworkResNet18ResNeXt50
ROI1ROI2ROI3ROI1ROI2ROI3
DCE-only models
Normalized
tumor volume
0.410.460.530.430.440.44
(0.33, 0.48)(0.43, 0.57)(0.49, 0.54)(0.40, 0.53)(0.40, 0.58)(0.42, 0.53)
Original
tumor volume
0.69 *0.67 *0.67 *0.71 *0.71 *0.72 *
(0.63, 0.69)(0.64, 0.69)(0.64, 0.69)(0.71, 0.72)(0.71, 0.72)(0.71, 0.72)
DWI-only models
Normalized
tumor volume
0.610.570.680.530.620.61
(0.52, 0.58)(0.45, 0.63)(0.59, 0.76)(0.51, 0.54)(0.60, 0.68)(0.57, 0.71)
Original
tumor volume
0.660.700.720.660.680.69
(0.65, 0.67)(0.68, 0.73)(0.71, 0.72)(0.64, 0.66)(0.60, 0.72)(0.68, 0.69)
DCE + DWI models
Normalized
tumor volume
0.620.610.680.590.630.64
(0.49, 0.60)(0.38, 0.62)(0.57, 0.68)(0.52, 0.65)(0.63, 0.65)(0.62, 0.65)
Original
tumor volume
0.670.700.710.660.670.71
(0.64, 0.70)(0.69, 0.72)(0.67, 0.73)(0.55, 0.68)(0.50, 0.70)(0.60, 0.72)
Note—Values in the table are the mean (minimum, maximum) areas under the receiver operating characteristic curve (AUCs). The top-performing models are highlighted in bold. DCE = dynamic contrast-enhanced MR images; DWI = diffusion-weighted imaging images; ROI1 = accurate tumor segmentation; ROI2 = bounding box over tumor; and ROI3 = enlarged bounding box dilated by 5 mm along the top, bottom, left, and right sides of ROI2. * Higher AUC than the paired model with the other tumor volume preprocessing method, p < 0.05.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xu, Z.; Zhou, Z.; Son, J.B.; Feng, H.; Adrada, B.E.; Moseley, T.W.; Candelaria, R.P.; Guirguis, M.S.; Patel, M.M.; Whitman, G.J.; et al. Deep Learning Models Based on Pretreatment MRI and Clinicopathological Data to Predict Responses to Neoadjuvant Systemic Therapy in Triple-Negative Breast Cancer. Cancers 2025, 17, 966. https://doi.org/10.3390/cancers17060966

AMA Style

Xu Z, Zhou Z, Son JB, Feng H, Adrada BE, Moseley TW, Candelaria RP, Guirguis MS, Patel MM, Whitman GJ, et al. Deep Learning Models Based on Pretreatment MRI and Clinicopathological Data to Predict Responses to Neoadjuvant Systemic Therapy in Triple-Negative Breast Cancer. Cancers. 2025; 17(6):966. https://doi.org/10.3390/cancers17060966

Chicago/Turabian Style

Xu, Zhan, Zijian Zhou, Jong Bum Son, Haonan Feng, Beatriz E. Adrada, Tanya W. Moseley, Rosalind P. Candelaria, Mary S. Guirguis, Miral M. Patel, Gary J. Whitman, and et al. 2025. "Deep Learning Models Based on Pretreatment MRI and Clinicopathological Data to Predict Responses to Neoadjuvant Systemic Therapy in Triple-Negative Breast Cancer" Cancers 17, no. 6: 966. https://doi.org/10.3390/cancers17060966

APA Style

Xu, Z., Zhou, Z., Son, J. B., Feng, H., Adrada, B. E., Moseley, T. W., Candelaria, R. P., Guirguis, M. S., Patel, M. M., Whitman, G. J., Leung, J. W. T., Le-Petross, H. T. C., Mohamed, R. M., Panthi, B., Lane, D. L., Chen, H., Wei, P., Tripathy, D., Litton, J. K., ... Ma, J. (2025). Deep Learning Models Based on Pretreatment MRI and Clinicopathological Data to Predict Responses to Neoadjuvant Systemic Therapy in Triple-Negative Breast Cancer. Cancers, 17(6), 966. https://doi.org/10.3390/cancers17060966

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop