Figure 1.
Bioinformatics analysis of gene targets for increase of MHC-I expression in breast cancer. (A,B) Boxplots display MHC-I signature scores for patients from the TCGA BRCA and METABRIC cohorts, categorized by ER status and breast cancer subtypes. In the TCGA BRCA cohort (A), scores are shown for ER-negative (n = 224) and ER-positive (n = 796) groups; subtypes include Basal (n = 191), Her2 (n = 81), LumA (n = 567), and LumB (n = 219). In the METABRIC cohort (B), ER statuses are negative (n = 609) and positive (n = 1817), with subtypes Basal (n = 209), Her2 (n = 224), LumA (n = 700), and LumB (n = 475). Each boxplot represents the median MHC-I signature score with whiskers indicating variability outside the upper and lower quartiles and individual data points plotted as dots. Statistical significance, determined using the Wilcoxon rank-sum test, is denoted by asterisks: **** for p < 0.0001, *** for p < 0.001, and * for p < 0.05. (C) Kaplan-Meier survival curves illustrating overall survival (OS) in both all (TCGA BRCA) and ER-positive (TCGA BRCA ER Pos) breast cancer patients with high (red) versus low (blue) MHC-I expression in the TCGA BRCA cohort. The y-axis indicates survival probability, and the x-axis represents OS in years. The p-values from the log-rank test are labeled. (D,E) GSEA based on MHC-I signature score in ER-positive breast cancer patients. Dot plot is shown in (D). The x-axis represents the negative logarithm of the adjusted p-value (−log10 (p.adjust)), while the y-axis denotes the enrichment score. The size of the bubbles corresponds to the adjusted p-value, and the color gradient indicates the enrichment score. Enrichment scores for the indicated pathways are shown in (E), with normalized enrichment scores (NES) and adjusted p-values. (F) A flow chart for identifying negative correlated factors of MHC-I expression by bioinformatics analysis. PCG, protein-coding genes. (G) Venn diagram illustrating the overlap of gene sets under different criteria in breast cancer. The diagram shows the intersection of five gene sets: (1) ER-positive (ER pos) breast cancer with MHC-I correlation less than or equal to −0.15 and p-value < 0.05 (blue), (2) genes with ER pos_exp (log2(tpm + 0.001)) ≥ 1 (pink), (3) BRCA-positive (BRCA Pos) with prognostic score less than −12 (yellow), (4) genes risky in more than three tumor types (green), and (5) BRCA with prognostic scores less than −12 (light green). The numbers within the diagram represent the count of genes falling into each category or intersection. The central intersection highlighted in red represents 243 genes common to all five criteria. (H) Bubble plot illustrating KEGG pathway enrichment analysis. Pathways are represented along the y-axis, with the x-axis displaying the gene ratio, defined as the proportion of genes within the pathway relative to the total number of input genes. Each bubble’s size reflects the gene count associated with the respective pathway, with larger bubbles indicating a higher number of genes. The color gradient of the bubbles corresponds to the statistical significance of the enrichment, represented by the −log2(p-value), where darker shades denote more significant p-values.
Figure 1.
Bioinformatics analysis of gene targets for increase of MHC-I expression in breast cancer. (A,B) Boxplots display MHC-I signature scores for patients from the TCGA BRCA and METABRIC cohorts, categorized by ER status and breast cancer subtypes. In the TCGA BRCA cohort (A), scores are shown for ER-negative (n = 224) and ER-positive (n = 796) groups; subtypes include Basal (n = 191), Her2 (n = 81), LumA (n = 567), and LumB (n = 219). In the METABRIC cohort (B), ER statuses are negative (n = 609) and positive (n = 1817), with subtypes Basal (n = 209), Her2 (n = 224), LumA (n = 700), and LumB (n = 475). Each boxplot represents the median MHC-I signature score with whiskers indicating variability outside the upper and lower quartiles and individual data points plotted as dots. Statistical significance, determined using the Wilcoxon rank-sum test, is denoted by asterisks: **** for p < 0.0001, *** for p < 0.001, and * for p < 0.05. (C) Kaplan-Meier survival curves illustrating overall survival (OS) in both all (TCGA BRCA) and ER-positive (TCGA BRCA ER Pos) breast cancer patients with high (red) versus low (blue) MHC-I expression in the TCGA BRCA cohort. The y-axis indicates survival probability, and the x-axis represents OS in years. The p-values from the log-rank test are labeled. (D,E) GSEA based on MHC-I signature score in ER-positive breast cancer patients. Dot plot is shown in (D). The x-axis represents the negative logarithm of the adjusted p-value (−log10 (p.adjust)), while the y-axis denotes the enrichment score. The size of the bubbles corresponds to the adjusted p-value, and the color gradient indicates the enrichment score. Enrichment scores for the indicated pathways are shown in (E), with normalized enrichment scores (NES) and adjusted p-values. (F) A flow chart for identifying negative correlated factors of MHC-I expression by bioinformatics analysis. PCG, protein-coding genes. (G) Venn diagram illustrating the overlap of gene sets under different criteria in breast cancer. The diagram shows the intersection of five gene sets: (1) ER-positive (ER pos) breast cancer with MHC-I correlation less than or equal to −0.15 and p-value < 0.05 (blue), (2) genes with ER pos_exp (log2(tpm + 0.001)) ≥ 1 (pink), (3) BRCA-positive (BRCA Pos) with prognostic score less than −12 (yellow), (4) genes risky in more than three tumor types (green), and (5) BRCA with prognostic scores less than −12 (light green). The numbers within the diagram represent the count of genes falling into each category or intersection. The central intersection highlighted in red represents 243 genes common to all five criteria. (H) Bubble plot illustrating KEGG pathway enrichment analysis. Pathways are represented along the y-axis, with the x-axis displaying the gene ratio, defined as the proportion of genes within the pathway relative to the total number of input genes. Each bubble’s size reflects the gene count associated with the respective pathway, with larger bubbles indicating a higher number of genes. The color gradient of the bubbles corresponds to the statistical significance of the enrichment, represented by the −log2(p-value), where darker shades denote more significant p-values.
Figure 2.
Partially verifying candidate genes for their role in the increase of MHC-I expression in breast cancer cells by gene knockout approach. (A) Clustered heatmap illustrating the Spearman correlations between 243 candidate genes (rows) and various TIL signatures (columns) in BRCA ER-positive samples. TIL signatures were computed using cell-type identification by estimating relative subsets of RNA transcripts (CIBERSORT), which estimates the relative proportions of immune cell types from gene expression data. The color scale ranges from blue (negative correlation) to red (positive correlation). Both genes and TIL types are hierarchically clustered, with dendrograms showing the relationship among them. This heatmap highlights key gene-TIL associations within the tumor microenvironment. (B) Scatter plot depicting the correlation between gene expression and two types of TIL scores, activated NK cells (y-axis) and CD8+ T cells (x-axis). Each point represents a gene, with the position determined by its correlation with activated NK cells or CD8+ T cells. The plot is divided into four quadrants by red and black dashed lines at correlation thresholds of −0.1 and 0.0. The shaded gray area highlights the lower left quadrant where both correlations are negative, indicating candidate genes (n = 144 out of 243) that are negatively correlated with both activated NK cells and CD8+ T cell scores (R < −0.1). (C) KEGG pathway enrichment analysis of genes identified from the analysis in (B). The y-axis lists the significantly enriched KEGG pathways, while the x-axis represents the gene ratio, calculated as the proportion of genes involved in each pathway relative to the total number of input genes. Each bubble’s size indicates the number of enriched genes in the respective pathway, and the color distinguishes different pathways. The right panel lists the specific genes associated with each pathway, highlighting their involvement in critical cellular processes. (D) Partially verifying the candidate genes for their role in increasing the transcription of MHC-I family genes in MCF7 cells. The indicated genes were knockout by the CRISPR-Cas9 method, followed by qPCR analysis of the mRNA levels of MHC-I family genes, including HLA-A, HLA-B, HLA-C, and HLA-DOB. Each red dot corresponds to a specific gene. (E) Effects of PIP5K1A-, NCKAP1-, or CYFIP1-deficiency on transcription of MHC-I family genes in MCF7 cells. The control (gNC) and PIP5K1A-, NCKAP1-, or CYFIP1-deficient (gPIP5K1A, gNCKAP1, gCYFIP1) MCF7 cells were harvested for qPCR analysis of the mRNA levels of the indicated genes. (F) Effects of PIP5K1A-, NCKAP1-, or CYFIP1-deficiency on MHC-I expression on the cell surface of MCF7 cells. The control (gNC) and PIP5K1A-, NCKAP1-, or CYFIP1-deficient (gPIP5K1A, gNCKAP1, gCYFIP1) MCF7 cells were harvested for flow cytometry analysis with PE-conjugated HLA-A2 antibody. The data shown are the mean ± SD (n = 3) from one representative experiment (E,F). ns, not significant, *** p < 0.001, **** p < 0.0001. Data are representative of three independent experiments.
Figure 2.
Partially verifying candidate genes for their role in the increase of MHC-I expression in breast cancer cells by gene knockout approach. (A) Clustered heatmap illustrating the Spearman correlations between 243 candidate genes (rows) and various TIL signatures (columns) in BRCA ER-positive samples. TIL signatures were computed using cell-type identification by estimating relative subsets of RNA transcripts (CIBERSORT), which estimates the relative proportions of immune cell types from gene expression data. The color scale ranges from blue (negative correlation) to red (positive correlation). Both genes and TIL types are hierarchically clustered, with dendrograms showing the relationship among them. This heatmap highlights key gene-TIL associations within the tumor microenvironment. (B) Scatter plot depicting the correlation between gene expression and two types of TIL scores, activated NK cells (y-axis) and CD8+ T cells (x-axis). Each point represents a gene, with the position determined by its correlation with activated NK cells or CD8+ T cells. The plot is divided into four quadrants by red and black dashed lines at correlation thresholds of −0.1 and 0.0. The shaded gray area highlights the lower left quadrant where both correlations are negative, indicating candidate genes (n = 144 out of 243) that are negatively correlated with both activated NK cells and CD8+ T cell scores (R < −0.1). (C) KEGG pathway enrichment analysis of genes identified from the analysis in (B). The y-axis lists the significantly enriched KEGG pathways, while the x-axis represents the gene ratio, calculated as the proportion of genes involved in each pathway relative to the total number of input genes. Each bubble’s size indicates the number of enriched genes in the respective pathway, and the color distinguishes different pathways. The right panel lists the specific genes associated with each pathway, highlighting their involvement in critical cellular processes. (D) Partially verifying the candidate genes for their role in increasing the transcription of MHC-I family genes in MCF7 cells. The indicated genes were knockout by the CRISPR-Cas9 method, followed by qPCR analysis of the mRNA levels of MHC-I family genes, including HLA-A, HLA-B, HLA-C, and HLA-DOB. Each red dot corresponds to a specific gene. (E) Effects of PIP5K1A-, NCKAP1-, or CYFIP1-deficiency on transcription of MHC-I family genes in MCF7 cells. The control (gNC) and PIP5K1A-, NCKAP1-, or CYFIP1-deficient (gPIP5K1A, gNCKAP1, gCYFIP1) MCF7 cells were harvested for qPCR analysis of the mRNA levels of the indicated genes. (F) Effects of PIP5K1A-, NCKAP1-, or CYFIP1-deficiency on MHC-I expression on the cell surface of MCF7 cells. The control (gNC) and PIP5K1A-, NCKAP1-, or CYFIP1-deficient (gPIP5K1A, gNCKAP1, gCYFIP1) MCF7 cells were harvested for flow cytometry analysis with PE-conjugated HLA-A2 antibody. The data shown are the mean ± SD (n = 3) from one representative experiment (E,F). ns, not significant, *** p < 0.001, **** p < 0.0001. Data are representative of three independent experiments.
Figure 3.
PIP5K1A-, NCKAP1-, or CYFIP1-deficiency activates the intrinsic interferon response and causes growth suppression of breast cancer cells. (A) Heatmap showing the expression profiles of significantly differentially expressed genes in PIP5K1A-, NCKAP1-, or CYFIP1-deficient MCF7 cells. Rows represent individual genes that were identified as significantly differentially expressed through differential expression analysis. Columns represent the different experimental conditions (gNC, gCYFIP1, gNCKAP1, gPIP5K1A). The color scale indicates the Z-score of gene expression, with red indicating upregulation and blue indicating downregulation. Key immune-related genes, including HLA-A, HLA-C, HLA-B, and TAP1/2, are highlighted on the left. The genes are clustered into two major groups based on their expression patterns, with the number of genes in each cluster shown on the right (n = 774 and n = 814). (B) Dot plot representing the results of hallmark GSEA following the deficiency of PIP5K1A (pink), CYFIP1 (blue), or NCKAP1 (yellow) in MCF7 cells. The y-axis lists various HALLMARK pathways, while the x-axis shows the normalized enrichment score (NES) for each pathway. Each dot represents the corresponding NES for a specific pathway. (C) Effects of PIP5K1A-, NCKAP1-, or CYFIP1-deficiency on transcription of interferon-stimulated genes. The control (gNC) and PIP5K1A-, NCKAP1-, or CYFIP1-deficient (gPIP5K1A, gNCKAP1, gCYFIP1) MCF7 cells were harvested for qPCR analysis of the mRNA of the indicated genes. (D) Effect of PIP5K1A-, NCKAP1-, or CYFIP1-deficiency on cell viability of MCF7 cells. The viability of the indicated cells was measured by CCK-8 assay. (E) Effect of PIP5K1A-, NCKAP1-, or CYFIP1-deficiency on colony formation of MCF7 cells. The clonogenic efficiency of the indicated cells was performed by a 2D colony formation assay. Shown are representative colony images (left) and quantification of colonies (right), scale bar: 5 mm. (F) Effect of PIP5K1A-, NCKAP1-, or CYFIP1-deficiency on MCF7 cell proliferation in the present or absent of fulvestrant treatment. The indicated cells were left un-treated or treated with fulvestrant (50 nM) for 36 h, followed by cell viability measurement by CCK-8 assay. (G) Boxplots showing the gene expression levels of PIP5K1A NCKAP1 and CYFIP1 in normal (blue, n = 292) and tumor (red, n = 1099) tissues derived from TCGA BRCA dataset. The y-axis represents gene expression levels measured in transcripts per million (TPM), log2-transformed [log2(tpm + 0.001)]. The horizontal lines within the boxes indicate the medians, while the boxes represent the interquartile ranges (IQR). Whiskers extend to data points within 1.5 times the IQR, with outliers depicted as individual points beyond the whiskers. p-values from the Wilcoxon rank-sum test are indicated for each comparison. (H) Kaplan-Meier survival curves illustrating overall survival (OS) probabilities for patients stratified by high (red) and low (blue) expression levels of PIP5K1A (High, n = 508; Low, n = 378), NCKAP1 (High, n = 358; Low, n = 517), and CYFIP1 (High, n = 616; Low, n = 254) in TCGA BRCA. The y-axis represents survival probability, and the x-axis indicates time in years. Log-rank p-values are labeled for each gene, indicating the statistical significance of the differences in survival outcomes between the high and low expression cohorts. The data shown are the mean ± SD (n = 3) from one representative experiment (C–F). ** p < 0.01, *** p < 0.001, **** p < 0.0001. Data are representative of three independent experiments.
Figure 3.
PIP5K1A-, NCKAP1-, or CYFIP1-deficiency activates the intrinsic interferon response and causes growth suppression of breast cancer cells. (A) Heatmap showing the expression profiles of significantly differentially expressed genes in PIP5K1A-, NCKAP1-, or CYFIP1-deficient MCF7 cells. Rows represent individual genes that were identified as significantly differentially expressed through differential expression analysis. Columns represent the different experimental conditions (gNC, gCYFIP1, gNCKAP1, gPIP5K1A). The color scale indicates the Z-score of gene expression, with red indicating upregulation and blue indicating downregulation. Key immune-related genes, including HLA-A, HLA-C, HLA-B, and TAP1/2, are highlighted on the left. The genes are clustered into two major groups based on their expression patterns, with the number of genes in each cluster shown on the right (n = 774 and n = 814). (B) Dot plot representing the results of hallmark GSEA following the deficiency of PIP5K1A (pink), CYFIP1 (blue), or NCKAP1 (yellow) in MCF7 cells. The y-axis lists various HALLMARK pathways, while the x-axis shows the normalized enrichment score (NES) for each pathway. Each dot represents the corresponding NES for a specific pathway. (C) Effects of PIP5K1A-, NCKAP1-, or CYFIP1-deficiency on transcription of interferon-stimulated genes. The control (gNC) and PIP5K1A-, NCKAP1-, or CYFIP1-deficient (gPIP5K1A, gNCKAP1, gCYFIP1) MCF7 cells were harvested for qPCR analysis of the mRNA of the indicated genes. (D) Effect of PIP5K1A-, NCKAP1-, or CYFIP1-deficiency on cell viability of MCF7 cells. The viability of the indicated cells was measured by CCK-8 assay. (E) Effect of PIP5K1A-, NCKAP1-, or CYFIP1-deficiency on colony formation of MCF7 cells. The clonogenic efficiency of the indicated cells was performed by a 2D colony formation assay. Shown are representative colony images (left) and quantification of colonies (right), scale bar: 5 mm. (F) Effect of PIP5K1A-, NCKAP1-, or CYFIP1-deficiency on MCF7 cell proliferation in the present or absent of fulvestrant treatment. The indicated cells were left un-treated or treated with fulvestrant (50 nM) for 36 h, followed by cell viability measurement by CCK-8 assay. (G) Boxplots showing the gene expression levels of PIP5K1A NCKAP1 and CYFIP1 in normal (blue, n = 292) and tumor (red, n = 1099) tissues derived from TCGA BRCA dataset. The y-axis represents gene expression levels measured in transcripts per million (TPM), log2-transformed [log2(tpm + 0.001)]. The horizontal lines within the boxes indicate the medians, while the boxes represent the interquartile ranges (IQR). Whiskers extend to data points within 1.5 times the IQR, with outliers depicted as individual points beyond the whiskers. p-values from the Wilcoxon rank-sum test are indicated for each comparison. (H) Kaplan-Meier survival curves illustrating overall survival (OS) probabilities for patients stratified by high (red) and low (blue) expression levels of PIP5K1A (High, n = 508; Low, n = 378), NCKAP1 (High, n = 358; Low, n = 517), and CYFIP1 (High, n = 616; Low, n = 254) in TCGA BRCA. The y-axis represents survival probability, and the x-axis indicates time in years. Log-rank p-values are labeled for each gene, indicating the statistical significance of the differences in survival outcomes between the high and low expression cohorts. The data shown are the mean ± SD (n = 3) from one representative experiment (C–F). ** p < 0.01, *** p < 0.001, **** p < 0.0001. Data are representative of three independent experiments.
Figure 4.
Bioinformatics analysis of data from the CCLE for the potential dual-effectors that regulate both MHC-I expression and cell survival in breast cancer. (A) Scatter plot of CYFIP1, NCKAP1, and PIP5K1A gene expression (log2-transformed) versus MHC-I signature scores across various breast cancer cell lines from the CCLE. Each dot represents a distinct cell line, color-coded by subtype. The plot includes a regression line (blue) with its 95% confidence interval (gray), illustrating a significant negative correlation between MHC-I scores and the indicated gene expression. (B) Lollipop plot displaying the Spearman correlation coefficients between the candidate genes and MHC-I scores across CCLE cell lines. Each line represents a gene, color-coded according to its correlation coefficient (cor) with MHC-I scores, ranging from yellow (positive correlation) to purple (negative correlation). The size of the dot at the end of each line corresponds to the statistical significance of the correlation, represented by −log10(p-value). Larger dots indicate higher statistical significance. (C) Boxplot analysis of gene dependency scores for the candidate genes across CCLE BRCA cell lines. The y-axis represents gene dependency, with lower scores indicating higher dependency. Each boxplot summarizes the distribution of dependency scores, with individual dots representing outliers. Genes with highly negative scores (left side, inset) are potential dual-effectors.
Figure 4.
Bioinformatics analysis of data from the CCLE for the potential dual-effectors that regulate both MHC-I expression and cell survival in breast cancer. (A) Scatter plot of CYFIP1, NCKAP1, and PIP5K1A gene expression (log2-transformed) versus MHC-I signature scores across various breast cancer cell lines from the CCLE. Each dot represents a distinct cell line, color-coded by subtype. The plot includes a regression line (blue) with its 95% confidence interval (gray), illustrating a significant negative correlation between MHC-I scores and the indicated gene expression. (B) Lollipop plot displaying the Spearman correlation coefficients between the candidate genes and MHC-I scores across CCLE cell lines. Each line represents a gene, color-coded according to its correlation coefficient (cor) with MHC-I scores, ranging from yellow (positive correlation) to purple (negative correlation). The size of the dot at the end of each line corresponds to the statistical significance of the correlation, represented by −log10(p-value). Larger dots indicate higher statistical significance. (C) Boxplot analysis of gene dependency scores for the candidate genes across CCLE BRCA cell lines. The y-axis represents gene dependency, with lower scores indicating higher dependency. Each boxplot summarizes the distribution of dependency scores, with individual dots representing outliers. Genes with highly negative scores (left side, inset) are potential dual-effectors.
Figure 5.
Identifying multiple genes as potential dual-effectors that regulate both MHC-I expression and cell survival in breast cancers. (A) Verifying the candidate genes for their role in increasing transcription of MHC-I family genes in MCF7 cells. The indicated genes were knockout by the CRISPR-Cas9 method, followed by qPCR analysis of the mRNA levels of MHC-I family genes, including HLA-A, HLA-B, HLA-C, and HLA-DOB. Each red dot corresponds to a specific gene. (B) Effects of DIS3-, TBP-, or EXOC1-deficiency on transcription of MHC-I family genes in MCF7 cells. The control (gNC) and DIS3-, TBP-, or EXOC1-deficient (gDIS3, gTBP, gEXOC1) MCF7 cells were harvested for qPCR analysis of the mRNA levels of the indicated genes. (C) Effects of DIS3-, TBP-, or EXOC1-deficiency on MHC-I expression on the cell surface of MCF7 cells. The control (gNC) and DIS3-, TBP-, or EXOC1-deficient (gDIS3, gTBP, gEXOC1) MCF7 cells were harvested for FACS analysis with PE-conjugated HLA-A2 antibody. (D) Effects of DIS3-, TBP-, or EXOC1-deficiency on transcription of interferon-stimulated genes. The control (gNC) and DIS3-, TBP-, or EXOC1-deficient (gDIS3, gTBP, gEXOC1) MCF7 cells were harvested for qPCR analysis of the mRNA of the indicated genes. (E) Effect of DIS3-, TBP-, or EXOC1-deficiency on cell viability of MCF7 cells. The viability of the indicated cells was measured by CCK-8 assay. (F) Kaplan-Meier survival curves illustrating overall survival (OS) probabilities for patients stratified by high (red) and low (blue) expression levels of DIS3 (High, n = 608; Low, n = 262), TBP (High, n = 309; Low, n = 562), and EXOC1 (High, n = 592; Low, n = 286) in TCGA BRCA. The y-axis represents survival probability, and the x-axis indicates time in years. Log-rank p-values are labeled for each gene, indicating the statistical significance of the differences in survival outcomes between the high and low expression cohorts. The data shown are the mean ± SD (n = 3) from one representative experiment (B–E). ns, not significant, ** p < 0.01, *** p < 0.001, **** p < 0.0001. Data are representative of three independent experiments.
Figure 5.
Identifying multiple genes as potential dual-effectors that regulate both MHC-I expression and cell survival in breast cancers. (A) Verifying the candidate genes for their role in increasing transcription of MHC-I family genes in MCF7 cells. The indicated genes were knockout by the CRISPR-Cas9 method, followed by qPCR analysis of the mRNA levels of MHC-I family genes, including HLA-A, HLA-B, HLA-C, and HLA-DOB. Each red dot corresponds to a specific gene. (B) Effects of DIS3-, TBP-, or EXOC1-deficiency on transcription of MHC-I family genes in MCF7 cells. The control (gNC) and DIS3-, TBP-, or EXOC1-deficient (gDIS3, gTBP, gEXOC1) MCF7 cells were harvested for qPCR analysis of the mRNA levels of the indicated genes. (C) Effects of DIS3-, TBP-, or EXOC1-deficiency on MHC-I expression on the cell surface of MCF7 cells. The control (gNC) and DIS3-, TBP-, or EXOC1-deficient (gDIS3, gTBP, gEXOC1) MCF7 cells were harvested for FACS analysis with PE-conjugated HLA-A2 antibody. (D) Effects of DIS3-, TBP-, or EXOC1-deficiency on transcription of interferon-stimulated genes. The control (gNC) and DIS3-, TBP-, or EXOC1-deficient (gDIS3, gTBP, gEXOC1) MCF7 cells were harvested for qPCR analysis of the mRNA of the indicated genes. (E) Effect of DIS3-, TBP-, or EXOC1-deficiency on cell viability of MCF7 cells. The viability of the indicated cells was measured by CCK-8 assay. (F) Kaplan-Meier survival curves illustrating overall survival (OS) probabilities for patients stratified by high (red) and low (blue) expression levels of DIS3 (High, n = 608; Low, n = 262), TBP (High, n = 309; Low, n = 562), and EXOC1 (High, n = 592; Low, n = 286) in TCGA BRCA. The y-axis represents survival probability, and the x-axis indicates time in years. Log-rank p-values are labeled for each gene, indicating the statistical significance of the differences in survival outcomes between the high and low expression cohorts. The data shown are the mean ± SD (n = 3) from one representative experiment (B–E). ns, not significant, ** p < 0.01, *** p < 0.001, **** p < 0.0001. Data are representative of three independent experiments.
Figure 6.
Correlation of candidate genes with MHC-I expression and TILs in clinical breast cancer at single-cell scale. (A) UMAP plot representing single-cell RNA sequencing data of the tumor microenvironment (TME) from HR+ and Her2+ breast cancer patients (n = 45). Each point corresponds to an individual cell, with colors indicating different major cell types within the TME. (B) Bar plot showing the proportions of major cell types within the TME across patients, arranged by their cancer cells’ average MHC-I score. Each bar represents a single patient, with colors depicting the various cell types present in the TME. (C) Boxplots displaying the proportions of CD8+ T cells, CD4+ T cells, regulatory T cells, NK cells, and cancer epithelial cells within the microenvironment of tumor samples categorized by high and low MHC-I expression. Statistical significance between the high (red) and low (blue) MHC-I groups was determined using the Wilcoxon rank-sum test, with p-values labeled for each cell type. The y-axis quantifies the proportion of each respective cell type. (D) Heatmap showing the mean MHC-I scores between cancer cells that express or not express specific genes (PIP5K1A, NCKAP1, CYFIP1, DIS3, TBP, EXOC1). The color gradient from blue to red represents the range of mean module scores, with blue indicating lower scores and red indicating higher scores. (E) Heatmap showing the correlation between the expression levels of selected genes (PIP5K1A, NCKAP1, CYFIP1, DIS3, TBP, EXOC1) in tumor cells and various cell populations within the tumor microenvironment. Rows represent different cell types, while columns correspond to the indicated genes. The color scale reflects the correlation coefficients, with red indicating positive correlations and blue indicating negative correlations. Hierarchical clustering was performed on both genes and cell types to identify patterns of association between gene expression and immune cell infiltration within the tumor microenvironment. (F) Correlation scatter plots showing the relationship between the average expression levels of selected genes in tumor cells and the proportions of CD8+ T cells and cancer epithelial cells within the tumor microenvironment. Each row corresponds to a specific cell type, while each column represents a gene of interest. The Spearman correlation coefficients (R) and p-values are indicated within each plot, with the blue line representing the fitted regression line and the shaded area indicating the 95% confidence interval.
Figure 6.
Correlation of candidate genes with MHC-I expression and TILs in clinical breast cancer at single-cell scale. (A) UMAP plot representing single-cell RNA sequencing data of the tumor microenvironment (TME) from HR+ and Her2+ breast cancer patients (n = 45). Each point corresponds to an individual cell, with colors indicating different major cell types within the TME. (B) Bar plot showing the proportions of major cell types within the TME across patients, arranged by their cancer cells’ average MHC-I score. Each bar represents a single patient, with colors depicting the various cell types present in the TME. (C) Boxplots displaying the proportions of CD8+ T cells, CD4+ T cells, regulatory T cells, NK cells, and cancer epithelial cells within the microenvironment of tumor samples categorized by high and low MHC-I expression. Statistical significance between the high (red) and low (blue) MHC-I groups was determined using the Wilcoxon rank-sum test, with p-values labeled for each cell type. The y-axis quantifies the proportion of each respective cell type. (D) Heatmap showing the mean MHC-I scores between cancer cells that express or not express specific genes (PIP5K1A, NCKAP1, CYFIP1, DIS3, TBP, EXOC1). The color gradient from blue to red represents the range of mean module scores, with blue indicating lower scores and red indicating higher scores. (E) Heatmap showing the correlation between the expression levels of selected genes (PIP5K1A, NCKAP1, CYFIP1, DIS3, TBP, EXOC1) in tumor cells and various cell populations within the tumor microenvironment. Rows represent different cell types, while columns correspond to the indicated genes. The color scale reflects the correlation coefficients, with red indicating positive correlations and blue indicating negative correlations. Hierarchical clustering was performed on both genes and cell types to identify patterns of association between gene expression and immune cell infiltration within the tumor microenvironment. (F) Correlation scatter plots showing the relationship between the average expression levels of selected genes in tumor cells and the proportions of CD8+ T cells and cancer epithelial cells within the tumor microenvironment. Each row corresponds to a specific cell type, while each column represents a gene of interest. The Spearman correlation coefficients (R) and p-values are indicated within each plot, with the blue line representing the fitted regression line and the shaded area indicating the 95% confidence interval.