*Article* **The Identification of RNA Modification Gene** *PUS7* **as a Potential Biomarker of Ovarian Cancer**

**Huimin Li <sup>1</sup> , Lin Chen <sup>1</sup> , Yunsong Han <sup>1</sup> , Fangfang Zhang <sup>1</sup> , Yanyan Wang <sup>1</sup> , Yali Han <sup>1</sup> , Yange Wang <sup>1</sup> , Qiang Wang 2,\* and Xiangqian Guo 1,\***


**Simple Summary:** RNA modifications are involved in a variety of diseases, including cancers. Given the lack of efficient and reliable biomarkers for early diagnosis of ovarian cancer (OV), this study was designed to explore the role of RNA modification genes (RMGs) in the diagnosis of OV. The study first selected PUS7 (Pseudouridine Synthase 7) as a diagnostic biomarker candidate through the analysis of differentially expressed genes using TCGA and GEO data. Then, we evaluated its specificity and sensitivity using Receiver Operating Characteristic (ROC) analysis in TCGA and GEO data. The protein expression, mutation, protein interaction networks, correlated genes, related pathways, biological processes, cell components, and molecular functions were analyzed for PUS7 as well. The upregulation of PUS7 protein in OV was confirmed by the staining images in HPA and tissue arrays. In conclusion, the findings of the present study point towards the potential of PUS7 as the diagnostic marker and therapeutic target for ovarian cancer.

**Abstract:** RNA modifications are reversible, dynamically regulated, and involved in a variety of diseases such as cancers. Given the lack of efficient and reliable biomarkers for early diagnosis of ovarian cancer (OV), this study was designed to explore the role of RNA modification genes (RMGs) in the diagnosis of OV. Herein, 132 RMGs were retrieved in PubMed, 638 OV and 18 normal ovary samples were retrieved in The Cancer Genome Atlas (TCGA), and GSE18520 cohorts were collected for differential analysis. Finally, *PUS7* (Pseudouridine Synthase 7) as differentially expressed RMGs (DEGs-RMGs) was identified as a diagnostic biomarker candidate and evaluated for its specificity and sensitivity using Receiver Operating Characteristic (ROC) analysis in TCGA and GEO data. The protein expression, mutation, protein interaction networks, correlated genes, related pathways, biological processes, cell components, and molecular functions of *PUS7* were analyzed as well. The upregulation of PUS7 protein in OV was confirmed by the staining images in HPA and tissue arrays. Collectively, the findings of the present study point towards the potential of PUS7 as a diagnostic marker and therapeutic target for ovarian cancer.

**Keywords:** DEGs; diagnosis; ovarian cancer; PUS7; RMGs

#### **1. Introduction**

Ovarian cancer (OV) is the leading cause of death among gynecologic malignancies in most developed countries [1,2]. It accounts for an estimated 239,000 new cases and 152,000 deaths worldwide annually [3]. The risk of having ovarian cancer during the lifetime of a woman is approximately 1 in 78, and the lifetime chance of dying of ovarian cancer is approximately 1 in 108 [4]. Four out of five OV patients are diagnosed with advanced stage [5], and out of these, only 30% of patients survive more than 5 years [4].

**Citation:** Li, H.; Chen, L.; Han, Y.; Zhang, F.; Wang, Y.; Han, Y.; Wang, Y.; Wang, Q.; Guo, X. The Identification of RNA Modification Gene *PUS7* as a Potential Biomarker of Ovarian Cancer. *Biology* **2021**, *10*, 1130. https://doi.org/10.3390/biology 10111130

Academic Editors: Shibiao Wan, Yiping Fan, Chunjie Jiang and Shengli Li

Received: 29 September 2021 Accepted: 30 October 2021 Published: 3 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

The lack of a practical screening strategy and the asymptomatic characteristic of OV contribute to the late presentation of the disease. Hence, the efficient and early detection of OV is pivotal to improving the survival of ovarian cancer patients.

Post-transcriptional modifications affect RNA stability, localization, structure, splicing, or function [6]. Different RNAs have been detected to contain numerous types of modifications [7,8]. For example, mRNA modifications include N6-methyladenosine (m6A), inosine (I), 5-methylcytosine (m5C), and 5-hydroxymethylcytosine (hm5C). Deregulated RNA modifications are reported to be associated with several pathological processes such as tumorigenesis, cardiovascular diseases, and neurological disorders [9]. RNA modification enzymes have been generally considered important decorations for RNAs [10], and dysregulation and mutation in RNA modification genes are involved in the development of numerous cancers including lung cancer, bladder cancer, leukemia, prostate cancer, breast cancer, etc. [11]. For example, Alpha-Ketoglutarate Dependent Dioxygenase (*FTO*) was deciphered as a prognosticator for lung squamous cell carcinoma and promoted cell proliferation and invasion [12]. Methyltransferase Like 3 (*METTL3*), acting as an oncogene in lung cancer, upregulated *EGFR* and *TAZ* expression and promoted growth, survival, and invasion of human lung cancer cells [13]. NOP2/Sun RNA Methyltransferase 2 (*NSUN2*) was reported to be overexpressed in breast cancer and to be associated with cancer progression [14]. Elongator Acetyltransferase Complex Subunit 3 (*ELP3*), responsible for mcm5s2 modification, has been found to be upregulated in breast cancer and to facilitate cancer cell metastasis [15]. tRNA methyltransferase 9B (*TRM9L/TRMT9B*) has been shown to be downregulated in breast cancer [16]. Similarly, in renal cell carcinomas, G3BP Stress Granule Assembly Factor 1 (*G3BP1*) has been shown to promote tumor progression and metastasis [17]. Taken together, RNA modification genes play pivotal roles in human cancers.

Pseudouridine synthases (PUS) are divided into six families (TruA, TruB, TruD, RsuA, RluA, and Pus10) [18]. PUS7 is the only member of the TruD family that is involved in the modification of tRNAs, at position Tyr35 in pre-tRNA, at position 13 in cytoplasmic tRNA, and at numerous nucleotides in mRNAs. PUS7 is the only pseudouridine synthase to possess a consensus sequence (UGUAR) for substrate recognition [19]. *PUS7* was also reported to be associated with human myeloid malignancies in embryonic stem cells [20]. However, no reports have expounded the role of *PUS7* in OV, so far.

In this study, PUS7 was identified as a novel and potential biomarker for early diagnosis, using transcriptional profiles in the GEO and TCGA databases, ROC, HPA, and Oncomine analyses. Protein–protein interaction (PPI); GSEA pathway; and GO analyses, including the biological process (BP), cell component (CC), and molecular function (MF) terms, were also performed to provide in-depth insights into PUS7.

## **2. Materials and Methods**

#### *2.1. Data Collection*

The RMGs were collected from PubMed according to the keywords "RNA modification". The transcriptome profiles, including datasets GSE18520 and TCGA, were obtained from GEO (https://www.ncbi.nlm.nih.gov/gds, accessed on 15 October 2019) [21] and UCSC Xena (https://xena.ucsc.edu/, accessed on 16 October 2019) [22], respectively. A total of 53 OV and 10 normal cases were enrolled in GSE18520 (platform: GPL570), and 585 OV and 8 normal cases in TCGA (Affymetrix Human Genome U133 Plus 2.0 Array) were adopted to carry out the following analyses.

#### *2.2. Differential Expression Analysis*

The GEO2R, an interactive web tool that facilitates users to compare the gene expression between different groups of samples in a GEO dataset, was used to identify the differentially expressed genes (DEGs). The SangerBox was adopted to analyze the TCGA expression profile of ovarian cancer. A *p* value < 0.05 and |log2FC| > 1 were used as the cut-off criteria to screen out DEGs. The DEGs of the two datasets were listed in Supplementary Table S1. Subsequently, the RMGs and DEGs that overlapped between GSE18520 and TCGA were selected using Venny 2.1 and were used for further analysis. The analysis of the volcano plot of DEGs in GSE18520 and TCGA, and the heatmaps of DEGs-RMGs in GSE18520 and TCGA were obtained through the SangerBox web tool.

#### *2.3. PUS7 Protein Level Analysis of OV Tissues in HPA and Tissue Array*

The protein expression of PUS7 was analyzed using HPA data [23]. A tissue chip (HOvaC070PT01) was purchased from SHANGHAI OUTDO BIOTECH CO., LTD. A total of 12 OV samples and 2 healthy ovary samples, and 65 OV samples and 5 healthy ovary samples were retrieved from HPA and tissue array, respectively. The one case with an equivocal staining result was excluded, and the baseline characteristics of the remaining 64 cases of OV tissues in tissue array are described in Supplementary Table S2. The immunohistochemistry (IHC) staining intensity was graded from 0 to 3 (0, negative; 1, weak; 2, moderate; and 3, strong). The staining quantity was graded from 0 to 3 (0, none; 1, <25%; 2, 25–75%; and 3, >75%) according to the percentage of positive cells in the HPA database. The staining quantity was graded from 0 to 4 (0, none; 1, <25%; 2, 25–50%; 3, 50– 75%; and 4, >75%) in the tissue assay. The staining scores were calculated by multiplying the staining intensity with the staining quantity.

#### *2.4. PUS7 Gene Expression Analysis Using TCGA and GEO Datasets*

The *PUS7* expression analysis was carried out using TCGA and GSE119056 expression profiling data. An ROC analysis (the method frequently used for binary assessment) was subsequently performed to evaluate the effectiveness of the expression level of any gene of interest in discriminating between OV and healthy samples. The area under the curve (AUC) value ranged from 0.5 to 1.0, which indicates 50 to 100% discrimination ability.

#### *2.5. PUS7 Gene Expression Analysis Using Oncomine Database*

The gene expression of *PUS7* was explored using the Oncomine database (https:// www.oncomine.org/resource/main.html, accessed on 25 October 2019) [24]. The Oncomine database applies a combination of threshold values (*p*-value) and fold change (FC, tumors vs. controls) with *p* ≤ 0.05 and fold change >1.

#### *2.6. Protein–Protein Interaction (PPI) Network Analyses*

STRING (https://stringdb.org/, accessed on 22 October 2019) [25] is a database used to predict and analyze functional interactions between proteins and was used to identify the functional protein–protein interactions (PPIs) of *PUS7*. GeneMANIA (http: //genemania.org/, accessed on 24 October 2019) [26] was used to identify gene networks embracing *PUS7*.

## *2.7. The Mutation and Correlation Analyses of PUS7*

The *PUS7* mutation was performed through cBioPortal (https://www.cbioportal.org/, 27 October 2019) [27]. The Gene Expression Profiling Interactive Analysis (GEPIA) database (http://gepia.cancer-pku.cn/, accessed on 27 October 2020) [28] was employed to analyze the PUS7 correlated genes based on TCGA data.

## *2.8. Pathways and BP, CC, and MF Analyses*

Gene set enrichment analysis (GSEA) was carried out to identify potential cellular pathways involved with PUS7. The TCGA-OV dataset was divided into a high (25%) and a low group (75%) based on the PUS7 mRNA expression. Nominal *p*-value < 0.01 and false discovery rate (FDR) q-value < 0.05 were considered significant for enriched gene set analysis. Using 312 genes positively correlated (R > 0.3, *p* < 0.05) with *PUS7* derived from the cBioPortal analysis, the BP, CC, and MF analyses were carried out through the Database for Annotation, Visualization, and Integrated Discovery (DAVID, https: //david.ncifcrf.gov/, 19 November 2020) [29] and visualized with bubble diagrams based on *p* values < 0.05.

#### *2.9. Statistical Analysis*

The statistical analyses were performed using SPSS ver. 26.0. The Student's *t*-test and the rank-sum test were used to evaluate the difference in PUS7 expression between the OV and normal samples. The ROC curve was constructed using *PUS7* expression profiles in the OV and normal samples by GraphPad Prism 8.0. A *p* value at < 0.05 was taken as a measure of statistically significant difference.

#### **3. Results**

#### *3.1. The Identification of DEGs-RMGs of OV Data in TCGA and GEO*

A total of 132 RMGs (Supplementary Table S3) were retrieved from PubMed. TCGA AffyU133a expression profiles and GSE18520 cohorts of ovarian cancer were downloaded from UCSC Xena and the GEO databases, respectively. A total of 1142 and 5215 DEGs (Supplementary Table S1) were obtained in the TCGA dataset and GSE18520 dataset between the OV and normal samples through DEO2R and SangerBox-limma analysis, respectively, and the volcano plots of DEGs are presented in Figure 1A,B. The RMGs and DEGs from the two cohorts were intersected to screen out the overlapping RMGs and DEGs for diagnostic biomarker analysis. As a result, two genes named *WDR77* and *PUS7* were identified as differentially expressed RMGs (Figure 1B). *WDR77* was excluded since it exhibited a contrary expression trend between OV and normal in TCGA and GSE18520 (Figure 2A,B). However, PUS7 showed a consistent high expression in OV rather than normal cases; thus, *PUS7* could be a potential diagnostic biomarker and is subject to further analyses.

#### *3.2. Expression Validation and Mutation Analysis for PUS7 in Ovarian Cancer*

To validate the overexpression of *PUS7* in OV rather than normal samples, an Oncomine analysis was performed on ovarian cancer with different pathological types, and found that the *PUS7* expression is highly elevated in OV samples with fold change >1 and *p* < 0.05 (as presented in Figure 3A,B and Table 1). Moreover, Figure 3C,D displays the corresponding ROC curve of *PUS7* in the TCGA and GSE18520 datasets, indicating the remarkable potential of *PUS7* to discriminate OVs from normal tissue. The IHC analytic results showed the overexpression of PUS7 at the protein level (Figure 4A,B). To further explore the overexpression of PUS7 at the protein level in OV samples, a tissue array was performed. Typical staining images in the tissue array are exhibited in Figure 4C, confirming the protein upregulation of PUS7 in OV tissues (Figure 4D). Since mutations in RNA modification genes have been reported to be associated with several types of human cancers, the mutation analysis of PUS7 was performed in cBioPortal, demonstrating the fusion of PUS7 with SRSF Protein Kinase 2 (SRPK2) in serous ovarian cancer (Table 2).

**Figure 1.** The identification of DEGs-RMGs using OV data in TCGA and GEO. (**A**,**B**) The volcano plot of DEGs between OV and normal samples in GSE18520 and TCGA data. (**C**) *WDR77* and *PUS7* were identified as the overlapping genes of DEGs in both datasets.

**Figure 2.** The heatmaps of differentially expressed RMGs. (**A**) The heatmap of the expression profile of overlapping genes of RMGs and DEGs in normal tissues and OV tissues in the TCGA dataset. (**B**) The heatmaps of the expression profile for overlapping genes of RMGs and DEGs in normal tissues and OV tissues in the GSE18520 dataset.

**Figure 3.** The differential expression analysis and ROC analysis of *PUS7* in OV and normal tissues. (**A**,**B**) The expression analysis of *PUS7* in TCGA and GSE18520 cohorts, respectively. (**C**,**D**) The ROC analysis of PUS7 between OV and normal samples in TCGA and GSE18520 cohorts. AUC is plotted as sensitivity% vs. 100-specificity%. A *p* < 0.05 was considered a significant difference.


**Table 1.** The comparison analysis of *PUS7* in ovarian cancer and normal tissue in different cohorts (Oncomine).

**Figure 4.** PUS7 protein expression was significantly higher in OV tissues than normal tissues. (**A**) Representative IHC images of PUS7 in normal (**left**) and OV (**right**) tissues in HPA. (**B**) Statistical analysis of the protein expression of PUS7 according to the staining scores of OV and normal tissues. (**C**) Representative IHC images of PUS7 in normal (**left**) and OV (**right**) tissues according to tissue microarray. (**D**) Statistical analysis of the protein expression of PUS7 according to the staining scores of OV and normal tissues. *p* < 0.05 was considered significant.

**Table 2.** The mutation distribution of PUS7 in ovarian cancer according to cBioPortal.


SRPK 2. SRSF Protein Kinase 2; PUS7: Pseudouridine Synthase 7.
