A Novel Method for Identifying Essential Genes by Fusing Dynamic Protein–Protein Interactive Networks
Abstract
:1. Introduction
2. Materials and Methods
2.1. Materials
2.2. Methods
2.2.1. Constructing Dynamic Protein–Protein Interactive Networks
Algorithm 1 Dynamic Network Construction |
Input: A static PPI (S_PPI) network represented as Graph G = (V, E, W), a time series of the gene expression profile of each gene in G, parameter k. |
Output: The active networks of each time point. |
Step1: Identify two categories of genes, the time-dependent genes and the time-independent genes. using Equations (1) and (2), according to their expression profiles. |
Step2: Filter out the noise genes in the time-independent genes. |
Step3: Identify the active genes of each time point from the remaining two categories of genes by judging whether or not their expression values are above the threshold (calculated by Equation (3)). |
Step4: Map the active genes of each time point to the S_PPI network and extract the active networks of each time point. |
2.2.2. Fusing the Active Protein–Protein Interactive Networks of Each Time Point
Algorithm 2 Active PPI network fusion |
Input: Active networks of each time point, parameter K. |
Output: Final fused network. |
Step1: Construct adjacency matrix Wi of the ith active network (I = 1, 2, 3, 4, …, M) using Equation (5). |
Step2: Construct Pi and Si of the ith active network using Equations (8) and (9). |
Step3: Calculate the similarities between any two networks based on the Euclidean distance of their adjacency matrixes. |
Step4: Select the nearest two active networks Gi and Gj, , , t = 0. Step5: Compute and using Equations (10) and (11), let t = t + 1. Step6: Repeat step 5 until t = 20. Step7: Compute the fused network R of Gi and Gj using Equation (12). Step8: Let Wr = R, construct Pr and Sr of the fused network R using Equation (8) and (9). Step9: Find the nearest active network Gk to R from the remaining active networks, let = , t = 0. Step10: Compute and using Equations (10) and (11), let t = t + 1. Step11: Repeat step 10 until t = 20. Step12: Compute the fused network of R and Gk using Equation (12), the fused network is named as R. Step13: Remove Gk from active network list and repeat steps 8 to 12 until all the active networks are fused to a final network. Step14: Output the final fused network. |
2.2.3. Ranking Genes in the Fused Network
Algorithm 3 FDP |
Input: A static PPI network represented as Graph G = (V, E, W), gene expression profile, orthologs data sets between Yeast and 99 other organisms (ranging from H.sapiens to E.coli), stopping error , parameter а, λ. |
Output: FDP values of genes. |
Step1: Construct active networks of each time point using the Dynamic Network Construction algorithm. |
Step2: Fuse these active networks into a final fused network using the Active PPI network fusion algorithm. |
Step3: Calculate the orthologous scores of each node in the final fused network using Equation (14). |
Step4: Construct matrix H and normalize all its entries by row. |
Step5: Initialize pr with pr0 = d, let t = 0. Step6: Compute prt+1 using Equation (13), let t = t + 1. Step7: Repeat step 6 until . |
Step8: Calculate the IFE value of each gene in the final fused network using Equations (15) and (16). |
Step9: Calculate the FDP value of each gene in the final fused network by linearly combining its pr value and IFE value (see Equation (17)). |
3. Results
3.1. Effects of Parameter λ
3.2. Comparing with Other Methods
3.3. Evaluation in Terms of Jackknife Curves
3.4. Evaluation in Terms of Precision-Recall Curve
4. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Wang, J.; Peng, W.; Wu F, X. Computational approaches to predicting essential proteins: A survey. J. Proteom. 2013, 7, 181–192. [Google Scholar] [CrossRef] [PubMed]
- Clatworthy, A.E.; Pierson, E.; Hung, D.T. Targeting virulence: A new paradigm for antimicrobial therapy. Nat. Chem. Biol. 2007, 3, 541–548. [Google Scholar] [CrossRef] [PubMed]
- Furney, S.; Alba, M.M.; Lopez-Bigas, N. Differences in the evolutionary history of disease genes affected by dominant or recessive mutations. BMC Genom. 2006, 7, 165. [Google Scholar]
- Xiao, Q.; Wang, J.; Peng, X.; Wu, F.-X. Detecting protein complexes from active protein interaction networks constructed with dynamic gene expression profiles. Proteome Sci. 2013, 11, S20. [Google Scholar] [CrossRef] [PubMed]
- Zhang, X.; Acencio, M.L.; Lemke, N. Predicting essential genes and proteins based on machine learning and network topological features: A comprehensive review. Front. Physiol. 2016, 7, 75. [Google Scholar] [PubMed]
- Fraser, H.B.; Hirsh, A.E.; Steinmetz, L.M.; Scharfe, C.; Feldman, M.J. Evolutionary rate in the protein interaction network. Science 2002, 296, 750–752. [Google Scholar] [CrossRef] [PubMed]
- Jordan, I.K.; Rogozin, I.B.; Wolf, Y.I.; Koonin, E.V. Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res. 2002, 12, 962–968. [Google Scholar] [CrossRef] [PubMed]
- Batada, N.N.; Hurst, L.D.; Tyers, M. Evolutionary and physiological importance of hub proteins. PLoS Comput. Biol. 2006, 2, e88. [Google Scholar] [CrossRef]
- Jeong, H.; Mason, S.P.; Barabási, A.-L.; Oltvai, Z.N. Lethality and centrality in protein networks. Nature 2001, 411, 41–42. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hahn, M.W.; Kern, A.D. Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. Mol. Biol. Evol. 2005, 22, 803–806. [Google Scholar] [CrossRef]
- Joy, M.P.; Brock, A.; Ingber, D.E.; Huang, S. High-betweenness proteins in the yeast protein interaction network. J. Biomed. Biotechnol. 2016, 2005, 96–103. [Google Scholar] [CrossRef] [PubMed]
- Wuchty, S.; Stadler, P.F. Centers of complex networks. J. Theor. Biol. 2003, 223, 45–53. [Google Scholar] [CrossRef] [Green Version]
- Estrada, E.; Rodriguez-Velazquez, J.A. Subgraph centrality in complex networks. Phys. Rev. E 2005, 71, 056103. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bonacich, P. Power and centrality: A family of measures. Am. J. Sociol. 1987, 92, 1170–1182. [Google Scholar] [CrossRef]
- Stephenson, K.; Zelen, M. Rethinking centrality: Methods and examples. Soc. Netw. 2002, 11, 1–37. [Google Scholar] [CrossRef]
- Wang, J.; Li, M.; Wang, H.; Pan, Y. Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans. Comput. Biol. Bioinform./IEEEACM 2012, 9, 1070–1080. [Google Scholar] [CrossRef] [PubMed]
- Li, M.; Wang, J.X.; Wang, H.A.; Pan, Y. Essential proteins discovery from weighted protein interaction networks. Lect. Note Bioinform. 2010, 6053, 89–100. [Google Scholar]
- Li, M.; Zhang, H.; Wang, J.; Pan, Y. A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data. BMC Syst. Biol. 2012, 6, 15. [Google Scholar] [CrossRef] [PubMed]
- Tang, X.; Wang, J.; Zhong, J.; Pan, Y. Predicting essential proteins based on weighted degree centrality. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 2014, 11, 407–418. [Google Scholar] [CrossRef]
- Li, G.; Li, M.; Wang, J.; Wu, J.; Wu, F.-X.; Pan, Y. Predicting essential proteins based on subcellular localization, orthology and ppi networks. BMC Bioinform. 2016, 17, 279. [Google Scholar] [CrossRef]
- Peng, X.; Wang, J.; Wang, J.; Wu, F.-X.; Pan, Y. Rechecking the centrality-lethality rule in the scope of protein subcellular localization interaction networks. PLoS ONE 2015, 10, e0130743. [Google Scholar] [CrossRef] [PubMed]
- Peng, W.; Wang, J.; Wang, W.; Liu, Q.; Wu, F.-X.; Pan, Y. Iteration method for predicting essential proteins based on orthology and protein-protein interaction networks. BMC Syst. Biol. 2012, 6, 87. [Google Scholar] [CrossRef] [PubMed]
- Peng, W.; Wang, J.; Cheng, Y.; Lu, Y.; Wu, F.; Pan, Y. Udonc: An algorithm for identifying essential proteins based on protein domains and protein-protein interaction networks. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 2015, 12, 276–288. [Google Scholar] [CrossRef] [PubMed]
- Ren, J.; Wang, J.X.; Li, M.; Wang, H.; Liu, B.B. Prediction of essential proteins by integration of PPI network topology and protein complexes information. Bioinform. Res. Appl. 2011, 6674, 12–24. [Google Scholar]
- Luo, J.; Qi, Y. Identification of essential proteins based on a new combination of local interaction density and protein complexes. PLoS ONE 2015, 10, e0131418. [Google Scholar] [CrossRef]
- Zhang, W.; Xu, J.; Li, Y.; Zou, X. Detecting essential proteins based on network topology, gene expression data and gene ontology information. IEEE/ACM Trans. Comput. Biol. Bioinform. 2016, 15, 109–116. [Google Scholar] [CrossRef] [PubMed]
- Przytycka, T.M.; Singh, M.; Slonim, D.K. Toward the dynamic interactome: It’s about time. Brief. Bioinform. 2010, 11, 15–29. [Google Scholar] [CrossRef] [PubMed]
- Wang, J.; Peng, X.; Peng, W.; Wu, F.X. Dynamic protein interaction network construction and applications. Proteomics 2014, 14, 338–352. [Google Scholar] [CrossRef]
- Tang, X.; Wang, J.; Liu, B.; Li, M.; Chen, G.; Pan, Y. A comparison of the functional modules identified from time course and static PPI network data. BMC Bioinform. 2011, 12, 339. [Google Scholar] [CrossRef]
- Wang, J.; Peng, X.; Li, M.; Pan, Y. Construction and application of dynamic protein interaction network based on time course gene expression data. Proteomics 2013, 13, 301–312. [Google Scholar] [CrossRef]
- Li, M.; Meng, X.; Zheng, R.; Wu, F.-X.; Li, Y.; Pan, Y.; Wang, J. Identification of protein complexes by using a spatial and temporal active protein interaction network. IEEE/ACM Trans. Comput. Biol. Bioinform. 2017. [Google Scholar] [CrossRef]
- Xiao, Q.; Wang, J.; Peng, X.; Wu, F.-x.; Pan, Y. Identifying essential proteins from active PPI networks constructed with dynamic gene expression. BMC Genom. 2015, 16 (Suppl. 3), S1. [Google Scholar] [CrossRef] [PubMed]
- Li, M.; Ni, P.; Chen, X.; Wang, J.; Wu, F.; Pan, Y. Construction of refined protein interaction network for predicting essential proteins. IEEE/ACM Trans. Comput. Biol. Bioinform. 2017. [Google Scholar] [CrossRef]
- Shang, X.; Wang, Y.; Chen, B. Identifying essential proteins based on dynamic protein-protein interaction networks and RNA-seq datasets. Sci. China Inf. Sci. 2016, 59, 070106. [Google Scholar] [CrossRef]
- Wang, B.; Mezlini, A.M.; Demir, F.; Fiume, M.; Tu, Z.; Brudno, M.; Haibe-Kains, B.; Goldenberg, A. Similarity network fusion for aggregating data types on a genomic scale. Nat. Methods 2014, 11, 333. [Google Scholar] [CrossRef] [PubMed]
- Krylov, D.M.; Wolf, Y.I.; Rogozin, I.B.; Koonin, E.V. Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution. Genome Res. 2003, 13, 2229–2235. [Google Scholar] [CrossRef] [PubMed]
- Hawoong Jeong, Z.N.O. Albert-László Barabási. Prediction of protein essentiality based on genomic data. ComPlexUs 2002, 2003, 10. [Google Scholar]
- Xenarios, I.; Salwinski, L.; Duan, X.Q.J.; Higney, P.; Kim, S.M.; Eisenberg, D. DIP, the database of interacting proteins: A research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 2002, 30, 303–305. [Google Scholar] [CrossRef] [PubMed]
- Sharan, R.; Suthram, S.; Kelley, R.M.; Kuhn, T.; McCuine, S.; Uetz, P.; Sittler, T.; Karp, R.M.; Ideker, T. Conserved patterns of protein interaction in multiple species. Proc. Natl. Acad. Sci. USA 2005, 102, 1974–1979. [Google Scholar] [CrossRef]
- Mewes, H.W.; Frishman, D.; Mayer, K.F.X.; Münsterkötter, M.; Noubibou, O.; Pagel, P.; Rattei, T.; Oesterheld, M.; Ruepp, A.; Stümpflen, V. MIPS: Analysis and annotation of proteins from whole genomes in 2005. Nucleic Acids Res. 2006, 34, 169–172. [Google Scholar] [CrossRef]
- Cherry, J.M. SGD: Saccharomyces genome database. Nucleic Acids Res. 1998, 26, 9. [Google Scholar] [CrossRef]
- Zhang, R.; Lin, Y. DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Res. 2009, 37, D455–D458. [Google Scholar] [CrossRef] [Green Version]
- Winzeler, E.A.; Shoemaker, D.D.; Astromoff, A.; Liang, H.; Anderson, K.; Andre, B.; Chu, A.M. Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science 1999, 285, 901–906. [Google Scholar] [CrossRef] [PubMed]
- De Lichtenberg, U.; Jensen, L.J.; Fausbøll, A.; Jensen, T.S.; Bork, P.; Brunak, S. Comparison of computational methods for the identification of cell cycle-regulated genes. Bioinformatics 2004, 21, 1164–1171. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wu, F.-X.; Xia, Z.; Mu, L. Finding significantly expressed genes from time-course expression profiles. Int. J. Bioinform. Res. Appl. 2009, 5, 50–63. [Google Scholar] [CrossRef] [PubMed]
- Peng, W.; Wang, J.; Zhao, B.; Wang, L. Identification of protein complexes using weighted pagerank-nibble algorithm and core-attachment structure. IEEE/ACM Trans. Comput. Biol. Bioinform. 2015, 12, 179–192. [Google Scholar] [CrossRef] [PubMed]
- Peng, W.; Li, M.; Chen, L.; Wang, L. Predicting protein functions by using unbalanced random walk algorithm on three biological networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 2017, 14, 360–369. [Google Scholar] [CrossRef] [PubMed]
- Estrada, E.J.P. Virtual identification of essential proteins within the protein interaction network of yeast. Proteomics 2010, 6, 35–40. [Google Scholar] [CrossRef]
Network | Genes in S_PPI | Edges in S_PPI | Genes in D_PPI | Essential Genes in S_PPI | Essential Genes in D_PPI |
---|---|---|---|---|---|
DIP_PPI | 5093 | 24743 | 2759 | 1167 | 827 |
SC_net | 4746 | 15166 | 2559 | 1130 | 785 |
T | 0 | 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 | 1 |
---|---|---|---|---|---|---|---|---|---|---|---|
100 | 47 | 70 | 79 | 82 | 85 | 87 | 90 | 90 | 89 | 90 | 92 |
200 | 108 | 111 | 130 | 147 | 151 | 154 | 159 | 163 | 164 | 165 | 168 |
300 | 156 | 166 | 176 | 193 | 201 | 211 | 215 | 215 | 223 | 229 | 226 |
400 | 201 | 212 | 225 | 230 | 242 | 252 | 249 | 273 | 280 | 285 | 277 |
T | 0 | 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 | 1 |
---|---|---|---|---|---|---|---|---|---|---|---|
100 | 37 | 58 | 77 | 82 | 88 | 88 | 90 | 91 | 90 | 91 | 90 |
200 | 87 | 100 | 134 | 147 | 152 | 154 | 158 | 163 | 162 | 164 | 167 |
300 | 131 | 156 | 182 | 199 | 209 | 217 | 222 | 222 | 226 | 232 | 221 |
400 | 176 | 199 | 224 | 248 | 255 | 266 | 269 | 275 | 280 | 277 | 272 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, F.; Peng, W.; Yang, Y.; Dai, W.; Song, J. A Novel Method for Identifying Essential Genes by Fusing Dynamic Protein–Protein Interactive Networks. Genes 2019, 10, 31. https://doi.org/10.3390/genes10010031
Zhang F, Peng W, Yang Y, Dai W, Song J. A Novel Method for Identifying Essential Genes by Fusing Dynamic Protein–Protein Interactive Networks. Genes. 2019; 10(1):31. https://doi.org/10.3390/genes10010031
Chicago/Turabian StyleZhang, Fengyu, Wei Peng, Yunfei Yang, Wei Dai, and Junrong Song. 2019. "A Novel Method for Identifying Essential Genes by Fusing Dynamic Protein–Protein Interactive Networks" Genes 10, no. 1: 31. https://doi.org/10.3390/genes10010031
APA StyleZhang, F., Peng, W., Yang, Y., Dai, W., & Song, J. (2019). A Novel Method for Identifying Essential Genes by Fusing Dynamic Protein–Protein Interactive Networks. Genes, 10(1), 31. https://doi.org/10.3390/genes10010031