PRRGNVis: Multi-Level Visual Analysis of Comparison for Predicted Results of Recurrent Geometric Network
Abstract
:1. Introduction
- A new multi-level visual analysis method for protein structure comparison is designed.
- The visualization method can display the similarity and deviation of the local RGN protein prediction results, which is convenient for relevant personnel to conduct a detailed analysis.
- Our work is helpful in illuminating the limitations and improvement directions of RGN work.
- In Section 2, the related work of protein structure comparison, comparative visualization, and the prediction of the RGN in the CASP experiment were introduced.
- In Section 3, the data preparation of multi-angle analysis are introduced.
- In Section 4, the overview of the PRRGNVis are introduced.
- In Section 5, the design details of the multi-level visualization method are introduced.
- In Section 6, based on the prediction results of the RGN, some representative protein chains are selected to analyze their effectiveness in the multi-level method and provide directions for exploring the characteristics and limitations of the RGN.
- In Section 7, the advantages, limitations, and future research directions of the RGN are analyzed based on the above visualization methods and results.
2. Related Work
2.1. Structure Comparison
2.2. Comparative Visualization
2.3. Protein Prediction-Based Deep Learning
3. Overview
4. Multi-Angle Analysis of RGN Predicted Results
- (1)
- Visual analysis angle. Including the transformation of the tertiary file into the PDB file, the purpose is to provide a data interface for a multi-level comparison visual analysis framework.
- (2)
- Conformational analysis angle. Including the transformation of the tertiary file into torsion angle data and the analysis and calculation of various similarity comparison standards, the purpose is to provide data support for the multi-level comparison visual analysis framework.
4.1. RGN Prediction Process and Data Acquisition
4.1.1. Calculation Stage of RGN
4.1.2. Geometry Stage of RGN
4.1.3. Evaluation Stage of RGN
4.2. Multi-Angle Analysis of Tertiary Data
- (1)
- Visual analysis angle. Transformation of the tertiary file into a PDB file, which is used for data import and analysis in the visualization framework.
- (2)
- Conformational analysis angle. The transformation of the tertiary file and the torsion angle file, the torsion angle data determines the formation of the predicted structure conformation and is an important factor for structure comparison.
4.2.1. Transformation of Tertiary File into PDB File
- (1)
- Create a parser to parse the formats of input information tertiary data (.tertiary), sequence data (.fasta) and output information PDB data (.pdb) respectively.
- (2)
- The reference point between the sequence information and the protein amino acid type is set for sequence mapping in the conversion process.
- (3)
- The joint parser and the amino acid type reference point are merged into the PDB data format.
- (4)
- Repeat step (3) to complete the conversion of each residue information from tertiary data format to PDB data format.
- (5)
- And outputs the PDB data format of each protein sequence information.
4.2.2. Transformation of Tertiary File into Torsion Angle Data
- (1)
- According to the coordinates of the four atoms, the vectors , , of the three bonds will be calculated.
- (2)
- Then, the normal vectors and of the two peptide planes are calculated according to the value of the vector in step (1).
- (3)
- The torsion angles of the two peptide planes are calculated according to the normal vectors of the two peptide planes in step (2) and formula:
4.3. Structure Comparison Standard of RGN Predicted Results
5. Multi-Level Visual Design and Analysis of RGN Predicted Results
5.1. The Prediction Accuracy of the RGN
5.2. Differences between Structures
5.3. Structural Stability
5.4. Interactive Exploration
6. Demonstration
6.1. Visualization of RGN Prediction Accuracy
6.2. Visual Analysis of Differences and Stability
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Baker, D.; Sali, A. Protein Structure Prediction and Structural Genomics. Science 2001, 294, 93–96. [Google Scholar] [CrossRef] [PubMed]
- Källberg, M.; Wang, H.; Wang, S.; Peng, J.; Wang, Z.; Lu, H.; Xu, J. Template-based protein structure modeling using the RaptorX web server. Nat. Protoc. 2012, 7, 1511–1522. [Google Scholar]
- AlQuraishi, M. End-to-End Differentiable Learning of Protein Structure. Cell Syst. 2019, 8, 292–301. [Google Scholar] [CrossRef] [PubMed]
- Roca, A.I. ProfileGrids: A sequence alignment visualization paradigm that avoids the limitations of Sequence Logos. BMC Proc. 2014, 8, S6. [Google Scholar] [CrossRef] [PubMed]
- Kunzmann, P.; Mayer, B.E.; Hamacher, K. Substitution matrix based color schemes for sequence alignment visualization. BMC Bioinform. 2020, 21, 209. [Google Scholar] [CrossRef]
- Pietal, M.J.; Szostak, N.; Rother, K.M.; Bujnicki, J.M. RNAmap2D—Calculation, visualization and analysis of contact and distance maps for RNA and protein-RNA complex structures. BMC Bioinform. 2012, 13, 333. [Google Scholar] [CrossRef]
- Kocincová, L.; Jarešová, M.; Byška, J.; Parulek, J. Comparative visualization of protein secondary structures. BMC Bioinform. 2017, 18, 23. [Google Scholar] [CrossRef]
- Moritz, E.; Meyer, J. Interactive 3D protein structure visualization using virtual reality. In Proceedings of the Fourth IEEE Symposium on Bioinformatics and Bioengineering, Taichung, Taiwan, 21 May 2004; pp. 503–507. [Google Scholar] [CrossRef]
- Wiltgen, M.; Holzinger, A.; Tilz, G.P. Interactive Analysis and Visualization of Macromolecular Interfaces between Proteins. In HCI and Usability for Medicine and Health Care; Holzinger, A., Ed.; Springer: Berlin/Heidelberg, Germany, 2007; pp. 199–212. [Google Scholar]
- Zhao, Y.; Shi, J.; Liu, J.; Zhao, J.; Zhou, F.; Zhang, W.; Chen, K.; Zhao, X.; Zhu, C.; Chen, W. Evaluating Effects of Background Stories on Graph Perception. IEEE Trans. Vis. Comput. Graph. 2021. to be published. [Google Scholar] [CrossRef]
- Zhao, Y.; She, Y.; Chen, W.; Lu, Y.; Xia, J.; Chen, W.; Liu, J.; Zhou, F. Eod edge sampling for visualizing dynamic network via massive sequence view. IEEE Access 2018, 6, 53006–53018. [Google Scholar] [CrossRef]
- Moult, J.; Pedersen, J.T.; Judson, R.; Fidelis, K. A large-scale experiment to assess protein structure prediction methods. Proteins Struct. Funct. Bioinform. 1995, 23, ii–v. [Google Scholar] [CrossRef]
- Holm, L.; Sander, C. Protein Structure Comparison by Alignment of Distance Matrices. J. Mol. Biol. 1993, 233, 123–138. [Google Scholar] [CrossRef] [PubMed]
- Shekhar, S.; Xiong, H.; Zhou, X. (Eds.) Monte Carlo Simulation. In Encyclopedia of GIS; Springer: Berlin/Heidelberg, Germany, 2017; p. 1361. [Google Scholar] [CrossRef]
- Gerstein, M.; Levitt, M. Using Iterative Dynamic Programming to Obtain Accurate Pairwise and Multiple Alignments of Protein Structures. Int. Conf. Intell. Syst. Mol. Biol. 1996, 4, 59–67. [Google Scholar]
- Gibrat, J.F.; Madej, T.; Bryant, S.H. Surprising similarities in structure comparison. Curr. Opin. Struct. Biol. 1996, 6, 377–385. [Google Scholar] [CrossRef]
- Shindyalov, I.N.; Bourne, P.E. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng. 1998, 11, 739–747. [Google Scholar] [CrossRef] [PubMed]
- Godzik, A. The structural alignment between two proteins: Is there a unique answer? Protein Sci. Publ. Protein Soc. 2008, 5, 1325–1338. [Google Scholar] [CrossRef] [PubMed]
- Kotlovyi, V.; Nichols, W.L.; Eyck, L.F.T. Protein structural alignment for detection of maximally conserved regions. Biophys. Chem. 2003, 105, 595–608. [Google Scholar] [CrossRef]
- Bock, M.E.; Garutti, C.; Guerra, C. Discovery of similar regions on protein surfaces. J. Comput. Biol. J. Comput. Mol. Cell Biol. 2007, 14, 285–299. [Google Scholar] [CrossRef]
- Rangwala, H.; Karypis, G. fRMSDPred: Predicting local RMSD between structural fragments using sequence information. Proteins Struct. Funct. Genet. 2008, 72, 1005–1018. [Google Scholar] [CrossRef]
- Stolte, C.; Sabir, K.S.; Heinrich, J.; Hammang, C.J.; Schafferhans, A.; O’Donoghue, S.I. Integrated visual analysis of protein structures, sequences, and feature data. BMC Bioinform. 2015, 16, S7. [Google Scholar] [CrossRef]
- Nguyen, K.; Ropinski, T. Large-scale multiple sequence alignment visualization through gradient vector flow analysis. In Proceedings of the 2013 IEEE Symposium on Biological Data Visualization (BioVis), Los Alamitos, CA, USA, 13–14 October 2013; pp. 9–16. [Google Scholar] [CrossRef]
- Vetrivel, I.; Hoffmann, L.; Guegan, S.; Offmann, B.; Laurent, A.D. PBmapclust: Mapping and Clustering the Protein Conformational Space Using a Structural Alphabet. In MolVa: Workshop on Molecular Graphics and Visual Analysis of Molecular Data 2019; Digital Library Federation: Alexandria, VA, USA, 2019. [Google Scholar]
- Li, H.; Hou, J.; Adhikari, B.; Lyu, Q.; Cheng, J. Deep learning methods for protein torsion angle prediction. BMC Bioinform. 2017, 18, 417. [Google Scholar] [CrossRef]
- Buzhong, Z.; Jinyan, L.; Qiang, L. Prediction of 8-state protein secondary structures by a novel deep learning architecture. BMC Bioinform. 2018, 19, 293. [Google Scholar]
- Wang, Y.; Mao, H.; Yi, Z. Protein Secondary Structure Prediction by using Deep Learning Method. Knowl. Based Syst. 2016, 118, 115–123. [Google Scholar] [CrossRef]
- Drori, I.; Thaker, D.; Srivatsa, A.; Jeong, D.; Pe’Er, I. Accurate Protein Structure Prediction by Embeddings and Deep Learning Representations. arXiv 2019, arXiv:1911.05531. [Google Scholar]
- Senior, A.W.; Evans, R.; Jumper, J.; Kirkpatrick, J.; Hassabis, D. Improved protein structure prediction using potentials from deep learning. Nature 2020, 577, 706–710. [Google Scholar] [CrossRef]
- Zhou, Y.; Duan, Y.; Yang, Y. Trends in template/fragment-free protein structure prediction. Theor. Chem. Accounts 2011, 128, 3–16. [Google Scholar]
- Bernstein, F.C.; Koetzle, T.F.; Williams, G.J.; Meyer, E.F.; Brice, M.D.; Rodgers, J.R.; Kennard, O.; Shimanouchi, T.; Tasumi, M. The protein data bank: A computer-based archival file for macromolecular structures. Arch. Biochem. Biophys. 1978, 185, 584–591. [Google Scholar] [CrossRef]
- AlQuraishi, M. ProteinNet: A standardized data set for machine learning of protein structure. BMC Bioinform. 2019, 20, 311. [Google Scholar] [CrossRef]
- Richardson, J.S. The Anatomy and Taxonomy of Protein Structure. Adv. Protein Chem. 1981, 34, 167–339. [Google Scholar]
- Zhang, Y.; Skolnick, J. Scoring function for automated assessment of protein structure template quality. Proteins-Struct. Funct. Bioinform. 2004, 57, 702–710. [Google Scholar] [CrossRef]
- Zemla, A. LGA: A method for finding 3D similarities in protein structures. Nucleic Acids Res. 2003, 31, 3370–3374. [Google Scholar] [CrossRef]
- Kabsch, W. A solution for the best rotation to relate two sets of vectors. Acta Crystallogr. Sect. A 1976, 32, 922–923. [Google Scholar] [CrossRef]
- Bostock, M.; Ogievetsky, V.; Heer, J. D3 Data-Driven Documents. IEEE Trans. Vis. Comput. Graph. 2011, 17, 2301–2309. [Google Scholar] [CrossRef] [PubMed]
- Li, D.; Mei, H.; Shen, Y.; Su, S.; Zhang, W.; Wang, J.; Zu, M.; Chen, W. ECharts: A declarative framework for rapid construction of web-based visualization. Vis. Inform. 2018, 2, 136–146. [Google Scholar] [CrossRef]
- Sehnal, D.; Deshpande, M.; Vařeková, R.S.; Mir, S.; Berka, K.; Midlik, A.; Pravda, L.; Velankar, S.; Koča, J. LiteMol suite: Interactive web-based visualization of large-scale macromolecular structure data. Nat. Methods 2017, 14, 1121–1122. [Google Scholar] [CrossRef] [PubMed]
Field Name | Field Meaning |
---|---|
ID | Predicted dataset protein ID |
Classification | Prediction method classification: TBM and FM |
Primary | Protein amino acid chain |
Evolutionary | Location specific scoring matrix |
Tertiary | Three-dimensional atomic representation of protein |
Mask | Position indicator, presence or absence of residue atom |
Percentage of Similarity Standards/Average | |||
---|---|---|---|
Data Prediction Direction | dRMSD (<3 Å) | TM-Score (>0.5) | GDT_TS (>50) |
TBM | 10%/5.32 Å | 35%/0.4 | 35%/40 |
FM | 0%/9.8 Å | 0%/0.15 | 0%/18 |
Parameter | Reference Value |
---|---|
angle and angle correspond to the proportion of allowable areas in the Laplace diagram | 90% |
angle | ±180° |
Maximum deviation | 8.5 Å–13.5 Å |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, Y.; Feng, L.; Wang, Q.; Xu, Y.; Guo, D. PRRGNVis: Multi-Level Visual Analysis of Comparison for Predicted Results of Recurrent Geometric Network. Appl. Sci. 2022, 12, 8465. https://doi.org/10.3390/app12178465
Wang Y, Feng L, Wang Q, Xu Y, Guo D. PRRGNVis: Multi-Level Visual Analysis of Comparison for Predicted Results of Recurrent Geometric Network. Applied Sciences. 2022; 12(17):8465. https://doi.org/10.3390/app12178465
Chicago/Turabian StyleWang, Yanfen, Li Feng, Quan Wang, Yang Xu, and Dongliang Guo. 2022. "PRRGNVis: Multi-Level Visual Analysis of Comparison for Predicted Results of Recurrent Geometric Network" Applied Sciences 12, no. 17: 8465. https://doi.org/10.3390/app12178465
APA StyleWang, Y., Feng, L., Wang, Q., Xu, Y., & Guo, D. (2022). PRRGNVis: Multi-Level Visual Analysis of Comparison for Predicted Results of Recurrent Geometric Network. Applied Sciences, 12(17), 8465. https://doi.org/10.3390/app12178465