An Augmented Sample Selection Framework for Prediction of Anticancer Peptides
Abstract
:1. Introduction
- This article constructs a novel augmented sample selection framework that can effectively remove noisy samples from the augmented samples, thereby ensuring the performance of the model.
- This article constructs a pseudo-label screening mechanism based on uncertainty, confidence, and label consistency, which can ensure the quality of augmented samples after screening.
2. Results
2.1. Performance Evaluation Metrics
2.2. Comparison of ACPs-ASSF with Other Methods
2.2.1. Performance Comparison under Different Numbers of Augmented Samples
2.2.2. Performance Comparison under Different Perturbation Factors
2.3. Visualization of ACPs-ASSF Selected Samples
2.4. Hyperparametric Sensitive Verification of ACPs-ASSF
2.5. Performance of Various Feature Combinations
3. Discussion
4. Materials and Methods
4.1. Datasets
4.2. Feature Extraction
4.2.1. BPF
4.2.2. OPE
4.2.3. CKSAAGP
4.2.4. AAC
4.2.5. AAIF
4.3. Data Augmentation
4.4. Pseudo-Labeling and Uncertainty Estimation
4.5. ACPs-ASSF
Algorithm 1 Uncertainty-aware augmented sample selection (ACPs-ASSF) | ||
Require: Original training dataset ; augmented sample dataset ; prediction model with trainable parameters ; uncertainty threshold and confidence threshold ; number of stochastic forward pass times ; number of iterations for selecting samples ; number of epochs for training model. | ||
1: | ; | ▷ obtain training set |
2: | for to do | |
3: | Initialize ; | |
4: | if | |
5: | ; | ▷ merge the selected samples to the training set |
6: | for to do | |
7: | Train using ; | ▷ using CE loss and SGD |
8: | end for | |
9: | for to do | |
10: | ; | |
11: | Input samples from into ; | ▷ accumulate the output of each pass |
12: | end for | |
13: | Compute the uncertainty and pseudo-labels by Equations (9) and (10); | |
14: | Use Equation (11) to obtain ; | ▷ select augmented samples |
15: | end for | |
16: | return |
4.6. Experimental Settings
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Sample Availability
References
- Fitzgerald, R.C.; Antoniou, A.C.; Fruk, L.; Rosenfeld, N. The future of early cancer detection. Nat. Med. 2022, 28, 666–677. [Google Scholar] [CrossRef] [PubMed]
- Siegel, R.L.; Miller, K.D.; Wagle, N.S.; Jemal, A. Cancer statistics, 2023. CA Cancer J. Clin. 2023, 73, 17–48. [Google Scholar] [CrossRef] [PubMed]
- Crosby, D.; Bhatia, S.; Brindle, K.M.; Coussens, L.M.; Dive, C.; Emberton, M.; Esener, S.; Fitzgerald, R.C.; Gambhir, S.S.; Kuhn, P.; et al. Early detection of cancer. Science 2022, 375, eaay9040. [Google Scholar] [CrossRef]
- Li, S.; Zhang, Z.; Lai, W.-F.; Cui, L.; Zhu, X. How to overcome the side effects of tumor immunotherapy. Biomed. Pharmacother. 2020, 130, 110639. [Google Scholar] [CrossRef] [PubMed]
- Brook, I. Late side effects of radiation treatment for head and neck cancer. Radiat. Oncol. J. 2020, 38, 84–92. [Google Scholar] [CrossRef] [PubMed]
- Mansoori, B.; Mohammadi, A.; Davudian, S.; Shirjang, S.; Baradaran, B. The different mechanisms of cancer drug resistance: A brief review. Adv. Pharm. Bull. 2017, 7, 339. [Google Scholar] [CrossRef]
- Xie, M.; Liu, D.; Yang, Y. Anti-cancer peptides: Classification, mechanism of action, reconstruction and modification. Open Biol. 2020, 10, 200004. [Google Scholar] [CrossRef] [PubMed]
- Norouzi, P.; Mirmohammadi, M.; Houshdar Tehrani, M.H. Anticancer peptides mechanisms, simple and complex. Chem.-Biol. Interact. 2022, 368, 110194. [Google Scholar] [CrossRef]
- Lath, A.; Santal, A.R.; Kaur, N.; Kumari, P.; Singh, N.P. Anti-cancer peptides: Their current trends in the development of peptide-based therapy and anti-tumor drugs. Biotechnol. Genet. Eng. Rev. 2023, 39, 45–84. [Google Scholar] [CrossRef]
- Ng, C.X.; Le, C.F.; Tor, Y.S.; Lee, S.H. Hybrid Anticancer Peptides DN1 and DN4 Exert Selective Cytotoxicity Against Hepatocellular Carcinoma Cells by Inducing Both Intrinsic and Extrinsic Apoptotic Pathways. Int. J. Pept. Res. Ther. 2021, 27, 2757–2775. [Google Scholar] [CrossRef]
- Rao, B.; Zhou, C.; Zhang, G.; Su, R.; Wei, L. ACPred-Fuse: Fusing multi-view information improves the prediction of anticancer peptides. Brief. Bioinform. 2019, 21, 1846–1855. [Google Scholar] [CrossRef] [PubMed]
- Chiangjong, W.; Chutipongtanate, S.; Hongeng, S. Anticancer peptide: Physicochemical property, functional aspect and trend in clinical application. Int. J. Oncol. 2020, 57, 678–696. [Google Scholar] [CrossRef] [PubMed]
- Zhang, C.; Yang, M.; Ericsson, A.C. Antimicrobial Peptides: Potential Application in Liver Cancer. Front. Microbiol. 2019, 10, 1257. [Google Scholar] [CrossRef]
- Barras, D.; Widmann, C. Promises of apoptosis-inducing peptides in cancer therapeutics. Curr. Pharm. Biotechnol. 2011, 12, 1153–1165. [Google Scholar] [CrossRef]
- Boohaker, R.J.; Lee, M.W.; Vishnubhotla, P.; Perez, J.L.M.; Khaled, A.R. The use of therapeutic peptides to target and to kill cancer cells. Curr. Med. Chem. 2012, 19, 3794–3804. [Google Scholar] [CrossRef] [PubMed]
- Thundimadathil, J. Cancer treatment using peptides: Current therapies and future prospects. J. Amino Acids 2012, 2012, 967347. [Google Scholar] [CrossRef] [PubMed]
- Hajisharifi, Z.; Piryaiee, M.; Mohammad Beigi, M.; Behbahani, M.; Mohabatkar, H. Predicting anticancer peptides with Chou’s pseudo amino acid composition and investigating their mutagenicity via Ames test. J. Theor. Biol. 2014, 341, 34–40. [Google Scholar] [CrossRef] [PubMed]
- Chen, W.; Ding, H.; Feng, P.; Lin, H.; Chou, K.-C. iACP: A sequence-based tool for identifying anticancer peptides. Oncotarget 2016, 7, 16895. [Google Scholar] [CrossRef] [PubMed]
- Manavalan, B.; Basith, S.; Shin, T.H.; Choi, S.; Kim, M.O.; Lee, G. MLACP: Machine-learning-based prediction of anticancer peptides. Oncotarget 2017, 8, 77121–77136. [Google Scholar] [CrossRef]
- Boopathi, V.; Subramaniyam, S.; Malik, A.; Lee, G.; Manavalan, B.; Yang, D.-C. mACPpred: A support vector machine-based meta-predictor for identification of anticancer peptides. Int. J. Mol. Sci. 2019, 20, 1964. [Google Scholar] [CrossRef] [PubMed]
- Wu, C.; Gao, R.; Zhang, Y.; De Marinis, Y. PTPD: Predicting therapeutic peptides by deep learning and word2vec. BMC Bioinf. 2019, 20, 456. [Google Scholar] [CrossRef]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
- Yi, H.-C.; You, Z.-H.; Zhou, X.; Cheng, L.; Li, X.; Jiang, T.-H.; Chen, Z.-H. ACP-DL: A Deep Learning Long Short-Term Memory Model to Predict Anticancer Peptides Using High-Efficiency Feature Representation. Mol. Ther.-Nucleic Acids 2019, 17, 1–9. [Google Scholar] [CrossRef]
- Yu, L.; Jing, R.; Liu, F.; Luo, J.; Li, Y. DeepACP: A novel computational approach for accurate identification of anticancer peptides by deep learning algorithm. Mol. Ther.-Nucleic Acids 2020, 22, 862–870. [Google Scholar] [CrossRef]
- Lv, Z.; Cui, F.; Zou, Q.; Zhang, L.; Xu, L. Anticancer peptides prediction with deep representation learning features. Brief. Bioinform. 2021, 22, bbab008. [Google Scholar] [CrossRef] [PubMed]
- Akbar, S.; Hayat, M.; Tahir, M.; Khan, S.; Alarfaj, F.K. cACP-DeepGram: Classification of anticancer peptides via deep neural network and skip-gram-based word embedding model. Artif. Intell. Med. 2022, 131, 102349. [Google Scholar] [CrossRef] [PubMed]
- Joulin, A.; Grave, E.; Bojanowski, P.; Mikolov, T. Bag of tricks for efficient text classification. arXiv 2016, arXiv:1607.01759. [Google Scholar]
- Yuan, Q.; Chen, K.; Yu, Y.; Le, N.Q.K.; Chua, M.C.H. Prediction of anticancer peptides based on an ensemble model of deep learning and machine learning using ordinal positional encoding. Brief. Bioinform. 2023, 24, bbac630. [Google Scholar] [CrossRef]
- Zhou, W.; Liu, Y.; Li, Y.; Kong, S.; Wang, W.; Ding, B.; Han, J.; Mou, C.; Gao, X.; Liu, J. TriNet: A tri-fusion neural network for the prediction of anticancer and antimicrobial peptides. Patterns 2023, 4, 100702. [Google Scholar] [CrossRef]
- Yao, L.; Li, W.; Zhang, Y.; Deng, J.; Pang, Y.; Huang, Y.; Chung, C.-R.; Yu, J.; Chiang, Y.-C.; Lee, T.-Y. Accelerating the Discovery of Anticancer Peptides through Deep Forest Architecture with Deep Graphical Representation. Int. J. Mol. Sci. 2023, 24, 4328. [Google Scholar] [CrossRef]
- Zhou, Z.-H.; Feng, J. Deep forest. Natl. Sci. Rev. 2019, 6, 74–86. [Google Scholar] [CrossRef]
- Mu, Z.; Yu, T.; Liu, X.; Zheng, H.; Wei, L.; Liu, J. FEGS: A novel feature extraction model for protein sequences and its applications. BMC Bioinf. 2021, 22, 297. [Google Scholar] [CrossRef]
- Agrawal, P.; Bhagat, D.; Mahalwal, M.; Sharma, N.; Raghava, G.P. AntiCP 2.0: An updated model for predicting anticancer peptides. Brief. Bioinform. 2021, 22, bbaa153. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Hinton, G. Learning Multiple Layers of Features from Tiny Images. 2009. Available online: https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf (accessed on 14 September 2023).
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Kai, L.; Li, F.-F. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
- Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 1–48. [Google Scholar] [CrossRef]
- Shorten, C.; Khoshgoftaar, T.M.; Furht, B. Text data augmentation for deep learning. J. Big Data 2021, 8, 1–34. [Google Scholar] [CrossRef] [PubMed]
- Park, D.S.; Chan, W.; Zhang, Y.; Chiu, C.-C.; Zoph, B.; Cubuk, E.D.; Le, Q.V. Specaugment: A simple data augmentation method for automatic speech recognition. arXiv 2019, arXiv:1904.08779. [Google Scholar]
- Chen, X.-G.; Zhang, W.; Yang, X.; Li, C.; Chen, H. Acp-da: Improving the prediction of anticancer peptides using data augmentation. Front. Genet. 2021, 12, 698477. [Google Scholar] [CrossRef] [PubMed]
- Bhattarai, S.; Kim, K.-S.; Tayara, H.; Chong, K.T. ACP-ADA: A Boosting Method with Data Augmentation for Improved Prediction of Anticancer Peptides. Int. J. Mol. Sci. 2022, 23, 12194. [Google Scholar] [CrossRef]
- Chen, X.; Huang, J.; He, B. AntiDMPpred: A web service for identifying anti-diabetic peptides. PeerJ 2022, 10, e13581. [Google Scholar] [CrossRef] [PubMed]
- Chen, K.; Wei, Z.; Zhang, Q.; Wu, X.; Rong, R.; Lu, Z.; Su, J.; de Magalhães, J.P.; Rigden, D.J.; Meng, J. WHISTLE: A high-accuracy map of the human N6-methyladenosine (m6A) epitranscriptome predicted using a machine learning approach. Nucleic Acids Res. 2019, 47, e41. [Google Scholar] [CrossRef] [PubMed]
- Ding, Y.; Tang, J.; Guo, F. Identification of drug-side effect association via multiple information integration with centered kernel alignment. Neurocomputing 2019, 325, 211–224. [Google Scholar] [CrossRef]
- Yan, J.; Bhadra, P.; Li, A.; Sethiya, P.; Qin, L.; Tai, H.K.; Wong, K.H.; Siu, S.W. Deep-AmPEP30: Improve short antimicrobial peptides prediction with deep learning. Mol. Ther.-Nucleic Acids 2020, 20, 882–894. [Google Scholar] [CrossRef]
- Su, R.; Hu, J.; Zou, Q.; Manavalan, B.; Wei, L. Empirical comparison and analysis of web-based cell-penetrating peptide prediction tools. Brief. Bioinform. 2020, 21, 408–420. [Google Scholar] [CrossRef]
- Yan, K.; Lv, H.; Guo, Y.; Chen, Y.; Wu, H.; Liu, B. TPpred-ATMV: Therapeutic peptide prediction by adaptive multi-view tensor learning model. Bioinformatics 2022, 38, 2712–2718. [Google Scholar] [CrossRef]
- Wei, L.; Ye, X.; Sakurai, T.; Mu, Z.; Wei, L. ToxIBTL: Prediction of peptide toxicity based on information bottleneck and transfer learning. Bioinformatics 2022, 38, 1514–1524. [Google Scholar] [CrossRef]
- Van Der Maaten, L.; Hinton, G. Visualizing Data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
- Chen, Z.; Zhao, P.; Li, F.; Leier, A.; Marquez-Lago, T.T.; Wang, Y.; Webb, G.I.; Smith, A.I.; Daly, R.J.; Chou, K.-C.; et al. iFeature: A Python package and web server for features extraction and selection from protein and peptide sequences. Bioinformatics 2018, 34, 2499–2502. [Google Scholar] [CrossRef] [PubMed]
- Hanchuan, P.; Fuhui, L.; Ding, C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern. Anal. Mach. Intell. 2005, 27, 1226–1238. [Google Scholar] [CrossRef]
- Lee, D.-H. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Proceedings of the 2013 International Conference on Machine Learning (ICML), Atlanta, GA, USA, 16–21 June 2013; p. 896. [Google Scholar]
- Rizve, M.N.; Duarte, K.; Rawat, Y.S.; Shah, M. In defense of pseudo-labeling: An uncertainty-aware pseudo-label selection framework for semi-supervised learning. arXiv 2021, arXiv:2101.06329. [Google Scholar]
- Gal, Y.; Ghahramani, Z. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of the 2016 International Conference on Machine Learning (ICML), New York, NY, USA, 19–24 June 2016; pp. 1050–1059. [Google Scholar]
Methods | N/R | Accuracy | Specificity | F1-Score | MCC |
---|---|---|---|---|---|
Baseline | - | 75.42 | 76.08 | 76.73 | 51.21 |
TDA | 1 | 76.25 | 71.55 | 78.41 | 52.86 |
2 | 75.42 | 68.97 | 77.94 | 51.69 | |
3 | 73.75 | 68.74 | 76.01 | 47.58 | |
4 | 73.33 | 65.58 | 76.36 | 47.46 | |
5 | 75.42 | 69.85 | 77.78 | 51.36 | |
ACPs-ASSF | 1 | 78.75 | 78.91 | 79.95 | 57.97 |
2 | 77.08 | 79.71 | 77.75 | 54.98 | |
3 | 77.92 | 79.15 | 78.54 | 55.85 | |
4 | 77.92 | 77.55 | 79.02 | 55.65 | |
5 | 79.17 | 77.67 | 80.56 | 58.16 |
Methods | N/R | Accuracy | Specificity | F1-Score | MCC |
---|---|---|---|---|---|
Baseline | - | 73.11 | 70.13 | 74.02 | 46.20 |
TDA | 1 | 72.30 | 69.05 | 73.22 | 44.55 |
2 | 71.89 | 69.10 | 72.69 | 44.00 | |
3 | 71.35 | 66.99 | 72.82 | 42.95 | |
4 | 72.03 | 69.47 | 73.01 | 44.17 | |
5 | 71.89 | 68.40 | 73.04 | 43.99 | |
ACPs-ASSF | 1 | 75.95 | 75.37 | 76.39 | 52.18 |
2 | 76.22 | 73.95 | 76.84 | 52.31 | |
3 | 75.68 | 75.19 | 75.92 | 51.32 | |
4 | 77.43 | 74.86 | 78.15 | 54.77 | |
5 | 77.57 | 76.57 | 77.90 | 55.17 |
Methods | Accuracy | Specificity | F1-Score | MCC | |
---|---|---|---|---|---|
Baseline | - | 75.42 | 76.08 | 76.73 | 51.21 |
TDA | 0.001 | 75.83 | 70.00 | 78.21 | 52.43 |
0.002 | 75.42 | 70.51 | 77.51 | 51.08 | |
0.003 | 77.08 | 73.55 | 78.92 | 54.83 | |
0.004 | 71.67 | 67.86 | 73.84 | 43.93 | |
0.005 | 75.42 | 69.08 | 77.86 | 51.72 | |
ACPs-ASSF | 0.001 | 77.50 | 76.59 | 79.07 | 56.25 |
0.002 | 77.50 | 80.71 | 78.19 | 55.91 | |
0.003 | 78.33 | 79.08 | 79.60 | 57.59 | |
0.004 | 77.08 | 77.11 | 78.35 | 54.88 | |
0.005 | 80.00 | 78.88 | 81.07 | 60.32 |
Methods | Accuracy | Specificity | F1-Score | MCC | |
---|---|---|---|---|---|
Baseline | - | 73.11 | 70.13 | 74.02 | 46.20 |
TDA | 0.001 | 72.16 | 69.69 | 72.95 | 44.33 |
0.002 | 71.35 | 68.61 | 72.36 | 42.80 | |
0.003 | 72.03 | 69.91 | 72.74 | 44.21 | |
0.004 | 71.89 | 70.66 | 72.41 | 43.63 | |
0.005 | 71.89 | 69.58 | 72.54 | 43.84 | |
ACPs-ASSF | 0.001 | 75.81 | 74.34 | 76.31 | 51.42 |
0.002 | 75.14 | 74.04 | 75.56 | 50.25 | |
0.003 | 75.81 | 76.30 | 75.84 | 51.77 | |
0.004 | 76.08 | 76.50 | 76.04 | 52.16 | |
0.005 | 76.22 | 75.14 | 76.66 | 52.34 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tao, H.; Shan, S.; Fu, H.; Zhu, C.; Liu, B. An Augmented Sample Selection Framework for Prediction of Anticancer Peptides. Molecules 2023, 28, 6680. https://doi.org/10.3390/molecules28186680
Tao H, Shan S, Fu H, Zhu C, Liu B. An Augmented Sample Selection Framework for Prediction of Anticancer Peptides. Molecules. 2023; 28(18):6680. https://doi.org/10.3390/molecules28186680
Chicago/Turabian StyleTao, Huawei, Shuai Shan, Hongliang Fu, Chunhua Zhu, and Boye Liu. 2023. "An Augmented Sample Selection Framework for Prediction of Anticancer Peptides" Molecules 28, no. 18: 6680. https://doi.org/10.3390/molecules28186680
APA StyleTao, H., Shan, S., Fu, H., Zhu, C., & Liu, B. (2023). An Augmented Sample Selection Framework for Prediction of Anticancer Peptides. Molecules, 28(18), 6680. https://doi.org/10.3390/molecules28186680