R-CRISPR: A Deep Learning Network to Predict Off-Target Activities with Mismatch, Insertion and Deletion in CRISPR-Cas9 System
Abstract
:1. Introduction
2. Materials and Methods
2.1. Datasets
2.2. R-CRISPR Model
2.2.1. Encoding Matrix Scheme for gRNA-Target Pair
2.2.2. Preprocessing Module for Feature Extraction
2.2.3. Long Short-Term Memory for Constructing RNN
2.2.4. R-CRISPR Model Construction
2.3. Mainstream Prediction Methods
3. Results
3.1. Performance of R-CRISPR on Mismatch-Only gRNA-Target Prediction
3.2. Performance of R-CRISPR on Multiple gRNA-Target Prediction
3.3. Performance of R-CRISPR with Different Training Datasets
3.4. Hyperparameters Optimization
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
CNN | Convolutional Neural Network |
Cas9 | CRISPR-associated protein 9 |
CRISPR | Clustered Regularly Interspaced Short Palindromic Repeats |
DCNN | Deep Convolutional Neural Network |
gRNA | guide RNA |
LRCN | Long-term Recurrent Convolutional Neural Network |
LSTM | Long Short-Term Memory |
PAM | protospacer adjacent motif |
PRC | Precision Recall Curve |
RNN | Recurrent Neural Networks |
ROC | Receiver Operating Characteristic |
References
- Doudna, J.A.; Charpentier, E. The new frontier of genome engineering with CRISPR-Cas9. Science 2014, 346. [Google Scholar] [CrossRef]
- Carroll, D. Collateral damage: Benchmarking off-target effects in genome editing. Genome Biol. 2019, 20, 114. [Google Scholar] [CrossRef]
- Urnov, F.D.; Ronald, P.C.; Carroll, D. A call for science-based review of the European court’s decision on gene-edited crops. Nat. Biotechnol. 2018, 36, 800–802. [Google Scholar] [CrossRef] [PubMed]
- Deveau, H.; Garneau, J.E.; Moineau, S. CRISPR/Cas system and its role in phage-bacteria interactions. Annu. Rev. Microbiol. 2010, 64, 475–493. [Google Scholar] [CrossRef] [PubMed]
- Horvath, P.; Barrangou, R. CRISPR/Cas, the immune system of bacteria and archaea. Science 2010, 327, 167–170. [Google Scholar] [CrossRef] [Green Version]
- Hoban, M.D.; Lumaquin, D.; Kuo, C.Y.; Romero, Z.; Long, J.; Ho, M.; Young, C.S.; Mojadidi, M.; Fitz-Gibbon, S.; Cooper, A.R.; et al. CRISPR/Cas9-mediated correction of the sickle mutation in human CD34+ cells. Mol. Ther. 2016, 24, 1561–1569. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Garneau, J.E.; Dupuis, M.È.; Villion, M.; Romero, D.A.; Barrangou, R.; Boyaval, P.; Fremaux, C.; Horvath, P.; Magadán, A.H.; Moineau, S. The CRISPR/Cas bacterial immune system cleaves bacteriophage and plasmid DNA. Nature 2010, 468, 67–71. [Google Scholar] [CrossRef] [PubMed]
- Kleinstiver, B.P.; Pattanayak, V.; Prew, M.S.; Tsai, S.Q.; Nguyen, N.T.; Zheng, Z.; Joung, J.K. High-fidelity CRISPR–Cas9 nucleases with no detectable genome-wide off-target effects. Nature 2016, 529, 490–495. [Google Scholar] [CrossRef] [Green Version]
- Doench, J.G.; Fusi, N.; Sullender, M.; Hegde, M.; Vaimberg, E.W.; Donovan, K.F.; Smith, I.; Tothova, Z.; Wilen, C.; Orchard, R.; et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat. Biotechnol. 2016, 34, 184–191. [Google Scholar] [CrossRef] [Green Version]
- Fu, Y.; Foden, J.A.; Khayter, C.; Maeder, M.L.; Reyon, D.; Joung, J.K.; Sander, J.D. High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nat. Biotechnol. 2013, 31, 822–826. [Google Scholar] [CrossRef] [Green Version]
- Lin, Y.; Cradick, T.J.; Brown, M.T.; Deshmukh, H.; Ranjan, P.; Sarode, N.; Wile, B.M.; Vertino, P.M.; Stewart, F.J.; Bao, G. CRISPR/Cas9 systems have off-target activity with insertions or deletions between target DNA and guide RNA sequences. Nucleic Acids Res. 2014, 42, 7473–7485. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Tsai, S.Q.; Zheng, Z.; Nguyen, N.T.; Liebers, M.; Topkar, V.V.; Thapar, V.; Wyvekens, N.; Khayter, C.; Iafrate, A.J.; Le, L.P.; et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nat. Biotechnol. 2015, 33, 187–197. [Google Scholar] [CrossRef] [Green Version]
- Wienert, B.; Wyman, S.K.; Richardson, C.D.; Yeh, C.D.; Akcakaya, P.; Porritt, M.J.; Morlock, M.; Vu, J.T.; Kazane, K.R.; Watry, H.L.; et al. Unbiased detection of CRISPR off-targets in vivo using DISCOVER-Seq. Science 2019, 364, 286–289. [Google Scholar] [CrossRef]
- Höijer, I.; Johansson, J.; Gudmundsson, S.; Chin, C.S.; Bunikis, I.; Häggqvist, S.; Emmanouilidou, A.; Wilbe, M.; den Hoed, M.; Bondeson, M.L.; et al. Amplification-free long-read sequencing reveals unforeseen CRISPR-Cas9 off-target activity. Genome Biol. 2020, 21, 290. [Google Scholar] [CrossRef]
- Kim, D.; Bae, S.; Park, J.; Kim, E.; Kim, S.; Yu, H.R.; Hwang, J.; Kim, J.I.; Kim, J.S. Digenome-seq: Genome-wide profiling of CRISPR-Cas9 off-target effects in human cells. Nat. Methods 2015, 12, 237–243. [Google Scholar] [CrossRef] [PubMed]
- Kim, D.; Kim, J.S. DIG-seq: A genome-wide CRISPR off-target profiling method using chromatin DNA. Genome Res. 2018, 28, 1894–1900. [Google Scholar] [CrossRef] [Green Version]
- Tsai, S.Q.; Nguyen, N.T.; Malagon-Lopez, J.; Topkar, V.V.; Aryee, M.J.; Joung, J.K. CIRCLE-seq: A highly sensitive in vitro screen for genome-wide CRISPR–Cas9 nuclease off-targets. Nat. Methods 2017, 14, 607–614. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lazzarotto, C.R.; Malinin, N.L.; Li, Y.; Zhang, R.; Yang, Y.; Lee, G.; Cowley, E.; He, Y.; Lan, X.; Jividen, K.; et al. CHANGE-seq reveals genetic and epigenetic effects on CRISPR–Cas9 genome-wide activity. Nat. Biotechnol. 2020, 38, 1317–1327. [Google Scholar] [CrossRef]
- Kang, S.H.; Lee, W.j.; An, J.H.; Lee, J.H.; Kim, Y.H.; Kim, H.; Oh, Y.; Park, Y.H.; Jin, Y.B.; Jun, B.H.; et al. Prediction-based highly sensitive CRISPR off-target validation using target-specific DNA enrichment. Nat. Commun. 2020, 11, 3596. [Google Scholar] [CrossRef]
- Hsu, P.D.; Scott, D.A.; Weinstein, J.A.; Ran, F.A.; Konermann, S.; Agarwala, V.; Li, Y.; Fine, E.J.; Wu, X.; Shalem, O.; et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol. 2013, 31, 827–832. [Google Scholar] [CrossRef]
- Naeem, M.; Majeed, S.; Hoque, M.Z.; Ahmad, I. Latest developed strategies to minimize the off-target effects in CRISPR-Cas-mediated genome editing. Cells 2020, 9, 1608. [Google Scholar] [CrossRef]
- Stemmer, M.; Thumberger, T.; del Sol Keyer, M.; Wittbrodt, J.; Mateo, J.L. CCTop: An intuitive, flexible and reliable CRISPR/Cas9 target prediction tool. PLoS ONE 2015, 10, e0124633. [Google Scholar]
- Listgarten, J.; Weinstein, M.; Kleinstiver, B.P.; Sousa, A.A.; Joung, J.K.; Crawford, J.; Gao, K.; Hoang, L.; Elibol, M.; Doench, J.G.; et al. Prediction of off-target activities for the end-to-end design of CRISPR guide RNAs. Nat. Biomed. Eng. 2018, 2, 38–47. [Google Scholar] [CrossRef] [PubMed]
- Abadi, S.; Yan, W.X.; Amar, D.; Mayrose, I. A machine learning approach for predicting CRISPR-Cas9 cleavage efficiencies and patterns underlying its mechanism of action. PLoS Comput. Biol. 2017, 13, e1005807. [Google Scholar] [CrossRef] [PubMed]
- Peng, H.; Zheng, Y.; Zhao, Z.; Liu, T.; Li, J. Recognition of CRISPR/Cas9 off-target sites through ensemble learning of uneven mismatch distributions. Bioinformatics 2018, 34, i757–i765. [Google Scholar] [CrossRef] [Green Version]
- Wang, J.; Xiang, X.; Bolund, L.; Zhang, X.; Cheng, L.; Luo, Y. GNL-Scorer: A generalized model for predicting CRISPR on-target activity by machine learning and featurization. J. Mol. Cell Biol. 2020, 12, 909–911. [Google Scholar] [CrossRef] [Green Version]
- Chuai, G.; Ma, H.; Yan, J.; Chen, M.; Hong, N.; Xue, D.; Zhou, C.; Zhu, C.; Chen, K.; Duan, B.; et al. DeepCRISPR: Optimized CRISPR guide RNA design by deep learning. Genome Biol. 2018, 19, 80. [Google Scholar] [CrossRef]
- Lin, J.; Wong, K.C. Off-target predictions in CRISPR-Cas9 gene editing using deep learning. Bioinformatics 2018, 34, i656–i663. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Liu, Q.; He, D.; Xie, L. Prediction of off-target specificity and cell-specific fitness of CRISPR-Cas System using attention boosted deep learning and network-based gene feature. PLoS Comput. Biol. 2019, 15, e1007480. [Google Scholar] [CrossRef]
- Zhang, G.; Zeng, T.; Dai, Z.; Dai, X. Prediction of CRISPR/Cas9 single guide RNA cleavage efficiency and specificity by attention-based convolutional neural networks. Comput. Struct. Biotechnol. J. 2021, 19, 1445–1457. [Google Scholar] [CrossRef]
- Zhang, Y.; Long, Y.; Yin, R.; Kwoh, C.K. DL-CRISPR: A Deep Learning Method for Off-Target Activity Prediction in CRISPR/Cas9 With Data Augmentation. IEEE Access 2020, 8, 76610–76617. [Google Scholar] [CrossRef]
- Lin, J.; Zhang, Z.; Zhang, S.; Chen, J.; Wong, K.C. CRISPR-Net: A Recurrent Convolutional Network Quantifies CRISPR Off-Target Activities with Mismatches and Indels. Adv. Sci. 2020, 7, 1903562. [Google Scholar] [CrossRef]
- Bae, S.; Park, J.; Kim, J.S. Cas-OFFinder: A fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics 2014, 30, 1473–1475. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Haeussler, M.; Schönig, K.; Eckert, H.; Eschstruth, A.; Mianné, J.; Renaud, J.B.; Schneider-Maunoury, S.; Shkumatava, A.; Teboul, L.; Kent, J.; et al. Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol. 2016, 17, 148. [Google Scholar] [CrossRef] [PubMed]
- May, A.P.; Cameron, P.; Settle, A.H.; Fuller, C.K.; Thompson, M.S.; Cigan, A.M.; Young, J.K. SITE-Seq: A genome-wide method to measure Cas9 cleavage. Protoc. Exch. 2017. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015. [Google Scholar]
- Ding, X.; Zhang, X.; Ma, N.; Han, J.; Ding, G.; Sun, J. Repvgg: Making vgg-style convnets great again. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 13733–13742. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Hochreiter, S. Untersuchungen zu Dynamischen Neuronalen Netzen. Master’s Thesis, Technische Universität München, Munich, Germany, 1991. [Google Scholar]
- Lanchantin, J.; Singh, R.; Wang, B.; Qi, Y. Deep motif dashboard: Visualizing and understanding genomic sequences using deep neural networks. In Pacific Symposium on Biocomputing 2017; World Scientific: Singapore, 2017; pp. 254–265. [Google Scholar]
Dataset | Train | Test | Total Sites | Off-Target Sites | Insertion/Deletion | gRNAs |
---|---|---|---|---|---|---|
CIRCLE | √ | - | 584,949 | 7371 | 430 | 10 |
PKD | √ | - | 4853 | 2273 | - | 65 |
PDH | √ | - | 10,129 | 52 | - | 19 |
SITE | √ | - | 217,733 | 3767 | - | 9 |
GUIDE_I | √ | - | 294,534 | 354 | - | 9 |
GUIDE_II | - | √ | 95,829 | 54 | - | 5 |
GUIDE_III | - | √ | 383,463 | 56 | - | 22 |
Hyperparameter | Value |
---|---|
Weight optimizer | Adam optimizer |
Weight learning rate initialization | 0.0001 |
Batch size | 10,000 |
Epoch | 100 |
Off-Target Prediction Methods | AUROC | AUPRC |
---|---|---|
AttnToMismatch_CNN | 0.961 | 0.071 |
CRISPR-Net | 0.993 | 0.292 |
Elevation-score | 0.993 | 0.131 |
CFD | 0.925 | 0.066 |
Ensemble SVM | 0.982 | 0.113 |
CNN_std | 0.956 | 0.115 |
R-CRISPR | 0.991 | 0.319 |
Training Dataset | CIRCLE | PKD | PDH | SITE | GUIDE_I | AUROC | AUPRC |
---|---|---|---|---|---|---|---|
A | √ | √ | √ | - | √ | 0.989 | 0.254 |
B | - | √ | √ | √ | √ | 0.991 | 0.319 |
C | √ | - | - | - | - | 0.993 | 0.173 |
D | √ | √ | √ | √ | √ | 0.991 | 0.312 |
E | - | - | - | √ | - | 0.991 | 0.251 |
F | - | √ | √ | - | √ | 0.992 | 0.265 |
G | √ | - | - | √ | - | 0.994 | 0.220 |
Benchmark | √ | √ | √ | √ | √ | 0.993 | 0.131 |
Training Dataset | CIRCLE | PKD | PDH | SITE | GUIDE_I | AUROC | AUPRC |
---|---|---|---|---|---|---|---|
B | - | √ | √ | √ | √ | 0.998 | 0.184 |
D | √ | √ | √ | √ | √ | 0.992 | 0.143 |
F | - | √ | √ | - | √ | 0.994 | 0.150 |
Benchmark | √ | √ | √ | √ | √ | 0.996 | 0.119 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Niu, R.; Peng, J.; Zhang, Z.; Shang, X. R-CRISPR: A Deep Learning Network to Predict Off-Target Activities with Mismatch, Insertion and Deletion in CRISPR-Cas9 System. Genes 2021, 12, 1878. https://doi.org/10.3390/genes12121878
Niu R, Peng J, Zhang Z, Shang X. R-CRISPR: A Deep Learning Network to Predict Off-Target Activities with Mismatch, Insertion and Deletion in CRISPR-Cas9 System. Genes. 2021; 12(12):1878. https://doi.org/10.3390/genes12121878
Chicago/Turabian StyleNiu, Rui, Jiajie Peng, Zhipeng Zhang, and Xuequn Shang. 2021. "R-CRISPR: A Deep Learning Network to Predict Off-Target Activities with Mismatch, Insertion and Deletion in CRISPR-Cas9 System" Genes 12, no. 12: 1878. https://doi.org/10.3390/genes12121878
APA StyleNiu, R., Peng, J., Zhang, Z., & Shang, X. (2021). R-CRISPR: A Deep Learning Network to Predict Off-Target Activities with Mismatch, Insertion and Deletion in CRISPR-Cas9 System. Genes, 12(12), 1878. https://doi.org/10.3390/genes12121878