Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (12)

Search Parameters:
Keywords = Nextflow

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
7 pages, 396 KiB  
Article
CIEVaD: A Lightweight Workflow Collection for the Rapid and On-Demand Deployment of End-to-End Testing for Genomic Variant Detection
by Thomas Krannich, Dimitri Ternovoj, Sofia Paraskevopoulou and Stephan Fuchs
Viruses 2024, 16(9), 1444; https://doi.org/10.3390/v16091444 - 11 Sep 2024
Viewed by 1119
Abstract
The identification of genomic variants has become a routine task in the age of genome sequencing. In particular, small genomic variants of a single or few nucleotides are routinely investigated for their impact on an organism’s phenotype. Hence, the precise and robust detection [...] Read more.
The identification of genomic variants has become a routine task in the age of genome sequencing. In particular, small genomic variants of a single or few nucleotides are routinely investigated for their impact on an organism’s phenotype. Hence, the precise and robust detection of the variants’ exact genomic locations and changes in nucleotide composition is vital in many biological applications. Although a plethora of methods exist for the many key steps of variant detection, thoroughly testing the detection process and evaluating its results is still a cumbersome procedure. In this work, we present a collection of easy-to-apply and highly modifiable workflows to facilitate the generation of synthetic test data, as well as to evaluate the accordance of a user-provided set of variants with the test data. The workflows are implemented in Nextflow and are open-source and freely available on Github under the GPL-3.0 license. Full article
(This article belongs to the Special Issue Virus Bioinformatics 2024)
Show Figures

Figure A1

13 pages, 2037 KiB  
Article
Reproducible Bioinformatics Analysis Workflows for Detecting IGH Gene Fusions in B-Cell Acute Lymphoblastic Leukaemia Patients
by Ashlee J. Thomson, Jacqueline A. Rehn, Susan L. Heatley, Laura N. Eadie, Elyse C. Page, Caitlin Schutz, Barbara J. McClure, Rosemary Sutton, Luciano Dalla-Pozza, Andrew S. Moore, Matthew Greenwood, Rishi S. Kotecha, Chun Y. Fong, Agnes S. M. Yong, David T. Yeung, James Breen and Deborah L. White
Cancers 2023, 15(19), 4731; https://doi.org/10.3390/cancers15194731 - 26 Sep 2023
Cited by 5 | Viewed by 2445
Abstract
B-cell acute lymphoblastic leukaemia (B-ALL) is characterised by diverse genomic alterations, the most frequent being gene fusions detected via transcriptomic analysis (mRNA-seq). Due to its hypervariable nature, gene fusions involving the Immunoglobulin Heavy Chain (IGH) locus can be difficult to detect [...] Read more.
B-cell acute lymphoblastic leukaemia (B-ALL) is characterised by diverse genomic alterations, the most frequent being gene fusions detected via transcriptomic analysis (mRNA-seq). Due to its hypervariable nature, gene fusions involving the Immunoglobulin Heavy Chain (IGH) locus can be difficult to detect with standard gene fusion calling algorithms and significant computational resources and analysis times are required. We aimed to optimize a gene fusion calling workflow to achieve best-case sensitivity for IGH gene fusion detection. Using Nextflow, we developed a simplified workflow containing the algorithms FusionCatcher, Arriba, and STAR-Fusion. We analysed samples from 35 patients harbouring IGH fusions (IGH::CRLF2 n = 17, IGH::DUX4 n = 15, IGH::EPOR n = 3) and assessed the detection rates for each caller, before optimizing the parameters to enhance sensitivity for IGH fusions. Initial results showed that FusionCatcher and Arriba outperformed STAR-Fusion (85–89% vs. 29% of IGH fusions reported). We found that extensive filtering in STAR-Fusion hindered IGH reporting. By adjusting specific filtering steps (e.g., read support, fusion fragments per million total reads), we achieved a 94% reporting rate for IGH fusions with STAR-Fusion. This analysis highlights the importance of filtering optimization for IGH gene fusion events, offering alternative workflows for difficult-to-detect high-risk B-ALL subtypes. Full article
Show Figures

Figure 1

34 pages, 7541 KiB  
Article
Comparison of Metagenomics and Metatranscriptomics Tools: A Guide to Making the Right Choice
by Laura C. Terrón-Camero, Fernando Gordillo-González, Eduardo Salas-Espejo and Eduardo Andrés-León
Genes 2022, 13(12), 2280; https://doi.org/10.3390/genes13122280 - 3 Dec 2022
Cited by 23 | Viewed by 13358
Abstract
The study of microorganisms is a field of great interest due to their environmental (e.g., soil contamination) and biomedical (e.g., parasitic diseases, autism) importance. The advent of revolutionary next-generation sequencing techniques, and their application to the hypervariable regions of the 16S, 18S or [...] Read more.
The study of microorganisms is a field of great interest due to their environmental (e.g., soil contamination) and biomedical (e.g., parasitic diseases, autism) importance. The advent of revolutionary next-generation sequencing techniques, and their application to the hypervariable regions of the 16S, 18S or 23S ribosomal subunits, have allowed the research of a large variety of organisms more in-depth, including bacteria, archaea, eukaryotes and fungi. Additionally, together with the development of analysis software, the creation of specific databases (e.g., SILVA or RDP) has boosted the enormous growth of these studies. As the cost of sequencing per sample has continuously decreased, new protocols have also emerged, such as shotgun sequencing, which allows the profiling of all taxonomic domains in a sample. The sequencing of hypervariable regions and shotgun sequencing are technologies that enable the taxonomic classification of microorganisms from the DNA present in microbial communities. However, they are not capable of measuring what is actively expressed. Conversely, we advocate that metatranscriptomics is a “new” technology that makes the identification of the mRNAs of a microbial community possible, quantifying gene expression levels and active biological pathways. Furthermore, it can be also used to characterise symbiotic interactions between the host and its microbiome. In this manuscript, we examine the three technologies above, and discuss the implementation of different software and databases, which greatly impact the obtaining of reliable results. Finally, we have developed two easy-to-use pipelines leveraging Nextflow technology. These aim to provide everything required for an average user to perform a metagenomic analysis of marker genes with QIMME2 and a metatranscriptomic study using Kraken2/Bracken. Full article
Show Figures

Figure 1

13 pages, 1304 KiB  
Article
hgtseq: A Standard Pipeline to Study Horizontal Gene Transfer
by Simone Carpanzano, Mariangela Santorsola, nf-core community and Francesco Lescai
Int. J. Mol. Sci. 2022, 23(23), 14512; https://doi.org/10.3390/ijms232314512 - 22 Nov 2022
Cited by 1 | Viewed by 3722
Abstract
Horizontal gene transfer (HGT) is well described in prokaryotes: it plays a crucial role in evolution, and has functional consequences in insects and plants. However, less is known about HGT in humans. Studies have reported bacterial integrations in cancer patients, and microbial sequences [...] Read more.
Horizontal gene transfer (HGT) is well described in prokaryotes: it plays a crucial role in evolution, and has functional consequences in insects and plants. However, less is known about HGT in humans. Studies have reported bacterial integrations in cancer patients, and microbial sequences have been detected in data from well-known human sequencing projects. Few of the existing tools for investigating HGT are highly automated. Thanks to the adoption of Nextflow for life sciences workflows, and to the standards and best practices curated by communities such as nf-core, fully automated, portable, and scalable pipelines can now be developed. Here we present nf-core/hgtseq to facilitate the analysis of HGT from sequencing data in different organisms. We showcase its performance by analysing six exome datasets from five mammals. Hgtseq can be run seamlessly in any computing environment and accepts data generated by existing exome and whole-genome sequencing projects; this will enable researchers to expand their analyses into this area. Fundamental questions are still open about the mechanisms and the extent or role of horizontal gene transfer: by releasing hgtseq we provide a standardised tool which will enable a systematic investigation of this phenomenon, thus paving the way for a better understanding of HGT. Full article
(This article belongs to the Special Issue Data Mining and Bioinformatic Tools for Health)
Show Figures

Figure 1

24 pages, 3447 KiB  
Article
A Comparison of Bioinformatics Pipelines for Enrichment Illumina Next Generation Sequencing Systems in Detecting SARS-CoV-2 Virus Strains
by Afiahayati, Stefanus Bernard, Gunadi, Hendra Wibawa, Mohamad Saifudin Hakim, Marcellus, Arli Aditya Parikesit, Chandra Kusuma Dewa and Yasubumi Sakakibara
Genes 2022, 13(8), 1330; https://doi.org/10.3390/genes13081330 - 26 Jul 2022
Cited by 2 | Viewed by 4430
Abstract
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is a newly emerging virus well known as the major cause of the worldwide pandemic due to Coronavirus Disease 2019 (COVID-19). Major breakthroughs in the Next Generation Sequencing (NGS) field were elucidated following the first release [...] Read more.
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is a newly emerging virus well known as the major cause of the worldwide pandemic due to Coronavirus Disease 2019 (COVID-19). Major breakthroughs in the Next Generation Sequencing (NGS) field were elucidated following the first release of a full-length SARS-CoV-2 genome on the 10 January 2020, with the hope of turning the table against the worsening pandemic situation. Previous studies in respiratory virus characterization require mapping of raw sequences to the human genome in the downstream bioinformatics pipeline as part of metagenomic principles. Illumina, as the major player in the NGS arena, took action by releasing guidelines for improved enrichment kits called the Respiratory Virus Oligo Panel (RVOP) based on a hybridization capture method capable of capturing targeted respiratory viruses, including SARS-CoV-2; therefore, allowing a direct map of raw sequences data to SARS-CoV-2 genome in downstream bioinformatics pipeline. Consequently, two bioinformatics pipelines emerged with no previous studies benchmarking the pipelines. This study focuses on gaining insight and understanding of target enrichment workflow by Illumina through the utilization of different bioinformatics pipelines named as ‘Fast Pipeline’ and ‘Normal Pipeline’ to SARS-CoV-2 strains isolated from Yogyakarta and Central Java, Indonesia. Overall, both pipelines work well in the characterization of SARS-CoV-2 samples, including in the identification of major studied nucleotide substitutions and amino acid mutations. A higher number of reads mapped to the SARS-CoV-2 genome in Fast Pipeline and merely were discovered as a contributing factor in a higher number of coverage depth and identified variations (SNPs, insertion, and deletion). Fast Pipeline ultimately works well in a situation where time is a critical factor. On the other hand, Normal Pipeline would require a longer time as it mapped reads to the human genome. Certain limitations were identified in terms of pipeline algorithm, whereas it is highly recommended in future studies to design a pipeline in an integrated framework, for instance, by using NextFlow, a workflow framework to combine all scripts into one fully integrated pipeline. Full article
(This article belongs to the Section Bioinformatics)
Show Figures

Figure 1

17 pages, 1083 KiB  
Article
SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene Variants
by Houcemeddine Othman, Sherlyn Jemimah and Jorge Emanuel Batista da Rocha
J. Pers. Med. 2022, 12(2), 263; https://doi.org/10.3390/jpm12020263 - 11 Feb 2022
Cited by 2 | Viewed by 3180
Abstract
Recent genomic studies have revealed the critical impact of genetic diversity within small population groups in determining the way individuals respond to drugs. One of the biggest challenges is to accurately predict the effect of single nucleotide variants and to get the relevant [...] Read more.
Recent genomic studies have revealed the critical impact of genetic diversity within small population groups in determining the way individuals respond to drugs. One of the biggest challenges is to accurately predict the effect of single nucleotide variants and to get the relevant information that allows for a better functional interpretation of genetic data. Different conformational scenarios upon the changing in amino acid sequences of pharmacologically important proteins might impact their stability and plasticity, which in turn might alter the interaction with the drug. Current sequence-based annotation methods have limited power to access this type of information. Motivated by these calls, we have developed the Structural Workflow for Annotating ADME Targets (SWAAT) that allows for the prediction of the variant effect based on structural properties. SWAAT annotates a panel of 36 ADME genes including 22 out of the 23 clinically important members identified by the PharmVar consortium. The workflow consists of a set of Python codes of which the execution is managed within Nextflow to annotate coding variants based on 37 criteria. SWAAT also includes an auxiliary workflow allowing a versatile use for genes other than ADME members. Our tool also includes a machine learning random forest binary classifier that showed an accuracy of 73%. Moreover, SWAAT outperformed six commonly used sequence-based variant prediction tools (PROVEAN, SIFT, PolyPhen-2, CADD, MetaSVM, and FATHMM) in terms of sensitivity and has comparable specificity. SWAAT is available as an open-source tool. Full article
(This article belongs to the Special Issue Systems Medicine and Bioinformatics)
Show Figures

Figure 1

20 pages, 1801 KiB  
Article
Side-by-Side Comparison of Post-Entry Quarantine and High Throughput Sequencing Methods for Virus and Viroid Diagnosis
by Marie-Emilie A. Gauthier, Ruvini V. Lelwala, Candace E. Elliott, Craig Windell, Sonia Fiorito, Adrian Dinsdale, Mark Whattam, Julie Pattemore and Roberto A. Barrero
Biology 2022, 11(2), 263; https://doi.org/10.3390/biology11020263 - 8 Feb 2022
Cited by 15 | Viewed by 4126
Abstract
Rapid and safe access to new plant genetic stocks is crucial for primary plant industries to remain profitable, sustainable, and internationally competitive. Imported plant species may spend several years in Post Entry Quarantine (PEQ) facilities, undergoing pathogen testing which can impact the ability [...] Read more.
Rapid and safe access to new plant genetic stocks is crucial for primary plant industries to remain profitable, sustainable, and internationally competitive. Imported plant species may spend several years in Post Entry Quarantine (PEQ) facilities, undergoing pathogen testing which can impact the ability of plant industries to quickly adapt to new global market opportunities by accessing new varieties. Advances in high throughput sequencing (HTS) technologies provide new opportunities for a broad range of fields, including phytosanitary diagnostics. In this study, we compare the performance of two HTS methods (RNA-Seq and sRNA-Seq) with that of existing PEQ molecular assays in detecting and identifying viruses and viroids from various plant commodities. To analyze the data, we tested several bioinformatics tools which rely on different approaches, including direct-read, de novo, and reference-guided assembly. We implemented VirusReport, a new portable, scalable, and reproducible nextflow pipeline that analyses sRNA datasets to detect and identify viruses and viroids. We raise awareness of the need to evaluate cross-sample contamination when analyzing HTS data routinely and of using methods to mitigate index cross-talk. Overall, our results suggest that sRNA analyzed using VirReport provides opportunities to improve quarantine testing at PEQ by detecting all regulated exotic viruses from imported plants in a single assay. Full article
Show Figures

Figure 1

12 pages, 2063 KiB  
Article
Comprehensive Analysis of Large-Scale Transcriptomes from Multiple Cancer Types
by Baoting Nong, Mengbiao Guo, Weiwen Wang, Zhou Songyang and Yuanyan Xiong
Genes 2021, 12(12), 1865; https://doi.org/10.3390/genes12121865 - 24 Nov 2021
Cited by 4 | Viewed by 3722
Abstract
Various abnormalities of transcriptional regulation revealed by RNA sequencing (RNA-seq) have been reported in cancers. However, strategies to integrate multi-modal information from RNA-seq, which would help uncover more disease mechanisms, are still limited. Here, we present PipeOne, a cross-platform one-stop analysis workflow for [...] Read more.
Various abnormalities of transcriptional regulation revealed by RNA sequencing (RNA-seq) have been reported in cancers. However, strategies to integrate multi-modal information from RNA-seq, which would help uncover more disease mechanisms, are still limited. Here, we present PipeOne, a cross-platform one-stop analysis workflow for large-scale transcriptome data. It was developed based on Nextflow, a reproducible workflow management system. PipeOne is composed of three modules, data processing and feature matrices construction, disease feature prioritization, and disease subtyping. It first integrates eight different tools to extract different information from RNA-seq data, and then used random forest algorithm to study and stratify patients according to evidences from multiple-modal information. Its application in five cancers (colon, liver, kidney, stomach, or thyroid; total samples n = 2024) identified various dysregulated key features (such as PVT1 expression and ABI3BP alternative splicing) and pathways (especially liver and kidney dysfunction) shared by multiple cancers. Furthermore, we demonstrated clinically-relevant patient subtypes in four of five cancers, with most subtypes characterized by distinct driver somatic mutations, such as TP53, TTN, BRAF, HRAS, MET, KMT2D, and KMT2C mutations. Importantly, these subtyping results were frequently contributed by dysregulated biological processes, such as ribosome biogenesis, RNA binding, and mitochondria functions. PipeOne is efficient and accurate in studying different cancer types to reveal the specificity and cross-cancer contributing factors of each cancer.It could be easily applied to other diseases and is available at GitHub. Full article
(This article belongs to the Section Bioinformatics)
Show Figures

Figure 1

8 pages, 7987 KiB  
Article
ORPER: A Workflow for Constrained SSU rRNA Phylogenies
by Luc Cornet, Anne-Catherine Ahn, Annick Wilmotte and Denis Baurain
Genes 2021, 12(11), 1741; https://doi.org/10.3390/genes12111741 - 29 Oct 2021
Cited by 1 | Viewed by 2717
Abstract
The continuous increase in sequenced genomes in public repositories makes the choice of interesting bacterial strains for future sequencing projects ever more complicated, as it is difficult to estimate the redundancy between these strains and the already available genomes. Therefore, we developed the [...] Read more.
The continuous increase in sequenced genomes in public repositories makes the choice of interesting bacterial strains for future sequencing projects ever more complicated, as it is difficult to estimate the redundancy between these strains and the already available genomes. Therefore, we developed the Nextflow workflow “ORPER”, for “ORganism PlacER”, containerized in Singularity, which allows the determination the phylogenetic position of a collection of organisms in the genomic landscape. ORPER constrains the phylogenetic placement of SSU (16S) rRNA sequences in a multilocus reference tree based on ribosomal protein genes extracted from public genomes. We demonstrate the utility of ORPER on the Cyanobacteria phylum, by placing 152 strains of the BCCM/ULC collection. Full article
Show Figures

Figure 1

12 pages, 1192 KiB  
Article
FA-nf: A Functional Annotation Pipeline for Proteins from Non-Model Organisms Implemented in Nextflow
by Anna Vlasova, Toni Hermoso Pulido, Francisco Camara, Julia Ponomarenko and Roderic Guigó
Genes 2021, 12(10), 1645; https://doi.org/10.3390/genes12101645 - 19 Oct 2021
Cited by 3 | Viewed by 4910
Abstract
Functional annotation allows adding biologically relevant information to predicted features in genomic sequences, and it is, therefore, an important procedure of any de novo genome sequencing project. It is also useful for proofreading and improving gene structural annotation. Here, we introduce FA-nf, a [...] Read more.
Functional annotation allows adding biologically relevant information to predicted features in genomic sequences, and it is, therefore, an important procedure of any de novo genome sequencing project. It is also useful for proofreading and improving gene structural annotation. Here, we introduce FA-nf, a pipeline implemented in Nextflow, a versatile computational workflow management engine. The pipeline integrates different annotation approaches, such as NCBI BLAST+, DIAMOND, InterProScan, and KEGG. It starts from a protein sequence FASTA file and, optionally, a structural annotation file in GFF format, and produces several files, such as GO assignments, output summaries of the abovementioned programs and final annotation reports. The pipeline can be broken easily into smaller processes for the purpose of parallelization and easily deployed in a Linux computational environment, thanks to software containerization, thus helping to ensure full reproducibility. Full article
(This article belongs to the Special Issue Trends and Future Perspectives in Genome Annotation)
Show Figures

Figure 1

17 pages, 5995 KiB  
Article
RNAflow: An Effective and Simple RNA-Seq Differential Gene Expression Pipeline Using Nextflow
by Marie Lataretu and Martin Hölzer
Genes 2020, 11(12), 1487; https://doi.org/10.3390/genes11121487 - 10 Dec 2020
Cited by 25 | Viewed by 10034
Abstract
RNA-Seq enables the identification and quantification of RNA molecules, often with the aim of detecting differentially expressed genes (DEGs). Although RNA-Seq evolved into a standard technique, there is no universal gold standard for these data’s computational analysis. On top of that, previous studies [...] Read more.
RNA-Seq enables the identification and quantification of RNA molecules, often with the aim of detecting differentially expressed genes (DEGs). Although RNA-Seq evolved into a standard technique, there is no universal gold standard for these data’s computational analysis. On top of that, previous studies proved the irreproducibility of RNA-Seq studies. Here, we present a portable, scalable, and parallelizable Nextflow RNA-Seq pipeline to detect DEGs, which assures a high level of reproducibility. The pipeline automatically takes care of common pitfalls, such as ribosomal RNA removal and low abundance gene filtering. Apart from various visualizations for the DEG results, we incorporated downstream pathway analysis for common species as Homo sapiens and Mus musculus. We evaluated the DEG detection functionality while using qRT-PCR data serving as a reference and observed a very high correlation of the logarithmized gene expression fold changes. Full article
(This article belongs to the Special Issue Algorithms and Workflows in RNA Bioinformatics)
Show Figures

Figure 1

1 pages, 141 KiB  
Abstract
VANIR—NextFlow Pipeline for Viral Variant Calling and de Novo Assembly of Nanopore and Illumina Reads for High-Quality dsDNA Viral Genomes
by Joan Martí-Carreras and Piet Maes
Proceedings 2020, 50(1), 117; https://doi.org/10.3390/proceedings2020050117 - 3 Jul 2020
Cited by 2 | Viewed by 1879
Abstract
Human cytomegalovirus (HCMV), like other herpes and dsDNA viruses, possesses unique properties derived from their genome architecture. The HCMV genome is composed of two unique domains: long (L) and short (S). Each domain contains a central unique region (U; thus, UL and US, [...] Read more.
Human cytomegalovirus (HCMV), like other herpes and dsDNA viruses, possesses unique properties derived from their genome architecture. The HCMV genome is composed of two unique domains: long (L) and short (S). Each domain contains a central unique region (U; thus, UL and US, respectively) and two repeated regions (thus, TRL/IRL and TRS/IRS). Recombination between repetitive regions is possible, yielding four possible genomic isomers, found in equimolar proportion in any viral infective population. Frequent recombination and an altered selective landscape can give rise to the persistence, if not fixation, of diverse variants in culturized HCMV isolates. This phenomenon has already been discovered in AD169 and Towne strains, characterizing a 10 kbp deletion (ΔUL/b’) in commonly used viral strains. Other dsDNA viruses are known for their structural rearrangements and frequent recombination. VANIR (viral variant calling and de novo assembly using nanopore and illumina reads) is a novel analysis pipeline that benefits from both short-read (Illumina) and long-read sequencing technologies (Oxford Nanopore Technologies Ltd.) to assemble high-quality dsDNA viral genomes and detection of variants. Illumina and nanopore sequencing provide complementary information to the assembly and variant discovery. Assembly contiguity, structural variant, and repeat calling are greatly improved by nanopore read-length and base-calling and base confidence by Illumina reduced error rate and increased yield. This specialized bioinformatic analysis pipeline is encoded in the NextFlow pipeline manager and containerized in a Singularity image. This set-up allows for improved traceability, reproducibility, transportability, and speed. Through VANIR, novel point mutations and structural genome rearrangements are called from sequencing data, benefiting diversity research with attenuated lab-strains and wild-type viruses. Full article
(This article belongs to the Proceedings of Viruses 2020—Novel Concepts in Virology)
Back to TopTop