Next Article in Journal
Phylogenomic Coalescent Analyses of Avian Retroelements Infer Zero-Length Branches at the Base of Neoaves, Emergent Support for Controversial Clades, and Ancient Introgressive Hybridization in Afroaves
Previous Article in Journal
Association of CX36 Protein Encoding Gene GJD2 with Refractive Errors
 
 
Article
Peer-Review Record

Genes and Diseases: Insights from Transcriptomics Studies

Genes 2022, 13(7), 1168; https://doi.org/10.3390/genes13071168
by Dmitry S. Kolobkov 1, Darya A. Sviridova 1, Serikbai K. Abilev 1, Artem N. Kuzovlev 2 and Lyubov E. Salnikova 1,2,3,*
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3:
Genes 2022, 13(7), 1168; https://doi.org/10.3390/genes13071168
Submission received: 6 May 2022 / Revised: 13 June 2022 / Accepted: 23 June 2022 / Published: 28 June 2022
(This article belongs to the Section Human Genomics and Genetic Diseases)

Round 1

Reviewer 1 Report

The authors describe a problem of genes identified as differentially expressed and therefore associated with a disease to be enriched in immune response, translation and tissue-specific genes. Although the problem of gene set enrichment analysis are pretty known, the authors do not discuss how they approach differ and is better than previous approaches (e.g. correcting for genes normally expressed in the tissue studied). Each of the section is poorly written - the introduction misses clear definitions of the terms used, the methods does not explain what the authors aim to achieve with each step, the results are also not clear, and therefore the conclusions as well. Overall, the manuscript is very hard to read and I could not catch the take home message and the novelty and importance of this study. Please, see below few of the more specific comments.

 

The "expression studies"  term is not precise enough in my opinion, as many things can be generally expressed, e.g. genes, proteins, but also emotions. Please, consider rephrasing the title, e .g. "Genes and diseases: insights from gene expression studies" or "Genes and diseases: insights from transcriptomics studies"

What is the "healthy transcriptomics of a disease" or "healthy expression"? How do authors define these terms?

Please, rephrase "differential DE"

Please, explain "whole genome expression levels"

 

No gene annotation versions are provided.

What are GEO up/down studies?

Please, define "initial expression level"

 

 

 

Author Response

Dear reviewer, we are grateful to you for the opinion and comments. Please find enclosed our replies, comments and point-by-point description of the changes made to the manuscript in response. We replaced terms that raised questions or explained them and thoroughly revised the whole manuscript to make it more consistent and easier to read. Please see the revised manuscript for more detailed information. 

Author Response File: Author Response.docx

Reviewer 2 Report

Major revision recommended:

 

The manuscript entitled “Genes and diseases: insights from expression studies” studied differential gene expression on disease-related genes using publicly available datasets from various databases such as HPA, GEO.  Authors claim to confirm that the previously known findings of higher expression levels of disease genes in pathologically linked tissues compared to other tissues.  And, DE genes were  highly expressed and enriched in disease gene.  Even though the manuscript has many major concerns, the study provides values to the field.

 

My major concerns are:

  • Gene expression depends on the particular method of sample and library preparations.  The databases that authors use all kind of diverse methods for the sample and library preparations.  Authors neglect all these important factors, which weakens authors conclusions.

  • Authors are studying gene expressions on the genes relevant to diseases, first by making disease-related gene list, then, analyze differential expression (DE).  Proper way of analysis should be systemic eQTL analyses.  

  • For healthy transcriptomic data analysis, the claim that “A comparison of expression levels of genes involved in the devel- 168 opment of immune system diseases and genes associated with non-immune diseases 169 showed that the former had higher levels of RNA expression in all immune related tissues 170 and some other tissues (38/77 in two tissue sets).” in Line168-170 should be supported by the proper p-value calculation here in this section

  • Jaccard index, which measures the ratio of intersection over to union, in Line200 varies hugely from 0.15 to 0.63, so not convinced the claim that this shows the consistency in transcriptomic data.

  • Current study is based on 9972 genes, which are correlated with diseases, which is about half of the genes.  If the authors analyze only for the 1801 genes, which overlaps in various phenotype databases, I am wondering how many of the current conclusions are still holding true.

 

My minor concerns are:

 

  • English should be edited properly.  The meanings are not clear

    • Line13-14 in Abstract, particularly, I don’t understand what the authors mean by ‘preferential’, and ‘a systematic research’

    • Line17 “healthy transcriptomics of disease” might mean “transcriptomic analysis for the health control in the study of a disease

    • Line36’s “integrated analysis of DE genes” does not clearly mention integrated with what

    • Line43’s “DE modules” define it first before using it

    • Some hyperlinks in the pdf are not accessible, i.e., Line83’s http://www.orphadata.org/cgi- bin/rare_free.html   Even deleting the dash right after the cgi did not resolve the issue.

    • Malacards’ link (https://www.malacards.org/) is better given in Line88

    • Malacards’ elite genes, which are genes associated with the disease as a potential cause supported by manual and trustworthy sources, should be defined 

    • PID in Line92 should be defined

  • TissueEnrich R package should be properly sited (Jain A, Tuteja G (2018). “TissueEnrich: Tissue-specific gene enrichment analysis.” Bioinformatics, bty890. doi: 10.1093/bioinformatics/bty890.)

  • Figure’s resolutions are suitable for publication - needs to provide better resolution figures.

  • 9972 genes are about half of 

    • Monaco data reference is missing

    • -log10 P value (adjusted) is not a standard way.  The 10 should be subscripted, and “P value (adjusted) is better to be shown as “Adjusted p-value”

Author Response

Dear reviewer, we are grateful to you for careful reading, important considerations and constructive comments. Please find enclosed our replies, comments and point-by-point description of the changes made to the manuscript in response.

Author Response File: Author Response.docx

Reviewer 3 Report

The authors consolidated lists of disease-related genes from multiple databases, and examined appearance of these genes in differentially expressed gene sets in GEO-registered datasets. Expression level, Gene Ontology terms and phenotype association highlighted characteristics of disease-relevant genes in the datasets comparing between normal and disease samples.

 

The analysis results presented in this manuscript make sense and consistent with previous reports of disease-related genes. To avoid potentially misleading interpretation of results, the authors should mention more clearly about the potential bias in the analysis in Discussion. The human disease gene databases used in this study tend to contain most of well-established genes. For example, Actb (beta-actin, which often considered as “housekeeping gene”) is also described as a human disease-related gene (included in the supplementary table,). Highly expressed genes tend to be well-characterized and reported to be disease-relevant. Such genes may have broad tissue specificity. This could affect majority of analysis presented in Figures 2, 3 and 4. General “healthiness” of cells supported by these housekeeping genes might, consequently, affect tissue function linked to pathogenic phenotype described in Figure 5. The results presented in this manuscript is fine, but the authors need to mention about the bias in the list of “disease gene” in Discussion. GEO datasets also might be biased to better detection of highly expressed genes, as both RNAseq and microarray transcriptome analysis provide more confident quantitative measurements for highly expressed genes.

Author Response

Dear reviewer, we are grateful to you for the constructive generalization of our results, indicating limitations and recommendations. Please find enclosed our replies, comments and point-by-point description of the changes made to the manuscript in response. 

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

* The authors removed the part related to Jaccard index from figure2 and the relevant texts.  This imposes more severe problem of potentially hiding the information that could be potentially against the author's results on this. The Jaccard index should be shown, and explicitly show that Jaccard index results is against the consistency 

 

Major comment 4: Jaccard index, which measures the ratio of intersection over to union, in Line200 varies hugely from 0.15 to 0.63, so not convinced the claim that this shows the consistency in transcriptomic data.

Reply

Data on Jaccard index in Figure 2 and in the text were removed. We also removed the statement on the consistency of transcriptomic data.

 

 

* And, I am not still convinced on the author's reply on the major comment 5.    

Author Response

Dear Reviewer, we thank you for the careful reading of the manuscript and offering constructive feedbacks and helpful suggestions.

Author Response File: Author Response.docx

Back to TopTop