*2.5. Differentially Expressed Gene (DEG) and Gene Ontology (GO) Analysis for Selection of Transcript Related to Disease Resistance*

Among the sequences generated from Trinity, sequences with a length of more than 100 amino acids were selected using TransDecoder. Based on these data and onion reference transcript data (NABIC, RDA, JeonJu, Korea, https://nabic.rda.go.kr, accessed on 7 February 2021), the sequences of each onion line were aligned with HISAT2. The read count of the transcript expression level was then calculated using the StringTie program. The transcripts obtained through StringTie were calculated at the transcript level, and a comparative analysis was performed between each onion line based on the read count of each transcript. After dividing into resistant and susceptible groups, a DEG analysis was performed using DEGseq (Bioconductor, http://www.bioconductor.org/packages/ release/bioc/html/DEGseq.html, accessed on 2 October 2021) [24]. First, after normalizing the raw read count data, a correlation analysis was performed between each onion line based on the normalized data. The analysis was conducted using Pearson's correlation coefficient and the average linkage method. The DEGseq of the R package was used to confirm the statistical significance of the expression differences between resistant and susceptible groups. After comparing the average expression levels between the two groups, the conditions were set as follows to select genes using significantly different expressions. The test was conducted using the equation log2 - Base Mean of R Base Mean of S . A negative value was set for the transcripts more expressed in the susceptible group than in the resistant group, and a positive value was set for the transcripts more expressed in the resistant group than in the susceptible groups. Thereafter, the transcripts with DEGs satisfying the conditions of |log2 fold change| ≥ 2 and PADJ < 0.05 were selected. From the results of variants analysis and DEG analysis, transcripts that commonly satisfied each condition were selected. A gene function analysis was performed to identify whether the selected variants were related to the disease-resistant mechanisms. To identify the functions related to disease resistance, the selected transcripts were analyzed using The Arabidopsis Information Resource (TAIR) ID derived from Arabidopsis, a model plant. Thereafter, GO annotation was performed using the transcripts' confirmed gene functions by TAIR ID, and the transcripts with functions related to disease resistance were selected.
