*4.5. WGCNA Analysis*

#### 4.5.1. Construction of Weighted Gene Co-Expression Networks and Identification of Modules

We use thresholds for different expression data to calculate the number of genes. The WGCNA algorithm was applied to the evaluation of gene expression. The flashClust toolkit (R language) was used to perform cluster analysis on samples and set appropriate thresholds. After background correction and quantile normalization, the top 50% of genes (8056 genes) with the most variant in the analysis of variance were selected for WGCNA analysis. The WGCNA package in R was used to construct the co-expression network of the 8056 genes after verification [30].

The gradient independence method was used to test the scale independence and average connectivity of different power modules (the power value ranging from 1 to 20). According to the scale-independent conditions, the appropriate power value was 0.6. After the power value determined, the module was constructed by the WGCNA algorithm, and the genetic information corresponding to each module was extracted. For high reliability of the results, the minimum number of genes per module was set to 50.

#### 4.5.2. Interactions Analysis of Co-Expression Modules

The interaction relationship between different co-expression modules was calculated by WGCNA, and the topological overlap matrix (TOM) was constructed using the correlation expression values. The resulting topological overlap is a biologically meaningful measure of gene similarity based on the co-expression relationship between two genes. Using each TOM as input, the flashClust function was used for hierarchical cluster analysis. Then the dynamicTreeCut algorithm (minModuleSize = 30) was used to detect the module as a branch of the dendrogram [30]. Random colors were assigned to the modules, and the first principal component was used to calculate the module characteristic genes of each module. The module characteristic genes can be regarded as the representative of

the gene expression pattern in this module, and the module combining highly relevant characteristic genes (merge-CutHeight = 0.07). By calculating the Pearson correlation between the module characteristic genes and the interest characteristics, and using the module characteristic genes to estimate the module-trait relationship, the samples are classified according to the corresponding traits, types, and tissues, and the module with a correlation coefficient ≥ | 0.75 | and *p*-value ≤ 0.01 was selected for further analysis. Through the heat map of the gene network topological overlap, the structure of co-expression module was visualized. Then through the hierarchical clustering tree diagram of the characteristic genes and the heat map of the corresponding characteristic gene network, the relationship between the modules was summarized.
