*2.9. Statistical Analysis*

Data were analyzed using RStudio (version 1.1.456) including R (version 3.6.0). Longitudinal plasma protein abundance was assessed by LMM analysis (lme4 package), with drug response (non-response or response), sampling time (baseline, 1 week, 4 weeks, and 10 weeks), and response/time interaction and technical replications as fixed variables, and individual patients nesting for fixed variables and individuals as random variables. In the GEE analysis (geesmv package), we merged plasma abundance as the median of two or three technical replicates and then analyzed drug response, sampling time, and drug/sampling time. The working correlation structure was set independently, and Gaussian estimation was performed.

Clustering analysis was based on median protein concentrations in each group (responders and non-responders) at the four time points, and t-stochastic neighbor embedding (t-SNE) [34] (perplexity = 2, theta=0, and dims=2) and affinity propagation (method=correlation symmetry matrix and Spearman) were computed using Rtsne and apcluster [35] packages, respectably. Other software packages included ggline for scatter plots and psygenet2r for mapping proteins on the psychiatric disorders gene association network (PsyGeNet) at database = "ALL" [36]. To control type I error by multiple comparisons, we applied the Bayesian sequential goodness of fit metatest (SGoF) method of default option (alpha = 0.05, gamma = 0.05, P0 = 0.5, a0 = 1, b0 = 1) in the SGoF R package [37] for *p*-values of response/time interaction by LMM analysis and Benjamini–Hochberg procedure [38] for *p*-values of MRM paired analysis, and then, we calculated permutated *p*-values for correlation analysis [39].
