**3. Results**

#### *3.1. Characteristics of the Sample*

A total of 508 patients were included, of which 276 were female (54.3%). The mean age of the total sample was 46.11 (standard deviation—SD = 14.47) years old. Our sample consisted of 345 (67.9%) patients with BD-I and 163 (32.1%) patients with BD-II. Of all of the patients, 262 (51.57%) fulfilled the DSM criteria for lifetime SUD of any type. The most used substance was alcohol (42.1%), followed by cannabis (22.6%), cocaine (12%), amphetamine (4.7%), MDMA (4.7%), and hallucinogens (2.1%). A total of 106 patients (20.8%) had AUD with at least another SUD (AUD+SUD). The sample characteristics are presented in Table 1.

#### *3.2. Missing Data*

Among the variables presenting missing data, 22 of them presented less than 25% of missing values. Of these, 17 presented fewer than 10% of missing values. Errors estimated during the imputation of missing data were less than 20%, except for the number of lifetime hospitalization (OOB = 0.29).

#### *3.3. Patients with SUD vs. without SUD*

The RF model performance outputs are reported in Table 2. The variables presenting higher values of mean decrease of gini were "number of total affective episodes", "number of total depressive episodes", "number of total hypomanic episodes", "number of total manic episodes", "number of lifetime hospitalization", "being in a relationship", "diagnosis of cluster B personality disorder", "number of attempted suicides", "number of mixed episodes", and "treatment with benzodiazepines".

These variables were tested in a multiple logistic regression adjusted for relevant covariates (see Methods). The presence of SUD was positively associated with a diagnosis of cluster B personality disorder (OR = 2.31 [95% CI = 1.26–4.23]; *p* = 0.006), and negatively associated with being in a relationship (OR = 0.6 [95% CI = 0.39–0.91]; *p* = 0.015) (Figure 1). The model explained 16.9% of the total variance in the sample of BD with SUD vs. non-SUD.

#### *3.4. Patients with AUD vs. without SUD*

RF model performance outputs are reported in Table 2. The variables presenting higher values of mean decrease of gini were "number of total affective episodes", "number of total depressive episodes", "number of total hypomanic episodes", "number of total manic episodes", "number of mixed episodes", "number of lifetime hospitalization", "number of attempted suicides", "being in a relationship", "diagnosis of cluster B personality disorder, any", "treatment with mood stabilizers other than lithium".

In a multiple logistic regression model adjusted for relevant covariates (see Methods), none of these variables was significantly associated with the presence of AUD (Figure 1). The model explained 14.2% of the total variance.



Substance use disorder: SUD; alcohol use disorder (AUD); confidence interval (CI).

#### *3.5. Patients with AUD+SUD vs. without SUD or AUD*

RF model performance outputs are reported in Table 2. The variables presenting higher values of mean decrease of gini were "number of total hypomanic episodes", "presence of hetero-directed aggressivity", "number of total affective episodes", "number of total depressive episodes", "violent suicide attempt", "number of total manic episodes", "being in a relationship", "number of lifetime hospitalization", "first episode as hypomanic", "presence of melancholia".

In a multiple logistic regression adjusted for relevant covariates, hypomania as the first affective episode (OR = 4.34 [95% CI = 1.42–13.31]; *p* = 0.01) and hetero-directed aggressivity (OR = 3.15 [95% CI = 1.48–6.74]; *p* = 0.003) were associated with AUD+SUD (Figure 1). The model explained 31.5% of the total variance in AUD+SUD.

#### *3.6. Patients with AUD+SUD vs. with AUD*

RF model performance outputs are reported in Table 2. The variables presenting higher values of mean decrease of gini were "number of total affective episodes", "number of total manic episodes", "number of total depressive episodes", "number of total hypomanic episodes", "mood disorders familiarity", "number of lifetime hospitalization", "number of total mixed episodes", "first episode as depressive", "presence of rapid-cycling", and "atypical depression".

These variables were considered in an adjusted multiple logistic regression. The presence of another SUD in the context of AUD was negatively associated with having depression as the first affective episode (OR = 0.41 [95% CI = 0.21–0.81]; *p* = 0.011) (Figure 1). The model explained 30.5% of the total variance.

**Figure 1.** Logistic regression plots of odds ratio (OR) and 95% confidence intervals (CI). Independent variables were selected among the top ten features derived from the random forest (RF) models. The four models predicted (from left to right): any substance use disorder (SUD) in the total sample, alcohol use disorder (AUD) in the total sample, AUD co-occurrence with at least another SUD in the total sample, and AUD co-occurrence with at least another SUD among BD patients with AUD.
