DSM-5M-5 [1]

Patients were diagnosed with pathological gambling if they met DSM-IV-TR criteria [26]. We used DSM-IV-TR criteria because the 8th criterion explores the presence of gambling-related illegal acts. Noteworthy, with the release of the DSM-5 [1], the term "pathological gambling" was replaced with "GD". All patient diagnoses were post-hoc reassessed and recodified to avoid the confounding effect of increased GD severity in patients with a criminal history. In this regard, only patients who met DSM-5 criteria for GD were included in the present study. The internal consistency in our study sample was α = 0.818.

#### South Oaks Gambling Screen (SOGS) [31]

The SOGS is a 20-item diagnostic questionnaire that ascertains GD severity. It discriminates between probable pathological, problem, and non-problem gamblers. Both reliability and validity of the Spanish validation of this tool are high [32], and the test–retest reliability (R = 0.98, *p* < 0.01) and internal consistency (Cronbach's α = 0.94) are excellent. In our study sample, this questionnaire achieved adequate internal consistency ( α = 0.734).

#### Symptom Checklist-Revised (SCL-90-R) [33]

This questionnaire assesses a broad range of psychological problems and psychopathological symptoms. It contains 90 items measuring nine primary symptom dimensions and it also yields a global score (Global Severity Index (GSI)), which is a widely used index of psychopathological distress. The Spanish validation obtained good psychometrical properties, with a mean internal consistency of 0.75 (Cronbach's alpha) [34]. The internal consistency estimated in the study sample for the global scale was excellent ( α = 0.98: α = 0.891 for somatization, α = 0.896 for obsession-compulsion, α = 0.877 for interpersonal sensitivity, α = 0.917 for depression, α = 0.895 for anxiety, α = 0.873 for hostility, α = 0.832 for phobic anxiety, α = 0.798 for paranoid ideation, and α = 0.855 for psychoticism).

#### Impulsive Behavior Scale (UPPS-P) [35]

This questionnaire assesses 5 dimensions of impulsive behavior through self-report on 59 items: lack of premeditation, lack of perseverance, sensation-seeking, negative urgency, and positive urgency. The Spanish adaptation showed good reliability (Cronbach's α between 0.79 and 0.93) and external validity [36]. In our sample, internal consistency was α = 0.923: α = 0.854 for negative urgency, α = 0.917 for positive urgency, α = 0.818 for lack of premeditation, α = 0.754 for lack of perseverance, and α = 0.866 for sensation-seeking.

#### Temperament and Character Inventory-Revised (TCI-R) [37]

It is a 240-item self-reported questionnaire that measures seven personality dimensions: four temperament (novelty seeking, harm avoidance, reward dependence, and persistence) and three character dimensions (self-directedness, cooperativeness, and selftranscendence). We used the Spanish version which showed adequate internal consistency (Cronbach's alpha α mean value of 0.87) [38]. In the present study, internal consistency was between adequate ( α = 0.701 for reward dependence, α = 0.726 for novelty-seeking, α = 0.745 for harm avoidance, and α = 0.772 for cooperativeness) to good ( α = 0.819 for self-transcendence, α = 0.846 for self-directedness, and α = 0.862 for persistence).

#### Other sociodemographic and clinical variables

Additional sociodemographic and clinical variables related to gambling, as well as substance use, and psychiatric comorbidities were assessed by means of a semi-structured face-to-face clinical interview described elsewhere [27]. Socioeconomic status was obtained using the Hollingshead Factor Index, based on the educational attainment and occupational prestige domains [39]. Gambling-related crimes were explored through a face-to-face interview designed for this study by two forensic experts in the field.

#### *2.4. Statistical Analysis*

Stata17 for Windows was used for statistical analysis [40]. Analysis of variance (ANOVA) was used for the comparison of quantitative variables between the groups, and chi-square tests ( χ2) for the comparison of categorical variables. For these comparisons, the effect sizes were estimated with the standardized Cohen's-d for mean differences and Cramer's-phi ( ϕ) for proportion differences. In addition, Finner's correction was used to control the potential increase in the Type-I error due to the use of multiple null-hypothesis tests (Finner-method is an alternative procedure to the classic Bonferroni-method) [41].

Kaplan-Meier product-limit estimator was used to obtain the cumulate survival curve for the rate to dropout and relapse, and Long Rank (Mantel-Cox procedure) compared the resulting functions between the groups [42].
