*2.6. Outcomes*

Children's Sleep Habits Questionnaire (CSHQ) [34]. This parent-rated questionnaire has been used and validated in multiple studies of ASD [35–39]. It comprises 33 scored questions, and additional items intended to provide other relevant information on sleep behavior. Each scored question is rated on a 3-point scale, as occurring 'usually' (i.e., 5–7 times within the past week), 'sometimes' (i.e., 2–4 times within the past week), or 'rarely (i.e., never or 1 time within the past week). A higher score reflects more significant sleep disturbances. Items are combined to form the following 8 subscales: bedtime resistance, sleep onset delay, sleep duration, sleep anxiety, night waking, parasomnias, sleep

disordered breathing, and daytime sleepiness. A total score is calculated as the sum of all CSHQ scored items and can range from 33 to 99. A total score of 41 and above indicates a pediatric sleep disorder, as this cutoff has been shown to accurately identify 80% of children with a clinically diagnosed sleep disorder [34]. Parents were instructed to answer questions regarding their child's sleep during a typical recent week. The questionnaire was completed at the onset and end of each treatment period. The completed CSHQ questionnaires were excluded from analysis if more than 20% of the data were missing.

*Clinical Global Impression–Improvement scale (CGI-I)* [40] was used to measure the improvement in disruptive behaviors from the baseline. Scores range from 1 (very much improved), to 4 (unchanged), to 7 (very much worse). Scores of 1 or 2 (much improved) were defined as a positive response and all others indicated a negative response [40]. CGI-I was assessed at the end of each treatment period. Anchoring instructions were used to rate improvement in behavioral difficulties on the CGI-I, rather than improvement in overall ASD symptoms. The same clinician (AA) assessed and rated the CGI-S and CGI-I of all participants. Notably, while the CGI-S and CGI-I were developed to assess 'overall function', we used anchor points that were 'domain-specific' for disruptive behavior.

*Social Responsiveness Scale (SRS-2*): [41] this 65-item, caregiver questionnaire quantifies autism symptom severity (total scores range from 0 to 195, with higher scores indicating worsening severity). The questionnaire was completed at the onset and end of each treatment period.
