*4.2. Test-Retest Reliability, Practice Effects, and Validity*

There were mixed results regarding the reliability and validity of the evaluated measures. Test-retest reliability was strongest for the SRS-2 subscales, providing evidence for consistent reports of social behavior by caregivers. Social Awareness had the lowest ICC for the SRS-2 (0.70), which was in the moderate range, but close to the "good" category (0.75). Although moderate test-retest reliability was found for NEPSY-II Theory of Mind raw scores, NEPSY-II Affect Recognition raw scores demonstrated poor reliability. Inconsistent scores between the two-week test-retest interval on the NEPSY-II Affect Recognition indicate that children may be guessing or acquiescing with their responses. For all evaluated measures, NEPSY-II subtests and SRS-2, practice effects were negligible. The lack of improvement at the second study time point suggests the measures were stable with multiple administrations over a relatively short period.

Investigation of convergent validity resulted in no association between parent reports of social cognition/awareness and direct assessments of social cognition. These different test modalities may be tapping different constructs or skills, as there are clear differences in laboratory-based assessments compared with parent-report measures. Therefore, while the NEPSY-II Theory of Mind shows some good psychometric properties, we need to consider what it is measuring. It may be the case that standardized clinical assessments of theory of mind do not represent parental perceptions of a child's daily abilities in social awareness and understanding. The NEPSY-II Theory of Mind may also have poor ecological validity. Further, the NEPSY-II Theory of Mind is moderately correlated with receptive and expressive language, and overall language abilities may be confounding performance, as has occurred in previous studies [21]. Another plausible interpretation is that social abilities reported by parents are truly different skills than those assessed in the laboratory. Although there were no associations between the NEPSY-II and SRS-2, there were significant associations among the SRS-2 subscales, which parallels previous significant correlations reported among SRS subscales in DS [19].

Associations with broader developmental domains varied significantly across measures. First, both NEPSY-II subtests and SRS-2 subscales had significant correlations with cognitive ability, in the expected directions, such that higher cognitive abilities were associated with better social cognition and fewer social behavior challenges. The associations between the SRS-2 subscales and cognitive abilities have not been consistently found in previous investigations between SRS-2 and nonverbal IQ in DS [5,19], but this study does replicate a moderate association found between cognition and SRS Total T-scores [19], despite using different IQ measures. Correlations between NEPSY-II subtests and ABIQ were markedly stronger than comparisons between the SRS-2 and ABIQ. This reinforces the idea that direct assessment may be tapping similar skills that are fundamentally different from the behaviors and performance observed by parents in the home environment. Both NEPSY-II subtests were positively correlated with the expressive language measures, but only the NEPSY-II Theory of Mind subtest was associated with receptive language. This highlights the receptive language demands of the NEPSY-II Theory of Mind that are required to complete the measure. SRS-2 Social Awareness was the only subscale that was associated with receptive language, which deviates from previous reports of a significant association between all SRS subscales and receptive vocabulary [19]. This study also replicated previous reports of no correlation between the SRS-2 subscales and expressive language [5]. Finally, associations with chronological age were minimal and corroborate previous reports of a lack of association with the SRS [5,19], suggesting that developmental level is a better indicator of social cognition and behavior than age.
