1. Introduction
Autism spectrum disorder (ASD) is a developmental condition accompanied by impairments in social communication and interaction, as well as restrictive and repetitive behaviours or interests [
1]. This includes differences in how the vocabularies of ASD infants develop compared to their non-ASD peers [
2,
3].
Some aspects of social communication and interaction are shown to be affected in ASD infants even prior to diagnosis. Infants who have an older sibling with ASD have an increased probability of receiving a diagnosis by approximately 20%, relative to the general population [
4]. The behavioural and cognitive profiles of these infants with an ‘elevated likelihood’ (EL) of ASD diagnosis can shed light on how the developmental condition manifests prior to diagnosis. In existing literature, while some studies have classified infants solely on the basis of their genetic background (i.e., prospectively), other studies classify infants after diagnosis (i.e., retrospectively). In prospective studies, two groups can be identified: elevated likelihood (EL) and typical likelihood (TL). In retrospective studies, infants can be classified as ASD, non-ASD, or as developing with other conditions such as language delay.
Vocabulary can be defined as the words that the infant understands (or is receptive to) and the words they are able to speak (or express) [
5]. Vocabulary development has been observed to be affected both in prospective and retrospective studies. In prospective studies, EL infants have been shown to have smaller expressive and/or receptive vocabulary compared to their TL peers [
6,
7,
8,
9,
10,
11,
12,
13,
14,
15].
However, there are mixed findings regarding the extent to which expressive and receptive vocabulary develop of EL and TL infants differ and (if so) in which age groups these differences in development are observable. For example, when comparing EL and TL receptive vocabularies, Iverson et al. [
6] reported significant differences in vocabulary scores of EL and TL infants by 11 months, whereas Nyström et al. [
16] did not find significant differences between the EL and TL groups for expressive or receptive vocabulary at 10 months [
16]. Similarly, other studies also found that EL infants did not differ in their receptive and expressive vocabulary sizes from TL peers [
17,
18].
There are a number of methodological differences that may factor into the mixed findings that we observe in the literature. One reason for mixed findings may be variations across studies in the age of testing. Across the first two years of life, there are changes in how large the differences are between the infant groups in their expressive and receptive vocabulary sizes. For example, longitudinal assessments of infants have found larger differences between older compared to younger EL and TL infant groups for both expressive and receptive vocabulary [
6]. Additionally, for expressive vocabulary, it has been observed that although the expressive vocabulary size of EL and TL infants is similar at 6 months of age, by 12 months, EL infants are observed to produce fewer words than their TL peers [
19]. In light of these findings, it is important to assess how age impacts group differences between EL and TL and ASD and non-ASD infants so that we can more clearly map the expressive and receptive vocabulary development of the developmental groups.
Another factor that may impact group differences is the language measure that is used in a study. There are multiple standardised assessment tools available for evaluating the expressive and receptive vocabulary of infants in the first years of life. Two standardised assessments that are frequently used to assess the vocabulary of ASD and non-ASD infants are the MacArthur–Bates Communicative Development Inventory (CDI) [
5] and the Mullen Scales of Early Learning; both are available in multiple languages [
20,
21]. Both the CDI and MSEL assess expressive and receptive vocabulary but in different ways. The CDI assesses infants’ vocabulary through a word checklist that is completed by parents. Depending on the age of administration, either the ‘Words and Gestures’ CDI (around 300 words) or the ‘Words and Sentences’ (around 700 words) of the CDI is used. The MSEL assesses vocabulary by directly testing infants. It has an expressive scale made up of 28 items and a receptive scale made up of 33 items, which includes checks such as whether the infant ‘coordinates listening and looking’. The test is carried out by clinicians who are trained on how to assess expressive and receptive vocabulary. There are therefore multiple differences between the CDI and MSEL in how and by who they are carried out, which can impact the assessment of language that they make.
The influence of language measures on vocabulary outcomes has been examined less frequently than the effect of age but may be worth considering for a few reasons [
22,
23]. Firstly, contextual factors, such as how tired or motivated a child is to participate can have an impact on the vocabulary scores they receive. Contextual factors may have more impact on the scoring in the MSEL than the CDI, because the MSEL is completed in one sitting of 5–15 min, while the CDI can be completed outside of a fixed time frame and in multiple contexts (e.g., at home or school). Although previous studies have observed high correlations between the CDI and MSEL, these observations were made when assessing children [
22]. When assessing infants, correlations between the CDI and MSEL may be lower because contextual factors are shown to have a larger impact in younger children [
22]. Therefore, the scores that infants receive on questionnaires and assessments may vary more across contexts than the scores that children receive. This increased variability in the scores received across contexts may also impact how the scores of an infant compare when they are assessed on the one assessment versus another, i.e., the CDI versus the MSEL.
An additional reason as to why language measurement may have an impact on the vocabulary outcomes that are observed is differences in how reliably clinicians versus parents can assess the infant. The CDI is administered by parents who do not received standardised training on how to complete the questionnaire. In contrast, the MSEL is administered by clinicians who are trained on how to assess infant expressive and receptive vocabulary. Parents may, due to lack of training, be less able to reliably determine when their child understands and/or produces a word [
23,
24,
25]. On the other hand, parents may also be more acquainted with the words that their infants understand and speak than a clinician who interacts with the child for only a brief period of time. Subsequently, we may expect differences between parental and clinician scorings of vocabulary due to differences in familiarity with the infant and the training received or not received. To discern the magnitude of differences more accurately between EL and TL and ASD and non-ASD infants’ vocabularies, we need to better understand the extent to which differences in vocabulary scores derive from the assessment tool used.
This paper aimed to systematically review and meta-analyse the existing literature, looking at the expressive and receptive vocabulary sizes of elevated likelihood (EL) and ASD infants between the age groups of 0 and 24 months. Both studies comparing infants prospectively (i.e., comparing EL and TL infants) and retrospectively (i.e., comparing ASD versus non-ASD diagnosed) were included in this review. Focusing on the earliest stages of vocabulary acquisition (i.e., from 0–24 months of age) allows us to compare the results of studies that classify infants prospectively versus retrospectively, because after 24 months of age, infants belonging to the ‘elevated’ and ‘typical’ likelihood groups often receive diagnoses as ASD or non-ASD. In comparing the results of retrospective and prospective studies, we aimed to further our understanding on how infants classified as ‘elevated likelihood’ (that is, infants with a genetic background of ASD but no diagnosis) differ in their receptive and expressive profiles from infants who go on to receive a diagnosis of ASD (infants with a genetic background and also a diagnosis). Pre-diagnostically, the EL group contains a larger variation of developmental profiles than the post-diagnostic group of infants who are diagnosed with ASD. Roughly 80% of EL infants receive no diagnosis or are diagnosed with a different developmental disorder from ASD. The relatively small proportion of infants in the EL group that receive an ASD diagnosis warrants investigation of whether the group differences observed pre-diagnostically between EL and TL infants are comparable to the group differences observed post-diagnostically between ASD and non-ASD infants.
In this paper, we focused on studies that compare the vocabularies of the infant groups using the CDI and the MSEL as their language measures. Although there are other measures which can be used to assess infants’ vocabulary, such as the Bayles Scales of Infant and Toddler Development [
26], we chose to focus on the CDI and MSEL because they are regularly used as language assessments for ASD and EL populations, both in individual studies and in larger infant cohorts [
27,
28,
29,
30]. Although both measures collect data on expressive and receptive vocabulary, their methodologies differ—whereas the CDI is parent completed and a questionnaire, the MSEL is clinician completed and involves infants completing a set of tasks. By focusing on these measures in the meta-analysis, we aimed to assess whether their differing methodologies impacts the results that are obtained in the studies. This impact of methodology has previously been assessed for the CDI and MSEL, but not in 0–24-month-old infants [
22]. Gaining a better understanding of the impact of methodology on language outcomes could be important information for large infant cohorts that test infants on both the CDI and MSEL.
In this paper, we hypothesised that EL and ASD infants will generally have smaller expressive and receptive vocabulary sizes compared to their typical likelihood (TL) and non-ASD peers. We predict this effect to become more pronounced with age, with group differences between EL versus TL and also ASD versus non-ASD infants increasing with age [
31]. We also hypothesised that the method of language measure will moderate the group differences that are observed between EL versus. TL infants and also between ASD versus non-ASD infants. We hypothesise this because it can be more challenging to reliably assess the vocabularies of younger compared to older children. Subsequently, the method of language measure is more likely to impact the assessment that is made of the infant.
4. Discussion
This paper aimed to examine to what extent differences existed in the expressive and receptive vocabulary sizes of infants with ASD pre-diagnostically and post-diagnostically. Pre-diagnostically, infants at elevated likelihood (EL) for ASD were compared to infants at typical likelihood (TL) for ASD. Post-diagnostically, ASD infants were compared to non-ASD infants. A systematic review and meta-analyses were carried out, aiming to compile the existing empirical research on this topic.
For both the pre-diagnostic and post-diagnostic groups, it was observed that the elevated likelihood and ASD infants had smaller expressive and receptive vocabularies than their typical likelihood and non-ASD peers. This effect size was moderate pre-diagnostically and large post-diagnostically. There was also a substantial heterogeneity both when comparing pre-diagnostic and post-diagnostic groups. Subsequently, age and language measure were assessed as moderators of the magnitude of the difference between the infant groups. In the pre-diagnostic infant groups, age was found to be a moderator of the effect size when comparing the receptive scores of the EL and TL infants. In the post-diagnostic groups, age was found to be a moderator of the effect size when comparing the expressive scores of ASD and non-ASD infants. In both instances, as the age of the infants increased, the difference between the atypical and typical groups increased. Language measure did not have an effect on the standardized mean difference between the infant groups, meaning that regardless of whether the CDI or MSEL was used, the size of the group differences was not different pre- and post-diagnostically. These results are subsequently discussed in more depth, with suggestions being made for future research directions.
4.1. Heterogeneity—Large Variability in the Effect Sizes across Studies
When comparing both pre- and post-diagnostic groups, there was a substantial heterogeneity observed in the effect sizes across studies. These findings were similar to those observed in other meta-analyses conducted previously on ASD populations in similar age groups [
3]. A large proportion of this variability came from ‘true’ effects as opposed to random variability between participants, suggesting that there are a number of factors that could influence the vocabulary sizes of the infants. For example, language background, i.e., the proportions of monolinguals versus multilinguals tested in a study, could be one such factor.
Due to the large heterogeneity, it was not possible to interpret the weighted mean effect size of all the studies. Instead, the lower and upper 95% confidence intervals of the weighted mean effect size were interpreted. The lower confidence intervals of the weighted mean effect were positive in both pre- and post-diagnostic analyses, that is, above zero. This indicated that the atypically developing group (EL or ASD) had lower vocabulary scores than their typically developing peers (TL or non-ASD). For the pre-diagnostic groups, this was a moderate effect (expressive: d = 0.27; receptive; d = 0.29), and for the post-diagnostic results, this was a large effect (expressive: d = 0.65; receptive; d = 0.60).
Infant group differences in vocabulary therefore appeared to be larger in post-diagnostic compared to pre-diagnostic groups. One explanation is that the group with elevated risk of ASD (pre-diagnosis) is heterogeneous; only some infants receive a diagnosis of ASD, while others do not, and the developmental profiles of the elevated likelihood infants with no diagnosis may be more similar to that of the typical likelihood infants [
6,
39]. In a study included in this meta-analysis, it was found that infants that are retrospectively diagnosed as ‘elevated likelihood no diagnosis’ do not differ substantially from infants that are classified prospectively as ‘typical likelihood’ [
6]. The majority of ‘elevated likelihood’ infants go on to receive a diagnosis of ‘elevated likelihood no diagnosis’, meaning that there is likely to be a larger overlap in the language profile of elevated and typical likelihood infants, compared to ASD and non-ASD infants. This may be why we observed a smaller (as opposed to larger) group difference between EL and TL infant groups.
4.2. Does Age Moderate the Effect Size, or the Mean Difference, of the Expressive and Receptive Vocabulary Size of the Infant Groups?
Pre-diagnostically. The moderator analysis revealed that, for expressive vocabulary, age was not a significant moderator of effect size. However, for receptive vocabulary, age was a significant moderator of effect size. As the age of the infants increased, the effect size of difference between the EL and TL infants’ receptive vocabulary size increased. We checked whether age was equally distributed in studies that looked at expressive versus receptive vocabulary. An independent
t-test showed that the age groups assessed were comparable. Our finding is in line with studies that have implemented other questionnaires to index language—as the infants get older, the two groups increasingly diverge from each other with regards to receptive vocabulary [
40].
There are several possibilities as to why we observe that age moderates receptive vocabulary, but not expressive vocabularies. One interpretation for why age moderated receptive but not expressive vocabulary size is that the reliability of receptive assessments is lower for receptive vocabulary than expressive vocabulary. There could be a number of reasons for this. The parental ratings of vocabulary could be affected by their knowledge of their child’s elevated likelihood status. Since the majority of the elevated likelihood infants receive a typically developing diagnosis at 24 or 36 months, differences that are observed between the EL and TL groups may be attributable to how parents rate the child. Biases in assessment are more likely to affect receptive vocabulary ratings, which tend to have a lower reliability over time [
23,
41,
42]. This may be why age moderates differences between the groups in receptive vocabulary size but not expressive vocabulary size.
Additionally, differences in the linguistic environment of the EL versus TL infants may also lead to differences in their vocabulary outcomes. Infants are classified as elevated likelihood because they have an older sibling with a diagnosis of ASD. We may expect familial dynamics to be different in these families where one child has a developmental disorder compared to the families of the typical likelihood infant who have an older child with no diagnosis. For example, previous research has shown that parental stress is higher in families where a child has a diagnosis of ASD [
43]. This increased parental stress has been shown to affect how parents assess some aspect of their child development in child studies [
44]. Although the previously mentioned study did not find stress to affect ratings of expressive and receptive vocabulary, it should be noted that this study assessed parents of older children. It could also be that infants who grow up with a sibling with ASD model their behaviour to that of their sibling and show less-ostensive reactions of word understanding. This may be why age moderates differences between the groups in receptive vocabulary size but not expressive vocabulary size.
Post-diagnostically. The moderator analysis revealed that age was not a significant moderator of effect size for receptive vocabulary. However, for expressive vocabulary, age was a significant moderator of effect size. As the age of the infants increased, the standardised mean difference between the ASD and non-ASD infants’ receptive vocabulary size increased. An independent t-test revealed that the age groups in which expressive compared to receptive vocabulary were assessed was comparable.
An interpretation of this could be that only expressive and not receptive vocabulary is impacted in EL-ASD infants and that this was not clear when looking at the pre-diagnosis groups due to another developmental group in the EL group (such as EL-no diagnosis). Expressive vocabulary development may be more affected by the motor-related impairments that are observable in ASD children [
45,
46]. The motor-related difficulties that are present in children diagnosed with ASD start to affect their word production abilities prior to the age of 24 months [
47]. In a study by Leonard et al. [
47], the motor delays of infants that later received a diagnosis of ASD were found to predict their expressive but not receptive scores. Infants with a diagnosis of ASD may therefore only differ from their non-ASD peers in their expressive but not receptive vocabulary scores.
4.3. Does Language Measure Moderate the Effect Size, or the Mean Difference, of the Expressive and Receptive Vocabulary Size of the Infant Groups?
Pre-diagnostically and post-diagnostically. No significant differences were found in the effect size depending on whether the CDI or the MSEL was used. This indicates that the magnitude of the difference between the two groups was not influenced by how their expressive or receptive vocabulary was assessed. Both in prospective and retrospective populations, the EL and ASD groups had smaller vocabularies than their TL and non-ASD peers, respectively. This is in line with existing research that has compared the CDI and MSEL scores of children [
22]. In addition, studies that have implemented other measures, such as the Reynell Developmental Language Skills, have found similar effects in the 0–24 months age groups—the language scores of ASD infants are significantly lower than that of their typical likelihood or elevated likelihood no diagnosis peers [
48].
Therefore, factors such as the parents’ more extensive experience with the child do not appear to impact how accurately expressive vocabulary is assessed. Additionally, the time frame and the environment in which the assessment is done does not appear to have an impact on the effect sizes. Interestingly, when studies with an effect size of three or above were included in the meta-analysis, language measure was found to be a significant moderator of effect size. Pre-diagnostically, for receptive language, it was observed that the mean differences between the EL and TL groups were larger when they were assessed with the CDI compared to the MSEL. A statistical reason for this could be that the CDI is much longer than the MSEL and variance in the CDI as a result be larger. Another reason for the mean differences between groups being higher in the CDI could be the larger variability in how parents assess their children’s vocabulary. Whereas clinicians are required to test infants’ vocabularies in a standard format, parents rely only on their previous experience with their child. This could have led to larger variations in the scores that parents assign their infants within the CDI when compared with the scores assigned by clinicians on the MSEL. Nevertheless, we need to treat these results with caution, as findings emerged only when we included studies with very large effect sizes.
4.4. Limitations
The meta-analysis has some limitations. First, in our meta-analyses, we treated all collected effect sizes as
independent effects, whereas it is unlikely that this is true. There were many studies which yielded multiple effect sizes, collected at various ages or via different methods (i.e., both CDI and Mullen). For instance, Landa and colleagues [
49] followed infants with or without elevated risk of ASD and measured their vocabulary sizes at multiple time points (6, 12, and 24 months). Indeed, the majority of studies assessed vocabulary at multiple ages: For the 57 prospective studies on vocabulary development, 54% (31/57) measured vocabulary more than once. Other studies sampled vocabulary within the same children both via the CDI and via Mullen. There were nine prospective studies that reported both outcomes, such as Tran et al., [
50]. In both cases, effect sizes were obviously related as they correspond to the same set of children. A third reason as to why some effect sizes are possibly related to each other is that across studies, some (parts of) datasets might have been used multiple times. Finding and testing infants at elevated likelihood of ASD is difficult, time-consuming, and costly, which is why researchers from different research sites often team up to collect data sets large enough to draw valid conclusions [
4,
27]. The difficulty to find subjects is also the reason why some studies first publish data prospectively, and when outcomes are known on children’s final diagnosis, also retrospectively, zooming in on those infants who are either typically developing or diagnosed with ASD. As a result, while our meta-analyses assume that all effect sizes are independent from each other, the reality is that many effect sizes are related to each other, which raises questions about the generalizability of our results and obscures true effects. Nevertheless, results indicate significant effects when we conservatively focus on the lower bound of the confidence intervals. Thus, our results suggest that across studies, there is reason to believe that infants at elevated risk of ASD develop smaller vocabularies compared to their peers.
Another limitation in our studies is while we observed heterogeneous effect sizes, there was not a perfect balance across ages or methods sampled. To illustrate, there was an imbalance in the number of papers that looked at each assessment type. In total, 75% of studies used the MSEL, whereas 16% used the CDI. There was also the case that 9% of papers tested infants on both assessment types. This, however, means that the majority of the papers were MSEL and not CDI. Additionally, there were few (if any) studies prior to 5 months of age because these time points precede what is considered the earliest stages of infants’ vocabulary comprehension [
5,
20].
Another limitation was that some potentially confounding factors were not controlled for when searching for and excluding papers. This includes SES and the language background of the infants. A number of studies did not have information on the SES of their participants. However, SES has frequently been shown to influence the receptive and expressive vocabulary scores of infants [
51]. Language background could have also influenced the receptive and expressive vocabulary scores observed in this study. For example, a child raised in a multilingual compared to monolingual home may have had lower receptive or expressive skills in the tested language not due to developmental classification (e.g., EL or TL) but due to the frequency of exposure they had in that language. Some studies have shown that children raised in multilingual homes at certain developmental timepoints may lag behind their peers [
52]. Thus, we could not examine other possible moderators that could explain the heterogeneity in effect sizes. More research on this is needed.
4.5. Future Research Directions
The large heterogeneity in effect sizes suggests that there are other factors contributing to the variance in the study results. Future research could therefore assess the factors leading to the heterogeneity of the effect sizes, including the SES and the language background of the infants.
Furthermore, although language measure was not found to moderate effect size, this could be attributable to studies more frequently testing older infants. Older infants are more often tested on their ability to understand and produce words and younger infants are more often tested on their ability to understand words. It therefore remains unclear as to whether language measure may moderate group differences between elevated likelihood/ASD and typical likelihood/non-ASD infants when looking at younger infants, whose vocabularies are made up of a larger proportion of words that are understood but not yet produced. Future studies could test empirically whether language measure moderates the vocabulary scores in these younger infant groups.