*2.4. Data Analyses*

This section first described video coding analysis, followed by reliability analysis and the three-tiered method for analysis.

#### 2.4.1. Video-Coding Analysis

The Noldus Observer XT 14.0 (Noldus Information Technology BV, Wageningen, The Netherlands) was used in video-coding analysis, which is software for behavioral research to code and visualize behaviors on a timeline accurate to the millisecond. Figure 2 shows an example of visualization in behavioral observation of moves and communicative functions on a timeline for one participant. Each video-coding was analyzed continuously, meaning the behaviors were coded whenever they appeared and were stored in the software. The software contained statistical analyses to calculate rate per minute for each specified communicative interaction behavior in dyads. Even though the software assisted the work, it took about six to eight hours for coding and rechecking each video due to the complexity of the coding scheme and the characteristics of the target group.

**Figure 2.** An example of video coding and visual analysis for moves and communicative functions. Note. It is a screen copy from the results of the Noldus Observer XT 14.0.

#### 2.4.2. Reliability

Inter-rater reliability was determined using the kappa statistic on 10% of each video fragment (i.e., random selection of a one-minute interval) as a reliability check [34] by another rater. Due to the idiosyncratic nature of the communicative behaviors made by children/youths with complex needs, we included videos of all dyads as a reliability check since each fragment was connected with specific challenges.

The inter-rater had a background in speech-language pathology and was experienced with video-coding of communication in children with severe disabilities. Before the coding started, she received training and practiced video-coding along with the first author for 12 h online or face-to-face. Following the training, the inter-rater conducted pilot coding on two videos. Cross-examination was carried out and any discrepancies were discussed to clarify the inconsistent coding. After reaching a consensus about the definitions of each code, the inter-rater independently conducted double coding of all one-minute videos based on the coding guidelines. To avoid judgment bias due to the random selection of each video fragment, the inter-rater checked the preceding communicative behaviors before issuing a coding.

Three categories—moves, communicative functions and modes of communication were addressed to check inter-rater reliability. Cohen's kappa was calculated to estimate the degree of consensus between raters, according to the equation [35], in which the value above 0.75 was a good agreement, between 0.4–0.75 was an acceptable agreement, and below 0.4 was low agreement [35,36]. Inter-rater reliability revealed that the average kappa values were acceptable for moves (k = 0.72 and 0.64 in EGAT and NEGAT conditions, respectively), acceptable to good for communicative functions (k = 0.85 and 0.74), and good for modes of communication (k = 0.98 and 0.89).
