1. Introduction
Closing the gap in achievement between students with and without disabilities requires special education practitioners to implement efficient and effective educational practices with fidelity. Highly effective preparation of career-ready preservice special education teachers requires a focus on contextualizing the implementation of practices empirically deemed effective in improving students’ learning and behavior in authentic classroom settings (
Sawyer et al., 2017;
Szocik et al., 2024). Despite legal and stakeholder impetus calling for the use of evidence-based practices in the classroom, a research-to-practice gap persists between knowing what works and using it with fidelity (
Cook & Odom, 2013;
IDEA, 2004;
Wang et al., 2024). Closing that gap requires researchers to determine the best approaches to train special education practitioners, using strategies that are both effective and efficient (
Brock et al., 2017). Faculty tasked with training teachers in the use of evidence-based practices with students with disabilities must be equipped with evidence-based practices of their own in order to successfully bridge research to practice.
Fortunately, our understanding of evidence-based training procedures has grown, particularly since 2007 (
Brock et al., 2017). One empirically established training method, behavior skills training (BST), has consistently proven effective in helping special education practitioners perform skills (
Brock et al., 2017;
Li & Alber-Morgan, 2024;
Sawyer et al., 2017). BST facilitates active skill learning through four components: (a) instructions, (b) modeling, (c) role-play, and (d) feedback (
Sawyer et al., 2015). Using a BST package, the trainer provides step-by-step instructions for performing the skill and models each step. The trainee then has the opportunity to practice the skill via role-play (also referred to as rehearsal), receiving praise for steps performed correctly and corrective feedback as needed. The elements of BST are conducive to 1:1 training, because they are recursive in nature and allow data-based decisions on how best to proceed to be made by trainers in real time, as individuals develop and demonstrate skill competence (
Kranak et al., 2018).
BST has a track record of success in teaching a wide variety of trainees (i.e., interventionists) to implement new skills across various settings (
Parsons et al., 2012). For example, BST has been used to teach zoological staff to implement discrete trial training with whales (
MacKellar et al., 2023), to teach medical students to implement behavior-analytic procedures to promote comfort and cooperation during medical treatment among patients with neurodevelopmental disabilities (
Hoang et al., 2024), and to teach law enforcement officers strategies to promote successful response to calls involving individuals with autism spectrum disorder (
Hinkle & Lerman, 2021).
In an investigation on the use of BST with preservice special education teachers,
Sawyer et al. (
2017) found the training package to be effective in improving the performance of evidence-based practices in undergraduate participants. BST sessions were conducted in a campus conference room and continued until all participants “checked out” of training by achieving a mastery criterion of 100% correct performance on trained skills. Using a pre-post design, the researchers improved teacher performance on average from 10 to 85 percent, respectively. Despite the encouraging findings, the study was conducted in a clinical setting and did not evaluate generalization to real classroom settings or the in vivo application of trained skills (
Sawyer et al., 2017).
Other researchers have evaluated the use of BST to improve the implementation of the in vivo application of trained skills. In a high school functional skills special education classroom,
Carnett et al. (
2021) delivered BST via telehealth to four teaching staff, targeting their implementation fidelity with communication facilitation strategies. Using independentadapted ABAB designs, the researchers demonstrated a functional relation between the BST intervention and staff fidelity with the communication interventions, and an increase in the levels of independent student communication as a function of the interventionists’ improved implementation fidelity. Furthermore, social validity results indicated that the staff found the telehealth procedures to be acceptable and the skills developed to be of value; however, implementation fidelity reverted to baseline levels in the absence of coaching (
Carnett et al., 2021). These findings suggest that the “buy-in” with the procedures may have been insufficient. Promoting choice and shared governance in the selection of targets and intervention procedures may help to ensure social validity and treatment adherence, as evidenced in maintained performance rather than verbal report alone. For example, involving interventionists in the identification of the student issues they want to target in their classrooms and in selecting the procedures to use may improve adherence after the training concludes.
Another validated approach to improving teacher fidelity with educational practices is coaching (
Samudre et al., 2024). Coaching is defined as 1:1 support following initial training (
Stormont et al., 2015). Coaching procedures may include tactics similar to BST and involve the addition of individualized performance-based feedback from authentic implementation efforts and/or goal setting. Oftentimes, coaching includes a review of graphical displays of performance with written and/or vocal discussion. In a tiered system of intervention in which the training of practitioners is the independent variable and the implementation of empirically supported procedures is the dependent variable, individualized coaching may be conceptualized as a secondary or tertiary intervention.
In a study with three first grade teachers,
Kretlow et al. (
2012) found that implementation fidelity with whole-class instruction strategies improved following a 3 h in-service training. Each participant then received one individualized in-class coaching session with specific praise and non-evaluative error correction, after which all three participants further improved performance. Although the researchers did not select the curriculum the teachers used and focused instead on training instructional strategies, the teachers did not participate in the selection of those strategies (
Kretlow et al., 2012). It is possible that social validity and implementation fidelity could have been further enhanced with choice and shared governance of the student outcomes targeted.
Previous research has demonstrated that BST and coaching are promising methods for improving implementation fidelity in special education classrooms. The purpose of the present study was to extend the literature on the use of BST and coaching, in order to improve implementation fidelity and social validation with an undergraduate preservice special education teacher by allowing the participant to have choice in selecting the student issues to address and the procedures for which she was to receive BST and coaching. To distinguish the practices employed in the present study, which were selected to meet specific inclusion criteria, the dependent variable was implementation fidelity with empirically supported procedures (ESPs). ESPs were defined as procedures with contextual fit, meaning they were identified as interventions to address real, practical problems identified by the participant and supported empirically with respect to peer-reviewed research, demonstrating positive outcomes with similar target student behaviors, similar population demographics (i.e., categories of disability recognized by IDEA), and in similar settings. The aim was to address the following research questions:
What are the effects of BST on a teacher’s implementation fidelity of ESPs in an authentic classroom setting with students with moderate to severe disabilities?
To what degree and level of effectiveness is coaching required to further support a teacher’s implementation fidelity with ESPs?
What are the teacher participant’s perceptions of the ESPs and training/coaching procedures?
2. Materials and Methods
2.1. Participant and Setting
After receiving approval for this research from The Ohio State University’s Institutional Review Board, the author sent her group of student teaching supervisees an invitation letter to participate in the study. The author met with the supervisee who was first to respond to provide additional information and the opportunity to ask questions about the study. Once fully informed, the participant signed a consent form to participate in the research, as did the administer of the school where she was student teaching.
The participant was a White, female, undergraduate student, completing her final student teaching practicum in a special education program at a large midwestern university. Her practicum placement was in a classroom for students with moderate to intensive disabilities at a public high school in an urban school district. Under the supervision of her cooperating teacher, the participant was responsible for planning, teaching, and assessing seven class periods in various subjects (e.g., reading, math), every day of the week. The participant reported general knowledge of ESPs acquired through her coursework but no prior formal experience implementing ESPs in the classroom.
Data collection on the implementation of ESPs took place during the third period science class with six students. Four of the students did not engage in vocal verbal behavior. BST and coaching sessions were conducted before or after school in a classroom with no students present, or on the weekend at a local coffee shop.
2.2. Experimenter and Research Assistant
The experimenter served as the trainer and coach. The experimenter was a White, female, doctoral student in a Special Education and Applied Behavior Analysis (ABA) program at the same large midwestern university as the participant. She was a Board Certified Behavior Analyst with nine years of experience working with teachers of and students with disabilities in special education classroom settings. A graduate student enrolled in the ABA Master’s program at the same university served as a data collector and research assistant. She had an undergraduate degree in psychology and collected data on the dependent and independent variables across all experimental conditions.
2.3. Materials
The experimenter developed all materials utilized in the study. Procedural fidelity checklists were created to outline the steps for the trainer (experimenter) to conduct the pre-baseline interview, baseline observation sessions, task analysis sessions, BST sessions, post-BST sessions, coaching sessions, and post-coaching sessions. During the pre-baseline interview, the experimenter followed a procedural checklist designed to obtain information from the participant concerning the top three perceived needs in the classroom. During BST and coaching sessions, the experimenter followed procedural checklists to ensure and maintain fidelity with BST and coaching procedures.
After pinpointing the classroom issues to target with the participant, the experimenter reviewed relative research to (a) identify and confirm each ESP met inclusion criteria, (b) discern the steps required for implementation, and (c) develop a task analysis for each of the selected ESPs.
The procedural checklists used during baseline, post-BST, and post-coaching sessions were identical. None required action by the experimenter; alternatively, they measured the presence versus absence of specific behaviors (i.e., instructions or feedback), to ensure additional training or coaching was not provided outside of experimental conditions under which those independent variables were manipulated.
2.4. Dependent Variables
Data were collected from 9:15 to 10:05 a.m. during the third period science class three times weekly on average, except for during weeks with holidays and unplanned school events. The dependent variable was ESP implementation fidelity, measured using task analyses of the ESPs and calculated as a percentage by dividing the number of steps implemented accurately by the number of steps that should have been implemented during the observation, and multiplying by 100.
Four ESPs were targeted and are summarized in
Table 1. ESP selection criteria included a minimum of three peer-reviewed sources of evidence (i.e., experimental research designs, meta-analyses, empirical literature reviews). At least one of the sources of evidence was required to demonstrate the ESP’s effectiveness with a similar student population; one was required to evidence the ESP’s effectiveness in addressing a similar issue (e.g., science vocabulary, on-task behavior); and one must have shown the ESP’s implementation in a similar setting (i.e., special education classroom). ESPs were academic or behavioral interventions, and preference was given to ESPs that had more extensive literature support and better contextual fit (i.e., considered acceptable and feasible by the participant and her cooperating teacher, addressed as relevant Individualized Education Plan (IEP) goals and objectives).
2.4.1. The Beeper System (TBS)
This momentary time sampling procedure was selected to increase on-task behavior and involved public posting of student on-task behavior at 5-minute intervals. This ESP was implemented with the whole class throughout the period, with the participant recording the percentage of intervals on task for each student each day.
2.4.2. Constant Time Delay (CTD)
This explicit instruction method was used to teach vocabulary to the four students who did not engage in verbal vocal behavior. It was implemented at the end of the period during independent work time at a table in the back of the classroom. The participant recorded each student’s identification of vocabulary terms as correct or incorrect each day and used the data for progress monitoring related to IEP objectives.
2.4.3. Direct Instruction Lesson Plan (DILP)
This lesson plan format was used to facilitate explicit instruction during science. It included 1–3 primary learning objectives, embedded teacher input statements, and questions for evoking active student responding. The participant wrote the lesson plan during her planning period and implemented it during whole class instruction.
2.4.4. Evoking Active Student Responding (EASR)
This procedure for soliciting and responding to student responses involved a 5-step sequence: (a) teacher question, (b) wait time, (c) signal, (d) student response, and (e) affirmative or corrective feedback. It was used during whole class instruction.
2.5. Experimental Design
A multiple baseline across skills design was used to assess the effects of a task analysis, training, and coaching on the participant’s ESP implementation fidelity.
2.5.1. Pre-Baseline
The participant was interviewed to identify areas to target based on both teacher and student needs. The experiment conducted pre-baseline observations to confirm those needs. Then, the experimenter identified ESPs to target each issue and created a procedural checklist for each one.
2.5.2. Baseline
Prior to collecting baseline data, the experimenter told the participant the names of the ESPs to implement. No instructions or explanations were provided, and the participant was told to try her best (e.g., “To address on-task student behavior, please try implementing the Beeper System using momentary time sampling and public posting.”).
2.5.3. Task Analysis
After establishing low levels of implementation fidelity during baseline, written instructions (i.e., procedural checklists) for all of the ESPs were provided. The participant was told to read the checklists and try her best to implement the ESPs.
2.5.4. BST
Instructions, modeling, role-play, and feedback were implemented in a staggered fashion across ESPs. The experimenter met 1:1 with the participant to conduct one BST session for each ESP. Each BST session lasted 20 to 30 min. Following BST, observations and data collection resumed.
2.5.5. Coaching
If ESP implementation fidelity remained below 90% across two consecutive sessions following BST, coaching sessions were conducted. During 1:1 coaching sessions, the experimenter provided a combination of graphical (i.e., review of the implementation fidelity graph) and verbal (i.e., discussion of steps implemented correctly and steps in need of improvement) feedback to the participant. Coaching sessions continued until the participant demonstrated at least 90% implementation fidelity across two sessions. This was only necessary for one ESP, and a total of five coaching sessions were conducted.
2.5.6. Post-BST/Post-Coaching
Observations identical to baseline conditions, during which neither instructions nor feedback were provided, continued until the end of the semester. Coaching sessions would have been instated (or reinstated) contingent on implementation fidelity falling below 90% across two consecutive sessions; however, it was not necessary.
2.5.7. Interobserver Agreement
Prior to baseline, the experimenter trained the research assistant to ensure reliability in scoring ESP implementation across skills. Using BST procedures, the experimenter provided instructions and modeling and engaged in role-play with feedback with the research assistant. Interobserver agreement (IOA) was examined for implementation fidelity of each ESP for at least 30% of the sessions of each experimental condition. The mean IOA across ESPs was 98% (range: 92 to 100%) during baseline, 98% (range: 88 to 100%) during task analysis, 98% (range: 95 to 100%) during post-BST, 99% (range: 99 to 100%) during coaching, and 100% during post-coaching.
2.5.8. Procedural Fidelity
Prior to the pre-baseline, the experimenter trained the research assistant to ensure reliability in fidelity to the procedures in each experimental condition. Using BST procedures, the experimenter provided instructions and written procedural fidelity checklists for the conduct of classroom observation sessions in each condition, as well as for the 1:1 pre-baseline meeting and BST and coaching sessions. The experimenter then modeled and engaged in role-play with feedback with the research assistant.
Procedural fidelity was measured during the pre-baseline meeting, 50% of the sessions during the task analysis condition, 100% of BST sessions, 35% of sessions during the post-BST condition, 100% of coaching sessions, and 33% of sessions during the post-coaching condition. Procedural fidelity was 98% during the pre-baseline meeting and an average of 100% during the task analysis condition. During BST sessions, procedural fidelity was 98.5% (range: 96 to 100%). During coaching sessions, procedural fidelity was 88.8% (range: 86–92%). Post-BST and post-coaching sessions were conducted with 100% procedural fidelity.
4. Discussion
The current study examined the use of written instructions (i.e., task analysis), BST, and coaching with a preservice teacher to improve implementation fidelity with four ESPs selected to address participant-identified targets. The results of this study demonstrated a functional relationship between BST and improvement of implementation fidelity with all four ESPs, and between coaching and further improvement of implementation fidelity with one ESP. These results are consistent with previous research (see
Brock et al., 2017 for a comprehensive review and meta-analysis), suggesting that BST is a consistently effective tactic for improving special education practitioners’ implementation fidelity of educational practices to students with disabilities. Additionally, these results support previous research indicating that coaching is an essential component of improving undergraduate preservice teachers’ training in evidence-based practices (
Wang et al., 2024).
These results extend the literature by incorporating social validation both before and after experimental conditions were conducted. The participant selected the classroom issues upon which to intervene prior to the onset of baseline conditions. The experimenter used that input and understanding of the classroom dynamics (i.e., setting and instructional formats, students, and their IEP goals and objectives) to select ESPs with an appropriate contextual fit. This element of social validation was designed to be proactive in ensuring the goals of the study aligned with the participant’s perceived needs in the classroom and intended to allow for shared governance in the planning of treatment (
Rajaraman et al., 2022). Future research should investigate incorporating other measures of social validity via shared governance in the selection, development, and implementation of ESPs throughout the consultative process.
Additionally, the data obtained from the social validity measure administered at the conclusion of the study were positive. The participant rated the ESPs with strong agreement in terms of their suitability, effectiveness, and ease of implementation. Indeed, she indicated an intention to continue implementing them in future settings and to recommend their use to colleagues. However, although the participant agreed that three of the four ESPs were reasonable in the time they required for implementation, it was to a lesser degree. This suggests that researchers and consultants may be wise to ensure that interventions be as parsimonious and efficient as possible. Although those were the intentions of the first author, it is possible that more concise task analyses could have been developed for the ESPs. Component analyses may help us to better understand the critical features of ESPs necessary to impact student outcomes as intended, thus allowing refinements of task analyses of those procedures.
Overall, the participant’s ratings of the procedures on the social validity measure suggest satisfaction with participation in the study. In particular, she strongly agreed with satisfaction in selecting the issues to address, participating in BST, and being coached. The notion of shared governance in treatment plans is valid; furthermore, it is also worth consideration that the participant disagreed with the item related to selecting the ESPs to implement herself (rather than having the experimenter select them). This is perhaps related to the point regarding time constraints—special education teachers are busy professionals. Future research should evaluate the extent to which practitioners are able to select ESPs independently and the degree to which they find the exercise reasonable and valuable in addition to their already heavy workloads. Relatedly, researchers could examine the balance between outcomes and the degree of shared governance in intervention selection, development, and implementation necessary to achieve socially valid outcomes for both participants and recipients of the interventions that participants are trained to implement.
Despite the positive outcomes of this study, there are several limitations that should be considered. Most significantly, no maintenance data were collected, due to the time constraints of the study imposed by the participant’s completion of her final semester of student teaching. This limitation, combined with the participant’s strong agreement in desire for coaching to be continued into her first year of teaching, presents an opportunity for future researchers to explore the maintenance and generalization of ESPs initially trained using BST procedures across time and in novel settings.
Future research with preservice teachers should seek participants who are available for continued participation in studies on maintenance and generalization across other skills. For example, it may be of interest to continue with the same participant(s) in subsequent semesters of student teaching to analyze the extent to which taught skills are transferred and maintained in other classrooms. It may also be insightful to assess the extent to which both pre and in-service teachers are able to adapt previously taught ESPs to meet the needs of new students. Given the brevity commonly associated with student teaching (i.e., only 1–2 college semesters), it would be ideal to include participants who are able to continue with follow-up observations into their initial years as in-service teachers. The participant in this study indicated strong agreement that new and experienced teachers alike should receive coaching throughout their careers. This should be considered positive news for those of us wanting to improve the professional practice of special education teachers—our students too, or possibly even more so, want to do better.
Another limitation and/or contribution of the present study, as well as an opportunity for future research, lies in the first author’s professional relationship with the participant, as she was the participant’s student teaching university supervisor. It is possible that relationship presented a conflict related to the participant’s amenability to participation and her performance. Anecdotally, the participant reported enjoying participation throughout the process and was assured it was in no way tied to regular student teaching evaluations required by the university; however, it is possible she felt undue pressure to comply and succeed given the circumstances. Alternatively,
Brock et al. (
2017) have noted the potential benefit of incorporating performance feedback, an element of both BST and coaching, into university fieldwork supervision, and this study supports the feasibility of doing so. Researchers should explore how to more explicitly embed performance feedback tied to specific educational practices into special education teacher preparation frameworks.
Although the experimental procedures employed in this study show strong internal validity, the small sample size is an inherent limitation of single case research design that yields uncertainty regarding the external validity of these findings. Future research could be designed to address these concerns. For example, systematic replication using multiple baseline designs across ESPs across multiple teachers could be used to demonstrate the procedures’ effectiveness with teachers who have identified other student issues to address in varied settings. Alternatively, some researchers have explored methods for scaling the use of BST with multiple participants in a group (rather than 1:1) format. For example,
Courtemanche et al. (
2021) incorporated peer feedback during role-play to deliver BST and found 15 of the 18 participants improved their percentage of steps implemented correctly on all four skills immediately after BST. Such similar group investigations could extend the literature on BST and be further enhanced with integrity checks of peer feedback to ensure fidelity with the training procedures (
Courtemanche et al., 2021).
Another limitation of the present investigation is that data on student outcomes were not collected. Given that the ESPs were selected for implementation based on inclusion criteria of peer-reviewed literature relevant to participant-selected needs and empirical evidence of their effectiveness with similar targets in similar settings with similar student populations, it is hopeful that they were effective in achieving their desired outcomes. However, future research should explicitly assess the extent to which ESP implementation impacts students’ academic and behavioral performance. For example, in their BST via telehealth study,
Carnett et al. (
2021) reported their findings in teacher–student dyads. By measuring both teacher implementation fidelity with the trained communication strategies and the percentage of independent student mands and frequency of mands presession, the researchers were able to conclude that their methods improved teacher implementation fidelity that, in turn, improved the targeted student outcome.
Moreover, researchers should seek to understand the level of implementation fidelity necessary to achieve student goals. For example, BST was effective in improving the implementation of TBS, a momentary time sampling and public positing procedure intended to improve student on-task behavior, to 88% correct implementation fidelity. It is possible that level of fidelity would have been sufficient in increasing on-task behavior, but the participant was moved into the coaching phase per pre-determined (albeit arbitrary) experimental condition criteria. Determining the extent to which implementation fidelity functionally relates to student dependent variables may help refine our understanding of the level or degree of intervention necessary to change the special education practitioner’s behavior, and, in turn, his, her, or their students’ performance.
Withstanding its limitations, the current study supports previous research on the effective means of training special education preservice teachers to implement empirically supported procedures. Such procedures are important, not only because they are federally mandated (i.e., the
No Child Left Behind Act of 2001 [NCLB], 2006), but because they have earned the status of being empirically supported due to the documented impact they have on outcomes for students with disabilities. It is reasonable to ask that we do what we know works, and, as knowledge develops, we adapt. Additionally, the current study provides several avenues for further empirical investigation that are worthy pursuits in our continued understanding of how best to prepare and support special educators.