*6.2. MDFS*

The Multidimensional Fluency Scale (MDFS) has also been used to assess expressive oral reading. This measure was initially developed by Zutell and Rasinski [18], who were influenced and motivated to do so by the work of two groups of researchers, Allington and Brown [38], and Aulls [39]. Zutell and Rasinski noted that both groups of scholars had identified specific elements of prosody [18], and that Aulls had created a rough scale for observing stages of reading fluency (word-by-word reading and phrasing, and expression). However, Aulls did not include all of the same elements that Zutell and Rasinski had described in greater detail in their initial Multidimensional Fluency Scale [18]. Zutell and Rasinski were also influenced by the NAEP prosody rating scale that was being developed for use in 1992. The MDFS was "an elaboration of the fluency rubric used in the NAEP studies of oral reading [14,34] that reported significant correlations (predictive validity) between oral reading prosody and fourth-grade students' silent reading comprehension" [35]. Unlike the NAEP scale, the MDFS utilized an analytic scoring system, using four levels, from low to high prosody focusing on three domains of expression—phrasing, smoothness, and pacing. A separate score for each of the three traits of expressive reading was given.

The MDFS changed from the original three to four dimensions of prosody in 2003, when the additional aspect of expression and volume was given [40]. Table 2 shows the current MDFS that consists of a four-point scale for four specific dimensions of fluency—expression and volume, phrasing, smoothness, and pacing.


**Table 2.** Multidimensional Fluency Scale (MDFS).

To use the MDFS, raters make judgments about individuals' prosodic reading in each of the four dimensions. The descriptions of the levels allow raters to make consistent decisions about readers' performances. However, with so many decisions required of raters, questions arise about the reliability and validity of the scores obtained using this instrument.

The reliability and validity of scores obtained using the MDFS have been established by various researchers. Zutell and Rasinski said that "initially teachers often feel insecure in making 'subjective' judgments; they are concerned about issue of reliability and validity" [18]. To alleviate these concerns, they conducted research to show that fluency ratings are strong predictors of results on standardized reading tests [41]. Further, they showed that with training, university teacher candidates could learn to apply to rubric accurately and consistently [42]. The training of raters is important to ensure reliability of scores, but questions remain about the number of raters, passages, and rating occasions that are required to obtain reliable scores. The combination of raters, passages and occasions also make the feasibility of using the MDFS an issue.

Moser, Sudweeks, Morrison, and Wilcox addressed these specific issues in a generalizability study of ratings of 36 fourth- graders' reading. For three days each week over a seven-week period, these students practiced fluent oral reading of passages from both genres. At the conclusion of that practice period, students read four passages—two narrative and two informational—to their teacher, the lead researcher in the study. All readings were recorded so that expressive oral reading could be assessed on different occasions. The 144 readings were evaluated using the MDFS by two trained raters on two separate rating occasions.

Results show the mean score for expression and volume was 3.11, phrasing was 3.25, smoothness was 3.12, and pace was 3.06, with an overall mean of 12.54 out of 16 [43].

Generalizability theory methods were used to evaluate the rating scores. These procedures provided a way to simultaneously estimate main e ffects and interaction e ffects through the analysis of mean ratings. Generalizability theory goes beyond the traditional ANOVA in that it can be used to estimate the relative percentage of measurement error from multiple facets. In this way, researchers were able to estimate the reliability of scores obtained using the MDFS. By using Generalizability theory, researchers also examined e ffects related to raters, rating occasions, and passages. Results were used to estimate the number of raters, rating occasions, and passages that are required to obtain reliable scores for expressive oral reading.

Results showed very high MDFS reliability scores, ranging from 0.92 to 0.98. Findings also showed that a minimum of two, and preferably three, equivalent passages, two raters, and one rating occasion are recommended to obtain reliable ratings. Like the research by Zutell and Rasinski, this study also demonstrated the value of training raters and encouraging them to collaborate during training sessions. In addition, this study showed the necessity of using multiple passages along with multiple raters. Most important, this study found that highly reliable expressive oral reading scores can be obtained using the MDFS and assures researchers and teachers that it can be used to measure expressive oral reading.

Smith and Paige were interested in examining the reliability of scores on prosodic reading that can be obtained using both the NAEP fluency scale and the MDFS. They sought to compare these two measures of prosody. Like Moser et al., they also used Generalizability theory. They trained four doctoral students to use both the MDFS and NAEP to rate children's oral reading. Results showed an average NAEP score of 2.54 out of 4 on the first occasion and 2.70 on the second. They also showed scores of 10.09 out of 16 and 10.76 on the two separate occasions using MDFS [44]. Children in first, second, and third grade orally read one grade-level, narrative passage from the Gray Oral Reading Test-5 [45]. All readings were digitally recorded so that ratings of prosody could be completed. The four raters judged the oral reading of 177 readers on two occasions.

These researchers measured the amount of variance contributed by di fferences in raters, rating occasions, and students. Reliability coe fficients were very similar for the MDFS and the NAEP. Results showed high reliability scores for each of the three grade levels, ranging from 0.91 to 0.94 for both rating instruments. Results showed slightly higher reliability scores for the MDFS than the NAEP, but the two measures were highly correlated with no significant di fferences in scores obtained from the two instruments.

Although the MDFS and NAEP produced similar results, the MDFS was slightly more e fficient than NAEP in regard to measurement design resources. To obtain desired results, the MDFS required only two raters, as opposed to three needed when using the NAEP instrument. MDFS provided a deeper analysis of the quality of reader fluency, due to the analytic nature of the MDFS and the holistic quality of the NAEP measure. The precision of information from the MDFS can better inform instruction. Training raters was essential to obtain reliable scores, regardless of which rating scale was employed.

The MDFS has been used in a number of studies that examined expressive oral reading. Dutch researchers examined the oral reading of 106 fourth graders to see what aspects are closely associated with reading comprehension. They used the MDFS to measure prosody. Regression analyses showed that prosody, not rate, was most closely linked to comprehension scores [46]. Similarly, Turkish researchers examined the oral reading of 132 fourth graders. Using the MDFS, they showed a strong relationship among attention, reading speed, and prosody [47].

Repeated readings have been suggested as a way to improve oral reading fluency. Guerin and Murphy used the MDFS in their study of struggling adolescent readers. Results showed that over a seven-week period of repeated reading, all aspects of fluency improved, leading to more strategic reading and improved comprehension [48].

In summary, the MDFS measure shows variability in expressive oral reading with mean fluency scores of 10.76 out of 16 in the Moser et al. study and 12.54 in Smith and Paige. This instrument had a reliability score of 0.92–0.98 for the Moser et al. study and 0.91–0.94 in Smith and Paige. These reliability scores assume two raters and one rating occasion. Other researchers using this scale have found that prosody better predicts comprehension than rate, even though the two have a strong relationship. Another result is that improved expressive reading can lead to greater reading comprehension.
