*5.3. Instruments*

#### 5.3.1. The Item Pool

Against the theoretical background mentioned before, items for assessing word recognition skills were developed and compiled into a comprehensive item pool. In order to form a suitable item pool for assessing the word reading skills of children of primary school age, various considerations are necessary, which integrate verbal, literary, scientific, and curricular analyses. The formal design of the items was based on economic and pragmatic factors. Thus, they were to be feasible as group procedures in class. In connection with the previously described significance of silent reading experiences [35,50,51], it was therefore decided that the students should identify a real target word from a selection of pseudowords (e.g., "Maelr"–"Maler"–"Melar"–"Mlaer"; target word: "Maler" = painter).

To generate the item corpus used in this study, an analysis of various common textbooks was carried out. An intersection of the word material was created and compared with the available minimum vocabulary for the primary school sector in Germany. On this basis, 1277 words could be identified as relevant word material for primary schools. The words of the item pool were structured according to different aspects (word type, number of letters, number of syllables, and number of graphemes, as well as phonological, morphological, and orthographic peculiarities) and occurrences according to grade levels. For each word out of the item pool we designed pseudowords. Every distractor shows an optical proximity to the target word. For each item, pseudowords have been chosen that have a letter combination valid for Germany, as well as those that are unpronounceable in the German language.

Within the prepilot, a total of 533 children of the first to the fourth grade solved between 40 and 50 items according to the described task format, depending on the grade level. The distribution among the different grades is shown in Table 2.


**Table 2.** Descriptive statistics of the sample in the prepilot.

An analysis of the student outcomes (frequency of solutions) and interviews with the teachers and students (difficulty with tasks and possible remarks) indicate that the task format is understandable for students in elementary schools, that teachers consider it appropriate, and that there is a high variance of outcomes, i.e., it can differentiate between different achievement levels.

#### 5.3.2. The ELFE-II Test

In addition to the items of the item pool, some of the children worked on an established instrument to assess the reading fluency, reading accuracy, and reading comprehension of German-speaking children at the word, sentence, and text level (ELFE-II) [61]. To test the reading comprehension at the word level, the children had to choose the correct word out of a list of four for a given picture within a limited time span. At the sentence level, the children had to separate the correct word from four given distractors, and at the text level, the children were asked to answer multiple-choice questions for short texts. The ELFE-II test can be used in an individual or group session from the end of the first grade to the beginning of the seventh grade. The reliability (split-half reliability: rtt = 96; retest reliability: rtt = 93; parallel reliability: rtt = 93) and concurrent validity (correlation to another reading test: r = 77; correlation to the teacher's judgment: r = 70) of the instrument could be proven. Construct validity was determined using structural equation models. In addition, validity studies are available for children with diagnosed reading and spelling disorders and for children from di fferent school types [61].
