1. Introduction
Student learning outcomes (SLOs) serve several purposes. The most explicit goal is to describe the unique knowledge, skills, and abilities students acquire as they complete courses and degree programs. Two less explicit goals are to support the design of curriculum and align learning goals with learning activities and assessments [
1,
2]. An additional goal, which is an important component of the Degree Qualifications Profile, is the articulation of how students who earn advanced degrees differ in professional expertise and skill from students who earn less advanced credentials [
3]. Faculty appeal to Bloom’s taxonomy [
4] and models of cognitive development in higher education (e.g., Perry [
5] and Clinchy [
6]) to achieve these goals. This paper examines how well current recommendations for writing SLOs based on Bloom’s taxonomy serve these goals.
Bloom’s taxonomy classifies thinking skills into six hierarchically organized categories that range from lower-level cognitive skills (know and understand) through higher-order cognitive skills (apply, analyze, evaluate, create) based on the verbs teachers select when they describe expectations for students’ thinking skills and behaviors in a learning outcome [
4]. Since its publication, teachers have relied on Bloom’s taxonomy to guide how they write learning outcomes, structure learning activities, and assess student learning. Bloom’s taxonomy guides the development of test questions to assess higher-level thinking skills by drawing attention to what test questions and assessment prompts require students to do (retrieve facts, apply knowledge, make a prediction, solve a problem, or evaluate a theory). Thus, Bloom’s taxonomy has influenced how instructors design their courses, how they describe learning outcomes, and how they create assessments of learning.
Ideally, SLOs use language that can be understood by colleagues, students, parents, and potential employers. SLOs written with verbs that describe specific, concrete actions or operations minimize ambiguity. They describe what the student can do in terms of specific, observable behaviors and communicate learning goals in clear, jargon-free language [
7].
Student learning outcomes are analogous to operational definitions in the natural and social sciences. The criteria that characterize a measurable SLO are similar to those that characterize the operational definitions researchers create to design experiments and address research questions [
8]. Operational definitions describe the procedures researchers create when they manipulate an independent variable and describe the instruments, operations, procedures, and public behaviors they use to measure dependent variables. SLOs that describe learning in operational (measurable) language create a link between faculty intentions for student learning and the strategies faculty will use to assess learning. Student learning outcomes written with objective action verbs imply the kinds of assignments students complete and the kinds of assessment measures these assignments generate. The assessments will be grounded in student behaviors that instructors can observe directly or in a tangible product students create and instructors evaluate on specific dimensions of quality [
8].
Guidelines built around Bloom’s taxonomy assume the verbs in each category describe a progressive development of cognitive skills. Verbs at lower levels of Bloom’s taxonomy describe acquisition of knowledge and facts whereas verbs at higher levels of Bloom’s taxonomy describe complex thinking skills, including application of knowledge to practical problems, analysis of competing interpretations, and creation of new knowledge or alternative interpretations of existing findings.
Other taxonomies focus on the relation between learning outcomes, learning activities, and assessments without making claims about the nature of the thinking skills described [
1,
7]. For example, Fink proposes a taxonomy of six categories he believes are necessary to create “significant learning” [
1]. Fink makes no claims about how the verbs from categories in his taxonomy relate to the development of cognitive skill. Fink emphasizes the importance of aligning learning goals with learning activities and assessments, using a process called backward design [
2]. An instructor who uses backward design to create a course will identify the course learning outcomes first. The instructor then selects reading materials and designs learning activities and assignments to create opportunities for students to acquire and practice these skills. Finally, the instructor selects assessment instruments that evaluate the learning outcomes. In contrast, an instructor who does not use backward design might begin designing a course by selecting assigned readings and determine afterward what students might learn from those texts. Like Fink, Adelman proposes a taxonomy comprised of nearly 20 categories of verbs based on functional activities aligned with specific cognitive demands of assignments and projects [
7]. Instead of organizing verbs into categories of thinking skill, Adelman’s approach focuses on creating categories of verbs that enable faculty to align course and program goals with instructional strategies, assignments, and assessments, as occurs when instructor use backward design to create courses [
1,
2].
Differentiating between types of degrees within a discipline is a more challenging task. Faculty often define higher order thinking skills that describe learning in terms of Bloom’s taxonomy when they write SLOs for advanced academic work. They justify this practice by arguing that the categories represent a hierarchy of increasingly complex skills [
4]. However, Anderson and Krathwohl raised questions about the developmental sequence of categories in Bloom’s hierarchy [
9]. Paul also criticized the hierarchical structure of Bloom’s taxonomy, arguing that critical thinking skills permeate every category of Bloom’s taxonomy [
10]. Fink and Adelman make no assumptions about how student learning progresses across their categories [
1,
7]. Adelman argues that contextual information such as how well a student executes skilled behavior and the disciplinary content itself may be necessary to differentiate between learning outcomes for associate’s, bachelor’s and master’s degrees.
The SLOs for different degree programs should illustrate how the skills of a graduate in one program or discipline differ from those of a graduate in another program. SLOs should also describe the knowledge, skills, and abilities that characterize the expertise represented by the award of specific degrees within a discipline (associate’s, bachelor’s, or graduate degrees). In addition, accrediting bodies establish standards that require institutions to differentiate the learning goals they articulate for associate’s, bachelor’s, graduate, and other degree credentials. The SLOs that describe the skills attained by a student who completes a more advanced degree should reflect a different level of expertise than the SLOs written for a less advanced degree. The Degree Qualifications Profile (DQP) aspires to describe how students who merit the award of an associate’s, bachelor’s, or master’s degree differ in the kinds of skills students achieve [
3]. The DQP is offered as a model an institution might use to frame program learning outcomes in specific fields and articulate the learning expectations associated with each type of degree program, independent of the discipline in which the degree is awarded [
3,
11].
The internet includes many resources that offer guidelines on how to write measurable SLOs. Many of these resources provide examples of measurable verbs, usually based on Bloom’s taxonomy. I created a collection of verbs for Bloom’s taxonomy by aggregating words gathered from over a dozen web sites, each of which provided examples of verbs for each level of Bloom’s taxonomy. In 2006, when I created my collection, few web resources posted large lists of words. Most sites identified 8–15 words for each level of Bloom’s taxonomy. Large collections of verbs for Bloom’s taxonomy have proliferated in recent years. A Google search will now produce many examples of what Adelman describes as “formless lumps of verbs” for Bloom’s taxonomy [
7]. Although some sites mirror lists created by others, many sites provide their own collection.
I was curious about whether these collections included verbs I might have overlooked when I created a collection of verbs for Bloom’s taxonomy. The sources I consulted sometimes placed a particular verb in different taxonomy categories. As a result, a few verbs in my collection appeared in more than one category because different sources aligned these verbs with different categories. Did the authors of other large collections also encounter this duplication of verbs across categories? If so, how frequently did lists align verbs with the same specific categories? Would an analysis of these collections identify converging opinion about the most appropriate alignment of a verb with a specific category in Bloom’s taxonomy? Thus, I posed the following research questions:
How consistently do authors use specific verbs to describe learning at different levels of Bloom’s taxonomy?
Will an analysis of large collections of verbs for Bloom’s taxonomy produce consensus about the level of thinking skill described by an SLO that uses a specific verb?
Can we make unambiguous recommendations about which verbs correspond to specific levels of cognitive skill implied by the structure of Bloom’s taxonomy?
2. Materials and Methods
Materials were obtained by conducting a Google search based on the string, “action words for Bloom’s taxonomy”, in March 2016 and downloading the lists posted in the top 30 web sites produced by this search. The 30 highest-ranking web sites each contained a unique collection of words. However, sites with lower rankings in the Google search frequently mirrored a list posted by one of the top 30 sites. The web sites that provided a list of Bloom’s taxonomy verbs used for this analysis are identified in
Appendix A.
Words harvested from lists posted on the 30 highest-ranking sites in the Google search were aggregated in a single data set without regard to duplicated words within or across levels of Bloom’s taxonomy. Each time a list identified a word for a given level of Bloom’s taxonomy, the word was recorded for that category in the data set. If a list duplicated a word, identifying it in more than one category, the word was recorded in the data set for each category in which it was listed.
Words that were not verbs were deleted (e.g., how, what, when, where, which, who, why). Grammatical variations for verbs were standardized. For example, words that were nominalizations were converted to their verb form (e.g., organization became organize, retrieving became retrieve, theory became theorize, inference became infer).
Citation frequency, the number of lists that assigned a verb to a given level of Bloom’s taxonomy, was recorded for verbs in each of Bloom’s categories. Citations frequency values could range from 1 (only one list aligned the verb with that category) to 30 (every list aligned the verb with that category). After recording the citation frequency for each verb, duplicate copies were deleted within a category (but a duplicate copy of the verb was retained in other categories where it occurred).
Verbs were sorted alphabetically within categories to identify how often a verb was duplicated across categories. That is, how often a verb appeared in two or more categories. This value could vary from 1 (verbs assigned to a single category) to 6 (verbs assigned to every category). The number of verbs assigned to a single category and to varying numbers of categories was recorded. Percentage of verbs assigned to a single category was used as an estimate of consensus across lists about category membership. This analysis was completed with verbs organized in six categories (based on Bloom’s taxonomy) and when verbs were organized in three categories (fact-based/low-level thinking, application, and conceptual/high-level thinking), which were created by merging selected categories.
Finally, verbs within each category were sorted by citation frequency to create smaller collections comprised of verbs that multiple lists nominated as members of a category. The percentage of verbs assigned to a single category in these smaller lists evaluated the consensus among lists about category membership. The most lenient criterion selected verbs nominated by four or more lists. A more stringent criterion selected verbs nominated by 10 or more lists. These two analyses determined whether list consensus improved when more lists agreed that the verb belonged to a given category.
3. Results
The full collection, based on all verbs included in the 30 lists (788 verbs), included duplications within categories (created when several lists aligned a verb to the Bloom’s taxonomy category) and duplications across categories (created when different lists associated a verb with different categories or when a single list associated a verb with more than one category). No verb was assigned consistently to the same category by all 30 lists in the sample. Many lists in the sample duplicated verbs and listed them in two or more categories.
After eliminating nouns and converting nominalizations to verbs, the final data set was edited to eliminate within-category duplications. This operation reduced the collection to 433 unique verbs, of which 236 verbs (54.5%) appeared in only one category. The remaining 197 verbs (45.5%) appeared in two to six categories: 108 verbs (24.9%) appeared in two categories, 41 verbs (9.5%) appeared in three categories, 30 verbs (6.9%) appeared in four categories, 15 verbs (3.5%) appeared in five categories, and 3 verbs (choose, relate, select) appeared in all six categories. Thus, analysis of the verbs included on 30 lists indicates that list authors vary widely in how they interpret the thinking skills described by these verbs.
Researchers establish 75% agreement as the threshold criterion for acceptable inter-rater agreement for assessment instruments [
12,
13]. Agreement among Bloom’s taxonomy lists might be defined in two ways. First, inter-list agreement might be defined as the percentage of lists that agree in assigning a verb to a category. A second definition might be defined in terms of intra-list agreement or the internal consistency of a collection of verbs. In this case, agreement is defined in terms of the percentage of verbs in a collection assigned to only one category.
Given the minimal level of within-list agreement (the percentage of verbs placed in only one category) observed for the collection based on all 30 lists, the collection was reduced to include only those verbs nominated for a category by at least 4 lists (representing 13.3% inter-list agreement). The reduced collection (176 verbs) appears in
Table 1. Although less variable than the larger collection, this collection shows considerable variability in how specific verbs align with categories. This list includes 118 verbs (67%) that appear in only one category. The remaining 58 verbs appear in two to five different categories. Of these, 38 (21.6%) appear in two categories, 13 (7.4%) appear in three categories, 5 (2.8%) appear in four categories, and 2 (1.1%) appear in five categories.
A more conservative selection criterion selected verbs that had been nominated by 10 or more lists for a given category in Bloom’s taxonomy (representing 33.3% inter-list agreement).
Table 2 presents this collection of 104 verbs. Although this “stricter” criterion was relatively weak in that it required only a third of the source lists to agree in nominating a verb for inclusion, this criterion eliminated nearly 76% of the verbs included in the original collection of 433 verbs. If we define consensus as the percentage of lists that agree on the alignment of verbs to a category, the resulting agreement (33.3%) is nowhere near acceptable levels established for research purposes. However, if consensus is defined in terms of within-list agreement, the estimate of agreement improves. This list includes 83 verbs (79.8%) that appear in only one category. However, 21 words (20.2%) appear in more than one category, 18 words (17.3%) appear in two categories and 3 words (2.9%) appear in three categories. Thus, adopting a criterion for inclusion that required 33.3% agreement among the lists produced acceptable levels of agreement about the alignment of words with categories in Bloom’s taxonomy but eliminated many serviceable and measurable verbs. The reduced collection continued to give evidence of multiple interpretations of the level of skill described by a verb.
Perhaps the six categories in Bloom’s taxonomy create too many distinctions for consistent judgments about category membership. Would a list based on a smaller set of categories reveal more consistent interpretation of verb meaning? When faced with low reliability on a scoring rubric, we can improve reliability by asking reviewers to make fewer or simpler decisions by assigning observations to fewer categories. One justification for examining a taxonomy based on fewer categories is the fact that publisher test banks sometimes categorize multiple choice questions as fact-based, application, or conceptual questions, implying three levels of thinking skill. Increased consistency may come at the cost of making fewer discriminations. If we create a blunter instrument for describing SLOs, will we gain consistency? To answer this question, the data were examined after combining the categories know and understand to create a single category, equivalent to the concept associated with factual questions in a test bank. In addition, a category based on the top three categories (analyze, evaluate, create) created the equivalent of the concept associated with conceptual questions in a test bank. Verbs in the application category were retained to represent application questions in a test bank.
Based on a three-category organization, 304 (70.2%) of the 433 verbs in the full collection were associated with a unique category, 91 (21%) were associated with two categories, and 38 (8.8%) were associated with all three categories. Thus, organizing the verbs in three categories produced only a minimal reduction in ambiguity. Moreover, the 38 verbs that aligned least consistently, appearing in all three categories, were verbs that frequently appeared on posted lists. In some cases, these verbs were nominated for category membership by 15 or more lists in the sample. Moreover, nearly all of these verbs were nominated by 10 or more lists to each of two categories. Clearly, authors of SLOs employ multiple meanings for these verbs.
Requiring greater inter-list agreement about including a verb in a category did not improve the internal consistency of the resulting collection. Analysis of the 104 verbs identified by 10 or more lists using the three-category organization (fact-based, application, or conceptual) produced even lower within-list agreement: 28 verbs (26.9%) appeared in all three categories, 39 verbs (37.5%) appeared in two categories, and only 37 verbs (35.6%) appeared in one category.
4. Discussion
Adelman provides a comprehensive analysis of the challenges faculty face when they set out to write meaningful, measurable SLOs [
7]. His analysis primarily focuses on the challenge of writing measurable SLOs, although he briefly discusses the role of context for articulating differences in disciplinary skills. SLOs for the Degree Qualifications Profile attempt to differentiate among degree programs by using measurable language for SLOs based on action verbs. Adelman proposes nearly 20 sets of “operational verbs” that establish clear connections between expectations about student learning and strategies for assessing this learning. However, Adelman offers scant advice on how using these categories to write SLOs will help differentiate between learning expectations represented by an associate’s, bachelor’s, or master’s level degree.
If verbs were consistently aligned with specific levels of Bloom’s taxonomy, this taxonomy would reduce ambiguity about levels of expertise instructors describe when they select language to write learning outcomes about the academic goals for a course. However, the analysis reported here suggests that existing lists of words organized by level of Bloom’s taxonomy cannot unambiguously support decisions about the differences in learning expected and distinguish between academic achievement expected for associate’s, bachelor’s, and master’s degree programs.
The challenge faced by verb-based frameworks is that language is notoriously flexible. Context modifies meaning, which explains why a verb associated with a low level in Bloom’s taxonomy in one context (
recognize the definition of technical terms) may be associated with a high level of Bloom’s taxonomy in another context (
recognize professional situations that produce a conflict of interest). Independent of context, many words have several meanings, which contributes to ambiguity about the level of cognitive skill intended. For example, although we might argue that
rewrite means
write again, an author’s intended meaning might correspond to a low-level skill
(copy or
transcribe) or a high-level skill (
revise or
edit). Paul makes a similar argument against categorizing
recall as a low-level cognitive skill [
10]. He describes circumstances in which
recall represents high-level thinking processes, arguing that unless the intent of
recall is limited to rote recall or mechanical repetition or reproduction of content, recall entails complex judgments about the credibility of the content accepted as part of a student’s belief system. Paul’s interpretation of the cognitive skills embedded in one simple verb casts doubt on the power of verbs alone to specify the level of expertise described by an SLO.
One solution might be to abandon verbs with multiple interpretations and select verbs with more precise meanings instead. However, this solution eliminates many wonderful, measurable verbs that serve well to articulate learning in language suitable for a broad audience.
Degree-granting institutions must clearly articulate how learning differs among students who earn the various degrees awarded. Regional accrediting bodies in the United States establish standards that require institutions to differentiate the learning goals articulated for undergraduate, graduate, and other degree programs [
14,
15,
16,
17,
18,
19]. This differentiation includes providing evidence that graduate programs “advance the student substantially beyond the educational accomplishments of a baccalaureate degree program” and “reflect the high level of complexity, specialization, and generalization inherent in advanced academic study” [
16]. Learning outcomes for graduate programs should be “progressively more advanced in academic content” [
18] and “require greater depth of study and increased demands on student intellectual or creative capacities” [
19]. Additional language in accreditation standards describes “knowledge of the literature” of the discipline or field and “ongoing engagement” in research, scholarship, creative expression, and/or professional practice and training [
18,
19]. These standards allude to increased intellectual demands without mandating particular language for learning outcomes for degree programs.
Academics argue that undergraduate students learn under the guidance of an expert, master’s students learn to teach themselves disciplinary knowledge and skills, and doctoral students learn to create new disciplinary knowledge [
20]. Academic programs often try to use Bloom’s taxonomy as a way to describe the expectations for increased expertise students exhibit in advanced academic work. SLOs for introductory courses (lower-level or general education courses) might be described using verbs selected from lower levels of Bloom’s taxonomy (
know,
understand,
apply). SLOs for advanced courses describe higher order thinking skills aligned with the upper levels of Bloom’s taxonomy (
analyze,
evaluate,
create). Based on this framework, graduate programs should describe SLOs that are dominated by verbs from high levels of Bloom’s taxonomy. However, this alignment is not rigid or formulaic. Introductory courses might include SLOs that describe higher-order thinking skills because these courses create opportunities for students to develop and practice skills expected at more advanced levels. In contrast, SLOs written for advanced courses might describe proficient use of these higher-order skills. In addition, advanced courses might include SLOs that use verbs from low levels of Bloom’s taxonomy to describe the initial acquisition of advanced disciplinary knowledge and skills that might be considered to be too challenging to introduce to novices in the field.
Perry and Clinchy describe models for the development of cognitive skill that suggest an alignment between academic programs and increasingly advanced thinking styles that could be described using verbs selected from levels of Bloom’s taxonomy [
5,
6]. For example, Perry argues that beginning students display dualistic thinking, which treats knowledge as a cannon of known “right answers”. Dualistic students focus on acquiring and reproducing the “correct facts”. They rely on an expert authority (the teacher) to tell them which facts or answers are correct and suitable for memorization. Clinchy calls these learners “received knowers” because these students receive knowledge in a passive way from an authority. SLOs that describe activities such as the retention and reproduction of facts, drawing from Bloom’s categories of
know and
understand, align with the cognitive skills and dispositions of dualistic learners.
More advanced students (those in Perry’s “multiplicity” stage or “subjective knowledge” in Clinchy’s model) discover that authorities consider multiple opinions and points of view as legitimate. The “right” answers might be undetermined. In the multiplicity stage, students treat all opinions as equally valid. More advanced students, who are relativistic thinkers, realize that some opinions are more valued than others. Eventually, students develop agency and make rational choices about the relative merit of competing opinions. These students use disciplinary-specific criteria about evidence and argument to make decisions and build arguments in favor of one interpretation. Clinchy refers to this cognitive stage in terms of “procedural knowledge”. SLOs that describe learning for students in these later stages would employ verbs associated with higher-level categories in Bloom’s taxonomy (analyze, evaluate, create).
Bloom’s taxonomy is a useful heuristic for describing the cognitive demands imposed by learning tasks and assessments [
4]. For example, Bloom argued that teachers who write objective exam questions can use the taxonomy to determine whether questions require only a superficial knowledge of content (recalling facts, retrieving the definitions of technical terms) or require advanced thinking skills (solving a problem, interpreting evidence, predicting an outcome based on a theoretical model). In the context of a specific question or task, the action verbs we associate with levels of Bloom’s taxonomy have specific and unambiguous meanings. Taken out of context, these verbs lose their specificity. However, in the context created by other verbs (e.g., a list of verbs for a taxonomy), a verb might take on one meaning, but may acquire a different meaning in the context created by a different set of verbs identified for a different level in the taxonomy. The context created by the object of an SLO also creates specificity of meaning. Thus, reviewers must consider both the verb and the remaining content of an SLO when they make judgments about the level of expertise an SLO describes.
Another approach to distinguishing between academic degrees is reflected in the language of accreditation standards. Although accreditation standards for graduate programs describe expectations for increased intellectual demands in graduate level programs, they decline to specify language for student learning outcomes. Instead, graduate work is distinguished from undergraduate work because it requires students to make use of the primary literature of the discipline to draw inferences for scholarly activity, whereas students in undergraduate programs might rely on the secondary literature, which presents an expert interpretation of the primary literature. Description of professional scholarship, research, and other creative activity completed by graduate students implies that these students will make a novel contribution to the primary literature. Students in applied graduate programs engage in professional activities and apply principles from the primary literature to solve real world problems in the discipline (professional practice and training).
The DQP Grid articulates SLOs for associate’s, bachelor’s, and master’s degree programs in each of six skill categories [
3]. The contextual language of these SLOs rather than the verb selected supports the progressively more advanced nature of learning in degrees. The SLOs articulated for associate level degrees describe the acquisition and use of existing knowledge and methods in the discipline. In contrast, the language in outcomes described for master’s level degrees describes the use of primary sources (disciplinary literature) to complete tasks, generation of novel solutions, and use of disciplinary research methods to make new contributions to the field. Contextual content included in SLOs written for bachelor’s and master’s degree programs describe increasing levels of independent, self-directed learning and professional activities [
20].
Evidence for the advanced nature of learning in graduate degrees might also be based on the disciplinary content and skill described in a learning outcome. Some skills and specialized knowledge are reserved for master’s and doctoral level education. For example, access to disciplinary assessment tools (e.g., personality and IQ tests) and training for the skills required to administer and interpret test results may be limited to students at the graduate level. However, the nature of this specialized knowledge is likely to be unique to disciplines and not obvious to broader audiences.