**2. Methods**

To develop a Mastery Rubric for Stewardship (MR-S, Table 1), we completed each of the three elements, described below. We then sought to determine the value and utility of the construct through three case studies, assessing the alignment of the MR-S KSAs with professional practice standards, all of which are relevant for practitioners of that field in or outside of academia, by way of a Degrees of Freedom Analysis [18,19].


**Table 1.** The Mastery Rubric for Stewardship.

evidence-based reasoning.

to critically evaluate their work according to disciplinary standards.

contributions to the field are well-reasoned and well-supported.


**Table 1.** *Cont*.



Note: \* This cell encompasses the entire MR-S, as well as the entire definition of the steward. Many independent practitioners (journeymen) have responsibilities to mentor or instruct, but have not accumulated evidence of qualification and the focus on the diagnosis and remediation of challenges that are encountered earlier in development, which distinguish the Master.

## *2.1. KSA Identification*

The first step in any MR is to identify core KSAs. We started by applying a cognitive task analysis (CTA; [20]) to the Carnegie Foundation's original definition of stewardship. This analysis is outlined in the Supplementary Materials. KSAs were taken directly from the main characteristics of stewardship: generation, evaluation, conservation, and responsibility in disciplinary/scholarly writing, teaching, application, and communication. Since the MR is a tool for curriculum development, the "teachability" of KSAs, and observability of learning, are prioritized; thus, KSAs were revised until these features were realized. KSA articulation was also informed by whether performance at a given stage in a developmental trajectory could be demonstrated concretely, within a variety of curricula, for the specific KSA. The first draft of KSAs were further separated where this brought clarity to the performance level descriptions, and/or where the evidence supporting achievement of di fferent KSAs could plausibly be separable (by job description or by intention). Ongoing discussion led to consensus on the final KSAs. This process is shown in the Supplementary Materials.

## *2.2. Trajectory Articulation*

Step 2 in developing a Mastery Rubric is the articulation of the stages along the trajectory. The trajectory is designed to ensure that each KSA is learnable and improvable, with concrete opportunities for assessment and demonstration, at each stage. As with all but one MR (see [6]), we used the European guild structure (see [21] (p. 182)), which identifies novice, apprentice, journeyman, and master stages or levels. These developmental stages conveniently map onto higher education generally, as well as to many professions. Thus, this trajectory can be used for curriculum development or evaluation so that the MR-S can be implemented across educational contexts, but can also be implemented in the workplace.

## *2.3. Performance Level Descriptors*

The third step in developing a MR is to describe what each KSA "looks like" when performed at a given stage. Because stewardship KSAs cannot be "tested" but must be observable, we sought to formally specify the level of KSA performance that would be minimally required for an individual to be classified into a given stage on that KSA (following [22], p. 4). The first and last authors did this using a formal approach to performance level description (PLD), combined with the assumption that the performance of a KSA by someone at a given stage can be described using the appropriate level of Bloom's taxonomy [23] (see Appendix A) required for the demonstration or performance of the KSA. Bloom's taxonomy is one of the most widely used, empirically and theoretically supported taxonomies for cognitive functions and is featured in every MR to date. Rather than rely on age, career stage or other such criteria, we drafted *Range* PLDs [24] (pp. 91–92) that could describe the complex behaviors each KSA represents. The three co-authors, who come from di fferent disciplinary perspectives, were participants in the Stewardship panel in 2016 specifically because of their expertise in the development (CMG) and application (CMR and RET) of the construct of stewardship; the first and last authors served as the "subject matter experts" in this iterative standard setting (PLD drafting) exercise following a combination of approaches articulated by Kingston and Tiemann (2012) [25]. Since our goal was to develop a tool that could be used by diverse disciplines, we required that PLDs entailed teachable behaviors that would be demonstrable within a variety of curricula. PLDs were articulated through an iterative process using a modified Body of Work procedure [25]: a first draft of each PLD for a given KSA was created based on Bloom's taxonomy by one co-author (RET), and served as the basis for "range finding" for performance of the KSA at a given stage. PLD drafting used Bloom's taxonomy, refined by appealing to the elements of assessment validity outlined by Messick (1994) [26]:


#### (3). What tasks will elicit these specific actions or behaviors?

The integration of Bloom's taxonomy and the Messick criteria facilitated the inclusion of concrete and observable behaviors in the PLDs that can be developed sequentially—and reinforced iteratively for deeper and sustainable learning over time.

The drafts were then discussed among the co-authors (CMR, RET) for "pinpointing", clarifying how di fferent evidence of performance of a given KSA at a given level by anyone developing their stewardship would be exhibited. The "boundaries" between stages relied on Bloom's levels and our own individual experiences with students and colleagues at di fferent stewardship levels. Discussions were both synchronous and asynchronous via online meetings (CMR and RET) and email (CMR, CMG, and RET), to finalize the performance level descriptors.

As each of these three steps were initiated, the interim results were used to triangulate results at other steps in the Mastery Rubric development process. That is, discussions about the PLDs led to the identification of a "missing" KSA (see Results), and also reinforced the choice of the guild structure for the developmental trajectory when concrete descriptions of each KSA at each stage were articulated. Refinements of PLDs for one KSA led to revisions and refinements in other KSA PLDs, to ensure that we pin-pointed performance in terms of stage and KSA, and also that the PLDs were not redundant.

## *2.4. Validity Evidence*

Once the Mastery Rubric for Stewardship was created (see Results), case studies were used both to study its validity—as a function of its relevance for professional preparation—and to assess the evidence for our claim that stewardship can be expanded beyond doctoral education to professional preparation: If the MR-S can be used to support training of professionals as well as scholars, there should be considerable alignment between the KSAs and professional practice guidelines—which are intended to guide both scholars and professional practitioners. Degrees of Freedom Analyses were used in all validation, i.e., we aligned the KSAs as predictive of the MR-S with the guideline principles, tabulating in the marginals the number of instances of simple alignment of these features. We did not carry out statistical analysis on the marginals, utilizing only the qualitative assessment of observed alignment to determine whether there was evidence that the MR-S can be useful in training professionals (to behave in concordance with professional practice guidelines). If there was minimal alignment, then we would conclude that the MR-S would *not* be useful for this training—although it might still be useful for training in just stewardship, if not professional practice.

The conclusion that *scholarly* stewardship is supported by the Mastery Rubric for Stewardship (MR-S) was explored with one of three case studies, representing the discipline of History (Case 1). The second case represents the discipline of Statistics and Data Science; both History and Statistics and Data Science have professional practice guidelines that already embrace the professional, as well as the scholarly, practitioner, but History is predominantly a scholarly discipline while Statistics and Data Science comprise a majority of practitioners *outside* of the academy. The third case explores the alignment of the MR-S with neurosciences, which has practice guidelines that emphasize scholarship (rather than general professional practice), although many neuroscience doctorate holders work in, or will go into, industry or other non-academic jobs. By examining the alignment of the MR-S with these three di fferent disciplines, we explored the relevance for the MR-S generally, to determine if di fferent disciplines each need their own MR-S (i.e., with discipline-specific PLDs, which would be suggested if the alignments across these cases were highly variable) or if the MR-S is su fficiently general, which is expected given that the construct of stewardship was intended to be general (i.e., for all doctoral education); this would be suggested if the alignments of the MR-S KSAs with the diverse practice guidelines in these validating case analyses were similar and high.
