A Comparison of Mixed and Partial Membership Diagnostic Classification Models with Multidimensional Item Response Models

Robitzsch , Alexander

doi:10.3390/info15060331

Open AccessArticle

A Comparison of Mixed and Partial Membership Diagnostic Classification Models with Multidimensional Item Response Models

by

Alexander Robitzsch

^1,2

¹

IPN—Leibniz Institute for Science and Mathematics Education, Olshausenstraße 62, 24118 Kiel, Germany

²

Centre for International Student Assessment (ZIB), Olshausenstraße 62, 24118 Kiel, Germany

Information 2024, 15(6), 331; https://doi.org/10.3390/info15060331

Submission received: 9 April 2024 / Revised: 28 May 2024 / Accepted: 4 June 2024 / Published: 5 June 2024

(This article belongs to the Special Issue Second Edition of Predictive Analytics and Data Science)

Download Versions Notes

Abstract

:

Diagnostic classification models (DCM) are latent structure models with discrete multivariate latent variables. Recently, extensions of DCMs to mixed membership have been proposed. In this article, ordinary DCMs, mixed and partial membership models, and multidimensional item response theory (IRT) models are compared through analytical derivations, three example datasets, and a simulation study. It is concluded that partial membership DCMs are similar, if not structurally equivalent, to sufficiently complex multidimensional IRT models.

Keywords:

diagnostic classification model; multidimensional item response model; mixed membership; partial membership

1. Introduction

In the social or life sciences, humans (i.e., subjects or persons) respond to multiple tasks (i.e., items) in a test. For example, students may be asked to solve items in a mathematics test, or patients may report whether particular symptoms present. These tests result in multivariate datasets with dichotomously scored item responses taking values of zero or one.

Let

X = (X_{1}, \dots, X_{I}) \in {0, 1}^{I}

be a random vector containing I items

X_{i}

(

i = 1, \dots, I

). In multivariate analysis, the dependency structure of the high-dimensional contingency table

P (X = x)

(for

x \in {0, 1}^{I}

) is represented by sufficiently parsimonious models. A general model is defined in latent structure analysis [1,2,3,4,5]:

P (X = x; γ) = \int \prod_{i = 1}^{I} P (X_{i} = x_{i} | ξ; γ_{i}) d F_{β} (ξ),

(1)

where

γ = (γ_{1}, \dots, γ_{I}, β)

is a model parameter. Note that the integration with respect to

F_{β}

is interpreted as any measure that can be either continuous, discrete, or a mixture of both. It should be emphasized that the dependency structure in

X

is represented by a (finite-dimensional) latent variable

ξ

. The dimension of

ξ

is typically lower than I to achieve a convenient model interpretation of the dependency in

X

. The model (1) imposes a local independence assumption, meaning that the items

X_{i}

are conditionally independent of

ξ

. The functions

ξ \mapsto P (X_{i} = x_{i} | ξ; γ_{i})

are also referred to as item response functions (IRF).

In item response theory (IRT; Refs. [6,7,8]), the latent variable

ξ

is a unidimensional or multidimensional latent variable

θ

that can take values between minus and plus infinity. Diagnostic classification models (DCM; Refs. [9,10,11]) are particular latent class models in which the variable

ξ

is denoted as

α

and is a multidimensional latent variable. It is evident that IRT and DCM offer quite distinct interpretations of the underlying latent variable to applied researchers. Recently, the exact allocation (i.e., crisp membership) of subjects to latent classes in DCMs has been weakened, allowing for gradual membership to latent classes [12,13]. Comparisons between unidimensional IRT models and DCMs have been carried out in [14,15,16,17,18,19,20,21]. However, comparisons of DCMs with multidimensional DCMs are scarce in the literature, and to our knowledge, there is no systematic comparison of multidimensional IRT models and mixed membership DCMs.

Purpose

In this article, DCMs, mixed membership DCMs, and multidimensional IRT models are compared using three publicly available datasets and by conducting a simulation study focusing on model selection.

The remainder of the article is structured as follows. Section 2 reviews compensatory and noncompensatory multidimensional IRT models. DCMs are reviewed in Section 3. The extension of ordinary DCMs to DCMs with mixed or partial membership is outlined in Section 4. Section 5 provides a heuristic comparison of multidimensional IRT models, DCMs, and DCM partial membership extensions. In Section 6, three empirical datasets are analyzed using different specifications of the three modeling approaches. Moreover, Section 7 presents findings from a simulation study devoted to model selection. Finally, the article closes with a discussion Section 8.

2. Multidimensional Item Response Model

Multidimensional IRT (MIRT) models [22,23] assume a vector

θ = (θ_{1}, \dots, θ_{D})

of real-valued ability variables. That is, abilities

θ_{d}

can range from minus to plus infinity. The MIRT model can be written as:

P (X = x; γ) = \int \prod_{i = 1}^{I} P (X_{i} = x_{i} | θ; γ_{i}) f_{β} (θ) d θ,

(2)

where

f_{β}

denotes the multivariate density function of

θ

that depends on the unknown parameter

β

. Most frequently, a multivariate normal distribution for

θ

is assumed, and its density is given as:

f_{θ} (x; μ, Σ) = {(2 π)}^{- D / 2} {| Σ |}^{- 1 / 2} exp (- \frac{1}{2} {(x - μ)}^{⊤} Σ^{- 1} (x - μ)) for x \in R^{D},

(3)

where

μ

and

Σ

are the mean vector and the covariance matrix of

θ

, respectively. Moreover,

| Σ |

denotes the determinant of

Σ

. In MIRT models that estimate item intercepts and item slopes for all items separately, all means are usually fixed to 0, and all standard deviations are fixed to 1 for identification reasons.

In this article, we confine ourselves to confirmatory MIRT models. Items are allocated to dimensions of

θ

in the specification of the model. The allocation can be encoded in a Q-matrix [24,25] that defines which item loads on which dimension. Two types of loading structures can be distinguished: between-item dimensionality and within-item dimensionality [26,27].

2.1. Between-Item Dimensionality

In between-item dimensionality [26,28], all items load on one and only dimension. That is, item

X_{i}

loads only on one dimension

d_{1} [i]

. In this case, the multidimensional IRF is reduced to an unidimensional IRF:

P (X_{i} = 1 | θ; γ_{i}) = P (X_{i} = 1 | θ_{d_{1} [i]}; γ_{i}) .

(4)

2.2. Within-Item Dimensionality

In within-item dimensionality, an item is allowed to load on more than a single dimension. Let us assume that item

X_{i}

loads on the two dimensions

d_{1} [i]

and

d_{2} [i]

. Then, the IRF can be written as:

P (X_{i} = 1 | θ; γ_{i}) = P (X_{i} = 1 | θ_{d_{1} [i]}, θ_{d_{2} [i]}; γ_{i}) .

(5)

In the rest of the paper, models for within-item dimensionality are only discussed in the case that items load on at most two dimensions. The case in which items load on more than two dimensions is not conceptually different but introduces much more notation. Therefore, we restrict ourselves to only two-dimensional IRT models to simplify the technical aspects of our arguments.

The most frequent choice of the IRF in the case of between-item dimensionality is the multivariate variant of the 2PL model [29,30] and is given as:

P (X_{i} = 1 | θ_{d_{1} [i]}; γ_{i}) = Ψ (λ_{i 0} + λ_{i 1, d_{1} [i]} θ_{d_{1} [i]}),

(6)

where

λ_{i 0}

denotes the item intercept and

λ_{i 1, d_{1} [i]}

the item slope, respectively, and

Ψ

denotes the logistic distribution function.

There is more flexibility in defining IRFs in the case of within-item dimensionality than for between-item dimensionality. Different alternatives are discussed in the following subsections.

2.2.1. Compensatory Multidimensional IRT Model

In the case of within-item dimensionality, the IRF of item

X_{i}

depends on the

θ

values of dimensions

d_{1} [i]

and

d_{2} [i]

. The compensatory MIRT model [23,30] is defined as:

P (X_{i} = 1 | θ_{d_{1} [i]}, θ_{d_{2} [i]}; γ_{i}) = Ψ (λ_{i 0} + λ_{i 1, d_{1} [i]} θ_{d_{1} [i]} + λ_{i 1, d_{2} [i]} θ_{d_{2} [i]}),

(7)

where

λ_{i 1, d_{1} [i]}

and

λ_{i 1, d_{2} [i]}

are dimension-specific item slopes. It can be seen in (7) that the terms

λ_{i 1, d_{1} [i]} θ_{d_{1} [i]}

and

λ_{i 1, d_{2} [i]} θ_{d_{2} [i]}

can compensate for each other. Hence, for example, low values in

θ_{d_{1} [i]}

can be compensated by large values in

θ_{d_{2} [i]}

. Nevertheless, the IRF in (7) allows a flexible relationship of the required abilities for item i. By allowing for compensation in the abilities, the abilities

θ_{d_{1} [i]}

and

θ_{d_{2} [i]}

can operate somehow independently, which is not necessarily the case for noncompensatory MIRT models discussed in the next subsection.

2.2.2. Noncompensatory Multidimensional IRT Model

In some applications, it can be questioned whether the compensatory MIRT model makes assumptions that are likely to be met. In contrast, it could be argued that low abilities in one dimension cannot be compensated by high abilities in another dimension. For this reason, noncompensatory MIRT models have been proposed. A noncompensatory functioning of abilities

θ_{d_{1} [i]}

and

θ_{d_{1} [i]}

can be defined by defining the multidimensional IRF as a product of dimension-wise IRFs [31,32,33,34]:

P (X_{i} = 1 | θ_{d_{1} [i]}, θ_{d_{2} [i]}; γ_{i}) = Ψ (λ_{i 0, d_{1} [i]} + λ_{i 1, d_{1} [i]} θ_{d_{1} [i]}) Ψ (λ_{i 0, d_{2} [i]} + λ_{i 1, d_{2} [i]} θ_{d_{2} [i]}) .

(8)

Note that item intercepts are separately defined for each dimension. Hence, the noncompensatory MIRT model (8) has one additional item parameter compared to the compensatory MIRT model (7). An alternative approach to (8) was proposed in [35], in which the minimum instead of the product of the two probabilities was chosen.

2.2.3. Partially Compensatory Multidimensional IRT Model

It might be too restrictive to decide between the extreme cases of the compensatory MIRT model (7) and the noncompensatory MIRT model (8). The partially compensatory MIRT model has been proposed that contains the former two MIRT models as particular cases [36]. By defining the linear terms:

T_{i, d_{1} [i]} = λ_{i 0, d_{1} [i]} + λ_{i 1, d_{1} [i]} θ_{d_{1} [i]} and T_{i, d_{2} [i]} = λ_{i 0, d_{2} [i]} + λ_{i 1, d_{2} [i]} θ_{d_{2} [i]},

(9)

The IRF of the partially compensatory MIRT model is given as

P (X_{i} = 1 | θ_{d_{1} [i]}, θ_{d_{2} [i]}; γ_{i}) = \frac{exp (T_{i, d_{1} [i]} + T_{i, d_{2} [i]})}{1 + exp (T_{i, d_{1} [i]} + T_{i, d_{2} [i]}) + λ_{i 12} [exp (T_{i, d_{1} [i]}) + exp (T_{i, d_{2} [i]})]},

(10)

where the interaction parameter

λ_{i 12}

ranges between 0 and 1. A value of 0 for

λ_{i 12}

corresponds to the compensatory MIRT model (7), while a value of 1 corresponds to the noncompensatory MIRT model (8).

An alternative partially compensatory MIRT model has been proposed whose IRFs also involve the product of abilities

θ_{d_{1} [i]} θ_{d_{2} [i]}

as an additional term [37]. The multidimensional IRF of this model is given as:

P (X_{i} = 1 | θ_{d_{1} [i]}, θ_{d_{2} [i]}; γ_{i}) = Ψ (λ_{i 0} + λ_{i 1, d_{1} [i]} θ_{d_{1} [i]} + λ_{i 1, d_{2} [i]} θ_{d_{2} [i]} + λ_{i 2, d_{1} [i] d_{2} [i]} θ_{d_{1} [i]} θ_{d_{2} [i]}) .

(11)

The disadvantage of (11) is that the product

θ_{d_{1} [i]} θ_{d_{2} [i]}

can positively contribute to the item response probabilities if both abilities are positive or both are negative. This property can be regarded as unreasonable.

3. Diagnostic Classification Model

In DCMs [9,38] (also referred to as cognitive diagnostic models [39,40]), the latent variables in the latent structure model (1) are discrete. In this article, we consider the case of binary latent variables, which are also referred to as skills in the literature [41]. Let

α = (α_{1}, \dots, α_{D}) \in {0, 1}^{D}

be a D-dimensional latent variable that contains binary components

α_{d} \in {0, 1}

(

d = 1, \dots, D

). A value of 1 of variable

α_{d}

indicates mastery of dimension d, whereas a value of 0 indicates nonmastery. The DCM is defined as:

P (X = x; γ) = \sum_{α \in {0, 1}^{D}} π_{α} \prod_{i = 1}^{I} P (X_{i} = x_{i} | α; γ_{i}),

(12)

where the IRFs

α \mapsto P (X_{i} = x_{i} | α; γ_{i})

are now functions of

α

and

π_{α}

are probabilities. In a saturated probability model for

α

, there are

2^{D} - 1

probabilities

π_{α}

that can be estimated. Obviously, the probabilities must sum to 1:

\sum_{α \in {0, 1}^{D}} π_{α} = 1 .

(13)

In the case of two skills (i.e.,

D = 2

), the vector

α

can take the

2^{2} = 4

values

(0, 0)

,

(1, 0)

,

(0, 1)

, and

(1, 1)

. For high-dimensional

α

vectors, positing or estimating a hierarchy among skills [42,43,44] might be convenient, which results in less than

2^{D}

possible combination of different

α

vectors in the DCM (12).

3.1. Between-Item Dimensionality

In the case of between-item dimensionality, all items load on one and only skill

d_{1} [i]

. The IRF is given as:

P (X_{i} = 1 | α_{d_{1} [i]}; γ_{i}) = Ψ (λ_{i 0} + λ_{i 1, d_{1} [i]} α_{d_{1} [i]}),

(14)

where the item parameters

λ_{i 0}

and

λ_{i 1, d_{1} [i]}

again denote the item intercept and the item slope, respectively. Note that (14) reparametrizes two probabilities

P (X_{i} = 1 | α_{d_{1} [i]} = a; γ_{i})

for

a = 0, 1

such that:

p_{i 1} = P (X_{i} = 1 | α_{d_{1} [i]} = 1; γ_{i}) = Ψ (λ_{i 0} + λ_{i 1, d_{1} [i]}) and

(15)

p_{i 0} = P (X_{i} = 1 | α_{d_{1} [i]} = 0; γ_{i}) = Ψ (λ_{i 0}) .

(16)

The probability

1 - p_{i 1}

is also referred to as the slipping probability, whereas

p_{i 0}

is denoted as the guessing probability [45]. The DCM with only one skill (i.e.,

D = 1

) appeared as the mastery model in the literature [46,47].

3.2. Within-Item Dimensionality

In the following subsections, different DCMs for items with within-item dimensionality are discussed. In this case, an item

X_{i}

is assumed to load on only two dimensions

d_{1} [i]

and

d_{2} [i]

. As argued in Section 2, the situation in which an item loads on more than two dimensions (i.e., two skills) is conceptually similar to the case of two dimensions but just introduces more notation, and it does not add more insights to this article. In analogy to MIRT models, compensatory, noncompensatory, and partially compensatory DCMs can be distinguished.

3.2.1. Compensatory DCM:ADCM

The additive diagnostic classification model (ADCM; Refs. [48,49]) allows that the two skills

α_{d_{1} [i]}

and

α_{d_{2} [i]}

can compensate. The IRF for the ADCM is defined as:

P (X_{i} = 1 | α_{d_{1} [i]}, α_{d_{2} [i]}; γ_{i}) = Ψ (λ_{i 0} + λ_{i 1, d_{1} [i]} α_{d_{1} [i]} + λ_{i 1, d_{2} [i]} α_{d_{2} [i]}) .

(17)

Note that

(α_{d_{1} [i]}, α_{d_{2} [i]}) \mapsto P (X_{i} = 1 | α_{d_{1} [i]}, α_{d_{2} [i]}; γ_{i})

imposes a constraint on the IRF because the four probabilities

P (X_{i} = 1 | α_{d_{1} [i]} = a_{1}, α_{d_{2} [i]} = a_{2}; γ_{i})

for

(a_{1}, a_{2}) \in {0, 1}^{2}

are represented by the three parameters

λ_{i 0}

,

λ_{i 1, d_{1} [i]}

and

λ_{i 1, d_{2} [i]}

.

3.2.2. Noncompensatory DCM:DINA Model

A noncompensatory DCM appeared as the deterministic inputs, noisy “and” gate (DINA; Refs. [45,50]) model in the literature. The IRF of the DINA model is given by:

P (X_{i} = 1 | α_{d_{1} [i]}, α_{d_{2} [i]}; γ_{i}) = Ψ (λ_{i 0} + λ_{i 2, d_{1} [i] d_{2} [i]} α_{d_{1} [i]} α_{d_{2} [i]}) .

(18)

A high probability of getting item

X_{i}

correct can only be achieved if both skills

α_{d_{1} [i]}

and

α_{d_{2} [i]}

are mastered (i.e., they have values of 1).

3.2.3. Partially Compensatory DCM:GDINA Model

The generalized deterministic inputs, noisy “and” gate (GDINA; Refs. [50,51,52]) model is a partially compensatory DCM that has the ADCM (17) and the DINA model (18) as two particular cases. It contains the two main effects of the ADCM as well as the interaction effect of the DINA model. The IRF of the GDINA model is given as:

P (X_{i} = 1 | α_{d_{1} [i]}, α_{d_{2} [i]}; γ_{i}) = Ψ (λ_{i 0} + λ_{i 1, d_{1} [i]} α_{d_{1} [i]} + λ_{i 1, d_{2} [i]} α_{d_{2} [i]} + λ_{i 2, d_{1} [i] d_{2} [i]} α_{d_{1} [i]} α_{d_{2} [i]}) .

(19)

The four item response probabilities

P (X_{i} = 1 | α_{d_{1} [i]} = a_{1}, α_{d_{2} [i]} = a_{2}; γ_{i})

for

(a_{1}, a_{2}) \in {0, 1}^{2}

are reparametrized in (19). Hence, in contrast to the ADCM and the DINA model, no constraints are imposed in the GDINA model in its most flexible form.

4. Mixed and Partial Membership Diagnostic Classification Model

Researchers have occasionally questioned that the dichotomous classification into masters and nonmasters of the skills in DCMs is empirically not always tenable [53,54,55]. DCMs assume a crisp membership; that is, students can only belong to the class

α = 0

or to the class

α = 1

. Mixed membership models (also referred to as grade of membership models) or partial membership models weaken this assumption [56,57,58,59,60,61,62]. In these models, students are allowed to switch classes (i.e., the mastery and the nonmastery states) across items [63,64,65].

In mixed membership models, the vector

α = (α_{1}, \dots, α_{D}) \in {0, 1}^{D}

of binary skills is replaced by a vector of continuous bounded variables

α^{*} = (α_{1}^{*}, \dots, α_{D}^{*}) \in {[0, 1]}^{D}

. The value

α_{d}^{*}

can be interpreted as the degree to which a student belongs to the mastery class

α_{d} = 1

, while

1 - α_{d}^{*}

characterizes the degree to belong to class

α_{d} = 0

. It should be emphasized that the bounded latent membership vector variable

α^{*}

can be equivalently represented by an unbounded latent vector variable

θ

such that:

α_{d} = L (θ_{d}) for d = 1, \dots, D,

(20)

where

L : R \to [0, 1]

is a monotonically increasing and injective link function. The probit link function

α_{d} = Φ (θ_{d})

has been used in [12]. In this article, the logistic link function

Ψ

is utilized [20,66]. In this case, an injective transformation of the latent membership variable

α^{*}

follows a multivariate normal distribution [66]. More formally, assume that

θ = (θ_{1}, \dots, θ_{D})

follows a multivariate normal distribution with mean vector

μ

and covariance matrix

Σ

. The variable

α^{*}

is obtained by applying the logistic transformation coordinate-wise:

α_{d}^{*} = Ψ (θ_{d}), which implies θ_{d} = Ψ^{- 1} (α_{d}^{*}) .

(21)

We write

x = Ψ^{- 1} (y)

as an abbreviation for the coordinate-wise application of the inverse logistic function for

y = (y_{1}, \dots, y_{D})

. By using the density transformation theorem [67], we obtain the following:

f_{α^{*}} (y; μ, Σ) = [\prod_{d = 1}^{D} \frac{1}{y_{d} (1 - y_{d})}] f_{θ} (Ψ^{- 1} (y); μ, Σ) for y \in {[0, 1]}^{D} .

(22)

Note that choosing a very large standard deviation for

θ_{d}^{*}

(e.g., a standard deviation of 1000) corresponds to a membership variable

α^{*}

whose values are concentrated nearly 0 or 1. That is, the crisp membership employed in DCMs can be obtained as a special case of mixed membership models.

As an alternative, a discrete grid of

α^{*}

values can be defined, and the mixed membership model is defined based on this grid. This approach is also referred as a nonparametric assumption of the mixed membership distribution. For example, one could assume

α^{*} \in {0, 0.5, 1}^{D}

. In this specification, the value

α_{d}^{*} = 0.5

can be interpreted as a partial mastery of the dth skill.

In the following two sections, we compare mixed membership and partial membership DCMs. These two modeling approaches differ in how their IRFs are defined.

4.1. Mixed Membership DCM

Mixed membership DCMs have been proposed in [12]. In general, IRFs in mixed membership models are obtained by defining a weighted sum of item response probabilities from crisp membership values associated with a corresponding latent class [12,68].

4.1.1. Between-Item Dimensionality

We first discuss the case of between-item dimensionality. The IRF for the crisp membership case (i.e., using the binary latent vector variable in the DCM) is denoted as:

P_{i, α_{d_{1} [i]}} = P (X_{i} = 1 | α_{d_{1} [i]}; γ_{i}) .

(23)

The IRF for the mixed membership DCM is then defined as

P (X_{i} = 1 | α_{d_{1} [i]}^{*}; γ_{i}) = α_{d_{1} [i]}^{*} P (X_{i} = 1 | α_{d_{1} [i]} = 1; γ_{i}) + (1 - α_{d_{1} [i]}^{*}) P (X_{i} = 1 | α_{d_{1} [i]} = 0; γ_{i}) .

(24)

Note that (24) can be more compactly written as:

P (X_{i} = 1 | α_{d_{1} [i]}^{*}; γ_{i}) = α_{d_{1} [i]}^{*} P_{i, 1} + (1 - α_{d_{1} [i]}^{*}) P_{i, 0} .

(25)

Using the IRF definition of the between-item dimensionality DCM in (14), we arrive at:

P (X_{i} = 1 | α_{d_{1} [i]}^{*}; γ_{i}) = Ψ (λ_{i, 0}) + α_{d_{1} [i]}^{*} (Ψ (λ_{i, 0} + λ_{i 1, d_{1} [i]}) - Ψ (λ_{i, 0})) .

(26)

4.1.2. Within-Item Dimensionality

Now, we discuss mixed membership DCMs in the case of within-item dimensionality. In the DCM with a crisp membership, we denote the IRF as:

P_{i, α_{d_{1} [i]} α_{d_{2} [i]}} = P (X_{i} = 1 | α_{d_{1} [i]}, α_{d_{2} [i]}; γ_{i}) .

(27)

The IRF can follow the ADCM, the DINA, or the GDINA model. The IRF for the mixed membership variable

α^{*}

is defined as (see [12]):

\begin{matrix} P (X_{i} = 1 | α_{d_{1} [i]}^{*}, α_{d_{2} [i]}^{*}; γ_{i}) \\ = & α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*} P_{i, 11} + α_{d_{1} [i]}^{*} (1 - α_{d_{2} [i]}^{*}) P_{i, 10} + (1 - α_{d_{1} [i]}^{*}) α_{d_{2} [i]}^{*} P_{i, 01} + (1 - α_{d_{1} [i]}^{*}) (1 - α_{d_{2} [i]}^{*}) P_{i, 00} . \end{matrix}

(28)

The IRF in (28) generally holds with a specified DCM for crisp membership. That is, by using the ADCM, DINA, or the GDINA model, different specifications of the mixed membership DCM are obtained.

Note that (28) can be rewritten as:

\begin{matrix} P (X_{i} = 1 | α_{d_{1} [i]}^{*}, α_{d_{2} [i]}^{*}; γ_{i}) \\ = & P_{i, 00} + α_{d_{1} [i]}^{*} [P_{i, 10} - P_{i, 00}] + α_{d_{2} [i]}^{*} [P_{i, 01} - P_{i, 00}] + α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*} [P_{i, 11} + P_{i, 00} - P_{i, 10} - P_{i, 01}] . \end{matrix}

(29)

4.2. Partial Membership DCM

Like mixed membership models, partial membership models [69] also weaken the assumption of a crisp membership but define their IRFs differently. While mixed membership models define the IRF as the weighted sum of IRFs from the crisp membership case, partial membership models define the IRF as the weighted harmonic mean of the IRFs from the crisp membership case [68,70]. We now separately describe the cases of between-item and within-item dimensionality in the next two subsections.

4.2.1. Between-Item Dimensionality

The IRFs in the between-item dimensionality in the crisp membership case (i.e., the DCMs) are denoted as:

P_{i, α_{d_{1} [i]}} = P (X_{i} = 1 | α_{d_{1} [i]}; γ_{i}) and Q_{i, α_{d_{1} [i]}} = 1 - P (X_{i} = 1 | α_{d_{1} [i]}; γ_{i}) .

(30)

If the logistic link function

Ψ

is utilized for defining the IRFs in the DCM, they can be written as:

P_{i, α_{d_{1} [i]}} = Ψ (η_{i, α_{d_{1} [i]}}) = \frac{exp (η_{i, α_{d_{1} [i]}})}{1 + exp (η_{i, α_{d_{1} [i]}})} and Q_{i, α_{d_{1} [i]}} = Ψ (- η_{i, α_{d_{1} [i]}}) = \frac{1}{1 + exp (η_{i, α_{d_{1} [i]}})}

(31)

with a linear predictor

η_{i, α_{d_{1} [i]}} = λ_{i 0} + λ_{i 1, d_{1} [i] .}

The IRF for the partial membership DCM is defined as:

P (X_{i} = 1 | α_{d_{1} [i]}^{*}; γ_{i}) = \frac{P_{i, 1}^{α_{d_{1} [i]}^{*}} P_{i, 0}^{1 - α_{d_{1} [i]}^{*}}}{C_{i} (α_{d_{1} [i]}^{*})} and P (X_{i} = 0 | α_{d_{1} [i]}^{*}; γ_{i}) = \frac{Q_{i, 1}^{α_{d_{1} [i]}^{*}} Q_{i, 0}^{1 - α_{d_{1} [i]}^{*}}}{C_{i} (α_{d_{1} [i]}^{*})},

(32)

where

C_{i} (α_{d_{1} [i]}^{*}) = P_{i, 1}^{α_{d_{1} [i]}^{*}} P_{i, 0}^{1 - α_{d_{1} [i]}^{*}} + Q_{i, 1}^{α_{d_{1} [i]}^{*}} Q_{i, 0}^{1 - α_{d_{1} [i]}^{*}}

.

In Appendix A, the IRF is derived as:

P (X_{i} = 1 | α_{d_{1} [i]}^{*}; γ_{i}) = Ψ (λ_{i 0} + λ_{i 1, d_{1} [i]} α_{d_{1} [i]}^{*}) .

(33)

This IRF looks identical to the IRF in the between-item dimensionality MIRT model in which

θ_{d_{1} [i]}

is replaced with

α_{d_{1} [i]}^{*}

. We elaborate on the connection of the two approaches in more detail in Section 5.

4.2.2. Within-Item Dimensionality

We now discuss the definition of the IRF for a partial membership DCM in the case of within-item dimensionality. Let again the IRF for the crisp membership DCMs be denoted by:

P_{i, α_{d_{1} [i]} α_{d_{2} [i]}} = P (X_{i} = 1 | α_{d_{1} [i]}, α_{d_{2} [i]}; γ_{i}) and Q_{i, α_{d_{1} [i]} α_{d_{2} [i]}} = 1 - P_{i, α_{d_{1} [i]} α_{d_{2} [i]}} .

(34)

The IRFs of the DCMs discussed in this paper utilize the logistic link function. The notation in (34) includes the IRFs of the ADCM, DINA, and the GDINA model.

We can rewrite (34) as:

P_{i, α_{d_{1} [i]} α_{d_{2} [i]}} = Ψ (η_{i, α_{d_{1} [i]} α_{d_{2} [i]}}) = \frac{exp (η_{i, α_{d_{1} [i]} α_{d_{2} [i]}})}{1 + exp (η_{i, α_{d_{1} [i]} α_{d_{2} [i]}})} .

(35)

The IRF for the partial membership DCM can generally be defined as:

P (X_{i} = 1 | α_{d_{1} [i]}^{*}, α_{d_{2} [i]}^{*}; γ_{i}) = \frac{P_{i, 11}^{α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*}} P_{i, 10}^{α_{d_{1} [i]}^{*} (1 - α_{d_{2} [i]}^{*})} P_{i, 01}^{(1 - α_{d_{1} [i]}^{*}) α_{d_{2} [i]}^{*}} P_{i, 11}^{α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*}}}{C_{i} (α_{d_{1} [i]}^{*}, α_{d_{2} [i]}^{*})} and

(36)

P (X_{i} = 0 | α_{d_{1} [i]}^{*}, α_{d_{2} [i]}^{*}; γ_{i}) = \frac{Q_{i, 11}^{α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*}} Q_{i, 10}^{α_{d_{1} [i]}^{*} (1 - α_{d_{2} [i]}^{*})} Q_{i, 01}^{(1 - α_{d_{1} [i]}^{*}) α_{d_{2} [i]}^{*}} Q_{i, 11}^{α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*}}}{C_{i} (α_{d_{1} [i]}^{*}, α_{d_{2} [i]}^{*})},

(37)

where the constant

C_{i} (α_{d_{1} [i]}^{*}, α_{d_{2} [i]}^{*})

is chosen such that the probabilities in (36) and (37) add to 1.

In Appendix B, the IRF is simplified to:

P (X_{i} = 1 | α_{d_{1} [i]}^{*}, α_{d_{2} [i]}^{*}; γ_{i}) = Ψ (λ_{i 0} + λ_{i 1, d_{1} [i]} α_{d_{1} [i]}^{*} + λ_{i 1, d_{2} [i]} α_{d_{2} [i]}^{*} + λ_{i 2, d_{1} [i] d_{2} [i]} α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*}) .

(38)

Interestingly, the IRF in (38) is the same as in the DCM in which the binary skills

α_{d}

are substituted with continuous skills

α_{d}^{*}

. These kinds of models are also referred to as probabilistic (membership) DCMs [13,71,72,73,74,75]. Hence, this section demonstrated the equivalence of partial membership and probabilistic membership DCMs if the IRFs are based on the logistic link function (see also [20]).

5. Heuristic Comparison of the Different Modeling Approaches

In this section, MIRT models, (crisp membership) DCMs, and partial membership CDMs are heuristically compared (see also [70,76]). Note thats the multidimensional real-value variable

θ \in R^{D}

from a MIRT model can be obtained from the bounded variable

α^{*} \in {[0, 1]}^{D}

by applying the inverse logistic transformation (see also [77]):

θ_{d} = Ψ^{- 1} (α_{d}^{*})

(39)

for all dimensions

d = 1, \dots, D

. Hence, the IRFs based on

θ

from the MIRT model can be alternatively formulated as IRFs based on the transformed latent variable

α^{*}

.

For two skills, an equivalent, or for at least three skills, a more parsimonious representation of

α \in {0, 1}^{D}

can be obtained by defining

α_{d}

as a discretization of

α_{d}^{*}

; that is:

α_{d} = 1 (α_{d}^{*} > 0.5) = C (α_{d}^{*}),

(40)

where

1

denotes the indicator function. This underlying multivariate normal distribution approach to DCMs has been discussed in [78]. Hence, (crisp membership) DCMs can be interpreted as partial membership models in which the discretization operator

C

is applied in the IRFs.

We now discuss the different implications of the approaches. The GDINA model for the partial membership DCM has the IRF

P (X_{i} = 1 | α_{d_{1} [i]}^{*}, α_{d_{2} [i]}^{*}; γ_{i}) = Ψ (λ_{i 0} + λ_{i 1, d_{1} [i]} α_{d_{1} [i]}^{*} + λ_{i 1, d_{2} [i]} α_{d_{2} [i]}^{*} + λ_{i 2, d_{1} [i] d_{2} [i]} α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*}) .

(41)

The DCM GDINA specification can be formulated as:

P (X_{i} = 1 | α_{d_{1} [i]}^{*}, α_{d_{2} [i]}^{*}; γ_{i}) = Ψ (λ_{i 0} + λ_{i 1, d_{1} [i]} C (α_{d_{1} [i]}^{*}) + λ_{i 1, d_{2} [i]} C (α_{d_{2} [i]}^{*}) + λ_{i 2, d_{1} [i] d_{2} [i]} C (α_{d_{1} [i]}^{*}) C (α_{d_{2} [i]}^{*})),

(42)

which follows the particular step function

C

, and the cut point of 0.5 is fixed across all items. While (38) assumes linear effects of

α_{d_{1} [i]}^{*}

,

α_{d_{2} [i]}^{*}

, and

α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*}

, the DCM specification follows a particular nonlinear step function. To some extent, the linear continuous effects offer more flexibility in modeling the IRFs.

The multidimensional compensatory IRT model can be written as:

P (X_{i} = 1 | α_{d_{1} [i]}^{*}, α_{d_{2} [i]}^{*}; γ_{i}) = Ψ (λ_{i 0} + λ_{i 1, d_{1} [i]} Ψ^{- 1} (α_{d_{1} [i]}^{*}) + λ_{i 1, d_{2} [i]} Ψ^{- 1} (α_{d_{2} [i]}^{*}))

(43)

with

θ_{d} = Ψ^{- 1} (α_{d}^{*})

. The IRF in (43) looks quite similar to (38) when excluding the interaction effect for

α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*}

. However, a different transformation of

α^{*}

is utilized when defining the IRF.

Alternatively, one could also start with (38) and formulate a partially compensatory IRT model as:

P (X_{i} = 1 | θ_{d_{1} [i]}, θ_{d_{2} [i]}; γ_{i}) = Ψ (λ_{i 0} + λ_{i 1, d_{1} [i]} Ψ (θ_{d_{1} [i]}) + λ_{i 1, d_{2} [i]} Ψ (θ_{d_{2} [i]}) + λ_{i 2, d_{1} [i] d_{2} [i]} Ψ (θ_{d_{1} [i]}) Ψ (θ_{d_{2} [i]})) .

(44)

Instead of (11), the logistically transformed variables

Ψ (θ_{d})

instead of the original variables

θ_{d}

are used.

Because the latent variable vector

α^{*}

in (38) is bounded, the IRF allows for partially capturing guessing and slipping effects. Hence, the partial membership DCM can be interpreted as more flexible than a MIRT model. The partial membership DCM can likely accommodate the shape of the IRF used in the four-parameter logistic (4PL; Refs. [79,80,81,82]) model and can attain lower and upper asymptotes different from 0 and 1, respectively. Note that it holds that:

lim_{(θ_{d_{1} [i]}, θ_{d_{2} [i]}) \to (- \infty, - \infty)} P (X_{i} = 1 | θ_{d_{1} [i]}, θ_{d_{2} [i]}; γ_{i}) = Ψ (λ_{i 0}) and

(45)

lim_{(θ_{d_{1} [i]}, θ_{d_{2} [i]}) \to (\infty, \infty)} P (X_{i} = 1 | θ_{d_{1} [i]}, θ_{d_{2} [i]}; γ_{i}) = Ψ (λ_{i 0} + λ_{i 1, d_{1} [i]} + λ_{i 1, d_{2} [i]} + λ_{i 2, d_{1} [i] d_{2} [i]}) .

(46)

Hence, the probabilistic membership DCMs can be interpreted as approximations of a multidimensional 4PL model [83,84,85,86,87]. Note that the probabilistic membership DCMs also estimate the mean vector

μ

and the covariance matrix

Σ

in the logit-normal distribution, while standardized variables are used in MIRT models for identification reasons when using a multivariate normally distributed

θ

variable.

Previous literature emphasized that there is likely skill continuity in DCMs. That is, the true data-generating latent variable in DCMs is

α^{*}

, but the discrete variable

α

is applied [88,89,90] (see also [91]). The strong resemblance between MIRT models and DCMs has also been studied by researcher Matthias von Davier [92,93,94,95,96].

To sum up, the similarity of the approaches can be investigated by comparing distributional assumptions of

α^{*} \in {[0, 1]}^{D}

and IRFs by rephrasing the compensatory MIRT model and DCMs reparametrized based on the latent variable

α^{*}

. In this sense, the compensatory MIRT model and DCMs can be viewed as particular cases of a partial membership DCM.

6. Empirical Examples

In this section, we compare different MIRT models, (crisp membership) DCMs, and mixed and partial membership DCMs through analyzing three publicly available datasets. The first dataset has between-item dimensionality, while the second and third dataset have within-item dimensionality. We selected subdatasets that contained only two dimensions (i.e.,

D = 2

) to reduce model complexity and computation time. It can be expected that the main findings would change if more than two dimensions were used.

The three datasets were frequently applied in DCM applications. Therefore, it is interesting to compare DCMs with alternative model specifications.

6.1. Method

We now describe the three datasets that are used in the following empirical analyses.

The first dataset, acl, contained in the R package mokken [97,98], had 433 subjects on items for the Dutch adjective checklist. For this analysis, we chose 20 items on the scales achievement (original Items 11 to 20) and dominance (original Items 21 to 30). This subdataset had between-item dimensionality. Each of the 20 items either measured the first or the second dimension. The dataset has been dichotomized, and values of at least two were set to one, while values smaller than two were set to zero in this analysis.

The second dataset, data.ecpe, contains item responses of 2922 subjects on the grammar section of the examination for the certificate of proficiency in English (ECPE) test [99,100,101,102]. The data.ecpe dataset can be accessed from the R package CDM [103]. For this analysis, we selected items that measured the first dimension (Skill 1: morphosyntactic rules) or the third skill (Skill 3: lexical rules). In total, 22 items that had within-item dimensionality were selected. Five and ten items uniquely measured the first or the second dimension, respectively. Seven items had within-item dimensionality.

The third dataset, mcmi, contains item responses of 1208 persons on a Dutch version of the Millon clinical multiaxial inventory (MCMI), designed as a questionnaire to diagnose mental disorders. The mcmi dataset is contained in the R package mokken [97,98]. Different DCMs for the whole dataset were studied in [104,105]. In the analysis, we used items from mcmi that measured the dimensions “H” (somatoform) or “CC” (major depression). The resulting 22 items had within-item dimensionality. Eight items measure both dimensions. Four and ten items solely load on the first and the second dimension, respectively.

In the following analyses, six model specifications were applied. In the DCMs, the skill space

α

is chosen as

{0, 1}^{2}

. The mixed membership (MM) and partial membership (PM) specifications defined a distribution on

{[0, 1]}^{2}

for

α^{*}

. In the first approach, the logit-normal distribution was specified (MM-NO and PM-NO) in which the mean vector

μ

and the covariance matrix

Σ

were estimated. In the second approach, a discrete nonparametric distribution on

{0, 0.5, 1}

is used for MM and PM (denoted as MM-NP and PM-NP). This approach can be interpreted as a crude approximation of a complex continuous distribution on

{[0, 1]}^{2}

. In the case of between-item dimensionality, there is only a single option for specifying the IRFs. In the case of within-item dimensionality, the GDINA, ADCM, and the DINA model were combined with the DCM, MM-NO, MM-NP, PM-NO, and PM-NP specifications. Finally, MIRT models for

θ

were specified by assuming a bivariate normal distribution on

R^{2}

. The mean vector

μ

was fixed to

0

, and only correlations were estimated in the covariance matrix

Σ

, while the standard deviations of the components of

θ

were fixed to 1. In the case of between-item dimensionality, there was only a single option for specifying the multidimensional 2PL model. In the case of within-item dimensionality, the partially compensatory (PC), the compensatory (CO), and the noncompensatory (NC) MIRT model were specified.

The different models are compared through information criteria. The information criteria rely on the deviance

- 2 l (\hat{γ})

of a fitted model. The most well-known information criteria are the Akaike information criterion (AIC; ref. [106]) and Bayesian information criterion (BIC; ref. [107]) and are defined as [108]:

AIC = - 2 l (\hat{γ}) + 2 p and

(47)

BIC = - 2 l (\hat{γ}) + log (N) p,

(48)

where p and N denote the number of estimated model parameters and sample size, respectively. A smaller information criterion indicates a better-fitting model. Note that the information criteria balance model fit and model parsimony.

Notably, AIC and BIC depend on sample size. To obtain a standardized measure of model fit, the Gilula–Haberman penalty (GHP; refs. [109,110,111,112]) is defined as:

GHP = \frac{AIC}{2 N I} = - \frac{l (\hat{γ})}{N I} + \frac{p}{N I} .

(49)

The GHP can be interpreted as a standardized measure of a bias-corrected likelihood per case and item. In this article, we define the GHP that is normed for a hypothetical sample size of

N = 10^{4} = 10, 000

and is defined as:

GHP 4 = 10^{4} \cdot GHP .

(50)

where, again, smaller values of GHP or GHP4 indicate a better-fitting model.

For the evaluation of alternative models, model differences in statistics AIC, BIC, and

GHP 4

can be computed, which are denoted as

Δ AIC

,

Δ BIC

, and

Δ GHP 4

. For

Δ AIC

, model differences regarding the AIC larger than 10 (see [108]) can be interpreted as substantial. In line with previous research,

GHP 4

differences larger than 10 might be considered as a moderate deviation, while

Δ GHP 4

differences between 1 and 10 as a small deviation [112,113].

All models were estimated using the sirt::xxirt() function from the R (Version 4.3.1; [114]) package sirt [115]. The models were estimated using marginal maximum likelihood estimation [116]. In the first 100 iterations, the expectation-maximization algorithm [117] was utilized and the algorithm switched afterward to the Newton-Raphson approach.

The different model specifications were compared regarding model fit using the AIC and the

Δ GHP 4

statistics. We did not perform model selection based on BIC because the simulation study presented in Section 7 showed inferior performance of this criterion compared to model selection based on AIC.

6.2. Results

6.2.1. Dataset `acl`

Table 1 presents

Δ AIC

and

Δ GHP 4

for the dataset acl. It turned out that the discrete mixed membership specification (i.e., MM-NP) fitted best, followed by PM-NO and the MIRT model. Notably, the DCM resulted in a worse model fit. However, there were only small model differences between MM-NP, PM-NO, PM-NP, and the MIRT model in terms of the

Δ GHP 4

statistic.

Table 2 presents the estimated item parameters for four selected model specifications for the acl dataset. Overall, the absolute values of MM and PM item parameters were larger compared to the DCM. The correlations of the

λ_{i 0}

item parameters were moderate to large but far from perfect (DCM and MM: 0.94, DCM and PM: 0.83, MM and PM: 0.78). While the correlations for the

λ_{i 1, 1}

parameter were quite large (DCM and MM: 0.92, DCM and PM: 0.98, MM and PM: 0.94), they turned out to be small to moderately valued for

λ_{i 1, 2}

(DCM and MM: 0.14, DCM and PM: 0.42, MM and PM: 0.75).

The correlation between the two dimensions were 0.67 in the DCM (computed as a tetrachoric correlation), 0.86 in the MM-NO model, 0.98 in the PM-NO model, and 0.54 in the MIRT model. Note that the correlations in MIRT and PM were quite different, resulting in different conclusions, although the model fit was very similar.

In the DCM, the skill class proportions were 0.28 for

(0, 0)

, 0.09 for

(1, 0)

, 0.25 for

(0, 1)

, and 0.38 for

(1, 1)

, resulting in marginal skill class probabilities of 0.47 for the first skill and 0.63 for the second skill. In the MM-NP specification, the means of

α_{d}^{*}

(

d = 1, 2

) were 0.48 and 0.57, and for PM-NP, they were 0.51 and 0.52, respectively. In addition, Table 3 displays the probabilities of

α^{*}

in the MM-NP and PM-NP specification. Although the general probability patterns were similar, there were nonnegligible differences between the probabilities, indicating that the different specifications resulted in slightly different quantitative interpretations regarding the two skills.

6.2.2. Dataset `data.ecpe`

Table 4 presents model comparisons of the specified DCMs, mixed membership DCMs and MIRT models for the dataset data.ecpe. It turned out that the three PM-NO specifications resulted in the best model fit in terms of AIC. The ADCM for PM-NO was the best-fitting model, slightly better than the GDINA model for PM-NO. In terms of

Δ GHP 4

, the model differences between all PM and MIRT model specifications turned out to be small.

Table 5 reports estimated item parameters for the partially compensatory models (i.e., GDINA and PC). We reported the NO specifications for MM and PM because they resulted in a better model fit compared to the nonparametric distribution (i.e., MM-NP and PM-NP).

There were moderate to strong correlations in the item intercept parameter

λ_{i 0}

(DCM and MM: 0.98, DCM and PM: 0.78, MM and PM: 0.75). The correlations between different model specifications for the item slope parameters

λ_{i 1, 1}

and

λ_{i 1, 2}

were slightly larger (for

λ_{i 1, 1}

: DCM and MM: 0.96, DCM and PM: 0.92, MM and PM: 0.89; for

λ_{i 1, 2}

: DCM and MM: 0.83, DCM and PM: 0.95, MM and PM: 0.86). The correlations of the item parameter

λ_{i 2, 12}

were large (DCM and MM: 0.95, DCM and PM: 0.98, MM and PM: 0.98). Similar to the acl dataset, the absolute values of estimated item parameters were larger for the MM and PM DCM compared to the crisp membership DCM. The

λ_{i 2, 12}

parameter in the PC MIRT model were estimated substantially different from 0 and 1, indicating that the items neither function fully compensatory nor noncompensatory.

The estimated correlations for the GDINA specifications between the two dimensions were 0.90 in the DCM (tetrachoric correlation), 0.96 in MM-NO, and 0.99 in the PM-NO model. Interestingly, the correlation of the ADCM specification for PM-NO was slightly smaller at 0.96. The correlations for the different MIRT models were very similar (MIRT-PC: 0.82, MIRT-CO: 0.82, MIRT-NC: 0.81).

6.2.3. Dataset `mcmi`

Table 6 reports model comparisons for the mcmi dataset. The GDINA specification of PM-NO fitted best, followed by ADCM of PM-NO. In terms of

Δ GHP 4

, the partially compensatory and compensatory MIRT models had only small differences from the PM-NO approaches. The MM-NO resulted in a worse fit, so we instead reported item parameters of the better-fitting MM-NP model. Across all models, the GDINA specifications fitted the data slightly better than the ADCM specifications.

Table 7 displays the estimated item parameters for four selected specifications for the mcmi dataset. Overall, the item parameters correlated moderately to strongly across models.

The correlations for the GDINA specification between the two dimensions were large (DCM: 0.93, MM-NO: 0.99, PM-NO: 1.00, MIRT-PC: 0.95, MIRT-CO: 0.96, MIRT-NC: 0.93).

7. Simulation Study

In this simulation study, we evaluate the accuracy of model selection based on AIC, BIC, and GHP. For simplicity, we only considered the case of between-item dimensionality.

7.1. Method

We used estimated model parameters from the analysis of the acl dataset presented in Section 6.2.1. This dataset had between-item dimensionality and consisted of 20 items. We varied the sample size N as 500, 1000, and 2000. Six data-generating models (DGM) were simulated: the DCM, MM-NO, MM-NP, PM-NO, PM-NP, and the MIRT model. The same models were specified as analysis models for all DGMs.

Model estimation was carried out in the same way as described in Section 6 (see p. 10). We assessed the percentage rate that a particular analysis was selected based on the smallest AIC and smallest BIC. Moreover, we computed the average

Δ GHP 4

statistic for

GHP 4

difference between the analysis model and the DGM (see Section 6.1).

In total, 2500 replications were carried out in each of the 3 (sample size) × 6 (DGMs) = 18 cells of the simulation study.

7.2. Results

Table 8 reports the percentage model selection rates and the average

Δ GHP 4

statistic as a function of sample size for the five DGMs.

If the DCM was the DGM, model selection based on AIC and BIC was accurate. However, there were only small average model differences of the DCM to the MM-NP and PM-NP specifications in terms of

Δ GHP 4

.

If the MM-NO was the DGM, model selection based on the BIC strongly favored the MIRT model. However, there were very small model differences between MM-NO and the other models, making the different analysis models difficult to distinguish. The true DGM MM-NO can only be correctly detected for a large sample size

N = 2000

and when using the AIC.

If the data were generated by MM-NP, model selection based on AIC worked well. However, BIC also chose the MIRT model at a nonnegligible rate of 32% for the sample size

N = 500

. Note that the PM-NP analysis model was relatively close to the MM-NP model in terms of

Δ GHP 4

.

If the DGM was the PM-NO model, the MIRT model was generally favored when the BIC was used as the model selection criterion. The AIC only showed acceptable model selection rates for

N = 2000

.

If the PM-NP was the DGM, AIC performed well for all sample sizes but failed for the BIC for a sample size of

N = 500

.

Finally, if the DGM was the MIRT model, model selection based on AIC and BIC performed well. However, there were only small model differences between MIRT and the partial membership models PM-NO and PM-NP, but larger differences to the mixed membership specifications MM-NO and MM-NP.

8. Discussion

In this article, we compared the extension of DCMs to mixed and partial memberships with crisp membership DCMs and MIRT models through three empirical datasets and a simulation study. We clarified the different nature of the two types of memberships (i.e., mixed and partial membership). In particular, we compared the partial membership DCM to the MIRT model. Essentially, these two specifications can be interpreted as structurally very similar. Hence, it is up to the researcher to interpret dimensions as partial memberships of classes or as measures of a quantitative continuous latent variable.

In the empirical examples, the DCM extensions fitted the data substantially better than the crisp membership DCMs. Moreover, partial membership DCMs outperformed mixed membership DCMs because they offer more flexibility in modeling IRFs. An anonymous reviewer wondered why an empirical study was needed to demonstrate this finding. However, the finding has previously been pointed out in the mixed membership literature of exploratory mixture and latent class models [68]. Nevertheless, recent generalizations of DCMs only considered mixed and not partial membership [12,13]. Therefore, our simulation study and empirical analysis could give rise to more DCM research devoted to partial memberships. Furthermore, MIRT models frequently showed a similar fit to the partial membership DCMs.

The simulation study showed that model selection based on AIC turned out to be more reliable than that of BIC. This finding holds for small (i.e.,

N = 500

) and large (i.e.,

N = 2000

) sample sizes. For a fixed number of items, model differences in terms of the GHP penalty are almost independent of the sample size. However, a dependence of the GHP penalty on the number of items cannot be expected.

In this article, we confined ourselves to low-dimensional models (i.e., models with only two dimensions). In models with more than two dimensions, marginal maximum likelihood might be computationally demanding, and pairwise likelihood [118,119,120], variational approximation [121,122,123], or Markov chain Monte Carlo estimation [124,125,126] might be considered alternatively.

Given the main findings of our paper, it might be questioned whether DCMs with binary skills should be used at all compared to partial membership DCMs or MIRT models. The model-based classification of persons into masters and nonmasters might be attractive to practitioners in terms of model interpretation and using individual classifications as the outcome of the DCM. Interpreted in this way, DCMs with crisp membership are model-based confirmatory cluster analyses that automatically provide classifications into masters and nonmasters of skills. However, the classification could alternatively be determined by content experts, such as in a standard setting [127,128,129] procedure. The classifications obtained by DCMs will likely be strongly dependent on the chosen persons and items [130], particularly if DCMs with a crisp membership are not the data-generating model (i.e., in the presence of skill continuity). Importantly, the property of the absolute invariance of item parameters in DCMs only holds for correctly specified models [53,90,131,132]. In MIRT models, there is always scale indeterminacy because the mean and the standard deviations of the dimensions of

θ

cannot be disentangled from item intercepts and item discriminations if they were freely estimated. That there is no such an indeterminacy in DCMs comes at a price that the DCM exactly holds. Hence, model-based comparisons from DCMs of groups in cross-sectional settings and change in longitudinal designs should only be carried out with caution. If researchers still want to apply DCMs for such comparisons, we think that it is advisable to impose invariant item parameters across groups or time points to enable well-defined and interpretable comparisons, even if measurement invariance is certainly violated.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The empirical datasets used in Section 6 can be extracted from the R packages mokken (https://cran.r-project.org/web/packages/mokken; accessed on 9 April 2024) and CDM (https://cran.r-project.org/web/packages/CDM; accessed on 9 April 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

2PL	two-parameter logistic
4PL	four-parameter logistic
AIC	Akaike information criterion
ADCM	additive diagnostic classification model
BIC	Bayesian information criterion
CO	compensatory
DCM	diagnostic classification model
DGM	data-generating model
DINA	deterministic inputs, noisy “and” gate
GDINA	generalized deterministic inputs, noisy “and” gate
GHP	Gilula–Haberman penalty
IRF	item response function
IRT	item response theory
MIRT	multidimensional item response theory
MM	mixed membership
NC	noncompensatory
PC	partially compensatory
PM	partial membership

Appendix A. Derivation of Equation (33)

The IRFs are defined in Equations (30)–(32). The numerator in (32) can be simplified to:

\begin{matrix} P_{i, 1}^{α_{d_{1} [i]}^{*}} P_{i, 0}^{1 - α_{d_{1} [i]}^{*}} & = & exp (α_{d_{1} [i]}^{*} log (P_{i, 1}) + (1 - α_{d_{2} [i]}^{*}) log (P_{i, 0})) \\ = & exp (α_{d_{1} [i]}^{*} η_{i, 1} + (1 - α_{d_{1} [i]}^{*}) η_{i, 0} - α_{d_{1} [i]}^{*} h (η_{i, 1}) - (1 - α_{d_{1} [i]}^{*}) h (η_{i, 0})) \\ = & exp (α_{d_{1} [i]}^{*} η_{i, 1} + (1 - α_{d_{1} [i]}^{*}) η_{i, 0})) {\tilde{C}}_{i} (α_{d_{1} [i]}^{*}), \end{matrix}

(A1)

where

h (x) = log (1 + exp (x))

is used as an abbreviation and we define:

{\tilde{C}}_{i} (α_{d_{1} [i]}^{*}) = exp (α_{d_{1} [i]}^{*} η_{i, 1} + (1 - α_{d_{1} [i]}^{*}) η_{i, 0} - α_{d_{1} [i]}^{*} h (η_{i, 1}) - (1 - α_{d_{1} [i]}^{*}) h (η_{i, 0})) .

(A2)

In the same way, we get:

\begin{matrix} Q_{i, 1}^{α_{d_{1} [i]}^{*}} Q_{i, 0}^{1 - α_{d_{1} [i]}^{*}} & = & exp (α_{d_{1} [i]}^{*} log (Q_{i, 1}) + (1 - α_{d_{2} [i]}^{*}) log (Q_{i, 0})) \\ = & exp (- α_{d_{1} [i]}^{*} h (η_{i, 1}) - (1 - α_{d_{1} [i]}^{*}) h (η_{i, 0})) \\ = & {\tilde{C}}_{i} (α_{d_{1} [i]}^{*}) \end{matrix}

(A3)

Hence, we obtain the simplified IRF for the partial membership case as:

P (X_{i} = 1 | α_{d_{1} [i]}^{*}; γ_{i}) = Ψ (α_{d_{1} [i]}^{*} η_{i, 1} + (1 - α_{d_{1} [i]}^{*}) η_{i, 0}) .

(A4)

Now, observe that

η_{i, 0} = λ_{i 0}

and

η_{i, 1} = λ_{i 0} + λ_{i 1, d_{1} [i]}

. Then, we obtain from (A4):

P (X_{i} = 1 | α_{d_{1} [i]}^{*}; γ_{i}) = Ψ (λ_{i 0} + λ_{i 1, d_{1} [i]} α_{d_{1} [i]}^{*}) .

(A5)

Appendix B. Derivation of Equation (38)

Using the same derivation as in Section 4.2.1, we obtain:

\begin{matrix} P_{i, 11}^{α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*}} P_{i, 10}^{α_{d_{1} [i]}^{*} (1 - α_{d_{2} [i]}^{*})} P_{i, 01}^{(1 - α_{d_{1} [i]}^{*}) α_{d_{2} [i]}^{*}} P_{i, 11}^{α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*}} \\ = & exp (α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*} η_{i, 11} + α_{d_{1} [i]}^{*} (1 - α_{d_{2} [i]}^{*}) η_{i, 10} + (1 - α_{d_{1} [i]}^{*}) α_{d_{2} [i]}^{*} η_{i, 01} + α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*} η_{i, 00}) \\ \times exp (- α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*} h (η_{i, 11}) - α_{d_{1} [i]}^{*} (1 - α_{d_{2} [i]}^{*}) h (η_{i, 10}) - (1 - α_{d_{1} [i]}^{*}) α_{d_{2} [i]}^{*} h (η_{i, 01}) - α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*} h (η_{i, 00})) \\ = & exp (α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*} η_{i, 11} + α_{d_{1} [i]}^{*} (1 - α_{d_{2} [i]}^{*}) η_{i, 10} + (1 - α_{d_{1} [i]}^{*}) α_{d_{2} [i]}^{*} η_{i, 01} + α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*} η_{i, 00}) {\tilde{C}}_{i} (α_{d_{1} [i]}^{*}, α_{d_{2} [i]}^{*}), \end{matrix}

(A6)

where the factor

{\tilde{C}}_{i} (α_{d_{1} [i]}^{*}, α_{d_{2} [i]}^{*})

is properly defined. In addition, we get:

Q_{i, 11}^{α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*}} Q_{i, 10}^{α_{d_{1} [i]}^{*} (1 - α_{d_{2} [i]}^{*})} Q_{i, 01}^{(1 - α_{d_{1} [i]}^{*}) α_{d_{2} [i]}^{*}} Q_{i, 11}^{α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*}} = {\tilde{C}}_{i} (α_{d_{1} [i]}^{*}, α_{d_{2} [i]}^{*}) .

(A7)

Hence, we get the following IRF:

P (X_{i} = 1 | α_{d_{1} [i]}^{*}, α_{d_{2} [i]}^{*}; γ_{i}) = Ψ (α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*} η_{i, 11} + α_{d_{1} [i]}^{*} (1 - α_{d_{2} [i]}^{*}) η_{i, 10} + (1 - α_{d_{1} [i]}^{*}) α_{d_{2} [i]}^{*} η_{i, 01} + α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*} η_{i, 00}),

(A8)

which can be further simplified to:

P (X_{i} = 1 | α_{d_{1} [i]}^{*}, α_{d_{2} [i]}^{*}; γ_{i}) = Ψ (λ_{i 0} + λ_{i 1, d_{1} [i]} α_{d_{1} [i]}^{*} + λ_{i 1, d_{2} [i]} α_{d_{2} [i]}^{*} + λ_{i 2, d_{1} [i] d_{2} [i]} α_{d_{1} [i]}^{*} α_{d_{2} [i]}^{*}) .

(A9)

References

Lazarsfeld, P.F.; Henry, N.W. Latent Structure Analysis; Houghton Mifflin: Boston, MA, USA, 1968. [Google Scholar]
Andersen, E.B. Latent structure analysis: A survey. Scand. J. Stat. 1982, 9, 1–12. [Google Scholar]
Andersen, E.B. The Statistical Analysis of Categorical Data; Springer: Berlin, Germany, 1994. [Google Scholar] [CrossRef]
Clogg, C.C.; Goodman, L.A. Latent structure analysis of a set of multidimensional contingency tables. J. Am. Stat. Assoc. 1984, 79, 762–771. [Google Scholar] [CrossRef]
Goodman, L.A. On the estimation of parameters in latent structure analysis. Psychometrika 1979, 44, 123–128. [Google Scholar] [CrossRef]
Yen, W.M.; Fitzpatrick, A.R. Item response theory. In Educational Measurement; Brennan, R.L., Ed.; Praeger Publishers: Westport, WT, USA, 2006; pp. 111–154. [Google Scholar]
Cai, L.; Choi, K.; Hansen, M.; Harrell, L. Item response theory. Annu. Rev. Stat. Appl. 2016, 3, 297–321. [Google Scholar] [CrossRef]
Chen, Y.; Li, X.; Liu, J.; Ying, Z. Item response theory—A statistical framework for educational and psychological measurement. arXiv 2021, arXiv:2108.08604. [Google Scholar] [CrossRef]
Rupp, A.A.; Templin, J.L. Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Meas. Interdiscip. Res. Persp. 2008, 6, 219–262. [Google Scholar] [CrossRef]
Chang, H.H.; Wang, C.; Zhang, S. Statistical applications in educational measurement. Annu. Rev. Stat. Appl. 2021, 8, 439–461. [Google Scholar] [CrossRef]
Zhang, S.; Liu, J.; Ying, Z. Statistical applications to cognitive diagnostic testing. Annu. Rev. Stat. Appl. 2023, 10, 651–675. [Google Scholar] [CrossRef]
Shang, Z.; Erosheva, E.A.; Xu, G. Partial-mastery cognitive diagnosis models. Ann. Appl. Stat. 2021, 15, 1529–1555. [Google Scholar] [CrossRef]
Shu, T.; Luo, G.; Luo, Z.; Yu, X.; Guo, X.; Li, Y. An explicit form with continuous attribute profile of the partial mastery DINA model. J. Educ. Behav. Stat. 2023, 48, 573–602. [Google Scholar] [CrossRef]
de la Torre, J.; Santos, K.C. On the relationship between unidimensional item response theory and higher-order cognitive diagnosis models. In Essays on Contemporary Psychometrics; van der Ark, L.A., Emons, W.H.M., Meijer, R.R., Eds.; Springer: New York, NY, USA, 2022; pp. 389–412. [Google Scholar] [CrossRef]
Lee, Y.S.; de la Torre, J.; Park, Y.S. Relationships between cognitive diagnosis, CTT, and IRT indices: An empirical investigation. Asia Pac. Educ. Rev. 2012, 13, 333–345. [Google Scholar] [CrossRef]
Ma, W.; Minchen, N.; de la Torre, J. Choosing between CDM and unidimensional IRT: The proportional reasoning test case. Meas. Interdiscip. Res. Persp. 2020, 18, 87–96. [Google Scholar] [CrossRef]
Maas, L.; Madison, M.J.; Brinkhuis, M.J.S. Properties and performance of the one-parameter log-linear cognitive diagnosis model. Front. Educ. 2024, 9, 1287279. [Google Scholar] [CrossRef]
Madison, M.J.; Wind, S.A.; Maas, L.; Kazuhiro Yamaguchi, S.H. A one-parameter diagnostic classification model with familiar measurement properties. J. Educ. Meas. 2024; epub ahead of print. [Google Scholar] [CrossRef]
Liu, R. Using diagnostic classification models to obtain subskill information and explore its relationship with total scores: The case of the Michigan english test. Psychol. Test Assess. Model. 2020, 62, 487–516. [Google Scholar]
Robitzsch, A. Relating the one-parameter logistic diagnostic classification model to the Rasch model and one-parameter logistic mixed, partial, and probabilistic membership diagnostic classification models. Foundations 2023, 3, 621–633. [Google Scholar] [CrossRef]
von Davier, M.; DiBello, L.; Yamamoto, K.Y. Reporting Test Outcomes with Models for Cognitive Diagnosis; (Research Report No. RR-06-28); Educational Testing Service: Princeton, NJ, USA, 2006. [Google Scholar] [CrossRef]
Bonifay, W. Multidimensional Item Response Theory; Sage: Thousand Oaks, CA, USA, 2020. [Google Scholar] [CrossRef]
Reckase, M.D. Multidimensional Item Response Theory Models; Springer: New York, NY, USA, 2009. [Google Scholar] [CrossRef]
Tatsuoka, K.K. Rule space: An approach for dealing with misconceptions based on item response theory. J. Educ. Meas. 1983, 20, 345–354. [Google Scholar] [CrossRef]
da Silva, M.A.; Liu, R.; Huggins-Manley, A.C.; Bazán, J.L. Incorporating the q-matrix into multidimensional item response theory models. Educ. Psychol. Meas. 2019, 79, 665–687. [Google Scholar] [CrossRef]
Adams, R.J.; Wilson, M.; Wang, W.c. The multidimensional random coefficients multinomial logit model. Appl. Psychol. Meas. 1997, 21, 1–23. [Google Scholar] [CrossRef]
Hartig, J.; Höhler, J. Representation of competencies in multidimensional IRT models with within-item and between-item multidimensionality. Z. Psychol. 2008, 216, 89–101. [Google Scholar] [CrossRef]
Feuerstahler, L.; Wilson, M. Scale alignment in between-item multidimensional Rasch models. J. Educ. Meas. 2019, 56, 280–301. [Google Scholar] [CrossRef]
Birnbaum, A. Some latent trait models and their use in inferring an examinee’s ability. In Statistical Theories of Mental Test Scores; Lord, F.M., Novick, M.R., Eds.; MIT Press: Reading, MA, USA, 1968; pp. 397–479. [Google Scholar]
Reckase, M.D. Logistic multidimensional models. In Handbook of Item Response Theory, Volume 1: Models; van der Linden, W.J., Ed.; CRC Press: Boca Raton, FL, USA, 2016; pp. 189–210. [Google Scholar] [CrossRef]
Bolt, D.M.; Lall, V.F. Estimation of compensatory and noncompensatory multidimensional item response models using Markov chain Monte Carlo. Appl. Psychol. Meas. 2003, 27, 395–414. [Google Scholar] [CrossRef]
Babcock, B. Estimating a noncompensatory IRT model using Metropolis within Gibbs sampling. Appl. Psychol. Meas. 2011, 35, 317–329. [Google Scholar] [CrossRef]
Buchholz, J.; Hartig, J. The impact of ignoring the partially compensatory relation between ability dimensions on norm-referenced test scores. Psychol. Test Assess. Model. 2018, 60, 369–385. [Google Scholar]
Wang, C.; Nydick, S.W. Comparing two algorithms for calibrating the restricted non-compensatory multidimensional IRT model. Appl. Psychol. Meas. 2015, 39, 119–134. [Google Scholar] [CrossRef]
Chalmers, R.P. Partially and fully noncompensatory response models for dichotomous and polytomous items. Appl. Psychol. Meas. 2020, 44, 415–430. [Google Scholar] [CrossRef]
Spray, J.A.; Davey, T.C.; Reckase, M.D.; Ackerman, T.A.; Carlson, J.E. Comparison of Two Logistic Multidimensional Item Response Theory Models. 1990. Research Report ONR90-8. ACT Research Report Series, Iowa City. Available online: https://apps.dtic.mil/sti/citations/tr/ADA231363 (accessed on 3 June 2024).
DeMars, C.E. Partially compensatory multidimensional item response theory models: Two alternate model forms. Educ. Psychol. Meas. 2016, 76, 231–257. [Google Scholar] [CrossRef]
DiBello, L.V.; Roussos, L.A.; Stout, W. A review of cognitively diagnostic assessment and a summary of psychometric models. In Handbook of Statistics, Volume 26: Psychometrics; Rao, C.R., Sinharay, S., Eds.; Elsevier: Amsterdam, The Netherlands, 2006; pp. 979–1030. [Google Scholar] [CrossRef]
George, A.C.; Robitzsch, A. Cognitive diagnosis models in R: A didactic. Quant. Meth. Psych. 2015, 11, 189–205. [Google Scholar] [CrossRef]
Chen, Y.; Liang, S. BNMI-DINA: A Bayesian cognitive diagnosis model for enhanced personalized learning. Big Data Cogn. Comput. 2023, 8, 4. [Google Scholar] [CrossRef]
Rupp, A.A.; Templin, J.; Henson, R.A. Diagnostic Measurement: Theory, Methods, and Applications; Guilford Press: New York City, NY, USA, 2010; Available online: https://rb.gy/9ix252 (accessed on 3 June 2024).
Ma, C.; Ouyang, J.; Xu, G. Learning latent and hierarchical structures in cognitive diagnosis models. Psychometrika 2023, 88, 175–207. [Google Scholar] [CrossRef]
Martinez, A.J.; Templin, J. Approximate invariance testing in diagnostic classification models in the presence of attribute hierarchies: A Bayesian network approach. Psych 2023, 5, 688–714. [Google Scholar] [CrossRef]
Wang, C.; Lu, J. Learning attribute hierarchies from data: Two exploratory approaches. J. Educ. Behav. Stat. 2021, 46, 58–84. [Google Scholar] [CrossRef]
Junker, B.W.; Sijtsma, K. Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Appl. Psychol. Meas. 2001, 25, 258–272. [Google Scholar] [CrossRef]
Dayton, C.M.; Macready, G.B. A probabilistic model for validation of behavioral hierarchies. Psychometrika 1976, 41, 189–204. [Google Scholar] [CrossRef]
Haertel, E.H. Using restricted latent class models to map the skill structure of achievement items. J. Educ. Meas. 1989, 26, 301–321. [Google Scholar] [CrossRef]
de la Torre, J. The generalized DINA model framework. Psychometrika 2011, 76, 179–199. [Google Scholar] [CrossRef]
Henson, R.A.; Templin, J.L.; Willse, J.T. Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika 2009, 74, 191–210. [Google Scholar] [CrossRef]
de la Torre, J. DINA model and parameter estimation: A didactic. J. Educ. Behav. Stat. 2009, 34, 115–130. [Google Scholar] [CrossRef]
Ma, W.; de la Torre, J. GDINA: An R package for cognitive diagnosis modeling. J. Stat. Softw. 2020, 93, 1–26. [Google Scholar] [CrossRef]
Shi, Q.; Ma, W.; Robitzsch, A.; Sorrel, M.A.; Man, K. Cognitively diagnostic analysis using the G-DINA model in R. Psych 2021, 3, 812–835. [Google Scholar] [CrossRef]
de la Torre, J.; Lee, Y.S. A note on the invariance of the DINA model parameters. J. Educ. Meas. 2010, 47, 115–127. [Google Scholar] [CrossRef]
Huang, Q.; Bolt, D.M. Relative robustness of CDMs and (M)IRT in measuring growth in latent skills. Educ. Psychol. Meas. 2023, 83, 808–830. [Google Scholar] [CrossRef]
Ma, W.; Chen, J.; Jiang, Z. Attribute continuity in cognitive diagnosis models: Impact on parameter estimation and its detection. Behaviormetrika 2023, 50, 217–240. [Google Scholar] [CrossRef]
Chen, L.; Gu, Y. A spectral method for identifiable grade of membership analysis with binary responses. Psychometrika 2024, 89, 626–657. [Google Scholar] [CrossRef]
Erosheva, E.A. Comparing latent structures of the grade of membership, Rasch, and latent class models. Psychometrika 2005, 70, 619–628. [Google Scholar] [CrossRef]
Gu, Y.; Erosheva, E.E.; Xu, G.; Dunson, D.B. Dimension-grouped mixed membership models for multivariate categorical data. J. Mach. Learn. Res. 2023, 24, 1–49. [Google Scholar]
Manton, K.G.; Woodbury, M.A.; Stallard, E.; Corder, L.S. The use of grade-of-membership techniques to estimate regression relationships. Sociol. Methodol. 1992, 22, 321–381. [Google Scholar] [CrossRef]
Qing, H. Estimating mixed memberships in directed networks by spectral clustering. Entropy 2023, 25, 345. [Google Scholar] [CrossRef]
Wang, Y.S.; Erosheva, E.A. Fitting Mixed Membership Models Using Mixedmem; Technical Report; 2020. Available online: https://tinyurl.com/9fxt54v6 (accessed on 3 June 2024).
Woodbury, M.A.; Clive, J.; Garson, A., Jr. Mathematical typology: A grade of membership technique for obtaining disease definition. Comput. Biomed. Res. 1978, 11, 277–298. [Google Scholar] [CrossRef]
Erosheva, E.A.; Fienberg, S.E.; Junker, B.W. Alternative statistical models and representations for large sparse multi-dimensional contingency tables. Ann. Faculté Sci. Toulouse Math. 2002, 11, 485–505. [Google Scholar] [CrossRef]
Erosheva, E.A.; Fienberg, S.E.; Joutard, C. Describing disability through individual-level mixture models for multivariate binary data. Ann. Appl. Stat. 2007, 1, 346–384. [Google Scholar] [CrossRef]
Finch, H.W. Performance of the grade of membership model under a variety of sample sizes, group size ratios, and differential group response probabilities for dichotomous indicators. Educ. Psychol. Meas. 2021, 81, 523–548. [Google Scholar] [CrossRef]
Paisley, J.; Wang, C.; Blei, D.M. The discrete infinite logistic normal distribution. Bayesian Anal. 2012, 7, 997–1034. [Google Scholar] [CrossRef]
Held, L.; Sabanés Bové, D. Applied Statistical Inference; Springer: Berlin, Germany, 2014. [Google Scholar] [CrossRef]
Gruhl, J.; Erosheva, E.A.; Ghahramani, Z.; Mohamed, S.; Heller, K. A tale of two (types of) memberships: Comparing mixed and partial membership with a continuous data example. In Handbook of Mixed Membership Models and Their Applications; Airoldi, E.M., Blei, D., Erosheva, E.A., Fienberg, S.E., Eds.; Chapman & Hall: Boca Raton, FL, USA, 2014; pp. 15–38. [Google Scholar] [CrossRef]
Heller, K.A.; Williamson, S.; Ghahramani, Z. Statistical models for partial membership. In Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, 5–9 July 2008; pp. 392–399. [Google Scholar] [CrossRef]
Ghahramani, Z.; Mohamed, S.; Heller, K. A simple and general exponential family framework for partial membership and factor analysis. In Handbook of Mixed Membership Models and Their Applications; Airoldi, E.M., Blei, D., Erosheva, E.A., Fienberg, S.E., Eds.; Chapman & Hall: Boca Raton, FL, USA, 2014; pp. 101–122. [Google Scholar] [CrossRef]
Liu, Q.; Wu, R.; Chen, E.; Xu, G.; Su, Y.; Chen, Z.; Hu, G. Fuzzy cognitive diagnosis for modelling examinee performance. ACM Trans. Intell. Syst. Technol. 2018, 9, 1–26. [Google Scholar] [CrossRef]
Zhan, P.; Wang, W.C.; Jiao, H.; Bian, Y. Probabilistic-input, noisy conjunctive models for cognitive diagnosis. Front. Psychol. 2018, 9, 997. [Google Scholar] [CrossRef]
Zhan, P.; Tian, Y.; Yu, Z.; Li, F.; Wang, L. A comparative study of probabilistic logic and fuzzy logic in refined learning diagnosis. J. Psychol. Sci. 2020, 43, 1258–1266. [Google Scholar]
Zhan, P. Refined learning tracking with a longitudinal probabilistic diagnostic model. Educ. Meas. 2021, 40, 44–58. [Google Scholar] [CrossRef]
Tian, Y.; Zhan, P.; Wang, L. Joint cognitive diagnostic modeling for probabilistic attributes incorporating item responses and response times. Acta Psychol. Sin. 2023, 55, 1573–1586. [Google Scholar] [CrossRef]
Marini, M.M.; Li, X.; Fan, P.L. Characterizing latent structure: Factor analytic and grade of membership models. Sociol. Methodol. 1996, 26, 133–164. [Google Scholar] [CrossRef]
Van der Linden, W.J. Unidimensional logistic response models. In Handbook of Item Response Theory, Volume 1: Models; van der Linden, W.J., Ed.; CRC Press: Boca Raton, FL, USA, 2016; pp. 11–30. [Google Scholar] [CrossRef]
Templin, J.L.; Henson, R.A. Measurement of psychological disorders using cognitive diagnosis models. Psychol. Methods 2006, 11, 287–305. [Google Scholar] [CrossRef]
Culpepper, S.A. The prevalence and implications of slipping on low-stakes, large-scale assessments. J. Educ. Behav. Stat. 2017, 42, 706–725. [Google Scholar] [CrossRef]
Loken, E.; Rulison, K.L. Estimation of a four-parameter item response theory model. Brit. J. Math. Stat. Psychol. 2010, 63, 509–525. [Google Scholar] [CrossRef]
Reise, S.P.; Waller, N.G. How many IRT parameters does it take to model psychopathology items? Psychol. Methods 2003, 8, 164–184. [Google Scholar] [CrossRef]
Robitzsch, A. Four-parameter guessing model and related item response models. Math. Comput. Appl. 2022, 27, 95. [Google Scholar] [CrossRef]
Fu, Z.; Zhang, S.; Su, Y.H.; Shi, N.; Tao, J. A Gibbs sampler for the multidimensional four-parameter logistic item response model via a data augmentation scheme. Brit. J. Math. Stat. Psychol. 2021, 74, 427–464. [Google Scholar] [CrossRef]
Guo, S.; Chen, Y.; Zheng, C.; Li, G. Mixture-modelling-based Bayesian MH-RM algorithm for the multidimensional 4PLM. Brit. J. Math. Stat. Psychol. 2023, 76, 585–604. [Google Scholar] [CrossRef]
Kalkan, Ö.K.; Çuhadar, I. An evaluation of 4PL IRT and DINA models for estimating pseudo-guessing and slipping parameters. J. Meas. Eval. Educ. Psychol. 2020, 11, 131–146. [Google Scholar] [CrossRef]
Liu, J.; Meng, X.; Xu, G.; Gao, W.; Shi, N. MSAEM estimation for confirmatory multidimensional four-parameter normal ogive models. J. Educ. Meas. 2024, 61, 99–124. [Google Scholar] [CrossRef]
Liu, T.; Wang, C.; Xu, G. Estimating three-and four-parameter MIRT models with importance-weighted sampling enhanced variational auto-encoder. Front. Psychol. 2022, 13, 935419. [Google Scholar] [CrossRef]
Bolt, D.M.; Kim, J.S. Parameter invariance and skill attribute continuity in the DINA model. J. Educ. Meas. 2018, 55, 264–280. [Google Scholar] [CrossRef]
Bolt, D.M. Bifactor MIRT as an appealing and related alternative to CDMs in the presence of skill attribute continuity. In Handbook of Diagnostic Classification Models; von Davier, M., Lee, Y.S., Eds.; Springer: Cham, Switzerland, 2019; pp. 395–417. [Google Scholar] [CrossRef]
Huang, Q.; Bolt, D.M. The potential for interpretational confounding in cognitive diagnosis models. Appl. Psychol. Meas. 2022, 46, 303–320. [Google Scholar] [CrossRef]
Hong, H.; Wang, C.; Lim, Y.S.; Douglas, J. Efficient models for cognitive diagnosis with continuous and mixed-type latent variables. Appl. Psychol. Meas. 2015, 39, 31–43. [Google Scholar] [CrossRef]
Haberman, S.J.; von Davier, M.; Lee, Y.H. Comparison of Multidimensional Item Response Models: Multivariate Normal Ability Distributions versus Multivariate Polytomous Distributions; Research Report No. RR-08-45; Educational Testing Service: Princeton, NJ, USA, 2008. [Google Scholar] [CrossRef]
Von Davier, M. A general diagnostic model applied to language testing data. Brit. J. Math. Stat. Psychol. 2008, 61, 287–307. [Google Scholar] [CrossRef]
Von Davier, M. Mixture Distribution Diagnostic Models; Research Report No. RR-07-32; Educational Testing Service: Princeton, NJ, USA, 2007. [Google Scholar] [CrossRef]
Von Davier, M. Hierarchical mixtures of diagnostic models. Psychol. Test Assess. Model. 2010, 52, 8–28. [Google Scholar]
Xu, X.; von Davier, M. Comparing Multiple-Group Multinomial Log-Linear Models for Multidimensional Skill Distributions in the General Diagnostic Model; Research Report No. RR-08-35; Educational Testing Service: Princeton, NJ, USA, 2008. [Google Scholar] [CrossRef]
Van der Ark, L.A. Mokken scale analysis in R. J. Stat. Softw. 2007, 20, 1–19. [Google Scholar] [CrossRef]
Van der Ark, L.A. New developments in Mokken scale analysis in R. J. Stat. Softw. 2012, 48, 1–27. [Google Scholar] [CrossRef]
Templin, J.; Hoffman, L. Obtaining diagnostic classification model estimates using Mplus. Educ. Meas. 2013, 32, 37–50. [Google Scholar] [CrossRef]
Templin, J.; Bradshaw, L. Hierarchical diagnostic classification models: A family of models for estimating and testing attribute hierarchies. Psychometrika 2014, 79, 317–339. [Google Scholar] [CrossRef]
Feng, Y.; Habing, B.T.; Huebner, A. Parameter estimation of the reduced RUM using the EM algorithm. Appl. Psychol. Meas. 2014, 38, 137–150. [Google Scholar] [CrossRef]
Robitzsch, A.; George, A.C. The R package CDM for diagnostic modeling. In Handbook of Diagnostic Classification Models; von Davier, M., Lee, Y.S., Eds.; Springer: Cham, Switzerland, 2019; pp. 549–572. [Google Scholar] [CrossRef]
George, A.C.; Robitzsch, A.; Kiefer, T.; Groß, J.; Ünlü, A. The R package CDM for cognitive diagnosis models. J. Stat. Softw. 2016, 74, 1–24. [Google Scholar] [CrossRef]
de la Torre, J.; van der Ark, L.A.; Rossi, G. Analysis of clinical data from a cognitive diagnosis modeling framework. Meas. Eval. Couns. Dev. 2018, 51, 281–296. [Google Scholar] [CrossRef]
Sijtsma, K.; van der Ark, L.A. Measurement Models for Psychological Attributes; CRC Press: Boca Raton, FL, USA, 2020. [Google Scholar] [CrossRef]
Cavanaugh, J.E.; Neath, A.A. The Akaike information criterion: Background, derivation, properties, application, interpretation, and refinements. WIREs Comput. Stat. 2019, 11, e1460. [Google Scholar] [CrossRef]
Neath, A.A.; Cavanaugh, J.E. The Bayesian information criterion: Background, derivation, and applications. WIREs Comput. Stat. 2012, 4, 199–203. [Google Scholar] [CrossRef]
Burnham, K.P.; Anderson, D.R. Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach; Springer: New York, NY, USA, 2002. [Google Scholar] [CrossRef]
Gilula, Z.; Haberman, S.J. Conditional log-linear models for analyzing categorical panel data. J. Am. Stat. Assoc. 1994, 89, 645–656. [Google Scholar] [CrossRef]
Gilula, Z.; Haberman, S.J. Prediction functions for categorical panel data. Ann. Stat. 1995, 23, 1130–1142. [Google Scholar] [CrossRef]
Haberman, S.J. The Information a Test Provides on an Ability Parameter; Research Report No. RR-07-18; Educational Testing Service: Princeton, NJ, USA, 2007. [Google Scholar] [CrossRef]
van Rijn, P.W.; Sinharay, S.; Haberman, S.J.; Johnson, M.S. Assessment of fit of item response theory models used in large-scale educational survey assessments. Large-Scale Assess. Educ. 2016, 4, 10. [Google Scholar] [CrossRef]
Robitzsch, A. On the choice of the item response model for scaling PISA data: Model selection based on information criteria and quantifying model uncertainty. Entropy 2022, 24, 760. [Google Scholar] [CrossRef]
R Core Team. R: A Language and Environment for Statistical Computing; R Core Team: Vienna, Austria, 2023; Available online: https://www.R-project.org/ (accessed on 15 March 2023).
Robitzsch, A. sirt: Supplementary Item Response Theory Models; 2024. R Package Version 4.1-15. Available online: https://CRAN.R-project.org/package=sirt (accessed on 6 February 2024).
Glas, C.A.W. Maximum-likelihood estimation. In Handbook of Item Response Theory, Volume 2: Statistical Tools; van der Linden, W.J., Ed.; CRC Press: Boca Raton, FL, USA, 2016; pp. 197–216. [Google Scholar] [CrossRef]
Aitkin, M. Expectation maximization algorithm and extensions. In Handbook of Item Response Theory, Volume 2: Statistical Tools; van der Linden, W.J., Ed.; CRC Press: Boca Raton, FL, USA, 2016; pp. 217–236. [Google Scholar] [CrossRef]
Katsikatsou, M.; Moustaki, I.; Yang-Wallentin, F.; Jöreskog, K.G. Pairwise likelihood estimation for factor analysis models with ordinal data. Comput. Stat. Data Anal. 2012, 56, 4243–4258. [Google Scholar] [CrossRef]
Varin, C.; Reid, N.; Firth, D. An overview of composite likelihood methods. Stat. Sin. 2011, 21, 5–42. [Google Scholar]
Robitzsch, A. Pairwise likelihood estimation of the 2PL model with locally dependent item responses. Appl. Sci. 2024, 14, 2652. [Google Scholar] [CrossRef]
Yamaguchi, K.; Okada, K. Variational Bayes inference algorithm for the saturated diagnostic classification model. Psychometrika 2020, 85, 973–995. [Google Scholar] [CrossRef]
Tamano, H.; Mochihashi, D. Dynamical non-compensatory multidimensional IRT model using variational approximation. Psychometrika 2023, 88, 487–526. [Google Scholar] [CrossRef]
Ulitzsch, E.; Nestler, S. Evaluating Stan’s variational Bayes algorithm for estimating multidimensional IRT models. Psych 2022, 4, 73–88. [Google Scholar] [CrossRef]
Culpepper, S.A. Bayesian estimation of the DINA model with Gibbs sampling. J. Educ. Behav. Stat. 2015, 40, 454–476. [Google Scholar] [CrossRef]
Yamaguchi, K.; Templin, J. A Gibbs sampling algorithm with monotonicity constraints for diagnostic classification models. J. Classif. 2022, 39, 24–54. [Google Scholar] [CrossRef]
Yamaguchi, K. Bayesian analysis methods for two-level diagnosis classification models. J. Educ. Behav. Stat. 2023, 48, 773–809. [Google Scholar] [CrossRef]
Cizek, G.J.; Bunch, M.B.; Koons, H. Setting performance standards: Contemporary methods. Educ. Meas. 2004, 23, 31–50. [Google Scholar] [CrossRef]
Pant, H.A.; Rupp, A.A.; Tiffin-Richards, S.P.; Köller, O. Validity issues in standard-setting studies. Stud. Educ. Eval. 2009, 35, 95–101. [Google Scholar] [CrossRef]
Tiffin-Richards, S.P.; Pant, H.A.; Köller, O. Setting standards for English foreign language assessment: Methodology, validation, and a degree of arbitrariness. Educ. Meas. 2013, 32, 15–25. [Google Scholar] [CrossRef]
Liao, X.; Bolt, D.M. Guesses and slips as proficiency-related phenomena and impacts on parameter invariance. Educ. Meas. 2024; epub ahead of print. [Google Scholar] [CrossRef]
Bradshaw, L.P.; Madison, M.J. Invariance properties for general diagnostic classification models. Int. J. Test. 2016, 16, 99–118. [Google Scholar] [CrossRef]
Ravand, H.; Baghaei, P.; Doebler, P. Examining parameter invariance in a general diagnostic classification model. Front. Psychol. 2020, 10, 2930. [Google Scholar] [CrossRef]

Table 1. Dataset acl: Model comparisons based on

Δ AIC

and

Δ GHP 4

.

Table 1. Dataset acl: Model comparisons based on

Δ AIC

and

Δ GHP 4

.

Model	$Δ AIC$	$Δ GHP 4$
DCM	167	97
MM-NO	79	46
MM-NP	0 ^‡	0 ^‡
PM-NO	5	3
PM-NP	11	6
MIRT	5	3

Note: DCM = diagnostic classification model; MM-NO = mixed membership DCM with a logistic normal distribution assumption for

α^{*}

; MM-NP = mixed membership DCM with the discrete distribution

{0, 0.5, 1}^{2}

for

α^{*}

; PM-NO = partial membership DCM with a logistic normal distribution assumption for

α^{*}

; PM-NP = partial membership DCM with the discrete distribution

{0, 0.5, 1}^{2}

for

α^{*}

; MIRT = multidimensional item response model; ^‡ = reference model used for computing

Δ AIC

and

Δ GHP 4

; Entries with

Δ AIC

values smaller than 10 and

Δ GHP 4

values smaller than 10 are printed in bold font.

Table 2. Dataset acl: Estimated item parameters for selected models.

	DCM			MM-NP			PM-NO			MIRT
Item	$λ_{i 0}$	$λ_{i 1, 1}$	$λ_{i 1, 2}$	$λ_{i 0}$	$λ_{i 1, 1}$	$λ_{i 1, 2}$	$λ_{i 0}$	$λ_{i 1, 1}$	$λ_{i 1, 2}$	$λ_{i 0}$	$λ_{i 1, 1}$	$λ_{i 1, 2}$
A1	−1.27	2.67		−3.21	6.86		−4.11	7.55		−0.29	2.23
A2	−0.84	1.67		−2.41	5.13		−2.68	4.88		−0.17	1.13
A3	−0.81	1.84		−2.31	5.66		−2.88	5.41		−0.06	1.11
A4	−1.22	1.70		−3.07	4.50		−2.79	4.52		−0.48	1.01
A5	−1.45	2.95		−3.47	7.22		−4.51	8.22		−0.37	2.53
A6	−0.26	1.87		−0.98	5.22		−2.28	5.16		0.52	1.10
A7	−0.10	1.54		−0.67	4.24		−1.73	4.22		0.59	0.95
A8	−0.88	2.27		−2.18	6.23		−3.29	6.34		0.00	1.42
A9	−1.75	1.86		−3.55	4.11		−3.47	5.09		−0.93	1.04
A10	−1.59	2.53		−4.00	6.58		−4.53	7.84		−0.61	1.61
D1	0.87		2.96	0.85		2.16	−0.54		5.11	1.80		0.84
D2	−0.66		4.16	−1.20		5.41	−3.66		7.27	0.85		1.70
D3	−1.89		1.42	−3.81		4.07	−3.81		3.09	−1.12		0.97
D4	−1.67		3.07	−2.69		2.56	−2.79		1.41	−1.15		0.61
D5	−1.40		2.77	−2.74		6.32	−4.45		7.02	0.14		1.80
D6	−1.45		2.48	−3.13		5.82	−4.15		6.09	−0.10		1.53
D7	−0.95		3.45	−3.19		4.62	−2.77		3.93	−0.10		1.01
D8	−1.09		1.29	−2.98		4.05	−2.63		3.16	−0.31		0.88
D9	−1.52		2.85	−2.82		6.24	−4.42		6.83	0.07		1.73
D10	−1.98		2.88	−3.93		5.79	−4.76		6.45	−0.36		1.69

Note: DCM = diagnostic classification model; MM-NP = mixed membership DCM with the discrete distribution

{0, 0.5, 1}^{2}

for

α^{*}

; PM-NO = partial membership DCM with a logistic normal distribution assumption for

α^{*}

; MIRT = multidimensional item response model.

Table 3. Dataset acl: Estimated class probabilities

P (α_{1}^{*}, α_{2}^{*})

.

Table 3. Dataset acl: Estimated class probabilities

P (α_{1}^{*}, α_{2}^{*})

.

$α_{1}^{*}$	$α_{2}^{*}$	MM-NP	PM-NP
0	0	0.14	0.09
0.5	0	0.06	0.13
1	0	0.01	0.00
0	0.5	0.11	0.13
0.5	0.5	0.25	0.27
1	0.5	0.08	0.13
0	1	0.03	0.00
0.5	1	0.18	0.14
1	1	0.14	0.11

Note: MM-NP = mixed membership DCM with the discrete distribution

{0, 0.5, 1}^{2}

for

α^{*}

; PM-NP = partial membership DCM with the discrete distribution

{0, 0.5, 1}^{2}

for

α^{*}

.

Table 4. Dataset data.ecpe: Model comparisons based on

Δ AIC

and

Δ GHP 4

.

Table 4. Dataset data.ecpe: Model comparisons based on

Δ AIC

and

Δ GHP 4

.

	$Δ AIC$			$Δ GHP 4$
Model	GDINA	ADCM	DINA	GDINA	ADCM	DINA
DCM	456	454	599	35	35	47
MM-NO	169	179	295	13	14	23
MM-NP	184	206	274	14	16	21
PM-NO	10	0 ^‡	29	1	0 ^‡	2
PM-NP	88	80	145	7	6	11
	PC	CO	NC	PC	CO	NC
MIRT	104	75	102	8	6	8

Note: DCM = diagnostic classification model; MM-NO = mixed membership DCM with a logistic normal distribution assumption for

α^{*}

; MM-NP = mixed membership DCM with the discrete distribution

{0, 0.5, 1}^{2}

for

α^{*}

; PM-NO = partial membership DCM with a logistic normal distribution assumption for

α^{*}

; PM-NP = partial membership DCM with the discrete distribution

{0, 0.5, 1}^{2}

for

α^{*}

; MIRT = multidimensional item response model; GDINA = generalized deterministic inputs, noisy “and” gate; ADCM = additive diagnostic classification model; DINA = deterministic inputs, noisy “and” gate; PC = partially compensatory MIRT model; CO = compensatory MIRT model; NC = noncompensatory MIRT model; ^‡ = reference model used for computing

Δ AIC

and

Δ GHP 4

; Entries with

Δ AIC

values smaller than 10 and

Δ GHP 4

values smaller than 10 are printed in bold font.

Table 5. Dataset data.ecpe: Estimated item parameters for selected models.

	DCM:GDINA				MM-NO:GDINA				PM-NO:GDINA				MIRT:PC
Item	$λ_{i 0}$	$λ_{i 1, 1}$	$λ_{i 1, 2}$	$λ_{i 2, 12}$	$λ_{i 0}$	$λ_{i 1, 1}$	$λ_{i 1, 2}$	$λ_{i 2, 12}$	$λ_{i 0}$	$λ_{i 1, 1}$	$λ_{i 1, 2}$	$λ_{i 2, 12}$	$λ_{i 0, 1}$	$λ_{i 1, 1}$	$λ_{i 0, 2}$	$λ_{i 1, 2}$	$λ_{i 2, 12}$
E3	−0.31	0.98	0.54	0.31	−1.40	2.07	0.28	1.90	−0.56	1.81	1.66	1.33	0.49	0.82	1.17	0.25	0.51
E4	−0.12		1.69		−1.42		4.08		−1.11		3.81				1.03	0.95
E5	0.98		2.31		0.63		3.93		−0.11		3.87				2.31	1.00
E6	0.84		1.77		0.27		3.83		−0.19		3.63				1.94	0.87
E7	−0.14	1.20	1.13	0.70	−1.51	3.08	1.82	2.57	−0.78	2.33	2.72	2.01	0.91	1.00	1.47	0.72	0.25
E9	0.10		1.26		−0.92		3.32		−0.81		3.19				0.95	0.74
E10	0.10	2.01			−0.84	4.24			0.39	4.16			0.77	0.92
E11	−0.08	1.00	1.11	0.59	−1.34	2.77	1.70	2.46	−0.65	2.02	2.54	1.77	0.84	0.84	1.34	0.63	0.23
E12	−1.72	0.69	1.31	1.06	−3.11	2.06	0.09	3.79	−2.18	2.15	2.94	2.52	0.20	1.27	0.70	1.11	0.66
E13	0.68	1.59			−0.08	3.70			1.09	3.75			1.25	0.80
E14	0.23	1.30			−0.77	3.20			0.39	3.30			0.68	0.66
E15	0.93		2.30		0.55		3.98		−0.24		3.99				2.29	1.05
E16	−0.13	1.07	1.02	0.44	−1.37	2.83	1.50	2.41	−0.68	2.07	2.40	1.70	0.87	0.86	1.37	0.63	0.28
E18	0.86		1.52		0.24		3.52		−0.08		3.38				1.83	0.76
E19	−0.23		1.96		−1.52		4.40		−1.25		4.12				1.09	1.06
E20	−1.46	0.89	1.02	0.86	−2.82	2.34	0.08	3.52	−1.85	2.35	2.68	2.41	0.21	1.28	0.93	0.91	0.65
E21	0.14	0.82	1.24	0.31	−1.16	2.52	2.16	2.21	−0.41	1.81	2.60	1.52	1.02	0.63	1.34	0.79	0.21
E22	−0.90		2.33		−2.58		4.67		−1.89		4.67				0.72	1.35
E25	0.12	1.14			−0.86	2.77			0.23	3.02			0.52	0.57
E26	0.15		1.16		−0.57		2.76		−0.59		2.77				0.92	0.59
E27	−0.87	1.76			−2.58	3.74			−0.66	4.01			−0.26	0.97
E28	0.51		1.90		−0.16		4.07		−0.51		3.95				1.74	0.99

Note: DCM = diagnostic classification model; GDINA = generalized deterministic inputs, noisy “and” gate; MM-NO = mixed membership DCM with a logistic normal distribution assumption for

α^{*}

; PM-NO = partial membership DCM with a logistic normal distribution assumption for

α^{*}

; MIRT: PC = partially compensatory multidimensional item response model.

Table 6. Dataset mcmi: Model comparisons based on

Δ AIC

and

Δ GHP 4

.

Table 6. Dataset mcmi: Model comparisons based on

Δ AIC

and

Δ GHP 4

.

	$Δ AIC$			$Δ GHP 4$
Model	GDINA	ADCM	DINA	GDINA	ADCM	DINA
DCM	979	977	1225	184	184	230
MM-NO	1125	1157	1373	212	218	258
MM-NP	281	353	466	53	66	88
PM-NO	0 ^‡	16	84	0 ^‡	3	16
PM-NP	250	256	483	47	48	91
	PC	CO	NC	PC	CO	NC
MIRT	31	31	65	6	6	12

Note: DCM = diagnostic classification model; MM-NO = mixed membership DCM with a logistic normal distribution assumption for

α^{*}

; MM-NP = mixed membership DCM with the discrete distribution

{0, 0.5, 1}^{2}

for

α^{*}

; PM-NO = partial membership DCM with a logistic normal distribution assumption for

α^{*}

; PM-NP = partial membership DCM with the discrete distribution

{0, 0.5, 1}^{2}

for

α^{*}

; MIRT = multidimensional item response model; GDINA = generalized deterministic inputs, noisy “and” gate; ADCM = additive diagnostic classification model; DINA = deterministic inputs, noisy “and” gate; PC = partially compensatory MIRT model; CO = compensatory MIRT model; NC = noncompensatory MIRT model; ^‡ = reference model used for computing

Δ AIC

and

Δ GHP 4

; Entries with

Δ AIC

values smaller than 10 and

Δ GHP 4

values smaller than 10 are printed in bold font.

Table 7. Dataset mcmi: Estimated item parameters for selected models.

	DCM:GDINA				MM-NP:GDINA				PM-NO:GDINA				MIRT:PC
Item	$λ_{i 0}$	$λ_{i 1, 1}$	$λ_{i 1, 2}$	$λ_{i 2, 12}$	$λ_{i 0}$	$λ_{i 1, 1}$	$λ_{i 1, 2}$	$λ_{i 2, 12}$	$λ_{i 0}$	$λ_{i 1, 1}$	$λ_{i 1, 2}$	$λ_{i 2, 12}$	$λ_{i 0, 1}$	$λ_{i 1, 1}$	$λ_{i 0, 2}$	$λ_{i 1, 2}$	$λ_{i 2, 12}$
1	−2.51	2.22	2.27	0.28	−3.33	4.05	3.45	2.46	−5.40	4.45	2.79	5.47	−0.21	2.60	0.27	1.80	0.22
2	−2.31	2.08	1.98	0.26	−3.17	3.85	2.66	2.39	−5.05	4.32	2.42	5.11	−0.41	2.14	0.46	1.68	0.24
3	−3.02	1.77			−3.75	2.19			−3.69	3.52			−2.39	1.03
5	−1.09		2.61		−1.50		3.89		−3.20		6.16				0.08	1.94
6	−2.85	2.04			−3.24	2.63			−3.99	4.05			−2.07	1.21
8	−1.54		3.61		−2.05		5.68		−4.54		7.97				0.04	2.90
9	−2.35	1.92	1.79	0.30	−3.09	3.31	2.23	1.85	−4.60	3.92	2.18	4.55	−0.47	2.01	0.41	1.41	0.41
15	−2.33	1.87	1.35	−0.01	−2.98	3.33	1.47	1.19	−4.20	3.63	1.69	3.69	−0.74	2.11	0.73	0.89	0.54
16	−1.73	2.64			−2.28	3.60			−3.73	6.06			−0.67	1.79
21	−2.33		2.24		−3.06		3.30		−3.93		5.14				−1.42	1.57
22	−2.82	1.03	1.40	0.25	−3.39	1.99	1.19	1.55	−4.35	2.45	1.45	3.19	−1.05	1.51	−0.03	0.85	0.76
25	−2.15	1.03	1.52	0.15	−2.67	1.50	2.58	1.68	−3.89	2.41	2.46	3.42	−0.13	0.83	−0.12	1.52	0.48
28	−3.17		3.02		−3.98		4.09		−4.41		5.59				−1.79	1.79
29	−2.78	1.85	2.13	0.35	−3.80	2.99	2.91	2.94	−5.19	3.83	2.61	5.38	−0.44	1.94	−0.31	1.92	0.31
32	−2.38		3.58		−3.27		5.35		−4.77		7.85				−0.87	2.92
33	−1.00	2.54			−1.37	4.07			−3.25	6.32			0.22	1.89
35	−1.90	1.13	1.98	0.14	−2.32	1.50	2.95	2.69	−3.90	2.53	3.00	4.31	−0.34	0.15	−0.48	1.87	0.00
36	−2.22		2.15		−2.83		3.01		−3.74		4.90				−1.32	1.47
37	−3.60		3.72		−4.42		4.99		−5.41		7.43				−2.19	2.96
38	−2.43		2.52		−2.93		3.40		−4.08		5.55				−1.34	1.67
39	−1.26		1.27		−1.63		1.65		−2.21		3.15				−0.65	0.81
44	−2.63		2.86		−3.21		3.93		−4.47		6.30				−1.41	1.96

Note: DCM = diagnostic classification model; GDINA = generalized deterministic inputs, noisy “and” gate; MM-NP = mixed membership DCM with the discrete distribution

{0, 0.5, 1}^{2}

for

α^{*}

; PM-NO = partial membership DCM with a logistic normal distribution assumption for

α^{*}

; MIRT: PC = partially compensatory multidimensional item response model.

Table 8. Simulation Study: Model selection rates based on AIC and BIC and average

Δ GHP 4

statistic for five different data-generating models (displayed in columns).

Table 8. Simulation Study: Model selection rates based on AIC and BIC and average

Δ GHP 4

statistic for five different data-generating models (displayed in columns).

		DCM			MM-NO			MM-NP			PM-NO			PM-NP			MIRT
		$N$			$N$			$N$			$N$			$N$			$N$
Crit	Model	500	1000	2000	500	1000	2000	500	1000	2000	500	1000	2000	500	1000	2000	500	1000	2000
AIC	DCM	97	98	98	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
	MM-NO	0	0	0	9	39	76	0	0	0	0	0	0	0	0	0	0	0	0
	MM-NP	2	1	1	26	23	13	97	100	100	1	0	0	1	0	0	0	0	0
	PM-NO	0	0	0	5	5	4	0	0	0	31	57	83	0	0	0	3	3	2
	PM-NP	1	1	1	6	4	1	2	0	0	2	1	0	98	100	100	2	1	0
	MIRT	0	0	0	53	27	6	0	0	0	66	42	17	1	0	0	95	97	98
BIC	DCM	100	100	100	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
	MM-NO	0	0	0	0	1	15	0	0	0	0	0	0	0	0	0	0	0	0
	MM-NP	0	0	0	0	0	0	67	99	100	0	0	0	1	0	0	0	0	0
	PM-NO	0	0	0	0	0	0	0	0	0	0	1	6	1	0	0	0	0	0
	PM-NP	0	0	0	0	0	0	0	0	0	0	0	0	42	96	100	0	0	0
	MIRT	0	0	0	100	99	85	32	1	0	100	99	94	57	4	0	100	100	100
$Δ GHP 4$	DCM	0 ^‡	0 ^‡	0 ^‡	16	18	19	100	103	104	35	36	37	105	109	110	64	65	64
	MM-NO	59	56	54	0 ^‡	0 ^‡	0 ^‡	52	48	46	15	12	11	55	52	50	49	44	42
	MM-NP	4	2	1	−1	0	1	0 ^‡	0 ^‡	0 ^‡	7	6	6	13	13	13	15	14	13
	PM-NO	30	29	28	0	1	1	20	20	20	0 ^‡	0 ^‡	0 ^‡	19	20	20	4	2	1
	PM-NP	4	2	1	1	2	2	9	9	9	5	5	4	0 ^‡	0 ^‡	0 ^‡	8	7	7
	MIRT	48	50	50	−2	1	2	18	20	20	−1	0	1	14	16	17	0 ^‡	0 ^‡	0 ^‡

Note: Crit = criterion; DCM = diagnostic classification model; MM-NO = mixed membership DCM with a logistic normal distribution assumption for

α^{*}

; MM-NP = mixed membership DCM with the discrete distribution

{0, 0.5, 1}^{2}

for

α^{*}

; PM-NO = partial membership DCM with a logistic normal distribution assumption for

α^{*}

; PM-NP = partial membership DCM with the discrete distribution

{0, 0.5, 1}^{2}

for

α^{*}

; MIRT = multidimensional item response model; ^‡ = reference model used for computing

Δ GHP 4

; Entries with model selection rates larger than 50% and entries with

Δ GHP 4

values smaller than 10 are printed in bold font. Cells in which the data-generating model coincided with the analysis model are highlighted with a yellow background.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Robitzsch , A. A Comparison of Mixed and Partial Membership Diagnostic Classification Models with Multidimensional Item Response Models. Information 2024, 15, 331. https://doi.org/10.3390/info15060331

AMA Style

Robitzsch A. A Comparison of Mixed and Partial Membership Diagnostic Classification Models with Multidimensional Item Response Models. Information. 2024; 15(6):331. https://doi.org/10.3390/info15060331

Chicago/Turabian Style

Robitzsch , Alexander. 2024. "A Comparison of Mixed and Partial Membership Diagnostic Classification Models with Multidimensional Item Response Models" Information 15, no. 6: 331. https://doi.org/10.3390/info15060331

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Comparison of Mixed and Partial Membership Diagnostic Classification Models with Multidimensional Item Response Models

Abstract

1. Introduction

Purpose

2. Multidimensional Item Response Model

2.1. Between-Item Dimensionality

2.2. Within-Item Dimensionality

2.2.1. Compensatory Multidimensional IRT Model

2.2.2. Noncompensatory Multidimensional IRT Model

2.2.3. Partially Compensatory Multidimensional IRT Model

3. Diagnostic Classification Model

3.1. Between-Item Dimensionality

3.2. Within-Item Dimensionality

3.2.1. Compensatory DCM:ADCM

3.2.2. Noncompensatory DCM:DINA Model

3.2.3. Partially Compensatory DCM:GDINA Model

4. Mixed and Partial Membership Diagnostic Classification Model

4.1. Mixed Membership DCM

4.1.1. Between-Item Dimensionality

4.1.2. Within-Item Dimensionality

4.2. Partial Membership DCM

4.2.1. Between-Item Dimensionality

4.2.2. Within-Item Dimensionality

5. Heuristic Comparison of the Different Modeling Approaches

6. Empirical Examples

6.1. Method

6.2. Results

6.2.1. Dataset acl

6.2.2. Dataset data.ecpe

6.2.3. Dataset mcmi

7. Simulation Study

7.1. Method

7.2. Results

8. Discussion

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. Derivation of Equation (33)

Appendix B. Derivation of Equation (38)

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

6.2.1. Dataset `acl`

6.2.2. Dataset `data.ecpe`

6.2.3. Dataset `mcmi`