| Type: | Package |
| Title: | Variational Bayesian Estimation for Diagnostic Classification Models |
| Version: | 2.0.1 |
| Description: | Enables computationally efficient parameters-estimation by variational Bayesian methods for various diagnostic classification models (DCMs). DCMs are a class of discrete latent variable models for classifying respondents into latent classes that typically represent distinct combinations of skills they possess. Recently, to meet the growing need of large-scale diagnostic measurement in the field of educational, psychological, and psychiatric measurements, variational Bayesian inference has been developed as a computationally efficient alternative to the Markov chain Monte Carlo methods, e.g., Yamaguchi and Okada (2020a) <doi:10.1007/s11336-020-09739-w>, Yamaguchi and Okada (2020b) <doi:10.3102/1076998620911934>, Yamaguchi (2020) <doi:10.1007/s41237-020-00104-w>, Oka and Okada (2023) <doi:10.1007/s11336-022-09884-4>, and Yamaguchi and Martinez (2023) <doi:10.1111/bmsp.12308>. To facilitate their applications, 'variationalDCM' is developed to provide a collection of recently-proposed variational Bayesian estimation methods for various DCMs. |
| Maintainer: | Keiichiro Hijikata <k.hijikata.1120@outlook.jp> |
| Depends: | R (≥ 4.2.0) |
| License: | GPL-3 |
| Encoding: | UTF-8 |
| Imports: | mvtnorm, stats |
| Suggests: | knitr |
| VignetteBuilder: | knitr |
| RoxygenNote: | 7.2.2 |
| URL: | https://github.com/khijikata/variationalDCM |
| BugReports: | https://github.com/khijikata/variationalDCM/issues |
| Config/testthat/edition: | 3 |
| LazyData: | true |
| Collate: | 'data.R' 'dina.R' 'dino.R' 'hm_dcm.R' 'mc_dina.R' 'satu_dcm.R' 'variationalDCM.R' 'summary.R' |
| NeedsCompilation: | no |
| Packaged: | 2024-03-25 13:54:29 UTC; khiji |
| Author: | Keiichiro Hijikata [aut, cre],
Motonori Oka |
| Repository: | CRAN |
| Date/Publication: | 2024-03-25 14:10:02 UTC |
Artificial data generating function for the DINA model based on the given Q-matrix
Description
dina_data_gen() returns the artificially generated item response data for the DINA model
Usage
dina_data_gen(Q, I, attr_cor = 0.1, s = 0.2, g = 0.2, seed = 17)
Arguments
Q |
the |
I |
the number of assumed respondents |
attr_cor |
the true value of the correlation among attributes (default: 0.1) |
s |
the true value of the slip parameter (default: 0.2) |
g |
the true value of the guessing parameter (default: 0.2) |
seed |
the seed value used for random number generation (default: 17) |
Value
A list including:
- X
the generated artificial item response data
- att_pat
the generated true vale of the attribute mastery pattern
References
Oka, M., & Okada, K. (2023). Scalable Bayesian Approach for the Dina Q-Matrix Estimation Combining Stochastic Optimization and Variational Inference. Psychometrika, 88, 302–331. doi:10.1007/s11336-022-09884-4
Examples
# load Q-matrix
Q = sim_Q_J80K5
sim_data = dina_data_gen(Q=Q,I=200)
Artificial data generating function for the hidden-Markov DCM based on the given Q-matrix
Description
hm_dcm_data_gen() returns the artificially generated item response data for the HM-DCM
Usage
hm_dcm_data_gen(
I = 500,
Q,
min_theta = 0.2,
max_theta = 0.8,
att_cor = 0.1,
seed = 17
)
Arguments
I |
the number of assumed respondents |
Q |
the |
min_theta |
the minimum value of the item parameter |
max_theta |
the maximum value of the item parameter |
att_cor |
the true value of the correlation among attributes (default: 0.1) |
seed |
the seed value used for random number generation (default: 17) |
Value
A list including:
- X
the generated artificial item response data
- alpha_true
the generated true vale of the attribute mastery pattern, matrix form
- alpha_patt_true
the generated true vale of the attribute mastery pattern, string form
References
Yamaguchi, K., & Martinez, A. J. (2024). Variational Bayes inference for hidden Markov diagnostic classification models. British Journal of Mathematical and Statistical Psychology, 77(1), 55–79. doi:10.1111/bmsp.12308
Examples
indT = 3
Q = sim_Q_J30K3
hm_sim_Q = lapply(1:indT,function(time_point) Q)
hm_sim_data = hm_dcm_data_gen(Q=hm_sim_Q,I=200)
Artificial data generating function for the multiple-choice DINA model based on the given Q-matrix
Description
mc_dina_data_gen() returns the artificially generated item response data for the MC-DINA model
Usage
mc_dina_data_gen(I, Q, att_cor = 0.1, seed = 17)
Arguments
I |
the number of assumed respondents |
Q |
the |
att_cor |
the true value of the correlation among attributes (default: 0.1) |
seed |
the seed value used for random number generation (default: 17) |
Value
A list including:
- X
the generated artificial item response data
- att_pat
the generated true vale of the attribute mastery pattern
References
Yamaguchi, K. (2020). Variational Bayesian inference for the multiple-choice DINA model. Behaviormetrika, 47(1), 159-187. doi:10.1007/s41237-020-00104-w
Examples
# load a simulated Q-matrix
mc_Q = mc_sim_Q
mc_sim_data = mc_dina_data_gen(Q=mc_Q,I=200)
Artificial Q-matrix for MC-DINA model
Description
Artificial Q-matrix for a 30-item test measuring 5 attributes.
Usage
mc_sim_Q
Format
A matrix with components
- column 1
Item number
- column 2
Stem
- column 3 to end
attributes
References
Yamaguchi, K. (2020). Variational Bayesian inference for the multiple-choice DINA model. Behaviormetrika, 47(1), 159-187. doi:10.1007/s41237-020-00104-w
Artificial Q-matrix for 30 items 3 attributes
Description
this matrix represents an artificial Q-matrix for 30 items and 3 attributes
Usage
sim_Q_J30K3
Format
An object of class matrix (inherits from array) with 30 rows and 3 columns.
Source
artificially simulated
Artificial Q-matrix for 80 items 5 attributes
Description
Artificial Q-matrix for a 80-item test measuring 5 attributes
Usage
sim_Q_J80K5
Format
An object of class matrix (inherits from array) with 80 rows and 5 columns.
Source
artificially simulated
Variational Bayesian estimation for DCMs
Description
variationalDCM() fits DCMs by VB algorithms.
Usage
variationalDCM(X, Q, model, max_it = 500, epsilon = 1e-04, verbose = TRUE, ...)
## S3 method for class 'variationalDCM'
summary(object, ...)
Arguments
X |
|
Q |
|
model |
specify one of "dina", "dino", "mc_dina", "satu_dcm", and "hm_dcm" |
max_it |
Maximum number of iterations (default: |
epsilon |
convergence tolerance for iterations (default: |
verbose |
logical, controls whether to print progress (default:
|
... |
additional arguments such as hyperparameter values |
object |
the return of the |
Value
variationalDCM returns an object of class
variationalDCM. We provide the summary function to summarize a
result and users can check the following information:
- model_params
estimates of posteror means and posterior standard deviations of model parameters
- attr_mastery_pat
MAP etimates of attribute mastery patterns
- ELBO
resulting value of evidence lower bound
- time
time spent in computation
Methods (by generic)
-
summary(variationalDCM): print summary information
variationalDCM
The variationalDCM() function performs recently-developed
variational Bayesian inference for various DCMs. The current version can
support the DINA, DINO, MC-DINA, saturated DCM, HM-DCM models. We briefly
introduce additional arguments that are specific to each model.
DINA model
The DINA model has two types of model parameters: slip
s_j and guessing g_j for j=1,\cdots,J. We name the
hyperparameters for the DINA model: delta_0 is a L-dimensional
vector, which is a hyperparameter \boldsymbol{\delta}^0 for the
Dirichlet distribution for the class mixing parameter
\boldsymbol{\pi} (default: NULL). When delta_0 is specified as
NULL, we set \boldsymbol{\delta}^0=\boldsymbol{1}_L.
alpha_s, beta_s, alpha_g, and beta_g are
positive values. They are hyperparameters {\alpha_s, \beta_s,
\alpha_g, \beta_g} that determines the shape of prior beta
distribution for the slip and guessing parameters (default: NULL). When
they are specified as NULL, they are set 1.
DINO model
The DINO model has the same model parameters and hyperparameters as the DINA model. We thus refer the readers to the DINA model.
MC-DINA model
The MC-DINA model has additional arguments
delta_0 and a_0. a_0 corresponds to positive hyperparamters
\mathbf{a}_{jc^\prime}^0 for all j and c^\prime. a_0 is by default set to NULL, and then it is specified as
1 for all elements.
Saturated DCM
The saturated DCM is a generalized model such as
the G-DINA and GDM. In the saturated DCM, we have hyperparameters
\mathbf{A}^0 and \mathbf{B}^0 in addition to
\boldsymbol{\delta}^0, which can be specified as arguments A_0
and B_0. They are specified by default as NULL, and then we
set weakly informative priors.
HM-DCM
When model is specified as "hm_dcm", users
have additional arguments nondecreasing_attribute,
measurement_model, random_block_design, Test_versions,
Test_order, random_start, A_0, B_0,
delta_0, and omega_0. Users can accommodate the
nondecreasing attribute constraint, which represents the assumption that
mastered attributes are not forgotten, by setting the logical valued
argument nondecreasing_attribute as TRUE (default:
FALSE). Users can also control the measurement model by specifying
measurement_model (default: "general"), and the current
version can deal with the HM-general DCM ("general") and HM-DINA
("dina") models. This function can also handle the datasets
collected by a random block design by specifying the logical valued
argument random_block_design (default: FALSE). When it is
specified as TRUE, users must enter Test_versions and
Test_order. Test_versions is an argument indicating which
version of the test each respondent has been assigned to based on a random
block design, while Test_order indicates the sequence in which items
are rearranged based on the random block design. A_0, B_0,
delta_0, and omega_0 correspond to hyperparameters
\mathbf{A}^0, \mathbf{B}^0, \boldsymbol{\delta}^0, and
\boldsymbol{\Omega}^0. \boldsymbol{\Omega}^0 is nonnegative
hyperparameters of Dirichlet distributions for attribute transition
probabilities. omega_0 is by default set to NULL, and then
we set \boldsymbol{\Omega}^0=\mathbf{1}_L\mathbf{1}_L^\top.
References
Yamaguchi, K., & Okada, K. (2020). Variational Bayes inference for the DINA model. Journal of Educational and Behavioral Statistics, 45(5), 569-597. doi:10.3102/1076998620911934
Yamaguchi, K. (2020). Variational Bayesian inference for the multiple-choice DINA model. Behaviormetrika, 47(1), 159-187. doi:10.1007/s41237-020-00104-w
Yamaguchi, K., Okada, K. (2020). Variational Bayes Inference Algorithm for the Saturated Diagnostic Classification Model. Psychometrika, 85(4), 973–995. doi:10.1007/s11336-020-09739-w
Yamaguchi, K., & Martinez, A. J. (2024). Variational Bayes inference for hidden Markov diagnostic classification models. British Journal of Mathematical and Statistical Psychology, 77(1), 55–79. doi:10.1111/bmsp.12308
Examples
# fit the DINA model
Q = sim_Q_J80K5
sim_data = dina_data_gen(Q=Q,I=200)
res = variationalDCM(X=sim_data$X, Q=Q, model="dina")
summary(res)